Abstract:
In some examples, computing devices can partition timestamped data into groups. The computing devices can then distribute the timestamped data based on the groups. The computing devices can also obtain copies of a script configured to process the timestamped data, such that each computing device receives a copy of the script. The computing devices can determine one or more code segments associated with the groups based on content of the script. The one or more code segments can be in one or more programming languages that are different than a programming language of the script. The computing devices can then run the copies of the script to process the timestamped data within the groups. This may involve interacting with one or more job servers configured to run the one or more code segments associated with the groups.
Abstract:
Systems and methods are provided for analyzing unstructured time stamped data. A distribution of time-stamped data is analyzed to identify a plurality of potential time series data hierarchies for structuring the data. An analysis of a potential time series data hierarchy may be performed. The analysis of the potential time series data hierarchies may include determining an optimal time series frequency and a data sufficiency metric for each of the potential time series data hierarchies. One of the potential time series data hierarchies may be selected based on a comparison of the data sufficiency metrics. Multiple time series may be derived in a single-read pass according to the selected time series data hierarchy. A time series forecast corresponding to at least one of the derived time series may be generated.
Abstract:
Some examples herein describe time-series recognition and analysis techniques with computer vision. In one example, a system can access an image depicting data lines representing time series datasets. The system can execute a clustering process to assign pixels in the image to pixel clusters. The system can generate image masks based on attributes of the pixel clusters, and identify a respective set of line segments defining the respective data line associated with each image mask. The system can determine pixel sets associated with the time series datasets based on the respective set of line segments associated with each image mask, and provide one or more pixel sets as input for a computing operation that processes the pixel sets and returns a processing result. The system may then display the processing result on a display device or perform another task based on the processing result.
Abstract:
Systems and methods are included for adjusting a set of predicted future data points for a time series data set including a receiver for receiving a time series data set. One or more processors and one or more non-transitory computer readable storage mediums containing instructions may be utilized. A count series forecasting engine, utilizing the one or more processors, generates a set of counts corresponding to discrete values of the time series data set. An optimal discrete probability distribution for the set of counts is selected. A set of parameters are generated for the optimal discrete probability distribution. A statistical model is selected to generate a set of predicted future data points. The set of predicted future data points are adjusted using the generated set of parameters for the optimal discrete probability distribution in order to provide greater accuracy with respect to predictions of future data points.
Abstract:
Systems and methods are provided for analyzing unstructured time stamped data. A distribution of time-stamped data is analyzed to identify a plurality of potential time series data hierarchies for structuring the data. An analysis of a potential time series data hierarchy may be performed. The analysis of the potential time series data hierarchies may include determining an optimal time series frequency and a data sufficiency metric for each of the potential time series data hierarchies. One of the potential time series data hierarchies may be selected based on a comparison of the data sufficiency metrics. Multiple time series may be derived in a single-read pass according to the selected time series data hierarchy. A time series forecast corresponding to at least one of the derived time series may be generated.
Abstract:
Time-series projections can be analyzed and manipulated via an interactive graphical user interface generated by a system. The graphical user interface can include a graph depicting an aggregated time-series projection (ATSP) over a future time. The ATSP can be generated by aggregating multiple time-series. The system can receive user input indicating that an existing value in the ATSP is to be overridden with an override value. In response, the system can adjust the ATSP using the override value to generate an updated version of the ATSP. The system can display the updated version of the ATSP in the graphical user interface. The system can also propagate the impact of overriding the existing value with the override value through the multiple time-series. The system can display an impact analysis portion within the graphical user interface indicating the impact of overriding the existing value with the override value on the multiple time-series.
Abstract:
Machines can be controlled using advanced control systems that implement an automated version of singular spectrum analysis (SSA). For example, a control system can perform SSA on a time series having one or more time-dependent variables by: generating a trajectory matrix from the time series, performing singular value decomposition on the trajectory matrix to determine elementary matrices; and categorizing the elementary matrices into groups. The elementary matrices can be automatically categorized into the groups by: generating one or more w-correlation matrices based on spectral components associated with the time series, determining w-correlation values based on the one or more w-correlation matrices; categorizing the w-correlation values into a predefined number of w-correlation sets, and forming the groups based on the predefined number of w-correlation sets. The control system can then generate a predictive forecast using the groups and control operation of a machine using the predictive forecast.
Abstract:
The operation of a machine can be controlled by performing reconciliation using a cluster of nodes. In one example, a node can receive parent timestamped data from a parent dataset and child timestamped data from child datasets that are children of the parent dataset in a hierarchical relationship. The parent timestamped data and the child timestamped data can relate to an operational characteristic of the machine. The node can generate computer processing-threads. Each computer processing-thread can solve one or more respective reconciliation problems between a parent data point that has a particular timestamp in the parent timestamped data and child data points that also have the particular timestamp in the child timestamp data to generate a reconciled dataset. An operational setting of the machine can then be adjusted based on the reconciled dataset.
Abstract:
Systems and methods are provided for analyzing unstructured time stamped data. A distribution of time-stamped data is analyzed to identify a plurality of potential time series data hierarchies for structuring the data. An analysis of a potential time series data hierarchy may be performed. The analysis of the potential time series data hierarchies may include determining an optimal time series frequency and a data sufficiency metric for each of the potential time series data hierarchies. One of the potential time series data hierarchies may be selected based on a comparison of the data sufficiency metrics. Multiple time series may be derived in a single-read pass according to the selected time series data hierarchy. A time series forecast corresponding to at least one of the derived time series may be generated.
Abstract:
Systems and methods are included for adjusting a set of predicted future data points for a time series data set including a receiver for receiving a time series data set. One or more processors and one or more non-transitory computer readable storage mediums containing instructions may be utilized. A count series forecasting engine, utilizing the one or more processors, generates a set of counts corresponding to discrete values of the time series data set. An optimal discrete probability distribution for the set of counts is selected. A set of parameters are generated for the optimal discrete probability distribution. A statistical model is selected to generate a set of predicted future data points. The set of predicted future data points are adjusted using the generated set of parameters for the optimal discrete probability distribution in order to provide greater accuracy with respect to predictions of future data points.