Abstract:
Machines can be controlled using advanced control systems that implement an automated version of singular spectrum analysis (SSA). For example, a control system can perform SSA on a time series having one or more time-dependent variables by: generating a trajectory matrix from the time series, performing singular value decomposition on the trajectory matrix to determine elementary matrices; and categorizing the elementary matrices into groups. The elementary matrices can be automatically categorized into the groups by: generating one or more w-correlation matrices based on spectral components associated with the time series, determining w-correlation values based on the one or more w-correlation matrices; categorizing the w-correlation values into a predefined number of w-correlation sets, and forming the groups based on the predefined number of w-correlation sets. The control system can then generate a predictive forecast using the groups and control operation of a machine using the predictive forecast.
Abstract:
Timestamped data can be read in parallel by multiple grid-computing devices. The timestamped data, which can be partitioned into groups based on time series criteria, can be deterministically distributed across the multiple grid-computing devices based on the time series criteria. Each grid-computing device can sort and accumulate the timestamped data into a time series for each group it receives and then process the resultant time series based on a previously distributed script, which can be compiled at each grid-computing device, to generate output data. The grid-computing devices can write their output data in parallel. As a result, vast amounts of timestamped data can be easily analyzed across an easily expandable number of grid-computing devices with reduced computational expense.
Abstract:
Systems and methods are provided for analyzing unstructured time stamped data. A distribution of time-stamped data is analyzed to identify a plurality of potential time series data hierarchies for structuring the data. An analysis of a potential time series data hierarchy may be performed. The analysis of the potential time series data hierarchies may include determining an optimal time series frequency and a data sufficiency metric for each of the potential time series data hierarchies. One of the potential time series data hierarchies may be selected based on a comparison of the data sufficiency metrics. Multiple time series may be derived in a single-read pass according to the selected time series data hierarchy. A time series forecast corresponding to at least one of the derived time series may be generated.
Abstract:
Systems and methods are provided for analyzing unstructured time stamped data. A distribution of time-stamped data is analyzed to identify a plurality of potential time series data hierarchies for structuring the data. An analysis of a potential time series data hierarchy may be performed. The analysis of the potential time series data hierarchies may include determining an optimal time series frequency and a data sufficiency metric for each of the potential time series data hierarchies. One of the potential time series data hierarchies may be selected based on a comparison of the data sufficiency metrics. Multiple time series may be derived in a single-read pass according to the selected time series data hierarchy. A time series forecast corresponding to at least one of the derived time series may be generated.
Abstract:
Disclosed are methods, system, and computer program products useful for generating summary statistics for data predictions based on the aggregation of data from past time intervals. Summary statistics such as prediction standard errors, variances, confidence limits, and other statistical measures, may be generated in a way that preserves the basic distributional properties of the original data sets, to allow, for example, a reduction of the multiple data sets through the aggregation process, which may be useful for a prediction process, while determining statistical information for the predicted data.
Abstract:
Timestamped data can be read in parallel by multiple grid-computing devices. The timestamped data, which can be partitioned into groups based on time series criteria, can be deterministically distributed across the multiple grid-computing devices based on the time series criteria. Each grid-computing device can sort and accumulate the timestamped data into a time series for each group it receives and then process the resultant time series based on a previously distributed script, which can be compiled at each grid-computing device, to generate output data. The grid-computing devices can write their output data in parallel. As a result, vast amounts of timestamped data can be easily analyzed across an easily expandable number of grid-computing devices with reduced computational expense.
Abstract:
Machines can be controlled using advanced control systems. Such control systems may use an automated version of singular spectrum analysis to control a machine. For example, a control system can perform singular spectrum analysis on a time series by: generating a trajectory matrix from the time series, performing singular value decomposition on the trajectory matrix to determine elementary matrices and corresponding eigenvalues, and automatically categorizing the elementary matrices into groups. The elementary matrices can be automatically categorized into the groups by: generating a matrix of w-correlation values based on the eigenvalues, categorizing the w-correlation values into a predefined number of w-correlation sets, and forming the groups based on the predefined number of w-correlation sets. The control system can then determine component time-series based on the groups, and generate a predictive forecast using the component time-series. The control system can use the predictive forecast to control operation of the machine.
Abstract:
Time-series projections can be analyzed and manipulated via an interactive graphical user interface generated by a system. The graphical user interface can include a graph depicting an aggregated time-series projection (ATSP) over a future time. The ATSP can be generated by aggregating multiple time-series. The system can receive user input indicating that an existing value in the ATSP is to be overridden with an override value. In response, the system can adjust the ATSP using the override value to generate an updated version of the ATSP. The system can display the updated version of the ATSP in the graphical user interface. The system can also propagate the impact of overriding the existing value with the override value through the multiple time-series. The system can display an impact analysis portion within the graphical user interface indicating the impact of overriding the existing value with the override value on the multiple time-series.
Abstract:
Systems and methods are provided for analyzing unstructured time stamped data. A distribution of time-stamped data is analyzed to identify a plurality of potential time series data hierarchies for structuring the data. An analysis of a potential time series data hierarchy may be performed. The analysis of the potential time series data hierarchies may include determining an optimal time series frequency and a data sufficiency metric for each of the potential time series data hierarchies. One of the potential time series data hierarchies may be selected based on a comparison of the data sufficiency metrics. Multiple time series may be derived in a single-read pass according to the selected time series data hierarchy. A time series forecast corresponding to at least one of the derived time series may be generated.
Abstract:
Systems and methods are included for adjusting a set of predicted future data points for a time series data set including a receiver for receiving a time series data set. One or more processors and one or more non-transitory computer readable storage mediums containing instructions may be utilized. A count series forecasting engine, utilizing the one or more processors, generates a set of counts corresponding to discrete values of the time series data set. An optimal discrete probability distribution for the set of counts is selected. A set of parameters are generated for the optimal discrete probability distribution. A statistical model is selected to generate a set of predicted future data points. The set of predicted future data points are adjusted using the generated set of parameters for the optimal discrete probability distribution in order to provide greater accuracy with respect to predictions of future data points.