Visualizing high cardinality categorical data

    公开(公告)号:US09613106B2

    公开(公告)日:2017-04-04

    申请号:US14260684

    申请日:2014-04-24

    CPC classification number: G06F17/30554

    Abstract: A computer-program causing a computing device to perform an association measurement between a target variable and each non-target variable of a data set; select non-target variables for inclusion in a visualization based on the degree of association; perform correspondence analysis between target values of the target variable and non-target values of each selected non-target variable; order target value markers within a target row based on the degrees of closeness; order non-target value markers within each non-target row based on the degrees of closeness; determine a width of each target value marker based on a frequency of occurrence of its target value in the data set; determine a width of each non-target value marker based on a frequency of occurrence of its non-target value in the data set; and cause generation of the visualization with connection markers emanating from the target value markers and extending among the non-target value markers.

    COMPUTER-IMPLEMENTED SYSTEM FOR HIERARCHICAL UNCONSTRAINING IN DATA PROCESSES
    133.
    发明申请
    COMPUTER-IMPLEMENTED SYSTEM FOR HIERARCHICAL UNCONSTRAINING IN DATA PROCESSES 有权
    用于数据处理中分层不连续的计算机实现系统

    公开(公告)号:US20170068484A1

    公开(公告)日:2017-03-09

    申请号:US15257545

    申请日:2016-09-06

    Abstract: Exemplary embodiments are generally directed to methods, mediums, and systems for correcting censored or constrained historical data with various possible types of computing devices, including cloud-based devices, personal computing devices, and edge-based devices. The corrected data may be used in forecasting, for example to forecast demand for a limited resource. In some embodiments, the data is modeled at a higher level of granularity than an individual record. The aggregated demand may then be pro-rated over a group of categories or users where a given category of users that might be small or nonexistent over a certain time frame may be better accommodated. Moreover, it may be easier or more efficient to make assumptions and employ computing resources at the aggregate level.

    Abstract translation: 示例性实施例通常涉及用于使用各种可能类型的计算设备(包括基于云的设备,个人计算设备和基于边缘的设备)来校正被检查或约束的历史数据的方法,介质和系统。 校正的数据可用于预测,例如用于预测有限资源的需求。 在一些实施例中,数据以比单个记录更高的粒度级进行建模。 然后可以对一组类别或用户评估总需求,其中在某一时间范围内可能小或不存在的给定类别的用户可能更好地适应。 此外,在总体水平上进行假设和采用计算资源可能更容易或更有效。

    DYNAMIC PREDICTION AGGREGATION
    134.
    发明申请
    DYNAMIC PREDICTION AGGREGATION 审中-公开
    动态预测综合

    公开(公告)号:US20170061315A1

    公开(公告)日:2017-03-02

    申请号:US15146697

    申请日:2016-05-04

    CPC classification number: G06N7/005 G06F16/2462 H04L67/00

    Abstract: Disclosed are methods, system, and computer program products useful for generating summary statistics for data predictions based on the aggregation of data from past time intervals. Summary statistics such as prediction standard errors, variances, confidence limits, and other statistical measures, may be generated in a way that preserves the basic distributional properties of the original data sets, to allow, for example, a reduction of the multiple data sets through the aggregation process, which may be useful for a prediction process, while determining statistical information for the predicted data.

    Abstract translation: 公开的方法,系统和计算机程序产品可用于基于从过去时间间隔的数据聚合生成数据预测的汇总统计数据。 可以以保持原始数据集的基本分布特性的方式产生诸如预测标准误差,方差,置信限度和其他统计度量之类的汇总统计数据,以允许例如通过 聚合过程,其可以用于预测过程,同时确定预测数据的统计信息。

    THREE-STAGE PREDICTOR FOR TIME SERIES
    135.
    发明申请
    THREE-STAGE PREDICTOR FOR TIME SERIES 有权
    时间序列三级预测

    公开(公告)号:US20170061296A1

    公开(公告)日:2017-03-02

    申请号:US15233400

    申请日:2016-08-10

    CPC classification number: G06N5/04

    Abstract: Information related to a time series can be predicted. For example, a repetitive characteristic of the time series can be determined by analyzing the time series for a pattern that repeats over a predetermined time period. An adjusted time series can be generated by removing the repetitive characteristic from the time series. An effect of a moving event on the adjusted time series can be determined. The moving event can occur on different dates for two or more consecutive years. A residual time series can be generated by removing the effect of the moving event from the adjusted time series. A base forecast that is independent of the repetitive characteristic and the effect of the moving event can be generated using the residual time series. A predictive forecast can be generated by including the repetitive characteristic and the effect of the moving event into the base forecast.

    Abstract translation: 可以预测与时间序列有关的信息。 例如,可以通过分析在预定时间段内重复的图案的时间序列来确定时间序列的重复特性。 可以通过从时间序列中去除重复特性来生成经调整的时间序列。 可以确定移动事件对经调整的时间序列的影响。 移动事件可以在两个或多个连续的年份的不同日期发生。 可以通过从调整的时间序列中消除移动事件的影响来生成剩余时间序列。 可以使用剩余时间序列来生成独立于重复特性和移动事件的影响的基础预测。 可以通过将重复性特征和移动事件的影响包括在基础预测中来产生预测性预测。

    TECHNIQUES TO PROVIDE PROCESSING ENHANCEMENTS FOR A TEXT EDITOR IN A COMPUTING ENVIRONMENT
    136.
    发明申请
    TECHNIQUES TO PROVIDE PROCESSING ENHANCEMENTS FOR A TEXT EDITOR IN A COMPUTING ENVIRONMENT 有权
    在计算环境中为文本编辑提供处理增强的技术

    公开(公告)号:US20170024359A1

    公开(公告)日:2017-01-26

    申请号:US15073998

    申请日:2016-03-18

    Abstract: Various embodiments include a system having interfaces, storage devices, memory, and processing circuitry. The system may include logic to render a portion of a first layer and a portion of a second layer for presentation, determine parameters of tokens for the second layer based a result of the rendering of the second layer, the parameters to include at least one of token width values, token offset values, line height values, and line top values. The system also to align the first layer and the second layer based on the parameters of the tokens for the second layer, and present the first layer and the second layer on a display, the first layer to present tokens and the second layer to receive events.

    Abstract translation: 各种实施例包括具有接口,存储设备,存储器和处理电路的系统。 系统可以包括用于呈现第一层的一部分和用于呈现的第二层的一部分的逻辑,基于第二层的渲染的结果确定第二层的令牌的参数,所述参数包括以下中的至少一个: 令牌宽度值,令牌偏移值,行高度值和行顶值。 该系统还基于用于第二层的令牌的参数来对准第一层和第二层,并将第一层和第二层呈现在显示器上,第一层呈现令牌,第二层用于接收事件 。

    Techniques for query homogenization in cache operations
    137.
    发明授权
    Techniques for query homogenization in cache operations 有权
    缓存操作查询均匀化技术

    公开(公告)号:US09519679B2

    公开(公告)日:2016-12-13

    申请号:US14859420

    申请日:2015-09-21

    CPC classification number: G06F17/30392 G06F17/30457

    Abstract: An apparatus includes a renaming component to homogenized query instructions for retrieving data items from a data set organized using index labels by identifying a declaration instruction associating an object thereof with an index label, replacing the name provided to the object the with an archetypal name based on the index label, and generating change data associating the name with the archetypal name; a hashing component to take an instruction hash of the homogenized instructions; a cache control routine to find a matching instruction hash corresponding to results of earlier database queries in a results cache; and a reversal routine to, in response finding a matching instruction hash, retrieve a cached result from the results cache associated with the matching instruction hash, and replace a name of a different object therein based on the change data and the query instructions to generate a new result of the new database query.

    Abstract translation: 一种装置包括:重新命名组件,用于均质化查询指令,用于从使用索引标签组织的数据集中检索数据项,通过识别将其对象与索引标签相关联的声明指令,基于以下原则替换提供给对象的名称: 索引标签,并生成将名称与原型名称相关联的更改数据; 散列组件对均匀化指令进行指令散列; 一个缓存控制程序,用于查找与结果缓存中早期数据库查询结果相对应的匹配指令散列; 以及反转程序,以响应于找到匹配指令散列,从与匹配指令散列相关联的结果缓存中检索缓存结果,并且基于改变数据和查询指令来替换其中不同对象的名称以生成 新数据库查询的新结果。

    ELECTRICAL TRANSFORMER FAILURE PREDICTION
    138.
    发明申请
    ELECTRICAL TRANSFORMER FAILURE PREDICTION 有权
    电气变压器故障预测

    公开(公告)号:US20160358106A1

    公开(公告)日:2016-12-08

    申请号:US15173927

    申请日:2016-06-06

    Abstract: A computing device predicts a probability of a transformer failure. An analysis type indicator defined by a user is received. A worth value for each of a plurality of variables is computed. Highest worth variables from the plurality of variables are selected based on the computed worth values. A number of variables of the highest worth variables is limited to a predetermined number based on the received analysis type indicator. A first model and a second model are also selected based on the received analysis type indicator. Historical electrical system data is partitioned into a training dataset and a validation dataset that are used to train and validate, respectively, the first model and the second model. A probability of failure model is selected as the first model or the second model based on a comparison between a fit of each model.

    Abstract translation: 计算设备预测变压器故障的概率。 接收由用户定义的分析类型指示符。 计算多个变量中的每一个的值。 基于所计算的值,选择来自多个变量的最高值变量。 基于接收到的分析类型指示符,将最高价值变量的多个变量限制为预定数量。 还基于接收到的分析类型指标来选择第一模型和第二模型。 历史电气系统数据被分为训练数据集和验证数据集,分别用于训练和验证第一模型和第二模型。 基于每个模型的拟合之间的比较,选择故障概率模型作为第一模型或第二模型。

    NORMALIZING ELECTRONIC COMMUNICATIONS USING A NEURAL NETWORK
    139.
    发明申请
    NORMALIZING ELECTRONIC COMMUNICATIONS USING A NEURAL NETWORK 有权
    使用神经网络正规化电子通信

    公开(公告)号:US20160350646A1

    公开(公告)日:2016-12-01

    申请号:US15175503

    申请日:2016-06-07

    CPC classification number: G06N3/0445 G06N3/0454 G06N3/0472

    Abstract: Electronic communications can be normalized using a neural network. For example, a noncanonical communication that includes multiple terms can be received. The noncanonical communication can be preprocessed by (I) generating a vector including multiple characters from a term of the multiple terms; and (II) repeating a substring of the term in the vector such that a last character of the substring is positioned in a last position in the vector. The vector can be transmitted to a neural network configured to receive the vector and generate multiple probabilities based on the vector. A normalized version of the noncanonical communication can be determined using one or more of the multiple probabilities generated by the neural network. Whether the normalized version of the noncanonical communication should be outputted can also be determined using at least one of the multiple probabilities generated by the neural network.

    Abstract translation: 电子通信可以使用神经网络进行归一化。 例如,可以接收包括多个术语的非经典通信。 非经典通信可以通过(I)从多个术语的术语生成包括多个字符的向量来预处理; 和(II)在向量中重复该项的子串,使得子串的最后一个字符位于向量中的最后位置。 向量可以被传送到被配置为接收向量并且基于向量生成多个概率的神经网络。 可以使用由神经网络生成的多个概率中的一个或多个来确定非规范通信的归一化版本。 还可以使用神经网络生成的多个概率中的至少一个来确定是否应该输出非规范通信的归一化版本。

    DISTRIBUTED CORRELATION AND ANALYSIS OF PATIENT THERAPY DATA
    140.
    发明申请
    DISTRIBUTED CORRELATION AND ANALYSIS OF PATIENT THERAPY DATA 审中-公开
    患者治疗数据的分布相关和分析

    公开(公告)号:US20160342742A1

    公开(公告)日:2016-11-24

    申请号:US15145222

    申请日:2016-05-03

    CPC classification number: G16H50/70 G16H10/60

    Abstract: An apparatus includes a processor and storage to store instructions that cause the processor to identify at least one correlation between a diagnosis group and a medication class for each patient of a first set of patients to derive a set of models for each diagnosis group that correlates the diagnosis group to at least one medication class based on the at least one identified correlation; and for each patient of a second set of patients, employ each model of each set of models to make at least one prediction of at least one diagnosis group as indicated in the corresponding diagnosis group record based on at least one medication class indicated in the corresponding medication class record, and compare the at least one prediction to the corresponding diagnosis group record to derive a tally of at least one of true positives or false positives for each prediction.

    Abstract translation: 一种装置包括处理器和存储器,用于存储使得处理器识别第一组患者的每个患者的诊断组与药物类别之间的至少一个相关性的指令,以导出每个诊断组的一组模型, 基于所述至少一个所识别的相关性,诊断组至少一个药物类别; 并且对于第二组患者的每个患者,使用每组模型的每个模型,以基于相应的诊断组记录中指示的至少一个诊断组的至少一个预测,基于相应的 药物分类记录,并且将至少一个预测与相应的诊断组记录进行比较,以得出每个预测的真阳性或假阳性中的至少一个的计数。

Patent Agency Ranking