Personalized summary generation of data visualizations

    公开(公告)号:US10909313B2

    公开(公告)日:2021-02-02

    申请号:US15630462

    申请日:2017-06-22

    Abstract: Various embodiments are generally directed to systems for summarizing data visualizations (i.e., images of data visualizations), such as a graph image, for instance. Some embodiments are particularly directed to a personalized graph summarizer that analyzes a data visualization, or image, to detect pre-defined patterns within the data visualization, and produces a textual summary of the data visualization based on the pre-defined patterns detected within the data visualization. In various embodiments, the personalized graph summarizer may include features to adapt to the preferences of a user for generating an automated, personalized computer-generated narrative. For instance, additional pre-defined patterns may be created for detection and/or the textual summary may be tailored based on user preferences. In some such instances, one or more of the user preferences may be automatically determined by the personalized graph summarizer without requiring the user to explicitly indicate them. Embodiments may integrate machine learning and computer vision concepts.

    MACHINE LEARNING PREDICTIVE LABELING SYSTEM
    2.
    发明申请

    公开(公告)号:US20190034766A1

    公开(公告)日:2019-01-31

    申请号:US16108293

    申请日:2018-08-22

    Abstract: A computing device automatically classifies an observation vector. (a) A converged classification matrix is computed that defines a label probability for each observation vector. (b) The value of the target variable associated with a maximum label probability value is selected for each observation vector. Each observation vector is assigned to a cluster. A distance value is computed between observation vectors assigned to the same cluster. An average distance value is computed for each observation vector. A predefined number of observation vectors are selected that have minimum values for the average distance value. The supervised data is updated to include the selected observation vectors with the value of the target variable selected in (b). The selected observation vectors are removed from the unlabeled subset. (a) and (b) are repeated. The value of the target variable for each observation vector is output to a labeled dataset.

    Word Embeddings and Virtual Terms

    公开(公告)号:US20210027024A1

    公开(公告)日:2021-01-28

    申请号:US17060198

    申请日:2020-10-01

    Abstract: A computing system receives a collection comprising multiple sets of ordered terms, including a first set. The system generates a dataset indicating an association between each pair of terms within a same set of the collection by generating co-occurrence score(s) for the first set. The system generates computed probabilities based on the co-occurrence score(s) for the first set. The computed probabilities indicate a likelihood that one term in a given pair of terms of the collection appears in a given set of the collection given that another term in the given pair of terms of the collection occurs. The system smoothes the computed probabilities by adding one or more random observations. The system generates one or more association indications for the first set based on the smoothed computed probabilities. The system outputs an indication of the dataset. Additionally, or alternatively, based on association measure(s), the system generates a virtual term.

    Computer-based visualization of machine-learning models and behavior

    公开(公告)号:US10762390B2

    公开(公告)日:2020-09-01

    申请号:US15952833

    申请日:2018-04-13

    Abstract: Machine-learning models and behavior can be visualized. For example, a machine-learning model can be taught using a teaching dataset. A test input can then be provided to the machine-learning model to determine a baseline confidence-score of the machine-learning model. Next, weights for elements in the teaching dataset can be determined. An analysis dataset can be generated that includes a subset of the elements that have corresponding weights above a predefined threshold. For each overlapping element in both the analysis dataset and the test input, (i) a modified version of the test input can be generated that excludes the overlapping element, and (ii) the modified version of the test input can be provided to the machine-learning model to determine an effect of the overlapping element on the baseline confidence-score. A graphical user interface can be generated that visually depicts the test input and various elements' effects on the baseline confidence-score.

    Rule development for natural language processing of text
    8.
    发明授权
    Rule development for natural language processing of text 有权
    自然语言处理文本的规则开发

    公开(公告)号:US09460071B2

    公开(公告)日:2016-10-04

    申请号:US14692333

    申请日:2015-04-21

    CPC classification number: G06F17/241 G06F17/30734

    Abstract: In a computing device that defines a rule for natural language processing of text, annotated text is selected from a first document of a plurality of annotated documents. An entity rule type is selected from a plurality of entity rule types. An argument of the selected entity rule type is identified. A value for the identified argument is randomly selected based on the selected annotated text to generate a rule instance. The generated rule instance is applied to remaining documents of the plurality of annotated documents. A rule performance measure is computed based on application of the generated rule instance. The generated rule instance and the computed rule performance measure are stored for application to other documents.

    Abstract translation: 在定义用于文本的自然语言处理的规则的计算设备中,从多个注释文档的第一文档中选择注释文本。 从多个实体规则类型中选择实体规则类型。 识别所选实体规则类型的参数。 基于所选注释文本随机选择所标识参数的值以生成规则实例。 所生成的规则实例被应用于多个注释文档的剩余文档。 基于生成的规则实例的应用计算规则性能度量。 生成的规则实例和计算的规则性能度量被存储以供应用于其他文档。

    Word embeddings and virtual terms

    公开(公告)号:US11048884B2

    公开(公告)日:2021-06-29

    申请号:US17060198

    申请日:2020-10-01

    Abstract: A computing system receives a collection comprising multiple sets of ordered terms, including a first set. The system generates a dataset indicating an association between each pair of terms within a same set of the collection by generating co-occurrence score(s) for the first set. The system generates computed probabilities based on the co-occurrence score(s) for the first set. The computed probabilities indicate a likelihood that one term in a given pair of terms of the collection appears in a given set of the collection given that another term in the given pair of terms of the collection occurs. The system smoothes the computed probabilities by adding one or more random observations. The system generates one or more association indications for the first set based on the smoothed computed probabilities. The system outputs an indication of the dataset. Additionally, or alternatively, based on association measure(s), the system generates a virtual term.

Patent Agency Ranking