Blockwise extraction of document metadata

    公开(公告)号:GB2583290B

    公开(公告)日:2022-03-16

    申请号:GB202009894

    申请日:2018-11-23

    Applicant: IBM

    Abstract: Methods, computer program products, and systems are presented. The methods include, for instance: obtaining a document image, wherein the document image includes a plurality of objects; identifying a plurality of macroblocks within the document image; performing microblock processing within macroblocks of the plurality of macroblocks, wherein the microblock processing includes examining content of microblocks within a macroblock for extraction of key-value pairs, the examining content including performing an ontological analysis of microblocks, wherein the microblock processing includes associating confidence levels to the extracted key-value pairs; and outputting metadata based on the performing microblock processing within macroblocks of the plurality of macroblocks.

    Automated resolution of over and under-specification in a knowledge graph

    公开(公告)号:GB2596729A

    公开(公告)日:2022-01-05

    申请号:GB202114479

    申请日:2020-05-15

    Applicant: IBM

    Abstract: Systems and methods for automated resolution of over-specification and under-specification in a knowledge graph are disclosed. In embodiments, a method includes: determining, by a computing device, that a size of an object cluster of a knowledge graph meets a threshold value indicating under-specification of a knowledge base of the knowledge graph; determining, by the computing device, sub-classes for objects of the knowledge graph; re-initializing, by the computing device, the knowledge graph based on the sub-classes to generate a refined knowledge graph, wherein the size of the object cluster is reduced in the refined knowledge graph; and generating, by the computing device, an output based on information determined from the refined knowledge graph.

    Phonetic patterns for fuzzy matching in natural language processing

    公开(公告)号:GB2585492B

    公开(公告)日:2021-03-17

    申请号:GB202008024

    申请日:2018-10-31

    Applicant: IBM

    Abstract: A token is extracted from a Natural Language input. A phonetic pattern is computed corresponding to the token, the phonetic pattern including a sound pattern that represents a part of the token when the token is spoken. New data is created from data of the phonetic pattern, the new data including a syllable sequence corresponding to the phonetic pattern. A state of a data storage device is changed by storing the new data in a matrix of syllable sequences corresponding to the token. An option is selected that corresponds to the token by executing a fuzzy matching algorithm using a processor and a memory, the selecting of the option is based on a syllable sequence in the matrix.

    Blockwise extraction of document metadata

    公开(公告)号:GB2583290A

    公开(公告)日:2020-10-21

    申请号:GB202009894

    申请日:2018-11-23

    Applicant: IBM

    Abstract: Methods, computer program products, and systems are presented. The methods include, for instance: obtaining a document image, wherein the document image includes a plurality of objects; identifying a plurality of macroblocks within the document image; performing microblock processing within macroblocks of the plurality of macroblocks, wherein the microblock processing includes examining content of microblocks within a macroblock for extraction of key-value pairs, the examining content including performing an ontological analysis of microblocks, wherein the microblock processing includes associating confidence levels to the extracted key-value pairs; and outputting metadata based on the performing microblock processing within macroblocks of the plurality of macroblocks.

    Dynamically determining a region
    5.
    发明专利

    公开(公告)号:GB2590257A

    公开(公告)日:2021-06-23

    申请号:GB202100856

    申请日:2019-06-26

    Applicant: IBM

    Abstract: A computer-implemented method includes monitoring, by a computing device, sensor data during gameplay of a sporting event; determining, by the computing device, predictive factors associated with a target based on the monitoring the sensor data; determining, by the computing device, a real-time region of effectiveness for the target based on the predictive factors and training data identifying historical effectiveness of the target;and outputting, by the computing device, the real-time region of effectiveness for displaying the real-time region of effectiveness around the target.

    Cognitive document image digitization

    公开(公告)号:GB2582722A

    公开(公告)日:2020-09-30

    申请号:GB202009558

    申请日:2018-11-23

    Applicant: IBM

    Abstract: Methods, computer program products, and systems are presented. The methods include, for instance: obtaining a document image with objects and identifying microblocks corresponding to each object. Analyzing a position of a microblock for collinearity with another microblock based on respective positional characteristics and adjustable collinearity parameters. Collinear microblocks are identified into a macroblock and computational data of a key- value pair is created from the macroblock. A heuristic confidence level is associated with the key-value pair. Also based on data cluster formation, a table may be classified and data extracted.

    Semantic normalization in document digitization

    公开(公告)号:GB2581461A

    公开(公告)日:2020-08-19

    申请号:GB202009248

    申请日:2018-11-30

    Applicant: IBM

    Abstract: A method for normalizing a key in a document image includes identifying a candidate key corresponding to an object in a document image with a key in key ontology data, based on that the candidate key is semantically interchangeable with the key. A context, position, and style of each objects of the document image is represented in the document metadata. The candidate key is normalized into a normal form. A key class corresponding to the normal form is determined and a confidence score indicating a likelihood of the key class being representative of the candidate key is assessed. A semantic database is updated with the key class upon verification for enhanced processing of future documents.

    Cognitively controlling data delivery

    公开(公告)号:GB2601087A

    公开(公告)日:2022-05-18

    申请号:GB202202265

    申请日:2020-07-10

    Applicant: IBM

    Abstract: An approach is provided for cognitive control of channel bandwidth. Devices connected to access point(s) of a network are detected. Locations of the devices are detected. Based on (i) the devices being connected to the access point(s) and (ii) the locations of the devices, a gathering of people is detected as a group of users who are operating the detected devices at a current time within a geographical area that includes the locations of the devices. Data access patterns of the devices are detected. Based on the detected data access patterns and the gathering of people being detected, a quality of service class identifier (QCI) is updated from a normal setting to a new setting to satisfy bandwidth requirements of the devices.

    Cognitive document image digitization

    公开(公告)号:GB2582722B

    公开(公告)日:2021-03-03

    申请号:GB202009558

    申请日:2018-11-23

    Applicant: IBM

    Abstract: Methods, computer program products, and systems are presented. The methods include, for instance: obtaining a document image with objects and identifying microblocks corresponding to each object. Analyzing a position of a microblock for collinearity with another microblock based on respective positional characteristics and adjustable collinearity parameters. Collinear microblocks are identified into a macroblock and computational data of a key-value pair is created from the macroblock. A heuristic confidence level is associated with the key-value pair. Also based on data cluster formation, a table may be classified and data extracted.

    Phonetic patterns for fuzzy matching in natural language processing

    公开(公告)号:GB2585492A

    公开(公告)日:2021-01-13

    申请号:GB202008024

    申请日:2018-10-31

    Applicant: IBM

    Abstract: A token is extracted from a Natural Language input. A phonetic pattern is computed corresponding to the token, the phonetic pattern including a sound pattern that represents a part of the token when the token is spoken. New data is created from data of the phonetic pattern, the new data including a syllable sequence corresponding to the phonetic pattern. A state of a data storage device is changed by storing the new data in a matrix of syllable sequences corresponding to the token. An option is selected that corresponds to the token by executing a fuzzy matching algorithm using a processor and a memory, the selecting of the option is based on a syllable sequence in the matrix.

Patent Agency Ranking