Invention Grant
- Patent Title: Identification of co-located artifacts in cognitively analyzed corpora
-
Application No.: US16041091Application Date: 2018-07-20
-
Publication No.: US10971273B2Publication Date: 2021-04-06
- Inventor: Brendan Bull , Paul Lewis Felt , Andrew Hicks
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Patterson + Sheridan, LLP
- Main IPC: G06F17/28
- IPC: G06F17/28 ; G16H70/20 ; G06N20/00

Abstract:
Techniques for cognitive corpora analysis are provided. Vector representations are generated by processing documents in a corpus using a passage encoder. One or more concepts are identified in the documents by processing the documents with the passage encoder, where the concepts are assigned respective importance scores by the passage encoder. Further, a selection of a document is received, and a sub-corpus of documents is generated by computing a similarity measure between the vector representation of the first document and the vector representation of at least one other document in the corpus. An overall importance score is generated for a first concept, with respect to the generated sub-corpus, by identifying a respective importance score of the first concept in at least two respective documents in the sub-corpus, and aggregating the respective importance scores. Finally, an indication of the generated overall importance score is provided.
Public/Granted literature
- US20200027566A1 IDENTIFICATION OF CO-LOCATED ARTIFACTS IN COGNITIVELY ANALYZED CORPORA Public/Granted day:2020-01-23
Information query