Identification of co-located artifacts in cognitively analyzed corpora
Abstract:
Techniques for cognitive corpora analysis are provided. Vector representations are generated by processing documents in a corpus using a passage encoder. One or more concepts are identified in the documents by processing the documents with the passage encoder, where the concepts are assigned respective importance scores by the passage encoder. Further, a selection of a document is received, and a sub-corpus of documents is generated by computing a similarity measure between the vector representation of the first document and the vector representation of at least one other document in the corpus. An overall importance score is generated for a first concept, with respect to the generated sub-corpus, by identifying a respective importance score of the first concept in at least two respective documents in the sub-corpus, and aggregating the respective importance scores. Finally, an indication of the generated overall importance score is provided.
Information query
Patent Agency Ranking
0/0