Identifying similar documents using graphs

Invention Grant

US10025783B2 Identifying similar documents using graphs 有权

Please log in to see more content

Patent Title: Identifying similar documents using graphs
Application No.: US14610261

Application Date: 2015-01-30
Publication No.: US10025783B2

Publication Date: 2018-07-17
Inventor: Rakesh Agrawal , Sreenivas Gollapudi , Anitha Kannan , Krishnaram Kenthapadi , Nathaniel Dion Parrish
Applicant: Microsoft Technology Licensing, LLC
Applicant Address: US WA Redmond
Assignee: Microsoft Technology Licensing, LLC
Current Assignee: Microsoft Technology Licensing, LLC
Current Assignee Address: US WA Redmond
Agent Jonathan M. Waldman
Main IPC: G06F17/30
IPC: G06F17/30

Identifying similar documents using graphs

Abstract:

While a document, such as an e-book, is read by a user on a computing device such as an e-reader, concept phrases are extracted from the document. The extracted concept phrases may be words or phrases that match known concept phrases such as headings. Based on a universal concept phrase graph that includes nodes for each known concept phrase, core concept phrases are determined for the document. These core concept phrases are associated with nodes of the universal concept phrase graph that are located within a predetermined distance of nodes that represent the concept phrases extracted from the document. Each core concept phrase is combined with one or more of the concept phrases to generate multiple queries. These queries are submitted to search engines, and indicators of documents from the corresponding search results are presented to the user with the original document that is being read.

Public/Granted literature

US20160224547A1 IDENTIFYING SIMILAR DOCUMENTS USING GRAPHS Public/Granted day:2016-08-04

Information query

Espacenet