Identifying similar documents using graphs
Abstract:
While a document, such as an e-book, is read by a user on a computing device such as an e-reader, concept phrases are extracted from the document. The extracted concept phrases may be words or phrases that match known concept phrases such as headings. Based on a universal concept phrase graph that includes nodes for each known concept phrase, core concept phrases are determined for the document. These core concept phrases are associated with nodes of the universal concept phrase graph that are located within a predetermined distance of nodes that represent the concept phrases extracted from the document. Each core concept phrase is combined with one or more of the concept phrases to generate multiple queries. These queries are submitted to search engines, and indicators of documents from the corresponding search results are presented to the user with the original document that is being read.
Public/Granted literature
Information query
Patent Agency Ranking
0/0