Invention Grant
US08214363B2 Recognizing domain specific entities in search queries 失效
识别搜索查询中的域特定实体

Recognizing domain specific entities in search queries
Abstract:
Disclosed is a computer implemented system and method of discerning an entity class from a query. A query is received and broken into query fragments comprised of terms. The terms of the query fragments are compared to terms belonging to one or more bag of words models with matches removed. The bag of words models to which the terms removed from the query fragments belong are remembered. The remaining n terms of the query fragments are processed to obtain query phrases containing 1-n grams from the query fragments. Each query phrase is submitted to a search engine. A sampling of snippets from the search results are extracted. Non stop words from the snippets are stemmed. A similarity score is computed for the stemmed snippet words with respect to each entity class bag of words model. Snippet entity classes are selected based on the bag of words models having the highest similarity score with the stemmed snippet words. The similarity scores of the selected snippet entity classes are consolidated to obtain a candidate list of entity classes to which the query phrase belongs based on the summed similarity scores for the snippet entity classes. The entity classes that exceed a predetermined threshold are selected as the entity classes to which the query phrase belongs. The remembered bag of words models to which the terms removed from the query fragments belong are used to choose the context sensitive entity classes to which the query phrase belongs.
Public/Granted literature
Information query
Patent Agency Ranking
0/0