Web document keyword and phrase extraction

Invention Grant

US08135728B2 Web document keyword and phrase extraction 有权

Title translation: Web文档关键字和短语提取

Please log in to see more content

Patent Title: Web document keyword and phrase extraction
Patent Title (中): Web文档关键字和短语提取
Application No.: US11619230

Application Date: 2007-01-03
Publication No.: US08135728B2

Publication Date: 2012-03-13
Inventor: Wen-tau Yih , Joshua T. Goodman , Vitor Rocha de Carvalho
Applicant: Wen-tau Yih , Joshua T. Goodman , Vitor Rocha de Carvalho
Applicant Address: US WA Redmond
Assignee: Microsoft Corporation
Current Assignee: Microsoft Corporation
Current Assignee Address: US WA Redmond
Agency: Lee & Hayes, PLLC
Main IPC: G06F7/00
IPC: G06F7/00 ; G06F17/30 ; G06F13/14

Web document keyword and phrase extraction

Abstract:

Extraction analysis techniques biased, in part, by query frequency information from a query log file and/or search engine cache are employed along with machine learning processes to determine candidate keywords and/or phrases of web documents. Web oriented features associated with the candidate keywords and/or phrases are also utilized to analyze the web documents. A keyword and/or phrase extraction mechanism can be utilized to score keywords and/or phrases in a web document and estimate a likelihood that the keywords and/or phrases are relevant, for example, in an advertising system and the like.

Abstract(Chinese):

提取分析技术部分地通过来自查询日志文件和/或搜索引擎高速缓冲存储器的查询频率信息以及机器学习过程来偏移来确定web文档的候选关键字和/或短语。与候选关键字和/或短语相关联的面向Web的功能也用于分析网络文档。可以使用关键字和/或短语提取机制来评估网络文档中的关键字和/或短语，并估计关键词和/或短语相关的可能性，例如在广告系统等中。

Public/Granted literature

US20070112764A1 WEB DOCUMENT KEYWORD AND PHRASE EXTRACTION Public/Granted day:2007-05-17

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F7/00	通过待处理的数据的指令或内容进行运算的数据处理的方法或装置（逻辑电路入H03K19/00）