Invention Grant
US09280528B2 Method and system for processing and learning rules for extracting information from incoming web pages
有权
用于从传入网页提取信息的处理和学习规则的方法和系统
- Patent Title: Method and system for processing and learning rules for extracting information from incoming web pages
- Patent Title (中): 用于从传入网页提取信息的处理和学习规则的方法和系统
-
Application No.: US12896942Application Date: 2010-10-04
-
Publication No.: US09280528B2Publication Date: 2016-03-08
- Inventor: Srinivasan Hanumantha Rao Sengamedu , Charu Tiwari , Amit Madaan , Rupesh Rasiklal Mehta , S R Jeyashankher , Rajeev Rastogi
- Applicant: Srinivasan Hanumantha Rao Sengamedu , Charu Tiwari , Amit Madaan , Rupesh Rasiklal Mehta , S R Jeyashankher , Rajeev Rastogi
- Applicant Address: US CA Sunnyvale
- Assignee: Yahoo! Inc.
- Current Assignee: Yahoo! Inc.
- Current Assignee Address: US CA Sunnyvale
- Agency: Brinks Gilson & Lione
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F17/22 ; G06F17/30

Abstract:
An example of a method includes determining features of a first type for a web page of a plurality of web pages. The method also includes electronically determining a plurality of rules for an attribute of the first web page, wherein the plurality of rules are determined based on features of the first type. The method also includes electronically identifying a first rule, from the plurality of rules, which satisfies a first predefined criterion. The first predefined criteria include at least one of a first threshold for a precision parameter, a second threshold for a support parameter, a third threshold for a distance parameter and a fourth threshold for a recall parameter. The method further includes storing the first rule to enable extraction of value of the attribute from a second web page.
Public/Granted literature
- US20120084636A1 METHOD AND SYSTEM FOR WEB INFORMATION EXTRACTION Public/Granted day:2012-04-05
Information query