Invention Grant
- Patent Title: Selective content extraction
- Patent Title (中): 选择性内容提取
-
Application No.: US13378153Application Date: 2009-06-30
-
Publication No.: US09032285B2Publication Date: 2015-05-12
- Inventor: Sam Liu , Parag Joshi , Yuhong Xiong , Clayton Atkins , Jerry Liu
- Applicant: Sam Liu , Parag Joshi , Yuhong Xiong , Clayton Atkins , Jerry Liu
- Applicant Address: US TX Houston
- Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee Address: US TX Houston
- International Application: PCT/US2009/049298 WO 20090630
- International Announcement: WO2011/002456 WO 20110106
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F17/30

Abstract:
A method for extracting web content includes detecting, within a web page, a hierarchical structure that includes a plurality of nodes. Potential article nodes from the plurality of nodes are identified. The identified potential article node with a highest rank in the hierarchical structure is identified as an article node. Content is extracted from the article node.
Public/Granted literature
- US20120089903A1 SELECTIVE CONTENT EXTRACTION Public/Granted day:2012-04-12
Information query