Invention Grant
- Patent Title: Method and apparatus for extracting information
-
Application No.: US15564187Application Date: 2016-06-17
-
Publication No.: US10679051B2Publication Date: 2020-06-09
- Inventor: Shouke Qin , You Han , Zhiyang Chen , Feichao Ma , Peizhi Xu
- Applicant: Baidu Online Network Technology (Beijing) Co., Ltd.
- Applicant Address: CN Beijing
- Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
- Current Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
- Current Assignee Address: CN Beijing
- Agency: Knobbe, Martens, Olson & Bear, LLP
- Priority: com.zzzhc.datahub.patent.etl.us.BibliographicData$PriorityClaim@60d1bd7f
- International Application: PCT/CN2016/086213 WO 20160617
- International Announcement: WO2017/113645 WO 20170706
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06F16/35 ; G06F16/958 ; G06F40/14 ; G06F40/117 ; G06F40/154

Abstract:
The present application discloses a method and apparatus for extracting information. A specific implementation of the method comprises: parsing a pre-acquired web page file into a structure of a tag tree, and recognizing, in nodes of the tag tree, at least one body node at which a web page body in the web page file is located; performing a paragraph division on a content contained in the at least one body node to generate paragraph blocks, and setting a tag attribute for each paragraph block according to an attribute of a tag associated with the each paragraph block; classifying a text content contained in the each paragraph block based on the tag attribute of the each paragraph block; and extracting information comprising a question and an answer from the text content contained in the each paragraph block based on a classification result. This implementation implements the automatic and precise extraction of information.
Public/Granted literature
- US20180322341A1 METHOD AND APPARATUS FOR EXTRACTING INFORMATION Public/Granted day:2018-11-08
Information query