Invention Grant
- Patent Title: Information extraction system, information extraction method, information extraction program, and information service system
- Patent Title (中): 信息提取系统,信息提取方法,信息提取程序和信息服务系统
-
Application No.: US12294143Application Date: 2007-03-23
-
Publication No.: US08886661B2Publication Date: 2014-11-11
- Inventor: Hironori Mizuguchi , Masaaki Tsuchida , Dai Kusui , Hideki Kawai
- Applicant: Hironori Mizuguchi , Masaaki Tsuchida , Dai Kusui , Hideki Kawai
- Applicant Address: JP Tokyo
- Assignee: NEC Corporation
- Current Assignee: NEC Corporation
- Current Assignee Address: JP Tokyo
- Agency: Young & Thompson
- Priority: JP2006-081598 20060323
- International Application: PCT/JP2007/055958 WO 20070323
- International Announcement: WO2007/108529 WO 20070927
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06F17/28 ; G06F17/27 ; G06Q30/02 ; G06Q30/06

Abstract:
According to the present invention, phrases of the same kind can be extracted from a plurality of documents having various formats. A storage device stores a plurality of documents that have various formats. A pattern candidate creating unit receives a list of input words that are selected as samples among phrases that are to be included in a dictionary. The pattern candidate creating unit selects one document, determines forward and backward character strings of input words in the selected document as candidates of patterns, and stores the forward and backward character strings as a pattern candidate. The pattern candidate creating unit executes the above processes for each of the documents. A phrase candidate creating unit extracts phrases interposed between patterns included in the pattern candidate as candidates of phrases to be output, and stores the extracted phrases as a phrase candidate. A phrase selecting unit outputs a candidate of a phrase satisfying a predetermined condition among candidates of phrases included in the phrase candidate as an output word to an output device.
Public/Granted literature
Information query