Invention Grant
- Patent Title: Extracting terms from document data including text segment
- Patent Title (中): 从包括文本段的文档数据中提取术语
-
Application No.: US13899020Application Date: 2013-05-21
-
Publication No.: US09043339B2Publication Date: 2015-05-26
- Inventor: Yohei Ikawa , Shiho Negishi , Hironori Takeuchi
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Cantor Colburn LLP
- Agent Gail Zarick
- Priority: JP2008-257388 20081002; JP2010-531786 20090730
- Main IPC: G06F17/30
- IPC: G06F17/30 ; G06F17/28 ; G06F17/27

Abstract:
A computer system, method, and article of manufacture for extracting a term from electronic document data that includes a text segment. The system includes: a first extraction unit that uses a first text processing information to extract a noun word from the document data; a second extraction unit that uses a second text processing information to extract a term candidate in relation to the noun word or a corpus that includes text data described in the same language used in the document data; a weight assignment unit that uses a third text processing information to select which type to assign a weight from the plurality of types and assigns the weight to the selected type for each noun word and term candidate; a determination unit that determines the type to which the noun word and term candidate belong; and an output unit to output the noun word and term candidate.
Public/Granted literature
- US20130253916A1 EXTRACTING TERMS FROM DOCUMENT DATA INCLUDING TEXT SEGMENT Public/Granted day:2013-09-26
Information query