Invention Grant
- Patent Title: Corpus expansion system and method thereof
- Patent Title (中): 语料库扩展系统及其方法
-
Application No.: US12138139Application Date: 2008-06-12
-
Publication No.: US07805288B2Publication Date: 2010-09-28
- Inventor: Hong Lei Guo , Zhi Li Guo , Zhao Ming Qiu , Li Qin Shen , Li Zhang
- Applicant: Hong Lei Guo , Zhi Li Guo , Zhao Ming Qiu , Li Qin Shen , Li Zhang
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Scully, Scott, Murphy & Presser, P.C.
- Agent Kenneth Corsello, Esq.
- Priority: CN200510108065 20050929
- Main IPC: G06F17/20
- IPC: G06F17/20

Abstract:
A system and method for expanding new sample seeds to automatically expand corpora, in which sample seeds are used to collect corpus is provided. The new sample seeds are generated based on the already existed sample seeds and collected corpora; The corpus expansion strategy is determined based on all the sample seeds having been used and new sample seeds: The new sample seeds are refined based on the corpus expansion strategy, and the refined new sample seeds are used to further collect corpus. The above steps are repeatedly executed until predefined condition is satisfied. According to the invention, corpus may be automatically expanded from the web or other resources with low cost and in convenient way to improve the coverage of corpora.
Public/Granted literature
- US20080250015A1 CORPUS EXPANSION SYSTEM AND METHOD THEREOF Public/Granted day:2008-10-09
Information query