Invention Grant
- Patent Title: Word extraction method and system for use in word-breaking using statistical information
- Patent Title (中): 使用统计信息的单词提取方法和系统
-
Application No.: US10839144Application Date: 2004-05-05
-
Publication No.: US07783476B2Publication Date: 2010-08-24
- Inventor: Jung-Chuan Yang
- Applicant: Jung-Chuan Yang
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agency: Westman, Champlin & Kelly, P.A.
- Agent Joseph R. Kelly
- Main IPC: G06F17/21
- IPC: G06F17/21 ; G06F17/27 ; G06F17/20

Abstract:
A method, computer readable medium and system are provided which collect new words for addition to a lexicon for an agglutinative language. Sentences in the agglutinative language are retrieved from documents, for example from web pages. New word candidate character strings are identified in the retrieved sentences. The identified new word candidate character strings are filtered using a combination of a plurality of statistical criteria to generate a new words list. Words from the new words list are added to the lexicon.
Public/Granted literature
- US20050251384A1 Word extraction method and system for use in word-breaking Public/Granted day:2005-11-10
Information query