Invention Grant
US07783476B2 Word extraction method and system for use in word-breaking using statistical information 有权
使用统计信息的单词提取方法和系统

Word extraction method and system for use in word-breaking using statistical information
Abstract:
A method, computer readable medium and system are provided which collect new words for addition to a lexicon for an agglutinative language. Sentences in the agglutinative language are retrieved from documents, for example from web pages. New word candidate character strings are identified in the retrieved sentences. The identified new word candidate character strings are filtered using a combination of a plurality of statistical criteria to generate a new words list. Words from the new words list are added to the lexicon.
Public/Granted literature
Information query
Patent Agency Ranking
0/0