Invention Grant
US08510099B2 Method and system of selecting word sequence for text written in language without word boundary markers
有权
为没有字边界标记的语言编写的文本选择字序列的方法和系统
- Patent Title: Method and system of selecting word sequence for text written in language without word boundary markers
- Patent Title (中): 为没有字边界标记的语言编写的文本选择字序列的方法和系统
-
Application No.: US12746889Application Date: 2009-12-04
-
Publication No.: US08510099B2Publication Date: 2013-08-13
- Inventor: Neng Dai
- Applicant: Neng Dai
- Applicant Address: KY Grand Cayman
- Assignee: Alibaba Group Holding Limited
- Current Assignee: Alibaba Group Holding Limited
- Current Assignee Address: KY Grand Cayman
- Agency: Lee & Hayes, PLLC
- Priority: CN200810192934 20081231
- International Application: PCT/US2009/066753 WO 20091204
- International Announcement: WO2010/077572 WO 20100708
- Main IPC: G06F17/27
- IPC: G06F17/27

Abstract:
The present disclosure discloses a method and apparatus of selecting a word sequence for a text written in a language without word boundary in order to solve the problem of having excessively large computation load when selecting an optimal word sequence in existing technologies. The disclosed method includes: segmenting a segment of the text to obtain different word sequences; determining a common word boundary for the word sequences; and performing optimal word sequence selection for portions of the word sequences prior to the common word boundary. Because optimal word sequence selection is performed for portions of word sequences prior to a common word boundary, shorter independent units can be obtained, thus reducing computation load of word segmentation.
Public/Granted literature
- US20110252010A1 Method and System of Selecting Word Sequence for Text Written in Language Without Word Boundary Markers Public/Granted day:2011-10-13
Information query