Invention Grant
- Patent Title: Compound splitting
- Patent Title (中): 复合分裂
-
Application No.: US13026936Application Date: 2011-02-14
-
Publication No.: US09075792B2Publication Date: 2015-07-07
- Inventor: Andrew M. Dai , Klaus Macherey , Franz Josef Och , Ashok C. Popat , David R. Talbot
- Applicant: Andrew M. Dai , Klaus Macherey , Franz Josef Och , Ashok C. Popat , David R. Talbot
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Remarck Law Group PLC
- Main IPC: G06F17/28
- IPC: G06F17/28 ; G06F17/27

Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for decompounding compound words are disclosed. In one aspect, a method includes obtaining a token that includes a sequence of characters, identifying two or more candidate sub-words that are constituents of the token, and one or more morphological operations that are required to transform the sub-words into the token, where at least one of the morphological operations involves a use of a non-dictionary word, and determining a cost associated with each sub-word and a cost associated with each morphological operation.
Public/Granted literature
- US20110202330A1 Compound Splitting Public/Granted day:2011-08-18
Information query