Invention Grant
US08744833B2 Method and apparatus for creating a language model and kana-kanji conversion
有权
用于创建语言模型和假名汉字转换的方法和装置
- Patent Title: Method and apparatus for creating a language model and kana-kanji conversion
- Patent Title (中): 用于创建语言模型和假名汉字转换的方法和装置
-
Application No.: US11917657Application Date: 2006-06-23
-
Publication No.: US08744833B2Publication Date: 2014-06-03
- Inventor: Rie Maeda , Yoshiharu Sato , Miyuki Seki
- Applicant: Rie Maeda , Yoshiharu Sato , Miyuki Seki
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agent Steve Crocker; Jim Ross; Micky Minhas
- Priority: JP2005-185765 20050624
- International Application: PCT/US2006/024566 WO 20060623
- International Announcement: WO2007/002456 WO 20070104
- Main IPC: G06F17/20
- IPC: G06F17/20

Abstract:
Method for creating a language model capable of preventing deterioration of quality caused by the conventional back-off to unigram. Parts-of-speech with the same display and reading are obtained from a storage device (206). A cluster (204) is created by combining the obtained parts-of-speech. The created cluster (204) is stored in the storage device (206). In addition, when an instruction (214) for dividing the cluster is inputted, the cluster stored in the storage device (206) is divided (210) in accordance with to the inputted instruction (212). Two of the clusters stored in the storage device are combined (218), and a probability of occurrence of the combined clusters in the text corpus is calculated (222). The combined cluster is associated with the bigram indicating the calculated probability and stored into the storage device.
Public/Granted literature
- US20110106523A1 Method and Apparatus for Creating a Language Model and Kana-Kanji Conversion Public/Granted day:2011-05-05
Information query