Invention Grant
- Patent Title: Word breaker from cross-lingual phrase table
- Patent Title (中): 词语断词由跨语言表
-
Application No.: US13861146Application Date: 2013-04-11
-
Publication No.: US09330087B2Publication Date: 2016-05-03
- Inventor: Mohamed Ahmed El-Sharqwi , Achraf Abdel-Moneim Tawfik Mahmoud Chalabi
- Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agent Alin Corie; Cassandra T. Swain; Micky Minhas
- Main IPC: G06F17/27
- IPC: G06F17/27 ; G06F17/28 ; G06F17/20 ; G10L21/00 ; G10L25/00

Abstract:
Automatically creating word breakers which segment words into morphemes is described, for example, to improve information retrieval, machine translation or speech systems. In embodiments a cross-lingual phrase table, comprising source language (such as Turkish) phrases and potential translations in a target language (such as English) with associated probabilities, is available. In various examples, blocks of source language phrases from the phrase table are created which have similar target language translations. In various examples, inference using the target language translations in a block enables stem and affix combinations to be found for source language words without the need for input from human-judges or prior knowledge of source language linguistic rules or a source language lexicon.
Public/Granted literature
- US20140309986A1 WORD BREAKER FROM CROSS-LINGUAL PHRASE TABLE Public/Granted day:2014-10-16
Information query