Mining multi-lingual data

Invention Grant

US09864744B2 Mining multi-lingual data 有权

Please log in to see more content

Patent Title: Mining multi-lingual data
Application No.: US14559540

Application Date: 2014-12-03
Publication No.: US09864744B2

Publication Date: 2018-01-09
Inventor: Matthias Gerhard Eck , Ying Zhang , Yury Andreyevich Zemlyanskiy , Alexander Waibel
Applicant: Facebook, Inc.
Applicant Address: US CA Menlo Park
Assignee: Facebook, Inc.
Current Assignee: Facebook, Inc.
Current Assignee Address: US CA Menlo Park
Agency: Perkins Coie LLP
Main IPC: G06F17/30
IPC: G06F17/30 ; G06F17/28

Abstract:

Technology is disclosed for mining training data to create machine translation engines. Training data can be mined as translation pairs from single content items that contain multiple languages; multiple content items in different languages that are related to the same or similar target; or multiple content items that are generated by the same author in different languages. Locating content items can include identifying potential sources of translation pairs that fall into these categories and applying filtering techniques to quickly gather those that are good candidates for being actual translation pairs. When actual translation pairs are located, they can be used to retrain a machine translation engine as in-domain for social media content items.

Public/Granted literature

US20160162575A1 MINING MULTI-LINGUAL DATA Public/Granted day:2016-06-09

Information query

Espacenet