Invention Grant
- Patent Title: Aligning hierarchal and sequential document trees to identify parallel data
- Patent Title (中): 对齐层次和顺序文档树以识别并行数据
-
Application No.: US11483941Application Date: 2006-07-10
-
Publication No.: US07805289B2Publication Date: 2010-09-28
- Inventor: Ming Zhou , Cheng Niu , Lei Shi
- Applicant: Ming Zhou , Cheng Niu , Lei Shi
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agency: Westman, Champlin & Kelly, P.A.
- Main IPC: G06F17/28
- IPC: G06F17/28 ; G06F17/20

Abstract:
A set of candidate parallel pages is identified based on trigger words in one or more pages downloaded from a given network location (such as a website). A set of document trees representing each of the candidate pages are aligned to identify translationally parallel content and hyperlinks. The parallel content is further fed into conventional sentence aligner for parallel sentences. And the parallel hyperlinks usually refer to other parallel documents, and lead to a recursive mining of parallel documents.
Public/Granted literature
- US20080010056A1 Aligning hierarchal and sequential document trees to identify parallel data Public/Granted day:2008-01-10
Information query