Invention Grant
- Patent Title: Systems and methods for converting legacy and proprietary documents into extended mark-up language format
- Patent Title (中): 将传统和专有文档转换为扩展标记语言格式的系统和方法
-
Application No.: US11598083Application Date: 2006-11-13
-
Publication No.: US07730396B2Publication Date: 2010-06-01
- Inventor: Boris Chidlovskii , Hervé Dejean
- Applicant: Boris Chidlovskii , Hervé Dejean
- Applicant Address: US CT Norwalk
- Assignee: Xerox Corporation
- Current Assignee: Xerox Corporation
- Current Assignee Address: US CT Norwalk
- Agency: Oliff & Berridge, PLC
- Main IPC: G06F17/22
- IPC: G06F17/22

Abstract:
A system and method that converts legacy and proprietary documents into extended mark-up language format which treats the conversion as transforming ordered trees of one schema and/or model into ordered trees of another schema and/or model. In embodiments, the tree transformers are coded using a learning method that decomposes the converting task into three components which include path re-labeling, structural composition and input tree traversal, each of which involves learning approaches. The transformation of an input tree into an output tree may involve decomposing the input document, labeling components in the input tree with valid labels or paths from a particular output schema, composing the labeled elements into the output tree with a valid structure, and finding such a traversal of the input tree that achieves the correct composition of the output tree and applies structural rules.
Public/Granted literature
Information query