PROGRAM FOR ALLOWING COMPUTER TO CARRY OUT AUTOMATIC SELECTION OF TRANSLATION SYSTEM, AND COMPUTER-READABLE RECORDING MEDIUM RECORDING THE PROGRAM

    公开(公告)号:JP2003263434A

    公开(公告)日:2003-09-19

    申请号:JP2002065365

    申请日:2002-03-11

    Abstract: PROBLEM TO BE SOLVED: To provide a program for allowing a computer to carry out automatic selection of a translation system suitable for the translation of input data, out of a plurality of translation systems. SOLUTION: Input data is translated by two translation systems TDMT and EBMT (step S10). A sentence structure score showing similarity between the input data and examples in translating the input data by the EBMT, and a DP distance showing similarity between the input data and examples in translating the input data by the EBMT, are computed (step S20). Evaluation data showing whether the TDMT and EBMT are suitable for the translation of the input data, and the sentence structure score and DP distance computed in the step S20, are used to generate a selector for selecting the translation system suitable for the translation of the input data by a decision tree learning method (step S30). COPYRIGHT: (C)2003,JPO

    LANGUAGE DATA COLLECTING METHOD
    2.
    发明专利

    公开(公告)号:JP2003263430A

    公开(公告)日:2003-09-19

    申请号:JP2002064099

    申请日:2002-03-08

    Abstract: PROBLEM TO BE SOLVED: To efficiently collect large-scale data by collecting words and phrases in the state of being divided in cells, and adding synonyms and the like to the cells. SOLUTION: When an original sentence inscribed in English is presented (S1), a sentence translated in Japanese (a Japanese sentence) is inputted (S3). When a first prescribed number (at least two, for instance) of sentences are inputted (S5), the sentences are divided for every character string such as words or phrases. At this time, the same character strings are put together in the same cell, and the different character strings are inputted (distributed) in different cells (S7). Partial information such as synonyms and related words of the character string is added to every cell (S11). The words or phrases, i.e., language data, are thus collected being divided in the cells. COPYRIGHT: (C)2003,JPO

Patent Agency Ranking