-
公开(公告)号:JP2003263434A
公开(公告)日:2003-09-19
申请号:JP2002065365
申请日:2002-03-11
Applicant: ATR ADVANCED TELECOMM RES INST
Inventor: YASUDA YOSHIYUKI , SUGAYA FUMIAKI , TAKEZAWA TOSHIYUKI , YAMAMOTO SEIICHI
IPC: G06F17/28
Abstract: PROBLEM TO BE SOLVED: To provide a program for allowing a computer to carry out automatic selection of a translation system suitable for the translation of input data, out of a plurality of translation systems. SOLUTION: Input data is translated by two translation systems TDMT and EBMT (step S10). A sentence structure score showing similarity between the input data and examples in translating the input data by the EBMT, and a DP distance showing similarity between the input data and examples in translating the input data by the EBMT, are computed (step S20). Evaluation data showing whether the TDMT and EBMT are suitable for the translation of the input data, and the sentence structure score and DP distance computed in the step S20, are used to generate a selector for selecting the translation system suitable for the translation of the input data by a decision tree learning method (step S30). COPYRIGHT: (C)2003,JPO
-
公开(公告)号:JP2003263430A
公开(公告)日:2003-09-19
申请号:JP2002064099
申请日:2002-03-08
Applicant: ATR ADVANCED TELECOMM RES INST
Inventor: SUGAYA FUMIAKI , KANESHIRO YUMIKO , TAKEZAWA TOSHIYUKI , KIKUI GENICHIRO , YAMAMOTO SEIICHI
IPC: G06F17/28
Abstract: PROBLEM TO BE SOLVED: To efficiently collect large-scale data by collecting words and phrases in the state of being divided in cells, and adding synonyms and the like to the cells. SOLUTION: When an original sentence inscribed in English is presented (S1), a sentence translated in Japanese (a Japanese sentence) is inputted (S3). When a first prescribed number (at least two, for instance) of sentences are inputted (S5), the sentences are divided for every character string such as words or phrases. At this time, the same character strings are put together in the same cell, and the different character strings are inputted (distributed) in different cells (S7). Partial information such as synonyms and related words of the character string is added to every cell (S11). The words or phrases, i.e., language data, are thus collected being divided in the cells. COPYRIGHT: (C)2003,JPO
-