Abstract:
A method for establishing paraphrasing data for a machine translation system includes selecting a paraphrasing target sentence through application of an object language model to a translated sentence that is obtained by machine-translating a source language sentence, extracting paraphrasing candidates that can be paraphrased with the paraphrasing target sentence from a source language corpus DB, performing machine translation with respect to the paraphrasing candidates, selecting a final paraphrasing candidate by applying the object language model to the result of the machine translation with respect to the paraphrasing candidates, and confirming the paraphrasing target sentence and the final paraphrasing candidate as paraphrasing lexical patterns using a bilingual corpus and storing the paraphrasing lexical patterns in a paraphrasing DB. According to the present invention, the consistent paraphrasing data can be established since the paraphrasing data is automatically established.