Abstract:
본 발명에 의한 2단계 구문분석을 통한 자동 번역 장치 및 그 방법은 입력문에 대해 형태소를 분석하는 형태소 분석부; 상기 각 형태소에 대해 품사 후보를 결정하는 태깅부; 태깅된 상기 입력문에 대하여 동사구를 중심으로 구문분석을 수행한 후 상기 입력문 전체문장에 대한 구문분석을 수행하는 구문분석부; 및 상기 구문분석 결과 생성되는 구문 트리를 기초로 하여 상기 입력문의 번역문을 생성하는 변환생성부;를 포함하는 것을 특징으로 하며, 구문분석 모호성중에서 태깅 모호성 및 명사구 청킹 모호성과 병렬 및 부착 모호성을 분리하여, 1단계에서 태깅 모호성 및 명사구 청킹 모호성을 해결하고, 2단계 병렬 및 부착 모호성을 해결함으로써 성능의 큰 저하 없이 분석 효율성을 올릴 수 있다. 자동번역, 구문분석, 문장분할, 기본 명사구, 동사구 분석
Abstract:
본 발명은 동사구 패턴에 기반한 한중 자동 번역 시스템에서 '하다' 동사의 번역 장치 및 방법에 관한 것으로, 본 발명의 목적은 'X를 하다' 구문의 처리를 위한 별도의 '하다' 동사의 동사구 패턴을 구축하지 않게 하여 시스템에서 요구되는 동사구 패턴의 수를 줄이면서도 고품질의 번역이 가능하게 하는 것이다. 본 발명에 따른 한국어 구조 분석기는 입력 결과에서 'X를 하다' 구문이 존재하는 지를 인식하고, 이를 'X하다' 구문으로 바꾸는 제 1단계; 데이터베이스에서 'X하다' 동사구 패턴을 가져와서 제1단계의 입력으로 들어온 'X를 하다' 구문의 제약조건을 만족하는 최적의 'X하다' 동사구 패턴을 선택하는 제 2단계; 'X하다' 동사구 패턴을 이용하여 한국어 입력문의 구문구조를 분석하는 제 3단계; 제 3단계에서 구문 분석된 'X하다' 구문의 결과를 'X를 하다' 구문 구조로 치환하는 제 4단계를 수행한다. 또한, 본 발명에서의 대역문 변환기는 이를 해결하기 위해 'X'가 관형절의 수식을 받는 'X를 하다' 구문이 입력되었는지를 인식하는 제 5단계; 수식을 받는 경우 'X를 하다' 구문이 어떤 유형에 속하는지를 판단하는 제 6단계; 'X를 하다'의 유형에 따라 관형어/절의 처리를 수행하는 제 7단계를 수행한다.
Abstract:
A hybrid automatic translation device for generating a high quality translation result of high coverage combining a rule-based method and a translation pattern method, a method thereof, and a computer-readable recording medium recording a program are provided to minimize ambiguity of parsing and a side effect of sentence division, and increase correctness of a phrasal pattern for translation pattern matching by extracting only the phrase unit result from a parsing result. A morpheme parser(101) parses morphemes from an input sentence. A tagger(102) determines each part of speech for a morpheme parsing result. A parser(103) parses a tagging result and outputs a parsing tree. A phrasal pattern generator(104) generates the phrasal pattern by extracting a chunking result of the phrases included in a sub category of verbs in the parsing tree. A phrasal pattern translator(105) performs translation for the phrasal pattern by using a translation pattern. A clause structure parser(106) checks a structure of a clause unit for the sentence in case that the translation pattern matching for the phrasal pattern is failed. A partial pattern translator(105-1) recognizes a partial phrasal pattern for each sub-clause by referring to a clause structure parsing result and performs the translation by using a partial translation pattern.
Abstract:
PURPOSE: A device and a method for analyzing a complex morpheme formed with multiple words are provided to enhance analysis correctness by applying a syllable or consonant/vowel base morpheme analysis and the complex morpheme analysis as a 2-step, excluding use of complex morpheme connection information, and additionally constructing spacing information. CONSTITUTION: A previously analyzed dictionary database stores a word unit morpheme analysis result by previously performing the word unit morpheme analysis of a Korean sentence. A preprocessor(202) determines application of a word-inside analysis by receiving/normalizing the sentence and using the previously analyzed dictionary database. A word-inside morpheme analyzer(204) performs the morpheme analysis in the word by using a word combination rule and an analysis algorithm. A word-outside morpheme analyzer(206) performs a word-outside morpheme analysis by using the spacing information for the input sentence. A part tagging part(208) performs morpheme tagging by using context tagging data and vocabulary tagging data.
Abstract:
PURPOSE: A passive and causative sentence structure analysis system and method is provided to enhance the case information determination performance by using the case frames, and to reduce the case frame number. CONSTITUTION: The system comprises a morpheme analyzer(10), a sentence normalizer(20), an active type case frame(30), a case frame converter(40), a case frame applicator(50), and a sentence structure analysis tree generator(60). The morpheme analyzer(10) analyzes the morphemes of an input passive or causative type sentence written in Hangul. The sentence normalizer(20) converts the morpheme analysis result into an active type sentence. The active type case frame(30) determines the case information based on the sentence normalization result of the sentence normalizer(20). The case frame converter(40) automatically converts auxiliary words and verb conjugations of the active type sentence into those of corresponding passive or causative type sentence for generating a passive or causative type conversion case frame. The case frame applicator(50) compares the morpheme analysis result with the conversion case frame, compares the sentence normalization result with the active type case frame, gives a weighting factor to a case frame according to a comparison result, and determines the case frame with the highest weighting factor as a final case frame. The sentence structure analysis tree generator(60) determines the case information based on the final case frame, analyzes the input passive or causative type sentence, and generates a sentence structure analysis result.
Abstract:
PURPOSE: A long sentence division method is provided to enhance a success in performing a sentence frame based machine translation on a long sentence by dividing the long sentence into plural short sentences. CONSTITUTION: The method comprises steps of extracting candidate start points of a short sentence by using sentence structure data(401), extracting candidate start points of a short sentence by using a start point pattern(402), extracting candidate end points of a short sentence by using the extracted candidate start points(403), extracting candidate end points of a short sentence by using the end point pattern(404), and extracting short sentences by extracting candidate short sentences and performing a sentence frame matching after recovering the short sentences(405). The short sentence extraction step(405) includes steps of extracting candidate short sentences between the candidate start points and the candidate end points, recovering the candidate short sentences with a perfect grammar, matching the recovered short sentences with short sentence frames, determining the start points and the end points according to the matching result, and repeating the steps until the full long sentence is divided into short sentences.
Abstract:
PURPOSE: A long sentence translation system and method is provided to reduce a length of a long sentence to be translated via a sentence division, a short sentence translation or a replacement, and to repeat a sentence frame matching and a combination of the partial sentence frames step by step for performing a semantically natural translation or solving a coverage problem occurred at a sentence frame translation. CONSTITUTION: The method comprises steps of performing preprocesses such as a morpheme analysis, a determination of parts of speech, a recognition of fixed expressions, a discovery of a protector or an analysis of a part separation(201), performing a sentence division on slot queues resulted from the preprocesses for dividing one long English sentence into one more short sentences(202), recognizing the divided short sentences and translating the short sentences by matching a sentence frame(203), determining if the short sentences are successfully translated(204), if the short sentences are successfully translated, replacing a divided sentence, corresponding to a partial sentence frame, with a sentence structure node, and translating all the sentence frames while searching the sentence frames(205), determining if all the sentence frames are successfully translated(206), if they are not, trying a translation by combining the partial sentence frames according to an inter-phrase structure analysis rule(207), and checking if the partial sentence translation is successful(208).
Abstract:
PURPOSE: The apparatus for automatic translation using a sentence frame including a protector and a syntax node is provided to offer a translation function to the part for processing a natural language, by rapidly forming the natural translation sentence with high quality from an original sentence. CONSTITUTION: The original sentence is inputted to a morpheme analysis part(102) via an input part(101). The part(102) analyzes the morpheme of each word in the inputted original sentence. A speech-part determination part(103) decides the part of speech of the analyzed word. A fixed expression recognizing part(104) attaches a new part of speech to the determined part thereof. A protector finding part(105) attaches the designation of the protector to the part thereof and the word that server for an important role in the sentence. A partial syntax analysis part(106) analyzes the words between protectors and attaches a proper syntax tag to the analyzed words. An original sentence frame generation part(107) creates the original sentence frame for the inputted original sentence.