Abstract:
PURPOSE: A rule base syntax analyzing device and method thereof are provided to perform syntax analysis with a high processing performance and an efficiency of rule based method by processing ambiguity based on vocabulary dependence information from context tree attached corpus. CONSTITUTION: A rule bases parsing module(103) selects optimal context tree by performing syntax analyzing of input sentence based on syntax rule. A rule weight calculating module(105) calculates rule weight and provides the weight to a rule based parsing module by using vocabulary weight and rule probability of a rule applied about the input sentence.
Abstract:
PURPOSE: A post-processing knowledge generation apparatus is provided to improve translation performance by correcting faults on translation based on post-processing knowledge. CONSTITUTION: An original text extracting unit(204) extracts an original text from a parallel corpus. A machine translation part(206) machine-translates the original text and creates machine translation corpus. An auto arranging part(210) arranges the machine translation corpus and a correct translation corpus which is extracted from the parallel corpus based on statistics. An extracting unit(212) extracts text arranging information by the arranging result. A filter(214) amends the error of the text arranging information and creates post-processing knowledge.
Abstract:
본 발명은 자동 번역 시스템의 도메인 변화에 따른 대역어 사전의 특화 기법에 관한 것으로, 목표 도메인에 속하는 원시 언어 코퍼스와 목표 언어 코퍼스를 이용하여 공기 어휘를 추출하고, 이를 대역어 사전에 매핑시켜 대역어 후보를 추출하며, 이에 대한 대역 관계의 오류를 필터링한 후 대표 대역어를 결정하여 대역어 사전에 반영함으로써, 자동 번역 시스템의 대역어 사전을 자동으로 특화시킬 수 있어 이를 구축하는데 소요되는 비용을 절감할 수 있는 것이다. 자동 번역, 대역어 사전
Abstract:
본 발명은 가중 유한 상태 변환기(Weighted Finite State Transducer)에서, 입력 기호가 널(null)인 천이, 예컨대 ε(epsilon ; 엡실론) 입력 천이를 제거하는 방법에 관한 것이다. 가중 유한 상태 변환기의 등가성(equivalency)을 유지하면서, 각 상태(state, 또는 노드(node))의 종류에 따라 ε 입력 천이 제거 방법을 달리하여 가중 유한 상태 변환기의 크기를 최소화하는 방향으로 ε 입력 천이를 제거한다. 가중 유한 상태 변환기, 널 심볼 천이
Abstract:
본 발명은 운율 모델을 이용한 형태소 품사 태깅 방법 및 그 장치에 관한 것으로, 음성인식의 결과 또는 전사한 텍스트문장의 형태소를 해석하고, 해석이 모호한 부분에 대해 음성 DB를 기반으로 기 구축된 형태소 품사 시퀀스별 운율 모델과 입력음성의 운율 모델을 비교하여, 입력음성의 운율 모델에 대한 최적의 형태소 품사 시퀀스를 찾고, 그 찾은 결과를 형태소 품사 태깅 방법과 조합시켜 형태소 품사를 태깅함으로써, 형태소 품사 태깅의 정확도를 극대화시킬 수 있다. 또한, 본 발명은 운율 모델을 적용하여 형태소 품사를 태깅함으로써, 화자의 발성 의도를 파악할 수 있다. 형태소, 운율 모델, 품사, 태깅
Abstract:
PURPOSE: A statistical HMM part tagging apparatus and a method thereof for increasing the tagging performance of a morpheme part are provided to increase the accuracy of tagging about a document of a domain by using learning information. CONSTITUTION: A real time learning based statistical part tagging unit(103) utilizes vocabulary stored in a vocabulary probability information DB and context stored in a context probability information DB. A real time learning based tagging error correction unit(105) corrects an error through a tagging error modification DB. A real time document information learning unit(101) establishes real time context probability information DB after extracting context probability information.
Abstract:
PURPOSE: An apparatus for searching a word entry in a portable electronic dictionary and a method thereof are provided to output N-best recognition results and enables a user to select one of the results when performing a dictionary searching operation for a foreign language through a voice recognition technology. CONSTITUTION: A pre-treatment unit receives a voice signal for the combination of continuous pronunciation of each letter which configures a dictionary pronunciation or a word, and a word network configuration unit(512) receives a pronunciation string from the stored multi-pronunciation dictionary information to configure a network through the matching with the extracted phoneme series. A searching unit(514) refers to a triphone unit acoustic model transferred through a training unit and the configured network in order to search the word corresponding to the voice signal.
Abstract:
PURPOSE: A neologism selection device and a method thereof are provided to determine priority of neologism candidate based on a theme and select neologism according to the priority. CONSTITUTION: A morpheme analyzer(102) performs morphological analysis about input web document. A keyword extractor(106) extracts a keyword corresponding to neologism candidate from analyzed sentence. A subject detecting and tracking unit(108) performs theme detection and theme tracking through the extracted keyword. According to a subject keyword weight of the keyword, a priority determination unit(110) selects neologism according to the priority after determination of priority.
Abstract:
PURPOSE: A null symbol transition removal method in weighting finite state transducer is provided to remove all ε input transitions except a loop formed by ε input transitions, thereby minimizing the size of the weighting finite state transducer according to a form of input transitions of the whole weighted finite state transducer. CONSTITUTION: An input transition search unit(100) searches null symbol transition while searching the whole weighted finite state transducer, for example searching ε input transition. An input transition determination unit(102) determines type of the ε input transition. An input transition removing unit(104) removes the ε input transition according to type of the determined ε input transition. If an input symbol is null, forward direction processing suitable for node type is performed. If the input symbol is not null, the next transition is performed.
Abstract:
PURPOSE: A method and a device for generating translation sentence in an automatic English-Korean translation system are provided to automatically generate prefinal-ending corresponding to information delivered from Korean to English verb by information which an auxiliary verb or a temporal adverbial phrase transfers to a verb. CONSTITUTION: An English morpheme analyzer(100) analyzes input English sentence in a morpheme unit. A structure analyzer(300) analyzes English structure. English transforming unit(400) changes construction analysis result into Korean structure. A Korean generator(500) generates the Korean structure in the Korean morpheme unit. A perfinal-ending processor(600) transfers Korean perfinal-ending information corresponding to information about the English auxiliary verb group to the Korean generator.