Abstract:
PURPOSE: A voice recognition unit and a method thereof of a multiple search base for performing a multi-search about the input speech signal of the multiple search base are provided to improve voice recognition performance about the voice signal by using FSN(Finite State Network) mode and N-gram mode. CONSTITUTION: A speech feature extracting block(102) extracts feature data about the inputted voice signal. A language model database(108) stores the FSN language model and N-gram language model. A multi-search block(104) is parallel performed the first voice search and the second voice search. The multiple search block is created in the integration search network. The multiple search block outputs the voice recognition result according to the third voice search.
Abstract:
PURPOSE: A method and a device for voice recognition using domain ontology are provided to build a domain ontology of voice recognition target and generate voice recognition grammar applying the built domain ontology and recognize voice through the voice recognition grammar, thereby improving performance of voice recognition device. CONSTITUTION: If a voice signal is inputted through a mike, a feature extraction unit extracts specific vector of a frame unit from the voice signal(S401). A sound model unit provides voice model to a voice recognition unit through modeling the signal characteristic of the voice signal(S403). The voice recognition unit performs voice recognition through a voice model, a voice recognition dictionary(S405), and voice recognition grammar(S407)(S409).
Abstract:
PURPOSE: A voice synthesis device and a method thereof are provided, in which the parametric value of the voice section of low reliability among the composite tone of the voice recognition unit is readjusted automatically. CONSTITUTION: A text voice composition unit(200) outputs the first composite sound by the synthesis of the inputted text sentence. A voice recognition unit(202) performs the first voice recognition in the state that the neighboring noise is added to the first composite sound. The voice recognition unit readjusts the voice parameter value about the voice section where the voice recognition reliability value is lower than the fixation standard. A voice composition unit(204) outputs the second composite tone by changed voice parameter value and the recognized voice.
Abstract:
An automatic Korean/English translating method and an apparatus thereof including an automatic pattern base translating technique are provided to generate final translating result by accurately dividing a Korean sentence. A result of A morphological analysis and a phrase analysis are cut by two or more partial original sentences(S208). A pattern machine half machine translation bilingual document and a statistical machine translation translated sentence are generated based on the divided result(S210, S212). A partial translated sentence about the Korean original sentence is synthesized to one English sentence by using translation result(S218).
Abstract:
A scalable coding apparatus and method are provided to divide a video signal into the video signal of a base layer and the video signal of an enhancement layer and encode the video signal using an MPEG4 SVC method and multiplex the video signal by layer, thereby enabling a receiver to receive video encoding data by layer selectively according to the performance of the receiver. A video encoding unit(10) receives a video signal and performs SVC(Scalable Video Coding) in an MPEG4(Moving Picture Experts Group4) SVC method to output the video encoding data of a base layer and the video encoding data of an enhancement layer. An audio encoding unit(20) receives an audio signal and performs audio encoding in an MPEG4 audio encoding method to output the audio encoding data of the base layer and outputs the audio encoding data of the enhancement layer selectively. A base layer multiplexing unit(30) performs SL(Sync Layer) packetizing, PES(Packetized Elementary Stream) packetizing, section generation, and TS(Transport Stream) multiplexing in the video encoding data of the base layer. An enhancement layer multiplexing unit(40) performs SL packetizing, PES packetizing, section generation, and TS multiplexing in the video encoding data of the enhancement layer.
Abstract:
A device and a method for segmenting an English sentence are provided to improve translation accuracy of machine translation and build a full text database from a simple English raw corpus for the machine translation for an English patent document. An input processor(100) segments paragraphs from an inputted English patent document. A token segmentation part(200) segments each word included in the paragraph into a token and sets a type of the token. A sentence segmentation part(300) segments a patent sentence by using the segmented token and the token type as input for an abbreviation database(610) and a proper noun database(620). A sentence segmentation knowledge builder(700) builds the abbreviation database and the proper noun database from a patent document raw corpus automatically. A sentence transformer(400) transforms an asyntactic patent sentence segmented in the sentence segmentation part by using a sentence transformation rule database(630). An output processor(500) outputs the segmented and transformed patent sentence as a result.
Abstract:
본 발명은 본 발명은 한국어를 원문으로 하는 특허 문서에서 빈번히 등장하는 전문용어의 대역어 선정을 위해 구축되는 대역어 사전의 정보를 자동으로 생성하여 제시함으로써 수동으로 구축되던 대역어 사전의 구축 작업을 반자동화하여 대역어 사전 구축의 효율성을 높이기 위한 장치 및 방법에 관한 것으로, 특허문서에서 전문용어를 구성하는 단위 명사 및 접사의 대역어 정보를 이용하여 복합명사형 전문용어 대상 엔트리와 대역어를 추출하는 단계와, 상기 추출된 복합명사형 전문용어 대상 엔트리 및 대역어에서 미등록 단일명사 전문용어의 대역어 후보자를 선정하는 단계와, 상기 대역어 후보자가 없는 경우에 수동 구축을 위해 해당 전문용어의 예문을 추출하여 제시하는 단계를 포함하여 이루어지는데 있다. 자동번역, 전문용어 추출, 특허 문서 번역, 대역어 선정
Abstract:
A statistical HMM(Hidden Markov Model) part-of-speech tagging apparatus and method capable of being applied to a new domain without a tagged domain corpus are provided to select a lexicon with lexical probability varied according to a domain to which the lexicon is applied, and update the lexical probability according to the domain to improve tagging accuracy without having a tagged domain corpus in a specific domain. Tagging probability information is learnt from a previously tagged corpus to construct a lexical/part-of-speech/contextual probability information database and a lexical probability information database(S210). The lexical probability information database is domain-dependently leant and updated based on a raw corpus of an application domain(S220). Morpheme analysis is performed on an input sentence on the basis of a morpheme analysis dictionary database(S240). Statistical part-of-speech tagging is carried out on the morpheme analysis result based on the lexical/part-of-speech/contextual probability information database and the updated lexical probability information database(S250). An error in the tagging result is corrected according to a tagging error correction rule database(S260).
Abstract:
A method and a device for automatically generating a compound noun translation using translation co-occurrence/probability information of a translation dictionary are provided to solve semantic disambiguation and synonymous translations by automatically extracting the translation co-occurrence/probability information from the dictionary and selecting the translation based on the extracted information. A translation co-occurrence and probability information extractor(107,108) respectively extracts the translation co-occurrence and probability information from the translation database(106). A compound noun extractor(102) extracts and dissolves the compound noun into words of a noun unit. A context-based translation selector(103) selects the highest context probability translation for each word based on the translation co-occurrence information. A probability-based translation selector(104) selects the highest probability translation for each word based on the translation probability information. A compound noun translation generator(105) generates the translation of the extracted compound noun by combining the selected translations.
Abstract:
본 발명에 의한 복합 명사 전문용어 사전 엔트리의 재분석 방법 및 그 장치는 전문용어 사전에서 단일 명사 전문용어와 복합 명사 전문용어를 분리하는 단계; 상기 복합 명사 전문용어에 소정의 품사를 가지는 단어를 부착하여 부분 문장을 생성하는 단계; 상기 단일명사 전문용어와 형태소 분석 기본 사전을 기초로 상기 부분문장의 형태소를 분석하는 단계; 및 상기 분석결과 상기 부분생성된 문장이 단일 명사 이외의 품사로 해석될 가능성의 유무로 상기 복합 명사의 등록 여부를 결정하는 단계;를 포함하는 것을 특징으로 하며, 형태소 분석 사전에 등재가 요구되는 복합 명사 전문용어 엔트리를 재분석하여 복합 명사 전문용어 삭제에 따른 분석 모호성 발생을 판단하고, 이에 따른 분석 사전 등재 대상 전문용어 엔트리를 선정하여 대용량 전문용어에 의해 크기가 커지는 분석 사전의 크기를 효과적으로 축소하면서 분석 정확률은 유지할 수 있는 시스템 효율성을 향상시키는 효과를 가져올 수 있다.