Abstract:
본 발명에 따른 통계 기반의 다중 발음 사전 생성 장치는, 발화 및 녹음된 음성 신호 파일들과 각각의 음성 신호 파일에 해당하는 단어 수준의 전사문 및 각각의 음성 신호 파일에 해당하는 화자 정보를 포함하는 데이터베이스; 상기 음성 신호 파일, 상기 단어 수준의 전사문, 및 각 단어 별로 복수 개의 발음열을 포함하는 다중 발음 사전으로부터 음성 인식기의 정렬 기능을 이용하여 상기 음성 신호 파일에 포함된 단어에 대하여 상기 다중 발음 사전에서 가장 가까운 발음열을 검출하는 음성-발음열 정렬부; 상기 가장 가까운 발음열의 검출을 상기 데이터베이스에 저장된 음성 신호 파일들과 단어 수준의 전사문에 적용하여 단어와 발음열의 쌍들을 추출하는 단어-발음열 쌍 추출부; 및 상기 추출된 단어와 발음열의 쌍들을 바탕으로 상기 다중 발음 사전의 각 단어 별 발음열들에 대한 통계 정보를 산출하여 저장하는 발음열 통계정보 추출부를 포함하는 것을 특징으로 한다.
Abstract:
본 발명은 자동 통역 시스템에 적용되는 의미 표현 처리 기법에 관한 것으로, 입력되는 음성에 대응하는 음성 인식 결과의 단어열을 의미 표현으로 변환하고, 변환된 의미 표현을 기 구축된 의미 표현 집합 데이터베이스를 참조하여 검색하며, 검색 결과에 따라 최종 의미 표현을 결정하고, 결정된 최종 의미 표현을 최종 문장으로 생성 및 출력함으로써, 사용자 의도가 반영된 의미 표현을 결정하여 자동 통역 시스템의 성능을 향상시킬 수 있는 것이다. 자동 통역 시스템, 음성 인식, 의미 표현
Abstract:
A method and a device for discriminating a lip motion image are provided to enable labeling on the online network and realize SVM(Support Vector Machine) pattern classification. Face motion image frames received from an imaging unit are analyzed. Final candidates estimated as a lip motion image are extracted(S10). The lip motion image is determined by classifying the final candidates on a coordinate plane based on a discrimination feature of the lip motion image(S20). The final candidates positioned in a critical region of a classification standard among the classified final candidates are determined as the lip motion image based on a region separation line(S80).
Abstract:
A method for generating POI allomorphs for the navigation and a system thereof are provided to produce a plurality of allomorphs to use the location information providing service through not only the normal expression but also the allomorphs. A normal expression DB(10) stores the normal expression of a POI for the navigation. An allomorph DB(20) stores allomorphs, derived in a different type, in the same general expression as meaning of the normal expression. A preprocessing module(30) selects a predetermined syllable in which the damage of the kernel meaning or error possibility exists in the normal expression, and then pre-processes the corresponding syllable into an unchangeable syllable. An allomorph creation module(40) applies a predetermined creation rule based on a predetermined type classified according to the allomorph creation principle, in order to generate a plurality of allomorphs from the normal expression.
Abstract:
A continuous voice recognition apparatus using the limit of a search space on the basis of phoneme recognition and a method therefor are provided to improve a voice recognition speed and performance by performing the phoneme recognition in a primary voice recognition stage, limiting a connection word to be transmitted in a boundary part of a word on the basis of the phoneme recognition result in a secondary voice recognition stage, reducing the search space, and performing voice recognition. A voice feature extracting unit(110) extracts a feature vector from an inputted voice signal. A phoneme recognition unit(120) recognizes a phoneme on the basis of the feature vector of the voice signal. A phoneme-based voice recognition unit(150) configures a connection word search network in which a search space is limited on the basis of the phoneme recognized result, and performs voice recognition on the basis of the connection word search network.
Abstract:
본 발명은 고빈도 의사형태소열을 하나의 인식단위로 활용하여, 의사형태소와 어절의 중간형태의 인식단위를 생성하도록 하는 대화체 및 낭독체 대어휘 연속음성인식시스템의 고빈도 어휘열 인식단위 생성장치 및 그 방법에 관한 것이다. 이와 같은 본 발명은 의사형태소 태깅된 텍스트 코퍼스로부터 연속된 어휘쌍 빈도정보를 추출하는 빈도정보 추출부(301)와, 상기 빈도정보 추출부(301)에서 추출된 빈도정보와 상기 각 어휘쌍의 길이정보을 바탕으로 결합할 어휘셋을 선정하는 결합 어휘셋 선정부(302)와, 상기 결합 어휘셋 선정부(302)에서 선정된 어휘셋을 기반으로 상기 텍스트 코퍼스를 수정한 후, 고빈도 연속 어휘쌍을 하나로 결합하여 수정된 텍스트 코퍼스를 생성하는 의사형태소 결합 정보 수정부(303)와, 상기 의사형태소 결합 정보 수정부(303)에서 생성된 텍스트 코퍼스를 바탕으로 고빈도 어휘열 인식단위를 생성하는 인식단위 생성부(304)로 구성된다. 대화체 및 낭독체 대어휘, 텍스트 코퍼스, 어휘사전, 언어모델, 발음사전
Abstract:
PURPOSE: A translation engine device for translating a source language into an object language and a translation method therefor are provided to use a dialogue sentence in several domain environment and accurately translate the source language inputted by a user into the object language. CONSTITUTION: A mapping table(408) stores a cluster of an object language mapped with a cluster of a source language. A DTST(Direct Translation Sentence Table) direct translation unit(401) directly translates a sentence capable of direct translation in an inputted source language sentence. A pre-processing module(402) maintains a kernel language of the source language sentence through the morpheme analysis of the source language sentence, hides the other portions, and simplifies the structure of the sentence. A clustering unit(404) divides the source language sentence into clusters. A mapping unit(405) decides the cluster of the object language mapped to the cluster of the source language using the mapping table(408). A post-processing and generating unit(406) reallocates the order of the clusters of the object language, and recovers the object language as a completed sentence form.
Abstract:
PURPOSE: A method and a device for generating a translated sentence using a statistical method of a word level are provided to generate the translated sentence of a high quality in a high speed by forming/using an order information database extracted from a large target language corpus through the statistical method. CONSTITUTION: A training module(110) statistically stores the order information from the target language corpus. A morpheme analyzer(121) analyzes a morpheme by receiving a original language sentence. A parameterizer(123) parameterizes the words corresponding to the first speech part forming the original language sentence divided into each morpheme and forms the sentence tagged by the morpheme after hiding the words corresponding to the second speech part. A word arranger(125) replaces each morpheme with the target language word from a translation dictionary database(130) by receiving the tagged sentence. A recovery part(127) recovers/inserts the original word of the parameterized speech part into the replaced target language and recovers the hidden words. A post-processor(129) outputs the original language sentence and the translated sentence based on the generation information after removing a tag.
Abstract:
PURPOSE: A method for video communication of an avatar based TTS system is provided to synchronize the shape of lips of an avatar with a synchronized voice in order to control the look and movement of the avatar in response to the contents of speech, and thereby perform the video communication . CONSTITUTION: An avatar model producer determines whether it performs an operation for making an online photograph or selects an existing avatar model(301). The online photograph is made(302), and a reference point of the avatar model is illustrated(303) if the online photograph is needed. If the existing avatar model is selected, the data of the selected avatar model is transmitted to an avatar server(305). A voice recognition module recognizes the voice inputted from an outside and thereby generates a character string, and transmits it to a language translation module(306). The language translation module translates the generated character string and transmits it to a voice synthesizing module(307). The voice synthesizing module extracts the information related to the movement from the translated character string.
Abstract:
PURPOSE: A stereo image display apparatus and method is provided to prevent eye fatigue by permitting the user to gaze the desired point and change the gaze point in a free manner. CONSTITUTION: A stereo image display apparatus comprises a three-dimensional model storage unit(11) for creating and storing a three-dimensional model for an object to be displayed in a virtual reality space; a head and eye movement detection unit(16) for detecting the position of head(face) of a user and extracting images of eyes of the user; a gaze direction and distance measurement unit(12) for extracting information for the current gaze point of the user, from the position of head and images of eyes output from the head and eye movement detection unit; an image creating unit(13) for generating the stereo image corresponding to the current gaze point extracted from the gaze direction and distance measurement unit, on the basis of the three-dimensional model of the object stored in the three-dimensional model storage unit; and display units(14,15) for displaying left and right side images created by the image creating unit.