Abstract:
본 발명은 차량용 네비게이션 단말기의 음성인식용 발화 이형태 생성을 위한 POI(points of interest) 대상, 복합명사 분해 및 태깅(tagging) 방법을 제시한다. 소형 차량 네비게이션 단말기 탑재 음성 인식 엔진은 일반적으로 고립어를 인식 대상으로 한다. 고립어는 지도상의 특정 지점에 대한 명칭이며, 이러한 명칭에 대해 사용자는 다양한 발화 이형태를 가진다. 본 발명은 사용자의 다양한 발화 이형태 생성을 위해, 지역 명칭으로 기술된 복합명사 형태의 어휘를 대상으로 복합 명사 분해 및 태깅 방법론을 제시한다. 분해는 차트 기반 동적 프로그래밍 방법론을 기반으로 하고, 태깅은 최대 엔트로피를 기반으로 하여 POI명칭을 구성하는 단일어 각각에 대한 의미 표지를 부착한다. 복합명사, 복합명사분해, 태깅, POI, 이형태
Abstract:
PURPOSE: A speaker adaptation apparatus and a method for speech recognition are provided to remarkably improve the performance of a speaker by estimating the answer of actual speech data in high possibility through an N-best recognition result screen output function. CONSTITUTION: A voice data verification unit(202) obtains measurement data for each phoneme with regard to accumulation data through reliability evaluation. The accumulation data includes voice data and N-best recognition result data. A sound model speaker adapting unit(204) performs speaker adaptation by measurement data for each acquired per-phoneme. A sound model updating unit(206) updates a sound model by a new speaker-subordinate sound model through performed speaker adaptation.
Abstract:
PURPOSE: A method for environment adaptation using discrimination training based on channel estimation is provided to find channel characteristic about adaptation data maintaining discrimination primarily and perform model conversion and combine converted model with discrimination learning technique thereby providing effective environmental adaptation. CONSTITUTION: A noise removing unit(110) eliminates noise component within training data(101). A base recognition performing unit(130) recognizes adaptive data(103). A channel characteristic estimator obtains statistical model about phoneme unit by right data(104) of the adaptation data. The channel characteristic estimator combines the statistical model to base sound model(102). A discrimination environment adapting unit(150) outputs adaptive sound model(106) after change of the statistical model by adaptation of discrimination learning technique.
Abstract:
PURPOSE: A remote controller, a method and an apparatus for controlling an input interface are provided to enable a user to conveniently input a Hangul, English, number and symbol character through a keypad. CONSTITUTION: An input keypad(1100) combines two keys among a number key, an asteroid key, a sharp key, a directional key and a special character key. The input keypad selects one of input among the Hangul, English and number characters and symbol, and a control unit(1200) recognizes a key operation through the input keypad. The control unit process a key signal corresponding to the recognized key operation, and a wireless transmission unit(1400) transmits the key signal processed in the control unit.
Abstract:
PURPOSE: A rejection apparatus and a method of a garbage and anti-word model base in voice recognition are provided to effectively reject various operating noise or an unenrolled word by implementing a rejection process about a recognized word. CONSTITUTION: An extracting unit(104) extracts a feature vector from a voice signal. A searcher(110) gives a score through a pattern matching about the feature vector and outputs a recognition result. A rejection network generator(114) generates 'the rejection network for a rejection evaluation' through the recognition result. A rejection searcher(124) outputs a recognition score of 'word model comprising the rejection network' based on a garbage sound model. A decision logic unit(128) decides the rejection about the recognized word comparing with the recognition scores.
Abstract:
PURPOSE: A home network service method using a ubiquitous intelligent robot for offering a service for a location of a user and a robot for the coordinate information are provided to no need to use a remote controller by supplying robot performing voice input through a location sensor. CONSTITUTION: User interface information is inputted through a ubiquitous intelligent robot. The inputted user interface information is transmitted to the ubiquitous intelligent robot server(S300, S302). The ubiquitous intelligent robot server refers to the multimedia device having the multimedia information corresponding to the user interface information from a home network device group(S304). If the multimedia device is detected, the information search result user interface information is outputted through the ubiquitous intelligent robot.
Abstract:
PURPOSE: A multiple recognition candidate formation apparatus and a method thereof are provided, which can improve the efficiency of the voice recognition engine by reducing the usage amount of a memory unit and search time for creating the multiple recognition candidate. CONSTITUTION: A voice feature extractor(502) creates the feature vector through the voice recognition about the consecutive numbers voice. A search unit(504) creates the single recognition candidate string through the pattern recognition about the feature vector. The search unit outputs the likelihood point and feature vector about discrete numerical sound composed of the single recognition candidate string. A multiple recognition candidate generation part(508) creates the multiple recognition candidate by referring the order by numerical sound of the confidence measure generator(506) and the pre-set confusion matrix.
Abstract:
본 발명은 부대역의 불확실성 정보를 이용한 잡음환경에서의 음성 인식 방법 및 장치에 관한 것으로, 잡음 신호 모델링을 통해 얻어진 추정 음성에서 각 부대역별로 추정 음성의 불확실성 정보를 추출하여 이를 각 부대역에 대한 가중치로 이용하여 잡음에 강한 음성 특징을 추출하고, 상기 각 부대역 가중치에 따라 음향 모델을 변환하여 변환된 음향 모델과 상기 추출된 음성 특징을 기반으로 음성 인식을 수행함으로써, 시간에 따른 잡음 모델링이 정확하지 않더라도 부대역의 불확실성 정보에 따라 불확실성이 높은 부대역의 영향을 줄여 잡음환경에서도 음성 인식 성능을 향상시킬 수 있는 것을 특징으로 한다.
Abstract:
A voice recognition method is provided to model various textual language phenomenons into statistical modeling among various knowledge sources. A morpheme is interpreted for a primitive text language corpus consisting of the separate words of Korean(S201). A morpheme language corpus separated is a separate word generated to morpheme. A word trigram which is the language model consisting of a morpheme unigram about a generated morpheme language corpus as described above, and bigram and trigrams is generated(S202). A first N - best recognition candidate to the maximum N is generated for a voice(S204). Recognition result candidates applying a morph-syntactic constraints are revaluated(S205). A second N-best list generated in above step is revaluated(S206). A final N-best list is generated.
Abstract:
본 발명은 음성 신호의 특징 벡터를 이용하여 음성 인식을 수행하는 장치에 있어서, 상기 특징 벡터를 이용하여 활성 노드를 선택하는 활성 노드 선택부, 상기 활성 노드 선택부에서 선택된 활성 노드의 수를 이용하여 관측 확률 계산 방식을 결정하는 관측 확률 계산 방식 결정부, 상기 관측 확률 계산 방식 결정부에서 결정된 방식에 따라 관측 확률을 구하는 관측 확률 계산부, 상기 관측 확률 계산부에서 구해진 관측 확률을 이용하여 음성 인식을 수행하고 그 결과를 출력하는 음식 인식 결과 생성부로 구성된 것으로서, 활성 노드의 수에 따라 관측 확률 계산 방식을 다르게 선택하므로 음성 인식률이 증가되고 인식 속도가 향상될 수 있다. 음성인식, HMM, 관측확률, 활성노드