Abstract:
PURPOSE: A pronunciation evaluation device and a method are provided to evaluate foreign language pronunciations using an acoustic model of a foreign language learner, pronunciations generated using a pronunciation model in which pronunciation errors are reflected, and an acoustic model of a native speaker, thereby increasing the accuracy of the pronunciation generated for the sound of the foreign language learner. CONSTITUTION: A pronunciation evaluation device(100) includes a sound input part(110), a sentence input part(120), a storage part(130), a pronunciation generation part(140), a pronunciation evaluation part(150), and an output part(160). The sound input part receives the sound of a foreign language learner, and the sentence input part receives a sentence corresponding to the sound of the foreign language learner. The storage part stores an acoustic model for the sound of the foreign language learner and a pronunciation dictionary for the sound of the foreign language learner. The pronunciation generation part performs sound recognition based on the acoustic model and pronunciation dictionary for the sound of the foreign language learner stored in the storage part. The pronunciation evaluation part detects the vocalization errors by analyzing the pronunciations for the sound of the foreign language learner. The output part outputs the vocalization errors of the foreign language learner detected from the pronunciation evaluation part. [Reference numerals] (110) Sound input part; (120) Sentence input part; (130) Storage part; (140) Pronunciation generation part; (150) Pronunciation evaluation part; (160) Output part
Abstract:
PURPOSE: A confusion network rescoring device for Korean continuous voice recognition, a method for generating a confusion network by using the same, and a rescoring method thereof are provided to improve a generation speed of the confusion network by setting a limit of a lattice link probability in a process for converting a lattice structure into a confusion network structure. CONSTITUTION: A confusion network rescoring device receives on or more lattices generated through voice recognition(S105). The device calculates each posterior probability of the lattices(S110). The device allocates a node included in the lattices to plural equivalence classes based on the posterior probability(S120,S130,S135). The device generates a confusion set by using the equivalence classes(S150,S155). The device generates a confusion network based on the confusion set. [Reference numerals] (AA) Start; (BB,DD,FF,HH,JJ) No; (CC,EE,GG,II,KK) Yes; (LL) End; (S105) Inputting lattices through voice recognition; (S110) Calculating each posterior probability of the lattices; (S115) Inputting SLF?; (S120) Allocating a first node(no) of the lattices to a first equivalence class(NO); (S125) N_i and n_i links exist?; (S130) Allocating an i-th node(n_i) of the lattices to a j-th equivalence class(N_j); (S135) Allocating the i-th node(n_i) of the lattices to a i-th equivalence class(N_i); (S140) Allocating all nodes of the lattices?; (S145) If u∈N_s n_i∈N_t, t=s+1 in e(u->n_i); (S150) Classifying the e(u->n_i) as CS(N_s,N_t); (S155) Classifying the e(u->n_i) as CS(N_k,N_k+1); (S160) Normalizing link probability in an extracted CS sequence; (S165) Adding a Null link, and allocating remaining probability values of a normalized value; (S170) Possibility value of the Null link > possibility value of the other link; (S175) Excluding the CS sequence from a voice recognition result
Abstract:
PURPOSE: A voice recognition apparatus and a method thereof are provided to increase recognition speed of an input signal and to perform recognition of an input signal in parallel. CONSTITUTION: A global database unit(10) includes a global feature vector(12), a global vocabulary model(14), and a global sound model(16). A recognition unit(20) includes separated recognition units(22a~22n). A plurality of separate recognition units performs voice recognition in parallel. A separate database unit(30) includes separate language models. A collection and evaluation unit(40) collects and evaluates the recognition result of the separate recognition unit.
Abstract:
The present invention provides a speech recognition device and a method thereof. More particularly, to a speech recognition device which estimates a speaker of the speech and using the same, and a method thereof. The speech recognition device of the present invention includes: an input unit for receiving speech; a speaker estimation unit for analyzing the characteristics of the speech, analyzing variation of the speaker for the characteristics and estimating speaker information of the speech; and a speech recognition unit for recognizing the speech by taking into account the speaker information.
Abstract:
본 발명은 실시간 음성 인식을 위한 채널 정규화 장치 및 방법에 관한 것이다. 본 발명은 입력 음성에 대하여 프레임마다 특징 벡터를 추출하는 특징 벡터 추출부, 미리 훈련된 선형 변환 매트릭스를 이용하여 특징 벡터가 추출된 프레임들의 특징 벡터를 변환시키는 특징 벡터 변환부, 및 변환된 특징 벡터를 기초로 음성 인식을 위한 채널 정규화를 수행하는 채널 정규화부를 포함하는 채널 정규화 장치를 제안한다. 본 발명에 따르면, 실시간 음성 인식이 가능해지며, 바이어스 성분 제거와 함께 음성 인식을 위한 변별력도 향상시킬 수 있다.