Abstract:
PURPOSE: A method and an apparatus for reducing a multi channel noise are provided to selectively apply beam-forming method and sound source separating method according to environment condition among multi-channel noises processing based on multi-channel voice recognition environment thereby maximizing noise processing performance. CONSTITUTION: A noise environment monitoring unit(210) grasps the number of sound source and background sound source and relative location information of user voice. According to how many information of the sound source and the relative location information of the background sound source and the user voice, a multi-channel noise processor(220) selects multi-channel noise processing methods among a plurality of multi-channel noises processing modes. The multi-channel noise processor performs noises processing by selected multi-channel noise processing method.
Abstract:
PURPOSE: An apparatus for generating keyword for speech recognition for a navigation device is provided to enable a retrieval service of POI through voice by automatically producing the allomorph of POI which a user can say to a navigation device. CONSTITUTION: An apparatus for generating keyword for speech recognition in navigation device comprises a statistical model studying unit(202) and an allomorph generating unit. The statistical model studying unit analyzes the POI character strings. The statistical model studying unit builds probability value as statistical information. The allomorph generating unit creates the allomorphs on POI name using the statistical information.
Abstract:
PURPOSE: A method for separating noise from an audio signal is provided to increase performance of sound source separation and increase convergence speed in a weighted learning stage, thereby increasing calculation efficiency. CONSTITUTION: A plurality of microphones records an audio signal that a user speaks and a noise signal. A beam former(20) performs a beam forming process and a blind processing separation procedure for the recorded audio signal and noise signal. The beam former spatially and statistically divides the audio signal and the noise signal. A sound source separator(30) separates the sound source signal and outputs the separated sound source signal.
Abstract:
PURPOSE: A voice recognition information generation device, a method thereof, and a broadcast service method thereof are provided to generate a database from allomorph character string, thereby offering a broadcast service according to voice recognition. CONSTITUTION: A voice recognition information generation device includes a prior matching unit(302), a section boundary partition unit(308), a normalization unit(310), and an allomorph generation unit(312). The prior matching unit performs prior matching according to character string information of broadcast data. The section boundary partition unit partitions the section boundary of a character string of which prior matching is performed in order to generate voice recognition target character string data. The normalization unit normalizes generated voice recognition target character string. The allomorph generation unit generates allomorph character string data from normalized voice recognition target character string data.
Abstract:
An apparatus for evaluating the performance of speech recognition includes a speech database for storing N-number of test speech signals for evaluation. A speech recognizer is located in an actual environment and executes the speech recognition of the test speech signals reproduced using a loud speaker from the speech database in the actual environment to produce speech recognition results. A performance evaluation module evaluates the performance of the speech recognition by comparing correct recognition results answers with the speech recognition results.
Abstract:
본 발명은 잡음 적응형 변별 학습 방법을 포함하는 잡음 적응형 음향 모델 생성 방법 및 장치에 관한 것으로서, 다양한 환경 잡음을 포함하는 대규모 음성 학습 데이터로부터 기본 음향 모델 파라미터를 생성하는 단계 및 상기 생성된 기본 음향 모델 파라미터를 입력받아 변별 학습 기법을 적용하여 실제 정용 환경에 적합한 적응형 음향 모델 파라미터를 생성하는 단계를 포함하는 잡음 적응형 음향 모델 생성 방법을 제공할 수 있다. 음향 모델, 환경 잡음
Abstract:
A device and a method for evaluating performance of a speech recognition engine are provided to require no interference of a person in any noise environment by adjusting an SNR(Signal-to-Noise Ratio) based on free volume control of a speech sound in a speaker. An evaluation speech database(201) stores evaluation speeches. An automatic voice recognition evaluator(203) plays the stored evaluation speech. The automatic voice recognition evaluator transmits an answer list and an audio signal file of evaluation data when voice recognition control for the evaluation data is completed. A speech recognizer(207) recognizes voice, and stores a voice recognition result list and a voice signal file used in voice recognition. A performance evaluation block(209) evaluates performance of a voice recognizer by comparing the answer list and the audio file with the voice recognition result list and the voice signal file.
Abstract:
A microphone array-based voice recognition system and a target voice extracting method in the system are provided to automatically find out one target voice uttered for voice recognition by using an HMM(Hidden Markov Model) and a GMM(Gaussian Mixture Model), thereby obtaining a higher recognition rate even in case of noise existence. A signal separator(110) separates mixed signals individually inputted through plural microphones into sound source signals through independent component analysis. A target voice extractor(120) extracts one target voice uttered for voice recognition among the separated sound source signals. A voice recognizer(130) recognizes a desired voice through the extracted target voice. An additional information unit transmits additional information used for the extraction of the target voice to the target voice extractor.
Abstract:
본 발명은 대화체 음성에서 빈번하게 나타나는 발음변이를 의사형태소 기반의 대표어휘에 수용하여 확장된 다중 발음사전을 구성하고 대표어휘만을 이용하여 언어모델 및 어휘사전을 구성함으로써, 대화체 연속음성인식의 성능을 향상시키고 정형화된 출력패턴을 얻을 수 있는 다중발음사전 구축 방법 및 시스템과 이를 이용한 대화체 음성인식 방법에 관한 것이다. 본 발명은 대화체 텍스트 코퍼스로부터 대표음 텍스트 코퍼스와 변이음 텍스트 코퍼스를 각각 추출하는 단계와, 대표음 및 변이음 텍스트 코퍼스 각각에 대해 의사형태소 분석 및 태깅을 수행하는 단계와, 태깅 결과를 어절별로 비교하여 의사형태소 단위의 대표음/변이음 쌍들을 추출하는 단계와, 대표음 코퍼스만의 의사형태소 태깅 결과로부터 대표음 어휘사전을 생성하는 단계와, 대표음 어휘사전과 대표음/변이음 쌍 추출 결과를 통해 다중발음사전 및 대표음 언어모델을 생성하는 단계로 이루어진다. 의사형태소, 다중 발음사전, 언어모델, 연속음성인식 시스템, 발음변이