Abstract:
PURPOSE: A multi-channel sound quality enhancement apparatus using similar auditory information and a running method thereof are provided to utilize soft-masking filter method using location information or direction of target sound source in noise environment. CONSTITUTION: A running method of multi-channel sound quality enhancement apparatus using similar auditory information comprises the following steps: presuming formal cross-correlation function[NCC] based on sound signal by multichannel NCC(Normalized Cross-Correlation) estimation unit(130); identifying one or more of signal among interference signal or background noise signal from the sound signal by a signal decision unit(135); presuming NCC density distribution of identified interference signal or background noise signal by a density estimation unit(140); and presuming soft-mask information by a soft-mask estimation unit(145).
Abstract:
PURPOSE: A voice recognition system for personal customized natural language is provided to create various voice searching services through vocalization of the natural language. CONSTITUTION: A voice recognition system comprises: a control unit(123) which provides a customized model to a voice recognition unit(143) in case that a user is registered and controls provision of the customized model in cast that the user is not registered; and a service processing unit(133) which controls updating locutionary act and voice recognition result in case that the user agrees the result.
Abstract:
본발명은음성과잡음신호분리방법및 그장치에관한것으로, 음원의통계적정보를이용하는음원분리기술과음원의공간적정보를활용하는빔포밍기술을두개이상의마이크로폰을갖춘시스템에사용할경우음성신호와잡음신호를보다효과적으로분리할수 있게되며, 결과적으로잡음환경에서녹음된신호로부터잡음신호가제거된깨끗한음성신호를추출할수 있다. 또한, 본발명은암묵신호분리기술에있어서학습과정이불필요하므로계산량이적고잘못된학습으로인한성능저하의염려가없는등, 음원분리의성능을높일뿐만아니라동시에가중치학습단계에서수렴속도를높임으로서계산효율성도제고할수 있으며, 빔포밍기술의경우에도일반적으로알려지지않은잡음원의개수및 위치에관계없이환경에강인하게동작할수 있다.
Abstract:
본 발명은 음성인식기에서 가비지 및 반단어 모델 기반의 거절 기술에 관한 것으로, 특히 비음성을 거절하기 위한 가비지 모델(garbage model), 음소 유사도에 기반하는 반단어 모델(anti-word model) 구성법, 이들을 통합한 거절 네트워크, 거절 네트워크에 대한 고속 재평가를 위한 인접 프레임 간의 유사도에 근거한 프레임 제거법(frame dropping)을 동원하여 인식된 결과를 거절하는 것을 특징으로 한다. 본 발명에 의하면, 종래 음성인식을 위한 발성사전에 등록되어 있지 않은 미등록 어휘나 비문법적 어휘의 입력뿐만 아니라, 등록되지 않은 음향-음성학적 입력 신호의 입력에 대해 효과적인 거절 기능을 수행할 수 있으며 고속의 거절평가가 가능해짐으로써 인식성공률이나 반응시간에서 음성인식기의 성능 향상을 도모할 수 있다. 음성인식, 거절(rejection), 프레임 제거법, 가비지 모델, 반단어 모델
Abstract:
PURPOSE: An utterance verification apparatus based on a word reliability threshold and a method thereof are provided to apply different reliability threshold to each word recognized in a word-based utterance verification system with respect to a voice recognition result. CONSTITUTION: A phoneme segment information extractor(130) extracts phoneme segment information with the analysis of a recognized word. Likelihood value calculators(140,150) calculate an likelihood value for the extracted phoneme and half-phoneme. A threshold calculator(170) calculates a threshold value corresponding to the recognized word. A comparator(190) compares the threshold value with an LLR(Log Likelihood Ratio) calculated by the likelihood value calculator. According to a comparison result, the comparator outputs or secludes a voice recognition result.
Abstract:
PURPOSE: A material-distribution search method using a voice recognizing function and a method thereof are provided to search a proper transportation request by searching a distribution in the moving route in real time. CONSTITUTION: The location of a truck is traced and a truck owner selects a freight search mode through a voice(202). If the selected freight search mode is an automatic search mode, the truck owner selects the forward data search or the backward data search through a voice(203,204). If the forward data search is selected, the transportation request information after the current location is provided(205). If the backward data search is selected, the transportation request information prior to the current location is provided(206).
Abstract:
PURPOSE: A voice recognition apparatus for generating a plurality of recognition results is provided to generate N-best recognition results using a phoneme column-based search unit. CONSTITUTION: A continuous voice recognition unit(101) performs the voice recognition of input voice data. The continuous voice recognition unit outputs a word column which is most similar to the input voice data as a recognition result. A phoneme column converter(102) changes the recognition result into a phoneme column. A phoneme column-based search unit(103) searches a plurality of word columns of which a phoneme column distance is contiguity with the recognition result from a language model(105).
Abstract:
PURPOSE: A voice recognition apparatus and method having two-step utterance verification structure for reducing the complexity of N-best recognized word calculation are provided to induce the re-utterance of a user or notify the user of an utterance error. CONSTITUTION: Using a first model, a voice recognition module(130) recognizes the voice of input voice data. The voice recognition module outputs a first N-best word list. An utterance verification module(140) creates a second N-best word list. Using a second model, the utterance verification module creates a final N-best word list from the second N-best word list.
Abstract:
PURPOSE: A target signal detecting device using a statistical model and a method thereof are provided to irrelevantly detect a voice frame interval where a voice of a user exists in a noise environment. CONSTITUTION: A cross correlation function estimation unit(23-1) calculates a conditional probabilities about a plurality of sound source frame corresponding to an audio signal. The cross correlation function estimating unit estimates a likelihood ratio of a conditional unit probability in case of absence and a case of a target signal existing about a cross correlation function which is normalized through the conditional unit probabilities. A density estimating unit(25) estimates density in moving average about the cross correlation function. A interference signal density estimation unit(29) estimates statistical average and deviation of the normalized cross correlation function having an interference signal frame in the conditional unit target signal absence probability.