Abstract:
본 발명은 고립어 엔베스트 인식결과를 위한 발화검증 방법 및 장치에 있어서, 엔베스트 음성인식을 통해 단어수준에서 인식된 결과에 대해 엔베스트 발화검증을 통해서 신뢰도를 측정하고, 동적 시간 와핑에 의한 음소간 유사도를 측정하여 신뢰도와 유사도를 기반으로 음성인식의 수락/거절 또는 판단불가 등을 표시함으로써 보다 신뢰성 높은 음성인식이 가능하도록 한다. 엔베스트, 발화검증, 음성, 인식, 신뢰도, 유사도
Abstract:
본 발명은 연속으로 발성되는 한국어 숫자음성에 대한 음성인식을 수행하여 그 인식 결과에 대해 혼동행렬과 신뢰도 치에 기반하여 다수의 음성인식 후보를 생성하는 기술에 관한 것으로, 혼동행렬은 숫자음성에 대해 오인식이 발생하는 숫자들로 구성되므로, 사전에 실험용 데이터베이스를 사용하여 인식을 수행한다. 또한 음성인식의 결과로 도출되는 숫자별 인식 점수인 통계적 우도를 그 단어의 지속시간 지수인 프레임 수로 나눈 로그 우도비를 신뢰도 치로써 사용하는 것을 특징으로 한다. 본 발명에 의하면, 음성인식 알고리즘에서 성능의 저하 없이 N-best를 생성하기 위해 사용하는 기억장치 사용량과 탐색시간을 절약함으로써 음성인식 엔진의 효율을 높일 수 있다. 음성인식, 숫자음성, N-best, 신뢰도치, 혼동행렬
Abstract:
PURPOSE: Interactive contents providing device and method in an e-book system are provided to interactive contents by editing the contents provided from CP. CONSTITUTION: CP(Contents Provider)(102,103) generates and provides the contents. A contents making device(110) creates interactive contents by manufacturing and editing the contents received from the CP. The contents making device provides to the interactive contents to the CP. Terminals(105-108,119) receive the interactive contents from the CP and provide the interactive contents to users. The interactive contents comprise script, object data and scene data.
Abstract:
PURPOSE: A device for separating a sound source and a method thereof are provided to extract only a desired sound from various sound sources. CONSTITUTION: An input unit(610) changes the offered signal in to a frequency domain. A processing unit(620) divides the sound source of the converted signal in the frequency band unit. The processing unit aligns the separated sound source through the phase difference of a mixed filter for mixing the sound sources. An output unit(630) changes the aligned sound sources into the time domain.
Abstract:
PURPOSE: An apparatus for filtering a noise based on a model by compensating distortion to recognize a voice are provided to remove a noise without distortion for voice recognition. CONSTITUTION: A voice member probability calculator(206) calculates a voice absence probability, and a noise estimating and updating unit(208) updates the estimated noise. A first noise cancellation filter outputs a first pure voice included in the distortion through a filtering operation by using the voice absence probability and the updated estimated noise. A second noise removing filter outputs a distortion-compensated final voice signal by filtering pure voice estimation. The pure voice estimation value is obtained based on posteriori probability.
Abstract:
PURPOSE: A viterbi decoder and a method for recognizing a voice are provided to prevent the dramatic lowering of an observation probability of a contaminated portion caused by an unintended impulse noise. CONSTITUTION: An optimal state calculator(220) obtains the state of the maximum accumulated similarity in each measurement vector of an observation vector row for the inputted voice. A buffer unit(240) stores an observation probability value for the plural voices inputted prior to the inputted voice. A non-linear filtering unit(250) calculates the observation probability value based on the observation probability value calculated by an observation probability calculator(230). A maximum similarity producer(260) calculates a local maximum similarity value based on the observation probability value.
Abstract:
PURPOSE: A method and an apparatus for reducing noises are provided to reinforce isolation function of voice and noise through voice/noise isolation function like soft masking technique thereby accurately presuming clean voice. CONSTITUTION: A noise estimator(130) presumes noise component within inputted voice signal. A posterior probability estimator(140) presumes posterior probability value from the noise component. A noise parameter adapting unit(150) applies noise Gaussian mixture model to the inputted voice signal. A voice/noise separating unit(160) divides noise and voice signal primarily. A noise removing unit(170) eliminates residual noise components of the voice signal.
Abstract:
An apparatus and a method for generating a noise adaptive acoustic model including discriminative noise adaptive training for environment transfer are provided to apply a voice recognition system to a noise environment effectively by using a voice recognition method. Voice studying data(201) reflects various noise environments by high-capacity audio data elementarily used for sound type model learning. A noise reduction unit(203) removes various noise components which are included in the voice studying data. A noise adaptive training technique unit(205) learns the voice studying data by the acoustic model training method. The learned acoustic model is set up as a basis acoustic model parameter(207). Voice data(211) for environmental adaptation is a small amount of voice data collected in the environment to which a voice recognition system is applied.
Abstract:
A home network system based on a voice interface and a control method thereof are provided to integrally search multimedia contents dispersedly stored to various multimedia devices connected to a home network through a voice command and reproduce corresponding multimedia contents in a desired device without limitation of a location or a place, thereby maximizing user's convenience. A home network system based on a voice interface comprises a home media server. The home media server(100) comprises the followings: a profile manager module; a media transmission module(120); a device recognition and control module(130); a communication module(140); a voice processing module(160) which recognizes the voice command of a user, searches information about the recognized voice command, and generates and outputs a complex sound for a search result; and a central control unit(150) which stores multimedia contents information stored in a multimedia device connected to a home network, and controls that multimedia contents which the user wants.