-
公开(公告)号:KR101178801B1
公开(公告)日:2012-08-31
申请号:KR1020080124371
申请日:2008-12-09
Applicant: 한국전자통신연구원
IPC: G10L15/10 , G10L15/28 , G10L21/0272 , G10L15/20
CPC classification number: G10L15/20 , G10L21/0272 , G10L2021/02166
Abstract: 본 발명은 음원분리 및 음원식별을 이용한 음성인식 기술에 관한 것으로, 음성인식기 사용자의 음성과 잡음 음원들이 혼재하는 환경에서 다수의 마이크와 독립요소분석 기법을 이용하여 각각의 원음을 분리하고, 이를 바탕으로 고성능의 음성인식을 수행하는 것이다. 독립요소분석에 의해 분리된 음원들 가운데 음성인식기 사용자가 음성인식기 구동을 목적으로 발성한 음성을 음성인식기가 자동으로 구분해내기 위해, 본 발명에서는 분리된 음원들의 음성인식 신뢰도 및 방향정보를 계산하고, 잡음 음원의 경우 움직이지 않는다고 가정한다. 이 방식에 의하면 음성인식기 사용자의 주변에 복수 개의 잡음원이 존재하는 경우에도 사용자는 마이크 배열과의 상대적인 위치에 무관하게 자유로운 위치에서 발성할 수 있으며, 높은 음성인식 성능을 얻을 수 있다.
마이크배열, 음성인식, 잡음처리, 음원분리, 음원식별, 독립요소분석(ICA)-
公开(公告)号:KR101134682B1
公开(公告)日:2012-04-09
申请号:KR1020090056120
申请日:2009-06-23
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A speaker adaptation system and a method thereof are provided to utilize the cumulative variable for obtaining enough statistics for the non-instruction adaptation in voice recognitioin process without performing an adaptation training thereby enabling to gradual adaptation. CONSTITUTION: A characteristic detecting part(110) extracts an eigenvector from a voice signal. A sound model storage(120) stores an acoustic model consisting of a recursive tree. A conversion parameter class determiner(130) produces gaussian posteriori probability of a candidate state based on the eigenvector and the sound model and determines the cumulative variable and a conversion parameter class based on the gaussian posterior probability. A sound model updater(140) produces the conversion parameter based on the conversion parameter class and cumulative variable and renews the acoustic model.
-
公开(公告)号:KR1020120019011A
公开(公告)日:2012-03-06
申请号:KR1020100082078
申请日:2010-08-24
Applicant: 한국전자통신연구원
Abstract: PURPOSE: An interaction service providing device using user information combination is provided to enable a user to analyze a state and to improve the quality of a service. CONSTITUTION: A condition determining unit(130) combines inputted personal information of a user and received personal information of the other user. The condition determining unit analyzes a condition o the user and the other user based on the combined information. A service adjusting unit(140) adjusts support service information.
Abstract translation: 目的:提供使用用户信息组合的交互服务提供设备,以使用户能够分析状态并提高服务质量。 条件:条件确定单元(130)组合输入的用户的个人信息和所接收的另一用户的个人信息。 条件确定单元基于组合的信息分析用户和其他用户的条件。 服务调整单元(140)调整支持服务信息。
-
公开(公告)号:KR101095867B1
公开(公告)日:2011-12-21
申请号:KR1020090026451
申请日:2009-03-27
Applicant: 한국전자통신연구원
Abstract: 본 발명은 음성합성장치 및 방법에 있어서, 합성음의 명료도를 높이기 위해 소음이 심한 곳 또는 가변적인 소음환경에서는 음성합성장치내 음성인식부에서 1차로 생성된 합성음 중 소음환경에 따라 신뢰도가 낮아진 음성 구간의 파라미터값에 대해 소음환경에 적절한 파라미터값으로 자동으로 재조정하도록 하고, 재조정된 파라미터값에 의해 2차로 합성음을 생성하도록 함으로써, 소음에 대해 명료도가 높은 합성음을 얻을 수 있게 된다.
음성인식, 합성, 소음, 명료도, 신뢰도-
公开(公告)号:KR101068120B1
公开(公告)日:2011-09-28
申请号:KR1020080126244
申请日:2008-12-12
Applicant: 한국전자통신연구원
Abstract: 본 발명은 입력된 음성 신호에 대한 다중 탐색을 통해 음성 인식을 수행하는 기법에 관한 것으로, 이를 위하여 본 발명은, FSN 방식, N-gram 방식 등의 기법을 이용하여 입력된 음성 신호를 인식하는 종래 방법과는 달리, FSN 방식 및 N-gram 방식을 이용한 음성 탐색을 병렬 처리한 후, 이에 따라 출력되는 제 1 단어 격자와 제 2 단어 격자를 통해 통합 탐색 네트워크를 생성하고, 생성된 통합 탐색 네트워크를 통해 음성 탐색을 재수행하여 음성 인식 결과를 출력함으로써, FSN 방식 및 N-gram 방식의 다중 탐색을 통해 입력된 음성 신호에 대한 음성 인식률을 향상시킬 수 있는 것이다.
음성 인식 기법(speech recognition), FSN(Finite State Network) 방식, N-gram 언어 모델 방식-
公开(公告)号:KR1020100068965A
公开(公告)日:2010-06-24
申请号:KR1020080127491
申请日:2008-12-15
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A device and a method for automatically interpreting languages are provided to accurately improve translation using various information extracted from the input sound for the automatic translation. CONSTITUTION: A feeling state analyzer(106) transfers the feeling information through the feeling state judgment with regard to a first voice of a first language. A gender determiner(108) transmits gender information through the gender judgment with regard to the first voice. A voice recognizer(110) recognizes the first voice and transmits a first character of the first language. A sentence style determiner(112) transfers the sentence pattern information through the sentence pattern determination about the first character. A translation unit(114) translates the first character into a second character of a second language in reference with the feeling information. A voice synthesizer(116) synthesizes the second character with the second voice of the second language in reference with the speaker, feeling, gender, and sentence pattern information.
Abstract translation: 目的:提供用于自动解释语言的设备和方法,以使用从用于自动翻译的输入声音提取的各种信息来精确地改进翻译。 构成:感觉状态分析器(106)通过针对第一语言的第一语音的感觉状态判断传送感觉信息。 性别决定者(108)通过关于第一个声音的性别判断来传播性别信息。 语音识别器(110)识别第一语音并发送第一语言的第一字符。 句型确定器(112)通过关于第一个字符的句子模式确定来转移句型图案信息。 翻译单元(114)参照感觉信息将第一个字符转换为第二语言的第二个字符。 语音合成器(116)参照说话者,感觉,性别和句型信息,将第二字符与第二语言的第二语音合成。
-
公开(公告)号:KR1020100065811A
公开(公告)日:2010-06-17
申请号:KR1020080124371
申请日:2008-12-09
Applicant: 한국전자통신연구원
IPC: G10L15/10 , G10L15/28 , G10L21/0272 , G10L15/20
CPC classification number: G10L15/20 , G10L21/0272 , G10L2021/02166
Abstract: PURPOSE: A speech recognition apparatus using source separation and source identification and a method therefor are provided to use a voice identifying device even under an environment with noise of point source types, thereby realizing various application systems of the voice identifying device. CONSTITUTION: A sound source separator divides mixed signals into sound source signals by independent elements analysis. The sound source separator extracts DOA(Direction Of Arrival) information of the divided sound source signals. A voice indentifying device(108) calculates the divided sound source signals by normalized log likelihood probability values. A user voice signal identifying device(112) uses reliability of voice signal identification to identify a sound source corresponding to a voice signal of a user.
Abstract translation: 目的:提供一种使用源分离和源标识的语音识别装置及其方法,即使在具有点源类型噪声的环境下也能使用语音识别装置,从而实现语音识别装置的各种应用系统。 构成:声源分离器通过独立元素分析将混合信号分为声源信号。 声源分离器提取分割声源信号的DOA(到达方向)信息。 语音识别装置(108)通过归一化对数似然概率值计算划分的声源信号。 用户语音信号识别装置(112)使用语音信号识别的可靠性来识别对应于用户的语音信号的声源。
-
公开(公告)号:KR100930587B1
公开(公告)日:2009-12-09
申请号:KR1020070122185
申请日:2007-11-28
Applicant: 한국전자통신연구원
IPC: G10L15/01 , G10L15/10 , G10L15/187
Abstract: A confusion matrix based utterance verification method and an apparatus thereof are provided to select a phoneme with high discrimination by using a probability value of a confusion matrix as a weight for a likelihood value of a mono phone model. By performing viterbi decoding by using a context dependent phoneme mode, an inputted voice is recognized(307). A likelihood value of each phoneme, included in a pre-trained context independence phoneme model, and each phoneme, included in the voice-recognized character string as a voice recognition result, is calculated(309). Reliability for the voice-recognized character string is measured based on the calculated likelihood value of each phoneme and the pre-calculated probability value of the confusion matrix(311). It is determined whether to grant or reject the voice-recognized character string based on the measured reliability(313,315,317).
Abstract translation: 提供基于混淆矩阵的发声验证方法及其装置,以通过使用混淆矩阵的概率值作为单声道手机型号似然值的权重来选择具有高判别度的音素。 通过使用上下文相关音素模式进行维特比解码,识别输入的语音(307)。 计算(309)包括在预先训练的上下文独立音素模型中的每个音素的似然值以及包括在作为语音识别结果的语音识别字符串中的每个音素。 基于计算出的每个音素的似然值和混淆矩阵的预先计算的概率值来测量语音识别字符串的可靠性(311)。 基于测量的可靠性来确定是否授予或拒绝语音识别字符串(313,315,317)。
-
公开(公告)号:KR1020090055320A
公开(公告)日:2009-06-02
申请号:KR1020070122185
申请日:2007-11-28
Applicant: 한국전자통신연구원
IPC: G10L15/01 , G10L15/10 , G10L15/187
Abstract: A confusion matrix based utterance verification method and an apparatus thereof are provided to select a phoneme with high discrimination by using a probability value of a confusion matrix as a weight for a likelihood value of a mono phone model. By performing viterbi decoding by using a context dependent phoneme mode, an inputted voice is recognized(307). A likelihood value of each phoneme, included in a pre-trained context independence phoneme model, and each phoneme, included in the voice-recognized character string as a voice recognition result, is calculated(309). Reliability for the voice-recognized character string is measured based on the calculated likelihood value of each phoneme and the pre-calculated probability value of the confusion matrix(311). It is determined whether to grant or reject the voice-recognized character string based on the measured reliability(313,315,317).
Abstract translation: 提供了一种基于混淆矩阵的话音验证方法及其装置,通过使用混淆矩阵的概率值作为单声道电话机型的似然值的权重来选择具有高辨别力的音素。 通过使用与上下文相关的音素模式进行维特比解码,识别输入的语音(307)。 计算包括在预先训练的上下文独立音素模型中的每个音素的可能性值,以及包括在作为语音识别结果的语音识别字符串中的每个音素(309)。 基于所计算的每个音素的似然值和混淆矩阵的预先计算的概率值来测量语音识别字符串的可靠性(311)。 确定是否基于测量的可靠性来授予或拒绝语音识别的字符串(313,315,317)。
-
-
-
-
-
-
-
-