-
公开(公告)号:KR1020140074636A
公开(公告)日:2014-06-18
申请号:KR1020120142816
申请日:2012-12-10
Applicant: 한국전자통신연구원
CPC classification number: G10L15/10 , G10L15/005
Abstract: According to an embodiment of the present invention, a pronunciation evaluation device is designed to easily evaluate pronunciation of an English voice spoken to be inputted from a phonological aspect and an acoustic aspect of a native English voice. The pronunciation evaluation device comprises: a likelihood ratio measuring module which includes a first null hypothesis calculation unit to calculate a first null hypothesis for an English voice spoken to be inputted, a first alternative hypothesis calculation unit to calculate a first alternative hypothesis corresponding to the first null hypothesis, a second null hypothesis calculation unit to calculate a second null hypothesis for a native English voice set in correspondence to the English voice, and a second alternative hypothesis calculation unit to calculate a second alternative hypothesis corresponding to the second null hypothesis; and a pronunciation scoring module to calculate a pronunciation matching probability value for the English voice in correspondence to the native English voice based on the first and second null hypotheses and the first and second alternative hypotheses.
Abstract translation: 根据本发明的实施例,发音评价装置被设计为容易地评估从语音方面输入的英语声音和母语英语的声学方面的发音。 发音评估装置包括:似然比测量模块,包括:第一零假设计算单元,用于计算用于输入的英语语音的第一零假设;第一替代假设计算单元,用于计算与第一替代假设相对应的第一替代假设 零假设,第二零假设计算单元,用于计算对应于英语语音的母语英语语音集的第二零假设;以及第二替代假设计算单元,用于计算与第二无效假设相对应的第二替代假设; 以及发音评分模块,用于基于所述第一和第二零假设以及所述第一和第二替代假设来计算与所述母语英语语音相对应的英语语音的发音匹配概率值。
-
公开(公告)号:KR101756287B1
公开(公告)日:2017-07-26
申请号:KR1020130077494
申请日:2013-07-03
Applicant: 한국전자통신연구원
IPC: G10L15/02
CPC classification number: G10L15/02
Abstract: 본발명에따른, 음성인식을위한특징추출장치는, 입력되는음성신호를소정사이즈의프레임단위로분리하는프레임형성부; 상기음성신호의각 프레임별로정적특징벡터를추출하는정적특징추출부; 기저함수또는기저벡터를이용하여, 상기추출된정적특징벡터의시간에따른변화를나타내는동적특징벡터를추출하는동적특징추출부; 및상기추출된정적특징벡터와동적특징벡터를결합하여특징벡터스트림을구성하는특징벡터결합부를포함한다.
Abstract translation: 根据本发明,提供了一种用于语音识别的特征提取装置,包括:帧形成单元,用于将输入语音信号分离成预定大小的帧; 静态特征提取单元,用于为语音信号的每帧提取静态特征向量; 动态特征提取单元,用于使用基函数或基矢量来提取指示所提取的静态特征向量随时间的变化的动态特征向量; 以及特征向量组合单元,其将提取的静态特征向量和动态特征向量组合以构建特征向量流。
-
公开(公告)号:KR101697650B1
公开(公告)日:2017-01-18
申请号:KR1020120142816
申请日:2012-12-10
Applicant: 한국전자통신연구원
Abstract: 실시예에따른발음평가장치는발화입력된영어음성에대하여원어민영어음성의음운학적측면및 음향학적측면에대한발음평가가용이하도록, 실시예는발화입력된영어음성에대한제1 귀무가설을산출하는제1 귀무가설산출부, 상기제1 귀무가설에대응하는제1 대립가설을산출하는제1 대립가설산출부, 상기영어음성에대응하여설정된원어민영어음성에대한제2 귀무가설을산출하는제2 귀무가설산출부및 상기제2 귀무가설에대응하는제2 대립가설을산출하는제2 대립가설산출부를포함하는우도비측정모듈및 상기제1, 2 귀무가설및 상기제1, 2 대립가설을기초로, 상기원어민영어음성에대응하여상기영어음성에대한발음일치확률값을산출하는발음스코어링모듈을포함하는발음평가장치를제공한다.
Abstract translation: 根据本发明的实施例,发音评价装置被设计为容易地评估从语音方面输入的英语声音和母语英语的声学方面的发音。 发音评估装置包括:似然比测量模块,包括:第一零假设计算单元,用于计算用于输入的英语语音的第一零假设;第一替代假设计算单元,用于计算与第一替代假设相对应的第一替代假设 零假设,第二零假设计算单元,用于计算与英语声音相对应的母语英语语音集的第二零假设;以及第二替代假设计算单元,用于计算与第二无效假设相对应的第二替代假设; 以及发音评分模块,用于基于所述第一和第二零假设以及所述第一和第二替代假设来计算与所述母语英语语音相对应的英语语音的发音匹配概率值。
-
公开(公告)号:KR1020140077788A
公开(公告)日:2014-06-24
申请号:KR1020120146925
申请日:2012-12-14
Applicant: 한국전자통신연구원
IPC: G10L15/01
Abstract: The present invention relates to a method for generating an out-of-vocabulary (OOV) based on a similarity in a voice recognition system. The method for generating an OOV according to the present invention includes the steps of: generating a dictionary of a vocabulary to be recognized which has a phoneme string by each vocabulary when voice test data is prepared; selecting an OOV from the voice test data, comparing the phoneme strings of the OOV and at least one vocabulary to be recognized which is stored in the dictionary of a vocabulary to be recognized, and calculating a similarity; classifying, into a first group, the vocabulary to be recognized which has a similarity included in a first range in at least one vocabulary to be recognized, adding the vocabulary to be recognized in a dictionary of the OOV, and revising a grammar; and classifying, into a second group, the vocabulary to be recognized which has a similarity included in a second range in at least one vocabulary to be recognized, and adding the vocabulary to be recognized in the dictionary of the OOV.
Abstract translation: 本发明涉及一种基于语音识别系统中的相似度来生成超出词汇(OOV)的方法。 根据本发明的用于生成OOV的方法包括以下步骤:当准备语音测试数据时,生成要被识别的词汇的词典,其具有每个词汇表具有音素串; 从所述语音测试数据中选择OOV,比较所述OOV的音素字符串和要被识别的至少一个要被识别的词汇表的词汇,并且计算相似度; 将要被识别的具有相似性的词汇分类为包含在要识别的至少一个词汇表中的第一范围内的词汇,将要识别的词汇添加到OOV的词典中,并修改语法; 并且将要被识别的词汇表分类为包含在要识别的至少一个词汇表中的第二范围中的相似性的词汇,并将要识别的词汇添加到OOV的词典中。
-
公开(公告)号:KR1020140076215A
公开(公告)日:2014-06-20
申请号:KR1020120144528
申请日:2012-12-12
Applicant: 한국전자통신연구원
IPC: G10L15/14
Abstract: The present invention discloses a method of supplementing the audio data that is missing for a particular language and simultaneously re-teaching without changing a structure of an acoustic model, by using joint phone included in the audio data of multi-languages. An acoustic model adaptation method of the present invention comprises the steps of preparing the acoustic model of native speakers including the utterance voice information of native speakers and the acoustic model of foreigners including the utterance voice information of the foreigners who are not native speakers; and adapting according to the predetermined criterion of the acoustic model of the native speakers and the acoustic model of the foreigner.
Abstract translation: 本发明公开了一种补充特定语言缺失的音频数据的方法,并且通过使用包括在多语言音频数据中的联合电话同时重新教导而不改变声学模型的结构。 本发明的声学模型适应方法包括以下步骤:准备母语者的声学模型,其中包括母语者的话语语音信息和外国人的声学模型,包括不是母语者的外国人的话语音信息; 并且根据母语者的声学模型的预定标准和外国人的声学模型进行调整。
-
公开(公告)号:KR1020130005160A
公开(公告)日:2013-01-15
申请号:KR1020110066574
申请日:2011-07-05
Applicant: 한국전자통신연구원
CPC classification number: H04M1/72552 , G10L15/083 , G10L15/30 , H04M2250/74 , H04W4/12
Abstract: PURPOSE: A message service method using a voice recognition function is provided to offer a message by combining a voice recognition result and the real voice of a user. CONSTITUTION: A message server(20) recognizes a voice transmitted from a transmission terminal(10)(S14). The message server generates a recognized result from the voice and an N-best result based on a chaos network. The message server transmits the generated N-best result to the transmission terminal(S20). The message server receives the selected message from the transmission terminal and an evaluation result for the message accuracy(S26). The message server transmits the message and the evaluation result to a reception terminal(30)(S32). [Reference numerals] (10) Transmission terminal; (20) Message server; (30) Reception terminal; (S10) Inputting voice; (S12,S40) Transmitting the voice; (S14) Recognizing the voice; (S16) Generating a recognized result and an N-best result; (S18) Storing log data; (S20) Transmitting the recognized result and the N-best result; (S22) Displaying the recognized result and the N-best result; (S24) Determining a message and an evaluation result; (S26,S32) Transmitting the message and the evaluation result; (S28) Storing additional log data; (S30) Modifying errors of the recognized result; (S34) Displaying the message and the evaluation result; (S36) Requesting the voice; (S38) Extracting the voice; (S42) Outputting the voice
Abstract translation: 目的:提供使用语音识别功能的消息服务方法,通过组合语音识别结果和用户真实语音来提供消息。 构成:消息服务器(20)识别从发送终端(10)发送的语音(S14)。 消息服务器根据混沌网络产生语音识别结果和N最佳结果。 消息服务器将生成的N最佳结果发送到发送终端(S20)。 消息服务器从发送终端接收所选择的消息和消息准确性的评估结果(S26)。 消息服务器将消息和评估结果发送到接收终端(30)(S32)。 (附图标记)(10)发送端子; (20)消息服务器; (30)接待台; (S10)输入声音; (S12,S40)发送语音; (S14)识别声音; (S16)生成识别结果和N最佳结果; (S18)存储日志数据; (S20)发送识别结果和N最佳结果; (S22)显示识别结果和N最佳结果; (S24)确定消息和评估结果; (S26,S32)发送消息和评估结果; (S28)存储其他日志数据; (S30)修正识别结果的错误; (S34)显示消息和评估结果; (S36)请求声音; (S38)提取声音; (S42)输出声音
-
-
-
公开(公告)号:KR1020170088165A
公开(公告)日:2017-08-01
申请号:KR1020160008167
申请日:2016-01-22
Applicant: 한국전자통신연구원
Abstract: 본발명의일면에따른심층신경망기반음성인식방법은, 음성신호를입력받는단계; 상기음성신호를주파수신호로변환하는단계; 상기주파수신호로이루어진벡터신호와가중치벡터(Weight Vector)와의가중치합(Weighted Sum)으로다음단계은닉층(Hidden Layer)의각 노드에대응하는복수의맥스-풀링(Max-Pooling) 입력노드값을구하는단계; 및상기복수의맥스-풀링입력노드값가운데가장큰 값을상기다음단계은닉층의노드값으로결정하는단계;를포함하되, 상기가중치벡터는학습에의해미리설정된기준가중치벡터를시간축으로압축하여구하는것을특징으로한다
Abstract translation: 根据本发明的一个方面,提供了一种基于深度语音神经网络的语音识别方法,包括:接收语音信号; 将语音信号转换成频率信号; 矢量信号,并用以下步骤隐藏层(隐蔽层)对应于uigak由汇集(MAX-池)步骤以获得所述输入节点值的频率信号的节点的多个最大的权重向量(权重矢量)与加权和(加权和) 。 和多个最大 - 确定所述合并的输入节点值,以下面的步骤的节点值中的最大值:隐藏;由所述加权矢量是通过学习时间轴,包括压缩到预定的基准的权重向量被获取,但 特征
-
公开(公告)号:KR101729972B1
公开(公告)日:2017-04-25
申请号:KR1020130055449
申请日:2013-05-16
Applicant: 한국전자통신연구원
IPC: G10L15/183 , G10L15/28
Abstract: 본발명에따른음성인식장치는, 입력된음성에서인식에유용한정보를추출하여특징벡터로변환하는특징추출부; 소정의음향모델을저장하는음향모델데이터베이스; 소정의언어모델을저장하는언어모델데이터베이스; 원어민의발음모델에타국인이범할수 있는문법오류에의해발생하는변이발음이더욱포함된발음모델을저장하는발음모델데이터베이스; 상기특징벡터를토대로상기음향모델데이터베이스, 상기발음모델데이터베이스, 및상기언어모델데이터베이스를이용하여가장확률이높은단어열을찾는탐색부; 및상기탐색부의출력을이용하여상기입력된음성의인식결과를제공하는인식결과출력부를포함하는것을특징으로한다.
-
-
-
-
-
-
-
-
-