하모닉 성분 재설정 검출 기반의 발성 속도 결정 장치 그 방법
    31.
    发明公开
    하모닉 성분 재설정 검출 기반의 발성 속도 결정 장치 그 방법 审中-实审
    基于谐波分量重置检测的语速确定装置

    公开(公告)号:KR1020170082892A

    公开(公告)日:2017-07-17

    申请号:KR1020160002180

    申请日:2016-01-07

    Abstract: 본발명은음성인식시스템에서다양한발성속도를갖는자연어인식성능을향상시킬수 있도록한 하모닉성분재설정검출기반의발성속도결정장치및 그방법에관한것으로서, 모음의강한하모닉성분의존재로발생하는하모닉성분재설정을활용함으로써발성속도의차이에따른자연어인식기의성능저하를줄일수 있고, 음절경계를추정함으로써장음화현상을검출하고이것을자연어인식기의성능개선에활용될수 있으며, 또한, 주파수영역에서하모닉성분을추정하는방법은피치의이득을구하는방법보다정교하기때문에정확한발성속도를얻고음성인식성능이향상되도록하는것이다.

    Abstract translation: 本发明重置由相关的存在给语音识别产生的高次谐波分量重置谐波分量,从而能够提高与各种系统中的话音速率的自然语言识别性能是基于所述话音速率确定在收集谐波分量的设备和方法,强烈检测到的 通过利用它可以根据在语音速度差减少自然语言标识符的降解,通过估计音节边界并检测jangeumhwa现象和可利用它来提高自然语言标识符的性能,而且,对于在频域中估计的谐波分量 该方法比获得音调增益的方法更加复杂,从而获得准确的语音速度并改善语音识别性能。

    불확실성을 이용한 잡음 환경에서의 음성 인식 방법 및 장치
    32.
    发明授权
    불확실성을 이용한 잡음 환경에서의 음성 인식 방법 및 장치 有权
    用于使用不确定性的噪声环境中的语音识别的方法和设备

    公开(公告)号:KR101740637B1

    公开(公告)日:2017-06-08

    申请号:KR1020130130299

    申请日:2013-10-30

    Inventor: 정호영 송화전

    Abstract: 본발명에따른음성인식방법은, 입력된음성신호로부터음성특징을추출하는단계; 상기음성신호의잡음성분을추정하는단계; 상기추정된잡음성분을이용하여상기추출된음성특징을보상하는단계; 상기추출된음성특징, 상기보상된음성특징, 및상기잡음성분을바탕으로, 주어진음향모델을변환하는단계; 및상기보상된음성특징과상기변환된음향모델을이용하여음성인식을수행하는단계를포함하는것을특징으로한다.

    Abstract translation: 根据本发明的语音识别方法包括:从输入语音信号中提取语音特征; 估计语音信号的噪声分量; 使用估计的噪声分量补偿提取的语音特征; 基于提取的语音特征,补偿的语音特征和噪声分量来转换给定的声学模型; 并使用补偿的语音特征和转换后的声学模型进行语音识别。

    음향 모델 생성 방법 및 그 장치
    33.
    发明授权
    음향 모델 생성 방법 및 그 장치 有权
    用于产生声学模型的方法及其装置

    公开(公告)号:KR101697649B1

    公开(公告)日:2017-01-18

    申请号:KR1020120125935

    申请日:2012-11-08

    Abstract: 음향모델생성방법및 그장치가개시된다. 본발명의일 실시예에따른음향모델생성방법은미리수집된훈련용음성데이터를이용하여음향모델을생성하는단계; 상기생성된상기음향모델과상기훈련용음성데이터에기초하여트리기반상태클러스터링을수행하는단계; 상기트리기반상태클러스터링을통해상태트리를생성하는단계; 및사용자의음성으로부터획득된로그음성데이터와상기생성된상기상태트리를이용하여최종음향모델을생성하는단계를포함하고, 상기수행하는단계는상기훈련용음성데이터로부터획득된문맥종속음소의통계값과음성학적지식기반을통해획득된질문셋에기초하여상기트리기반상태클러스터링을수행함으로써, 실제사용환경에최적화된음향모델을생성하고, 이를통해음성인식성능을개선할수 있다.

    Abstract translation: 公开了一种声学模型生成方法及其装置。 根据本发明的一个实施例,声学模型生成方法包括:使用预先收集的训练声学数据产生声学模型的步骤; 基于所生成的声学模型和训练声学数据执行基于树状态的聚类的步骤; 通过基于树状态聚类形成状态树的步骤; 以及使用从用户的语音和所生成的状态树获取的对数声学数据来生成最终声学模型的步骤。 执行聚类的步骤基于从训练声学数据获取的上下文相关音素的统计值和通过语音知识获取的查询集合来执行基于树状态的聚类,使得可以生成优化的声学模型 为实际使用环境,从而提高语音识别性能。

    잡음 환경에서의 음성 인식을 위한 특징 보상 장치 및 방법
    34.
    发明公开
    잡음 환경에서의 음성 인식을 위한 특징 보상 장치 및 방법 审中-实审
    噪声环境下语音识别的特征补偿装置及方法

    公开(公告)号:KR1020160112793A

    公开(公告)日:2016-09-28

    申请号:KR1020150039098

    申请日:2015-03-20

    CPC classification number: G10L15/20 G10L15/02 G10L21/0216 G10L15/142

    Abstract: 본발명에따른잡음환경에서의음성인식을위한특징보상장치는둘 이상의프레임으로구성된잡음이부가된오염된음성신호로부터음성신호특징정보를추출하는특징추출부, 추출된음성신호특징정보및 보상된음성특징으로부터잡음특징정보를추정하는잡음추정부, 둘이상의프레임으로구성된잡음이부가된오염된음성신호의인접프레임간의상관도를산출하는확률산출부및 오염된음성신호의인접프레임간의상관도및 추정된잡음특징정보를고려하여추출된음성신호특징정보의잡음특징을제거하여상기보상된음성특징을생성하는음성특징보상부를포함한다.

    Abstract translation: 本发明涉及一种在嘈杂环境中进行语音识别的特征补偿装置和方法。 根据本发明的用于噪声环境中的语音识别的特征补偿装置包括:特征提取单元,从已经添加了由两个或更多个帧组成的噪声的污染语音信号中提取语音信号特征信息; 噪声估计单元,其基于所提取的语音信号特征信息和经补偿的语音特征来估计噪声特征信息; 计算与由两个以上的帧组成的噪声相邻的污染语音信号相邻的帧之间的相关性的概率计算单元; 以及语音特征补偿单元,其通过考虑与污染的语音信号相邻的帧与估计的噪声特征信息之间的相关性,从提取的语音信号特征信息中去除噪声特征来生成补偿的语音特征。

    대화 인식을 통한 이동 단말 제어 장치 및 방법, 회의 중 대화 인식을 통한 정보 제공 장치
    35.
    发明公开
    대화 인식을 통한 이동 단말 제어 장치 및 방법, 회의 중 대화 인식을 통한 정보 제공 장치 审中-实审
    通过对话识别控制移动设备的装置和方法,以及在会议期间通过对话识别提供信息的装置

    公开(公告)号:KR1020140078258A

    公开(公告)日:2014-06-25

    申请号:KR1020120147429

    申请日:2012-12-17

    Inventor: 정호영

    Abstract: A device for controlling a mobile terminal according to the present invention comprises: a conversation recognizing unit to recognize a conversation among users by mobile terminals; a user intention identifying unit to identify an intention of at least one user among the users based on the recognized result; and an additional function control unit to execute an additional function corresponding to the user intention identified in the mobile terminal of the user. According to the present invention, the device can recognize the conversation among the users to directly provide the information associated with the conversation or to provide a service, thereby improving a communication among the users.

    Abstract translation: 根据本发明的用于控制移动终端的装置包括:通话识别单元,用于识别移动终端在用户之间的对话; 用户意图识别单元,用于基于所识别的结果来识别用户中的至少一个用户的意图; 以及附加功能控制单元,用于执行与在用户的移动终端中识别的用户意图相对应的附加功能。 根据本发明,设备可以识别用户之间的对话,以直接提供与会话相关联的信息或提供服务,从而改善用户之间的通信。

    인트라 프레임 특성을 이용한 자동음성인식 성능 향상 방법
    36.
    发明公开
    인트라 프레임 특성을 이용한 자동음성인식 성능 향상 방법 审中-实审
    语音识别性能改进使用内部帧功能

    公开(公告)号:KR1020140059601A

    公开(公告)日:2014-05-16

    申请号:KR1020120126211

    申请日:2012-11-08

    CPC classification number: G10L15/02 G10L21/0272 G10L21/038

    Abstract: Disclosed is a method for improving automatic voice recognition performance using an intra frame feature. According to the present invention, the method for improving automatic voice recognition performance using an intra frame feature includes: a step of collecting speech signals and preprocessing the collected speech signals by boosting or attenuating the signals; a step of dividing the preprocessed speech signals by threshold band using a gamma-tone filter bank and channelizing signals in each threshold band; a step of frame-blocking the channelized speech signals with a frame shift size of 10 ms and a frame size of 20 - 25 ms; a step of hamming-windowing each blocked channel and extracting a predefined amount of data from the predefined section; a step of estimating signal intensity from the extracted data based on time-frequency and estimating energy based on the estimated signal intensity; a step of getting Cepstral coefficients and derivatives through logarithmic operation and discrete cosine transform for the estimated energy; a step of performing sub-frame analysis for the preprocessed speech signals and extracting intra frame features from the sub-frame analyzed speech signals; and a step of getting voice recognition features by combining the Cepstral coefficients, the derivatives, and the intra frame features.

    Abstract translation: 公开了一种使用帧内特征来提高自动语音识别性能的方法。 根据本发明,使用帧内特征提高自动语音识别性能的方法包括:通过增强或衰减信号来收集语音信号和预处理所收集的语音信号的步骤; 使用伽马色调滤波器组将预处理的语音信号除以阈值频带并在每个阈值频带中信道化信号的步骤; 以10ms的帧移位大小和20-25ms的帧大小帧阻塞信道化语音信号的步骤; 对每个被阻塞的信道进行汉明开窗,并从预定义的部分提取预定量的数据; 基于时间频率从所提取的数据估计信号强度的步骤,并且基于所估计的信号强度估计能量; 通过对数运算和离散余弦变换获得倒频谱系数和导数的估计能量的步骤; 对所述预处理语音信号进行子帧分析并从所述子帧分析的语音信号中提取帧内特征的步骤; 以及通过组合倒频谱系数,导数和帧内特征来获得语音识别特征的步骤。

    외국어 학습자의 발음 평가 장치 및 방법
    37.
    发明公开
    외국어 학습자의 발음 평가 장치 및 방법 无效
    用于评估外国语言学习者授权的装置和方法

    公开(公告)号:KR1020130068598A

    公开(公告)日:2013-06-26

    申请号:KR1020110135888

    申请日:2011-12-15

    CPC classification number: G09B19/06 G09B5/04 G09B7/04 G10L15/005 G10L15/26

    Abstract: PURPOSE: A pronunciation evaluation device and a method are provided to evaluate foreign language pronunciations using an acoustic model of a foreign language learner, pronunciations generated using a pronunciation model in which pronunciation errors are reflected, and an acoustic model of a native speaker, thereby increasing the accuracy of the pronunciation generated for the sound of the foreign language learner. CONSTITUTION: A pronunciation evaluation device(100) includes a sound input part(110), a sentence input part(120), a storage part(130), a pronunciation generation part(140), a pronunciation evaluation part(150), and an output part(160). The sound input part receives the sound of a foreign language learner, and the sentence input part receives a sentence corresponding to the sound of the foreign language learner. The storage part stores an acoustic model for the sound of the foreign language learner and a pronunciation dictionary for the sound of the foreign language learner. The pronunciation generation part performs sound recognition based on the acoustic model and pronunciation dictionary for the sound of the foreign language learner stored in the storage part. The pronunciation evaluation part detects the vocalization errors by analyzing the pronunciations for the sound of the foreign language learner. The output part outputs the vocalization errors of the foreign language learner detected from the pronunciation evaluation part. [Reference numerals] (110) Sound input part; (120) Sentence input part; (130) Storage part; (140) Pronunciation generation part; (150) Pronunciation evaluation part; (160) Output part

    Abstract translation: 目的:提供一种发音评价装置和方法,以使用外语学习者的声学模型评估外语发音,使用其中反映发音错误的发音模型产生的发音和母语者的声学模型,从而增加 为外语学习者的声音产生的发音的准确性。 发音评价装置(100)包括声音输入部(110),句子输入部(120),存储部(130),发音生成部(140),发音评价部(150) 输出部分(160)。 声音输入部分接收外语学习者的声音,并且句子输入部分接收与外语学习者声音相对应的句子。 存储部分存储外语学习者的声音的声学模型和用于外语学习者的声音的发音词典。 发音生成部基于存储在存储部中的外语学习者的声音的声学模型和发音字典进行声音识别。 发音评价部分通过分析外语学习者的声音发音来检测发音错误。 输出部分输出从发音评价部分检测到的外语学习者的发声错误。 (附图标记)(110)声音输入部; (120)句子输入部分; (130)储存部分; (140)发音生成部分; (150)发音评价部分; (160)输出部分

    코퍼스 기반 언어모델 변별학습 방법 및 그 장치
    38.
    发明公开
    코퍼스 기반 언어모델 변별학습 방법 및 그 장치 无效
    基于公司语言模式辨别培训的装置和方法

    公开(公告)号:KR1020130067854A

    公开(公告)日:2013-06-25

    申请号:KR1020110134848

    申请日:2011-12-14

    CPC classification number: G06F17/277 G06F17/18

    Abstract: PURPOSE: A Corpus-based language model discrimination learning method and a device thereof are provided to easily build and use a learning database corresponding to a target domain by building a discrimination learning training corpus database with a text corpus. CONSTITUTION: A language model discrimination learning database extracts a voice feature vector from a corpus database to be built(S302). Continuous speech voice recognition is performed by receiving the voice feature vector(S303). The language model discrimination learning is performed by using a score sentence score and a voice recognition result outputted through continuous speech voice recognition performance(S304). A discrimination language model is generated(S305). [Reference numerals] (AA) Start; (BB) End; (S301) Build a DB for language model discrimination learning; (S302) Extract a voice feature vector; (S303) Recognize voice of continuous speech; (S304) Perform the language model discrimination learning; (S305) Generate a discriminative language model

    Abstract translation: 目的:提供一种基于语料库的语言模型识别学习方法及其设备,通过建立具有文本语料库的歧视学习训练语料库数据库,轻松构建和使用与目标域对应的学习数据库。 构成:语言模型识别学习数据库从要构建的语料库数据库中提取语音特征向量(S302)。 通过接收语音特征向量来执行连续语音识别(S303)。 通过使用通过连续语音识别性能输出的分数句分数和语音识别结果来执行语言模型识别学习(S304)。 生成辨别语言模型(S305)。 (附图标记)(AA)开始; (BB)结束; (S301)建立语言模型歧视学习DB; (S302)提取语音特征向量; (S303)识别连续语音的声音; (S304)执行语言模型识别学习; (S305)生成歧视语言模型

    한국어 연속 음성인식을 위한 컨퓨젼 네트워크 리스코어링 장치 및 이를 이용한 컨퓨젼 네트워크 생성 방법 및 리스코어링 방법
    39.
    发明公开
    한국어 연속 음성인식을 위한 컨퓨젼 네트워크 리스코어링 장치 및 이를 이용한 컨퓨젼 네트워크 생성 방법 및 리스코어링 방법 有权
    为了连续地使用韩国语音识别的混合网络的装置以及使用该方法生成和减少混合网络的方法

    公开(公告)号:KR1020130011574A

    公开(公告)日:2013-01-30

    申请号:KR1020110072813

    申请日:2011-07-22

    Abstract: PURPOSE: A confusion network rescoring device for Korean continuous voice recognition, a method for generating a confusion network by using the same, and a rescoring method thereof are provided to improve a generation speed of the confusion network by setting a limit of a lattice link probability in a process for converting a lattice structure into a confusion network structure. CONSTITUTION: A confusion network rescoring device receives on or more lattices generated through voice recognition(S105). The device calculates each posterior probability of the lattices(S110). The device allocates a node included in the lattices to plural equivalence classes based on the posterior probability(S120,S130,S135). The device generates a confusion set by using the equivalence classes(S150,S155). The device generates a confusion network based on the confusion set. [Reference numerals] (AA) Start; (BB,DD,FF,HH,JJ) No; (CC,EE,GG,II,KK) Yes; (LL) End; (S105) Inputting lattices through voice recognition; (S110) Calculating each posterior probability of the lattices; (S115) Inputting SLF?; (S120) Allocating a first node(no) of the lattices to a first equivalence class(NO); (S125) N_i and n_i links exist?; (S130) Allocating an i-th node(n_i) of the lattices to a j-th equivalence class(N_j); (S135) Allocating the i-th node(n_i) of the lattices to a i-th equivalence class(N_i); (S140) Allocating all nodes of the lattices?; (S145) If u∈N_s n_i∈N_t, t=s+1 in e(u->n_i); (S150) Classifying the e(u->n_i) as CS(N_s,N_t); (S155) Classifying the e(u->n_i) as CS(N_k,N_k+1); (S160) Normalizing link probability in an extracted CS sequence; (S165) Adding a Null link, and allocating remaining probability values of a normalized value; (S170) Possibility value of the Null link > possibility value of the other link; (S175) Excluding the CS sequence from a voice recognition result

    Abstract translation: 目的:提供一种用于韩语连续语音识别的混淆网络解密设备,通过使用该方法产生混淆网络的方法及其解决方法,以通过设置网格链路概率的限制来提高混淆网络的生成速度 在将网格结构转换成混淆网络结构的过程中。 构成:混淆网络重新获取装置接收通过语音识别产生的或多个格子(S105)。 该装置计算格子的每个后验概率(S110)。 该设备基于后验概率将包括在格子中的节点分配给多个等价类(S120,S130,S135)。 该设备通过使用等价类产生混淆集(S150,S155)。 该设备基于混淆集产生混淆网络。 (附图标记)(AA)开始; (BB,DD,FF,HH,JJ)否; (CC,EE,GG,II,KK)是; (LL)结束; (S105)通过语音识别输入格子; (S110)计算格子的每个后验概率; (S115)输入SLF? (S120)将格子的第一节点(否)分配给第一等价类(NO); (S125)存在N_i和n_i个链路? (S130)将格子的第i个节点(n_i)分配给第j个等价类(N_j); (S135)将格子的第i个节点(n_i)分配给第i个等价类(N_i); (S140)分配格子的所有节点? (S145)如果u∈N_sn_i∈N_t,则e(u-> n_i)中的t = s + 1; (S150)将e(u-> n_i)分类为CS(N_s,N_t); (S155)将e(u-> n_i)分为CS(N_k,N_k + 1); (S160)在提取的CS序列中归一化链路概率; (S165)添加空链路,分配归一化值的剩余概率值; (S170)空链路的可能值>其他链路的可能值; (S175)从语音识别结果中排除CS序列

    음성 인식 방법 및 장치
    40.
    发明公开
    음성 인식 방법 및 장치 无效
    用于识别语音的方法和装置

    公开(公告)号:KR1020120072145A

    公开(公告)日:2012-07-03

    申请号:KR1020100133957

    申请日:2010-12-23

    CPC classification number: G10L15/02 G10L15/04 G10L15/142

    Abstract: PURPOSE: A voice recognizing method and a device thereof are provided to obtain features about both a short section and a long section of a voice wherein features of time are reflected to the long section. CONSTITUTION: A segment dividing unit(531) partitions a voice signal into segment sections. A temporal length of a segment section is longer than a temporal length of a frame section. A segment feature extracting unit(532) extracts a segment voice feature vector around a partition boundary portion of the segment section. A segment voice recognizing unit(533) recognizes a voice using the segment voice feature vector and a segment based probability model. A combination synchronizing unit(540) combines a voice recognizing result of a frame based voice recognizing unit with a voice recognizing result of the segment voice recognizing unit.

    Abstract translation: 目的:提供语音识别方法及其装置,以获得关于时间的特征被反映到长部分的语音的短部分和长部分的特征。 构成:段划分单元(531)将语音信号分割成段区段。 段段的时间长度比帧段的时间长度长。 分段特征提取单元(532)提取分段段的分割边界部分周围的分段语音特征向量。 段语音识别单元(533)使用段语音特征向量和基于段的概率模型识别语音。 组合同步单元(540)将基于帧的语音识别单元的语音识别结果与段语音识别单元的语音识别结果组合。

Patent Agency Ranking