-
公开(公告)号:KR1020160062666A
公开(公告)日:2016-06-02
申请号:KR1020150094041
申请日:2015-07-01
Applicant: 한국전자통신연구원
Abstract: 본발명은 PC, 또는핸드폰, 스마트폰, PDA, Laptop 등휴대가가능한기기와통신하거나직접자동통역단말기에활용되는자동통역시스템에관한것으로서, 상기자동통역시스템은, 발화자의음성인식용마이크신호, 골도마이크신호및 발화자의제스쳐신호를네트워크를통해전송하고, 네트워크를통해수신된통역결과신호를출력하는웨어러블자동통역입출력장치; 및상기웨어러블자동통역입출력장치로부터네트워크를통해전송된골도마이크신호또는제스쳐신호를이용하여상기음성인식용마이크신호에서음성데이터구간을검출하고, 검출된구간내의음성데이터의음성인식및 통역을수행한후, 통역결과신호를네트워크를통해상기웨어러블자동통역입출력장치로전송하는서버를포함한다.
Abstract translation: 本发明涉及能够通信诸如PC,智能电话,PDA,笔记本电脑等便携式设备或应用于自动解释终端的自动解释系统。 自动解释系统包括:可穿戴自动解释输入/输出设备和服务器。 可穿戴自动解释输入/输出装置通过网络发送麦克风信号用于语音识别呼叫者,骨传导麦克风信号和呼叫者的手势信号,并输出通过网络接收的解释结果信号。 服务器通过使用来自可穿戴自动解释输入/输出设备的骨传导麦克风信号或通过网络发送的手势信号,从用于语音识别的麦克风信号中检测语音数据部分; 并且在检测到的部分中的语音数据的语音识别和解释之后,通过网络将解释结果信号发送到可穿戴自动解释输入/输出设备。
-
公开(公告)号:KR101333194B1
公开(公告)日:2013-11-26
申请号:KR1020110072394
申请日:2011-07-21
Applicant: 한국전자통신연구원
Abstract: 본 발명에 따른 통계 기반의 다중 발음 사전 생성 장치는, 발화 및 녹음된 음성 신호 파일들과 각각의 음성 신호 파일에 해당하는 단어 수준의 전사문 및 각각의 음성 신호 파일에 해당하는 화자 정보를 포함하는 데이터베이스; 상기 음성 신호 파일, 상기 단어 수준의 전사문, 및 각 단어 별로 복수 개의 발음열을 포함하는 다중 발음 사전으로부터 음성 인식기의 정렬 기능을 이용하여 상기 음성 신호 파일에 포함된 단어에 대하여 상기 다중 발음 사전에서 가장 가까운 발음열을 검출하는 음성-발음열 정렬부; 상기 가장 가까운 발음열의 검출을 상기 데이터베이스에 저장된 음성 신호 파일들과 단어 수준의 전사문에 적용하여 단어와 발음열의 쌍들을 추출하는 단어-발음열 쌍 추출부; 및 상기 추출된 단어와 발음열의 쌍들을 바탕으로 상기 다중 발음 사전의 각 단어 별 발음열들에 대한 통계 정보를 산출하여 저장하는 발음열 통계정보 추출부를 포함하는 것을 특징으로 한다.
-
13.
公开(公告)号:KR1020130014895A
公开(公告)日:2013-02-12
申请号:KR1020110076622
申请日:2011-08-01
Applicant: 한국전자통신연구원
CPC classification number: G10L21/0232 , G10L21/0264
Abstract: PURPOSE: A sound source division reference determination device and a method thereof are provided to detect a sound source direction in a noise environment by using an ITD(Interaural Time Delay) value and an IID(Interaural Intensity Difference) value. CONSTITUTION: A histogram generator(110) generates a histogram related a sound source direction including the input signal based on an SNR(Signal to Noise Raito) value or an input signal energy value. A noise area detecting unit(120) detects a noise area from an input signal. A sound source division standard determination unit(130) determines a boundary value as a standard value for dividing the sound sources. [Reference numerals] (110) Histogram generator; (120) Noise area detecting unit; (130) Sound source division standard determination unit; (140) First power unit; (150) First main control unit
Abstract translation: 目的:提供一种声源分配参考确定装置及其方法,用于通过使用ITD(时间间延迟)值和IID(干涉强度差)值来检测噪声环境中的声源方向。 构成:直方图生成器(110)基于SNR(信噪比Ra值)或输入信号能量值生成包括输入信号的声源方向的直方图。 噪声区域检测单元(120)根据输入信号检测噪声区域。 声源分割标准确定单元(130)将边界值确定为用于划分声源的标准值。 (附图标记)(110)直方图生成器; (120)噪声区检测单元; (130)声源分割标准确定单元; (140)第一电源单元; (150)第一主控单元
-
公开(公告)号:KR1020110038448A
公开(公告)日:2011-04-14
申请号:KR1020090095741
申请日:2009-10-08
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2854
Abstract: PURPOSE: An automatic interpretation terminal, a service, a system and method for servicing automatic interpretation are provided to supply rapid and exact interpretation service by directly performing interpretation and relay interpretation by utilizing a plurality of interpretation supporters and providing an interpretation result to a terminal of a user. CONSTITUTION: A communication unit(300) receives interpretation request from a user. The communication unit transmits an interpretation result according to the interpretation request. A interpretation applicant information DB(306) stores a list information of the interpretation supporters capable of performing interpretation as a target language. A server control unit(304) searches interpretation supporter capable of interpreting the request target language.
Abstract translation: 目的:提供自动解释终端,服务,系统和方法来提供快速准确的口译服务,通过利用多个解释支持者直接进行解释和中继解释,并将解释结果提供给终端 一个用户 构成:通信单元(300)从用户接收解释请求。 通信单元根据解释请求发送解释结果。 解释申请者信息DB(306)存储能够进行解释的解释支持者的列表信息作为目标语言。 服务器控制单元(304)搜索能够解释请求目标语言的解释支持者。
-
公开(公告)号:KR1020100137873A
公开(公告)日:2010-12-31
申请号:KR1020090056120
申请日:2009-06-23
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A speaker adaptation system and a method thereof are provided to utilize the cumulative variable for obtaining enough statistics for the non-instruction adaptation in voice recognitioin process without performing an adaptation training thereby enabling to gradual adaptation. CONSTITUTION: A characteristic detecting part(110) extracts an eigenvector from a voice signal. A sound model storage(120) stores an acoustic model consisting of a recursive tree. A conversion parameter class determiner(130) produces gaussian posteriori probability of a candidate state based on the eigenvector and the sound model and determines the cumulative variable and a conversion parameter class based on the gaussian posterior probability. A sound model updater(140) produces the conversion parameter based on the conversion parameter class and cumulative variable and renews the acoustic model.
Abstract translation: 目的:提供一种扬声器适应系统及其方法,以利用该累积变量来获得用于语音识别过程中的非指令适配的足够的统计量,而不进行适应训练,从而能够逐渐适应。 构成:特征检测部(110)从语音信号中提取特征向量。 声音模型存储(120)存储由递归树组成的声学模型。 转换参数类确定器(130)基于特征向量和声音模型产生候选状态的高斯后验概率,并且基于高斯后验概率确定累积变量和转换参数类。 声音模型更新器(140)基于转换参数类和累积变量产生转换参数,并更新声学模型。
-
公开(公告)号:KR102223653B1
公开(公告)日:2021-03-05
申请号:KR1020160076806
申请日:2016-06-20
Applicant: 한국전자통신연구원
Abstract: 본발명의일 실시예에따른음성신호처리장치는사용자의음성신호를입력받는입력부, 상기사용자의발화에기인하는움직임을감지해서상기사용자의음성신호발화구간식별을위한보조신호를감지하는감지부, 상기사용자로부터동작모드의선택및 상기음성신호와보조신호에대한프로토콜적용방식의선택중 적어도하나에관한정보를입력받는스위치및 선택된상기동작모드가제 1 동작모드인경우상기음성신호를제 1 프로토콜을이용하여외부단말로전송하고, 선택된상기동작모드가제 2 동작모드인경우상기음성신호및 보조신호를상기제 1 프로토콜을이용하여상기외부단말로전송하거나, 상기음성신호및 보조신호별로상이하게각각상기제 1 프로토콜및 제 2 프로토콜중 하나의프로토콜을이용하여상기외부단말로전송하는신호처리부를포함할수 있다.
-
公开(公告)号:KR101578766B1
公开(公告)日:2015-12-22
申请号:KR1020110090283
申请日:2011-09-06
Applicant: 한국전자통신연구원
IPC: G10L15/08
Abstract: 본발명은선택적포즈가삽입될단어목록을기반으로요소 WFST를구성함으로써, 음성인식의성능을떨어뜨리지않으면서탐색공간의크기증가를최소화할수 있는음성인식용탐색공간생성장치및 방법에관한것이다.이를위하여본 발명은발음사전과, 선택적포즈가삽입될단어목록을저장하고있는단어목록데이터베이스와, 상기발음사전으로부터읽어들인각 단어의발음열을이용하여탐색공간을생성하되, 상기읽어드린단어가상기단어목록데이터베이스에포함된경우상기읽어드린단어에선택적포즈를삽입시켜탐색공간을생성하는탐색공간구현부와, 상기선택적포즈가삽입된탐색공간이저장된데이터베이스를포함하는음성인식용탐색공간생성장치를제공한다.
-
公开(公告)号:KR1020130057668A
公开(公告)日:2013-06-03
申请号:KR1020110123528
申请日:2011-11-24
Applicant: 한국전자통신연구원
CPC classification number: G10L15/20
Abstract: PURPOSE: A voice recognition apparatus based on a cepstrum feature vector and a method thereof are provided to estimate the reliability of each segment of an input voice signal including noise and to apply the reliability to a sound model and the input voice signal as a weighted value in a decoding step of voice recognition. CONSTITUTION: A reliability estimating unit(108) estimates the reliability of time-frequency segments from an input voice signal. A reliability reflecting unit(110) reflects estimated reliability to a normalized cepstrum feature vector extracted from the input voice signal and a cepstrum average vector included in decoding regarding the state of a HMM(Hidden Markov Model). A cepstrum transforming unit(112) transforms reliability reflected cepstrum feature and average vectors through a cosine transformation matrix. An output probability calculating unit(113) calculates an output probability value of the time-frequency segments. [Reference numerals] (101) Frame based dividing unit; (102) Filter bank analyzing unit; (104,111) Cosine transformation unit(DCT); (105) Cepstrum normalization unit; (106) HMM sound model; (107) HMM average vector; (108) Reliability estimating unit; (109) Cosine reverse-transformation unit(IDCT); (110) Reliability reflecting unit; (112) Cepstrum transforming unit; (113) Output probability calculating unit; (AA) Background noise sound signal input; (BB) Log filter bank energy
Abstract translation: 目的:提供一种基于倒谱特征向量的语音识别装置及其方法,以估计包括噪声的输入语音信号的每个片段的可靠性,并将可靠性应用于声音模型,并将输入的语音信号作为加权值 在语音识别的解码步骤中。 构成:可靠性估计单元(108)根据输入语音信号估计时频段的可靠性。 可靠性反射单元(110)将估计的可靠性反映到从输入语音信号提取的归一化反相特征向量和包括在关于HMM(隐马尔可夫模型)的状态的解码中包括的倒谱平均向量。 倒频变换单元(112)通过余弦变换矩阵来变换可靠性反射倒谱特征和平均矢量。 输出概率计算单元(113)计算时间段的输出概率值。 (附图标记)(101)基于帧的分割单元; (102)过滤器库分析单元; (104,111)余弦变换单元(DCT); (105)倒谱归一化单元; (106)HMM声音模型; (107)HMM平均向量; (108)可靠性估计单元; (109)余弦逆变换单元(IDCT); (110)可靠性反射单元; (112)倒谱变换单元; (113)输出概率计算单位; (AA)背景噪声声音信号输入; (BB)对数滤波器组能量
-
公开(公告)号:KR1020130026855A
公开(公告)日:2013-03-14
申请号:KR1020110090283
申请日:2011-09-06
Applicant: 한국전자통신연구원
IPC: G10L15/08
Abstract: PURPOSE: A search space generator for recognizing voice is provided to improve the accuracy of voice recognition by recognizing the voice by using a voice articulation database for training a voice model. CONSTITUTION: A search space generator for recognizing voice includes a pronunciation dictionary(100), a word list database(120), a WFST(Weighted Finite State Transducer) L realization unit(140), and a WFST L database(160). The WFST L implementation unit acquires a pronunciation string for each word by reading the pronunciation dictionary. The WFST L implementation unit generates WFST L in which a selective pause is inserted by comparing the acquired pronunciation dictionary with the word list stored in the word list database. [Reference numerals] (100) Pronunciation dictionary; (120) Word list database; (140) WFST L realization unit; (160) WFST L database
Abstract translation: 目的:提供用于识别语音的搜索空间发生器,以通过使用用于训练语音模型的语音发音数据库识别语音来提高语音识别的准确性。 构成:用于识别语音的搜索空间发生器包括发音字典(100),单词列表数据库(120),WFST(加权有限状态传感器)L实现单元(140)和WFST L数据库(160)。 WFST L实现单元通过读取发音字典获取每个单词的发音字符串。 WFST L实现单元产生WFST L,其中通过将获取的发音字典与存储在单词列表数据库中的单词列表进行比较来插入选择性暂停。 (附图标记)(100)发音字典; (120)词汇表数据库; (140)WFST L实现单元; (160)WFST L数据库
-
公开(公告)号:KR101134682B1
公开(公告)日:2012-04-09
申请号:KR1020090056120
申请日:2009-06-23
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A speaker adaptation system and a method thereof are provided to utilize the cumulative variable for obtaining enough statistics for the non-instruction adaptation in voice recognitioin process without performing an adaptation training thereby enabling to gradual adaptation. CONSTITUTION: A characteristic detecting part(110) extracts an eigenvector from a voice signal. A sound model storage(120) stores an acoustic model consisting of a recursive tree. A conversion parameter class determiner(130) produces gaussian posteriori probability of a candidate state based on the eigenvector and the sound model and determines the cumulative variable and a conversion parameter class based on the gaussian posterior probability. A sound model updater(140) produces the conversion parameter based on the conversion parameter class and cumulative variable and renews the acoustic model.
-
-
-
-
-
-
-
-
-