-
51.
公开(公告)号:KR101001618B1
公开(公告)日:2010-12-17
申请号:KR1020080085095
申请日:2008-08-29
Applicant: 한국전자통신연구원
IPC: G10L15/08 , G10L15/26 , H04N21/438
Abstract: 본 발명은 음성 인식을 위한 음성 인식 정보를 생성하고 이를 이용하여 음성 입력을 통한 방송 서비스를 제공하는 기법에 관한 것으로, 방송 데이터의 문자열 정보에 따라 사전 매칭을 수행하고, 사전 매칭이 수행된 문자열의 구간 경계를 분할하여 음성 인식 대상 문자열 데이터를 생성하며, 이를 약어 처리하여 정규화한 후에, 정규화된 음성 인식 대상 문자열 데이터를 발화 이형태 문자열 데이터로 조합 생성하여 저장함으로써, 방송 서비스를 제공하기 위한 음성 입력 시 사용자 발화에 효과적으로 대응하여 해당 방송 서비스를 효과적으로 제공할 수 있는 것이다.
IP TV(Internet Protocol Television), 음성 인식-
公开(公告)号:KR1020100073161A
公开(公告)日:2010-07-01
申请号:KR1020080131755
申请日:2008-12-22
Applicant: 한국전자통신연구원
CPC classification number: G10L15/187 , G10L15/10
Abstract: PURPOSE: An utterance verification method and device for isolated word an NBEST recognition result are provided to enable more reliable voice recognition by displaying the acceptance/refusal or decision failure of voice recognition by measuring inter-phoneme similarity through DTW(Dynamic Time Warping). CONSTITUTION: A pre-processing unit(104) performs feature extraction and end point detection for detecting voice section and noise processing section. An NBEST voice recognition unit(106) perform an NBEST voice recognition through a viterbi speech in consideration of a context-subordinate sound model(26). An NBEST speech verification unit(108) compares the result of an SVM(Support Vector Machine) with the similarity result to measure the similarity of voice recognition result. Therefore, the NBEST speech verification unit displays the acceptance, refusal and decision failure for the voice recognition result.
Abstract translation: 目的:提供一种用于隔离词的话语验证方法和设备,其包括NBEST识别结果,以通过通过DTW(动态时间扭曲)测量语音间相似度来显示语音识别的接受/拒绝或决策失败来实现更可靠的语音识别。 构成:预处理单元(104)执行用于检测语音段和噪声处理部分的特征提取和终点检测。 考虑到上下文从属声音模型(26),NBEST语音识别单元(106)通过维特比语音执行NBEST语音识别。 NBEST语音验证单元(108)将SVM(支持向量机)的结果与相似度结果进行比较,以测量语音识别结果的相似度。 因此,NBEST语音验证单元显示语音识别结果的接受,拒绝和决策失败。
-
公开(公告)号:KR1020100072746A
公开(公告)日:2010-07-01
申请号:KR1020080131238
申请日:2008-12-22
Applicant: 한국전자통신연구원
IPC: G10L15/20 , G10L21/0208 , G10L21/0272
Abstract: PURPOSE: A method and an apparatus for reducing a multi channel noise are provided to selectively apply beam-forming method and sound source separating method according to environment condition among multi-channel noises processing based on multi-channel voice recognition environment thereby maximizing noise processing performance. CONSTITUTION: A noise environment monitoring unit(210) grasps the number of sound source and background sound source and relative location information of user voice. According to how many information of the sound source and the relative location information of the background sound source and the user voice, a multi-channel noise processor(220) selects multi-channel noise processing methods among a plurality of multi-channel noises processing modes. The multi-channel noise processor performs noises processing by selected multi-channel noise processing method.
Abstract translation: 目的:提供一种降低多通道噪声的方法和装置,以便根据多声道噪声处理,根据环境条件选择性地应用波束形成方法和声源分离方法,从而使噪声处理性能最大化 。 构成:噪声环境监测单元(210)掌握用户声音的声源和背景声源的数量和相对位置信息。 根据声源的多少信息和背景声源和用户声音的相对位置信息,多声道噪声处理器(220)在多个多声道噪声处理模式中选择多声道噪声处理方法 。 多声道噪声处理器通过选择的多声道噪声处理方法进行噪声处理。
-
公开(公告)号:KR1020100072731A
公开(公告)日:2010-07-01
申请号:KR1020080131221
申请日:2008-12-22
Applicant: 한국전자통신연구원
CPC classification number: G01C21/3608 , G01C21/3611 , G01C21/3679 , G10L15/26
Abstract: PURPOSE: An apparatus for generating keyword for speech recognition for a navigation device is provided to enable a retrieval service of POI through voice by automatically producing the allomorph of POI which a user can say to a navigation device. CONSTITUTION: An apparatus for generating keyword for speech recognition in navigation device comprises a statistical model studying unit(202) and an allomorph generating unit. The statistical model studying unit analyzes the POI character strings. The statistical model studying unit builds probability value as statistical information. The allomorph generating unit creates the allomorphs on POI name using the statistical information.
Abstract translation: 目的:提供一种用于为导航装置生成用于语音识别的关键词的装置,以通过自动产生用户可以对导航装置说的POI的变体,从而通过语音实现POI的检索服务。 构成:用于在导航装置中生成用于语音识别的关键词的装置,包括统计模型研究单元(202)和变形生成单元。 统计模型研究单位分析POI字符串。 统计模型研究单位建立概率值作为统计信息。 变异生成单元使用统计信息在POI名称上创建变形。
-
公开(公告)号:KR1020100066916A
公开(公告)日:2010-06-18
申请号:KR1020080125433
申请日:2008-12-10
Applicant: 한국전자통신연구원
IPC: G10L99/00
Abstract: PURPOSE: A method for separating noise from an audio signal is provided to increase performance of sound source separation and increase convergence speed in a weighted learning stage, thereby increasing calculation efficiency. CONSTITUTION: A plurality of microphones records an audio signal that a user speaks and a noise signal. A beam former(20) performs a beam forming process and a blind processing separation procedure for the recorded audio signal and noise signal. The beam former spatially and statistically divides the audio signal and the noise signal. A sound source separator(30) separates the sound source signal and outputs the separated sound source signal.
Abstract translation: 目的:提供一种从音频信号中分离噪声的方法,以增加声源分离的性能,增加加权学习阶段的收敛速度,从而提高计算效率。 构成:多个麦克风记录用户说话的音频信号和噪声信号。 波束成形器(20)对记录的音频信号和噪声信号执行波束形成处理和盲目处理分离程序。 波束形成器在空间和统计学上划分音频信号和噪声信号。 声源分离器(30)分离声源信号并输出分离的声源信号。
-
56.
公开(公告)号:KR1020100026187A
公开(公告)日:2010-03-10
申请号:KR1020080085095
申请日:2008-08-29
Applicant: 한국전자통신연구원
IPC: G10L15/08 , G10L15/26 , H04N21/438
Abstract: PURPOSE: A voice recognition information generation device, a method thereof, and a broadcast service method thereof are provided to generate a database from allomorph character string, thereby offering a broadcast service according to voice recognition. CONSTITUTION: A voice recognition information generation device includes a prior matching unit(302), a section boundary partition unit(308), a normalization unit(310), and an allomorph generation unit(312). The prior matching unit performs prior matching according to character string information of broadcast data. The section boundary partition unit partitions the section boundary of a character string of which prior matching is performed in order to generate voice recognition target character string data. The normalization unit normalizes generated voice recognition target character string. The allomorph generation unit generates allomorph character string data from normalized voice recognition target character string data.
Abstract translation: 目的:提供一种语音识别信息生成装置,其方法和广播服务方法,以从变形字符串生成数据库,从而根据语音识别提供广播服务。 构成:语音识别信息生成装置包括先验匹配单元(302),区间边界分割单元(308),归一化单元(310)和变形函数生成单元(312)。 先前的匹配单元根据广播数据的字符串信息执行先前的匹配。 区段边界分割单元对执行了先前匹配的字符串的区域边界进行分割,以便生成语音识别目标字符串数据。 归一化单元对生成的语音识别目标字符串进行归一化。 变形生成单元从归一化的语音识别目标字符串数据生成变形字符串数据。
-
公开(公告)号:KR100669241B1
公开(公告)日:2007-01-15
申请号:KR1020040106610
申请日:2004-12-15
Applicant: 한국전자통신연구원
IPC: G10L13/10
Abstract: 본 발명은 화행 정보를 이용한 대화체 음성합성 시스템 및 방법에 관한 것으로서, 대화 텍스트(dialog text)에서 대화의 맥락(context)에 따라 다른 억양이 구현될 필요가 있는 표현에 대해 두 대화자의 발화 문장으로부터 추출되는 화행(speech act) 정보를 이용하여 억양을 구분하는 태깅을 수행해 주고, 음성 합성시에는 그 태그에 맞는 억양을 갖는 음성 신호를 음성데이타베이스에서 추출하여 합성에 사용함으로써 대화의 흐름에 맞는 자연스럽고 다양한 억양을 구현함으로써, 대화의 상호작용(interaction)적인 측면을 좀더 실감나게 표현할 수 있어 대화음성의 자연성의 증진 효과를 기대할 수 있다.
대화체 음성합성시스템(Dialog-style Text-to-Speech system), 대화체 텍스트(dialog text), 음성 합성(speech synthesis), 맥락(context), 화행(speech act), 억양(intonation)-
公开(公告)号:KR100620898B1
公开(公告)日:2006-09-07
申请号:KR1020050064097
申请日:2005-07-15
Applicant: 한국전자통신연구원
Inventor: 김종진
IPC: G10L13/00
Abstract: 본 발명은 음성합성시스템(Text-to-Speech system)의 발화속도 변환방법에 관한 것으로, 합성DB에서 발성목록을 추출하여 빠른 발화, 정상 발화 및 느린 발화로 이루어진 각 발화스타일별로 발성시켜 합성단위별 지속시간 확률분포를 구축시키는 단계와, 요청된 합성에 대응하여 비터비 탐색을 통해 최적의 합성단위 후보열을 검색하고, 합성단위의 지속시간 타켓 파라미터를 생성하는 단계와, 상기 최적의 합성단위 후보열의 지속시간 파라미터를 통해 최적 합성단위 후보열을 다시 구하여 합성음을 생성하는 단계로 진행함으로써, 상기한 새로운 지속시간을 이용하여 2-패스 검색하여 합성음을 생성하므로 기존의 SOLA방식처럼 합성음에 대해 신호처리를 하지 않아도 된다는 장점이 있을 뿐만 아니라, 상기 새로운 지속시간을 구하는 계산식 자체에 발화속도에 민감한 문맥과 발화속도에 둔감한 문맥에 대한 고려가 포함되어 있어, 이러한 문맥의 식별을 위해 별도의 훈련이나 예측모델을 생성할 필요가 없다는 장점이 있다.
음성합성시스템(Text-to-Speech system), 발화속도변환(Speaking Rate conversion), SOLA, 끊어읽기(Break indexing)-
公开(公告)号:KR100399574B1
公开(公告)日:2003-09-26
申请号:KR1020000083014
申请日:2000-12-27
Applicant: 한국전자통신연구원
Inventor: 김종진
IPC: H04M3/50
Abstract: PURPOSE: A phone announcement automatic interpreting system and method for foreigners are provided to allow a foreign user and a domestic guide to easily inquire and reply in a native language in a manner that when the foreign user connects to a phone guide automatic interpreting system and inquire in his/her native language, the phone guide system automatically interprets the inquiry content and transmits it to the domes guide, and when the domestic guide replies to the inquire, the phone guide system automatically interprets the reply content and transmits it to the foreign user. CONSTITUTION: An interface unit(61) interfaces between a domestic guide and a foreign user and processes connection information for a billing. The first voice recognizing unit(62) recognizes a voice of the foreign user received through the interface unit(61) and converts it into a foreign language sentence. The first middle language generating unit(63) generates a middle language form, a semantic structure required for generation of a domestic language by the foreign language sentence converted by the first voice recognizing unit(62). The first language translation unit(64) translates the middle language form generated by the first middle language generating unit(63) into a domestic language sentence. The first voice synthesizing unit(65) synthesizes the domestic language sentence translated by the first language translation unit(64) to a voice and transmits the voice to the domestic guide. The second voice recognizing unit(66) recognizes a voice of the domestic guide and converts it to a domestic language sentence. The second middle language generating unit(67) generates a middle language form, a semantic structure required for generation of a foreign language by the domestic language sentence converted by the second voice recognizing unit(66). The second language translation unit(68) translates the middle language form generated by the second middle language generating unit(67) into a foreign language sentence. The second voice synthesizing unit(66) synthesizes the foreign language sentence translated by the second language translation unit(68) to a voice and transmits it to the foreign user through the interface unit(61).
Abstract translation: 目的:提供一种针对外国人的电话通知自动翻译系统和方法,以允许外国用户和国内导游易于用母语进行查询和回复,当外国用户连接到电话指南自动翻译系统并查询 电话指南系统以他/她的母语自动解释查询内容并将其发送到球机指南,并且当国内指南回答查询时,电话指南系统自动解释答复内容并将其发送给外国用户 。 组成:接口单元(61)在国内指南和外国用户之间接口并且处理用于计费的连接信息。 第一声音识别单元(62)识别通过接口单元(61)接收的外国用户的声音并将其转换成外语句子。 第一中间语言生成单元(63)生成中间语言形式,即由第一语音识别单元(62)转换的外语句子生成家庭语言所需的语义结构。 第一语言翻译单元(64)将由第一中间语言生成单元(63)生成的中间语言形式翻译成家庭语言句子。 第一语音合成单元(65)将由第一语言翻译单元(64)翻译的家庭语言句子合成为语音,并将语音发送给家庭指南。 第二声音识别单元(66)识别国内指南的声音并将其转换为家庭语言句子。 第二中间语言生成单元(67)生成中间语言形式,即由第二语音识别单元(66)转换的家庭语言句子生成外语所需的语义结构。 第二语言翻译单元(68)将由第二中间语言生成单元(67)生成的中间语言形式翻译成外语句子。 第二语音合成单元(66)将由第二语言翻译单元(68)翻译的外语句子合成为语音,并通过接口单元(61)将其发送给外国用户。
-
60.
公开(公告)号:KR1020030042285A
公开(公告)日:2003-05-28
申请号:KR1020010073005
申请日:2001-11-22
Applicant: 한국전자통신연구원
IPC: G10L13/00
Abstract: PURPOSE: An apparatus and a method for evaluating a synthesized sound and a system for comparing performances of sound synthesizing systems using the same are provided to objectively and directly obtain the satisfaction degree of a user and a diagnosis result of an apparatus for generating the synthesized sound. CONSTITUTION: A cerebrum cortex displacement input unit(320) is contacted to a cerebrum cortex of a valuer for measuring stress intensity generated in the cerebrum cortex. A preprocessor(330) obtains a wanted evaluating signal from the measured stress intensity signals. A storing unit(340) stores the preprocessed signals to analyze the preprocessed signals on-line and off-line. A microprocessor(350) analyzes the preprocessed signals stored at the storing unit. An output unit(360) outputs the result of the analysis.
Abstract translation: 目的:提供一种用于评估合成声音的装置和方法以及用于比较使用其的声音合成系统的性能的系统,以客观和直接地获得用户的满意度和用于产生合成声音的装置的诊断结果 。 构成:大脑皮质置换输入单元(320)与评价者的大脑皮层接触,以测量大脑皮质中产生的应激强度。 预处理器(330)从所测量的应力强度信号获得有用的评估信号。 存储单元(340)存储预处理信号以分析在线和离线的预处理信号。 微处理器(350)分析存储在存储单元中的预处理信号。 输出单元(360)输出分析结果。
-
-
-
-
-
-
-
-
-