-
公开(公告)号:KR101294024B1
公开(公告)日:2013-08-08
申请号:KR1020090127336
申请日:2009-12-18
Applicant: 한국전자통신연구원
Abstract: 본 발명은 전자책 시스템을 위한 인터랙티브 컨텐츠를 제작, 유통, 활용하는 장치 및 방법에 관한 것이다.
본 발명은, 컨텐츠 제공 장치에 있어서, 컨텐츠 저작 도구 및 컨텐츠 구성 아이템을 유통시키는 저작 도구 유통부; 상기 저작 도구 유통부로부터의 컨텐츠 저작 도구 및 컨텐츠 구성 아이템을 이용하거나 기 제작된 컨텐츠를 더 이용하여 인터랙티브 컨텐츠를 생성하는 인터랙티브 컨텐츠 제작부; 및 상기 인터랙티브 컨텐츠 제작부에서 생성된 인터랙티브 컨텐츠를 단말로 유통시키거나 타 인터랙티브 컨텐츠 제작부로 더 유통시키는 인터랙티브 컨텐츠 유통부를 포함하되, 상기 인터랙티브 컨텐츠는, 스크립트, 객체 데이터, 장면 데이터를 포함한다.
전자책, e-book, 음성인식, 음성합성, 인터랙티브 동화, 아동, 유아-
公开(公告)号:KR1020130068624A
公开(公告)日:2013-06-26
申请号:KR1020110135919
申请日:2011-12-15
Applicant: 한국전자통신연구원
IPC: G10L15/14
CPC classification number: G10L17/04 , G10L15/063 , G10L15/18
Abstract: PURPOSE: A speaker group based on voice recognition unit and a method thereof are provided to automatically judge a corresponding speaker group with respect to a recognition result of a speaker group sharing specific interest or voice uttered by a group, to comprise a language model based on a linguistic phenomenon commonly shown in the judged speaker group and to perform voice recognition using the comprised language model. CONSTITUTION: A first decoder unit(120) performs voice recognition with respect to speaker voice input based on a universal language model and generates a first voice recognition result. A language model interpolation unit(130) generates an interpolation language model using the generated first voice recognition result and speaker language model. A second decoder unit(140) performs voice recognition with respect to the speaker voice input based on the generated interpolation language model. [Reference numerals] (110) Voice recognition saving unit; (120) First decoder unit; (130) Language model security unit; (140) Second decoder unit; (150) Speaker classification unit; (160) Corpus saving unit; (170) Classification index unit; (180) Language model education unit; (AA) User listening input
Abstract translation: 目的:提供一种基于语音识别单元及其方法的扬声器组,用于相对于分组特定兴趣或由组发出的语音的扬声器组的识别结果自动判断相应的扬声器组,以包括基于 在所判断的扬声器组中通常示出的语言现象,并且使用所包含的语言模型来执行语音识别。 构成:第一解码器单元(120)基于通用语言模型执行关于扬声器语音输入的语音识别,并且生成第一语音识别结果。 语言模型插值单元(130)使用所生成的第一语音识别结果和说话者语言模型来生成内插语言模型。 第二解码器单元(140)基于所生成的插值语言模型对扬声器语音输入执行语音识别。 (附图标记)(110)语音识别保存单元; (120)第一解码器单元; (130)语言模型安全单位; (140)第二解码器单元; (150)扬声器分类单元; (160)语料储蓄单位; (170)分类指标单位; (180)语言模范教育单位; (AA)用户听力输入
-
公开(公告)号:KR1020130067848A
公开(公告)日:2013-06-25
申请号:KR1020110134837
申请日:2011-12-14
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A voice recognition server and a method thereof are provided to apply terminal information of a user terminal and voice information of a user and perform voice recognition through a relevant sound model, thereby providing a voice recognition result of a state where a channel environment of the user terminal is matched to a training environment of the sound model. CONSTITUTION: A sound model storage part(220) stores multiple sound models. A sound model extraction part(260) extracts a relevant sound model among the stored multiple sound models based on the gender of a user of a user terminal determined by a user gender determination part(240) and terminal information verified by a terminal information verifying part(250). A decoding part(270) applies the extracted sound model and recognizes the voice of the user. [Reference numerals] (210) Control part; (220) Sound model storage part; (230) Communication part; (240) User gender determination part; (250) Terminal information verifying part; (260) Sound model extraction part; (270) Decoding part; (AA) Terminal/gender sound model 1; (BB) Terminal/gender sound model 2; (CC) Terminal/gender sound model N
Abstract translation: 目的:提供语音识别服务器及其方法来应用用户终端的信息和用户的语音信息,并通过相关的声音模型进行语音识别,从而提供语音识别服务器的通道环境的语音识别结果 用户终端与声音模型的训练环境相匹配。 构成:声音模型存储部分(220)存储多个声音模型。 声音模型提取部分(260)基于由用户性别确定部分(240)确定的用户终端的用户的性别和由终端信息验证部件(240)验证的终端信息,提取所存储的多个声音模型中的相关声音模型 (250)。 解码部分(270)应用所提取的声音模型并识别用户的声音。 (附图标记)(210)控制部; (220)声音储存部分; (230)通讯部分; (240)用户性别确定部分; (250)终端信息验证部分; (260)声音模型提取部分; (270)解码部分; (AA)终端/性别声音模型1; (BB)终端/性别声音模型2; (CC)终端/性别声音模型N
-
公开(公告)号:KR101217525B1
公开(公告)日:2013-01-18
申请号:KR1020080131365
申请日:2008-12-22
Applicant: 한국전자통신연구원
CPC classification number: G10L15/08 , G10L15/142
Abstract: 본 발명에 따른 비터비 디코더는, 입력된 음성 프레임의 관측 벡터에 대한 관측 확률 값을 계산하고, 과거 음성 프레임에 대해 계산된 관측 확률값과의 비션형 필터링을 통해 현재 관측 확률값을 갱신하고, 이를 기반으로 최대 유사도 값을 산출하여 인식 단어를 출력한다.
이와 같이, 본 발명은 관측 확률에 대한 비선형 필터링 방식을 적용하여 음성 신호간에 존재하는 상관성을 토대로 관측 확률 값을 복원함으로써, 의도하지 않은 임펄스성 잡음으로 인해 오염된 부분의 관측 확률이 급격히 낮아지는 것을 방지할 수 있다.
비터비 디코더, 음성, 관측 확률, 비선형 필터링, 잡음-
公开(公告)号:KR1020130005160A
公开(公告)日:2013-01-15
申请号:KR1020110066574
申请日:2011-07-05
Applicant: 한국전자통신연구원
CPC classification number: H04M1/72552 , G10L15/083 , G10L15/30 , H04M2250/74 , H04W4/12
Abstract: PURPOSE: A message service method using a voice recognition function is provided to offer a message by combining a voice recognition result and the real voice of a user. CONSTITUTION: A message server(20) recognizes a voice transmitted from a transmission terminal(10)(S14). The message server generates a recognized result from the voice and an N-best result based on a chaos network. The message server transmits the generated N-best result to the transmission terminal(S20). The message server receives the selected message from the transmission terminal and an evaluation result for the message accuracy(S26). The message server transmits the message and the evaluation result to a reception terminal(30)(S32). [Reference numerals] (10) Transmission terminal; (20) Message server; (30) Reception terminal; (S10) Inputting voice; (S12,S40) Transmitting the voice; (S14) Recognizing the voice; (S16) Generating a recognized result and an N-best result; (S18) Storing log data; (S20) Transmitting the recognized result and the N-best result; (S22) Displaying the recognized result and the N-best result; (S24) Determining a message and an evaluation result; (S26,S32) Transmitting the message and the evaluation result; (S28) Storing additional log data; (S30) Modifying errors of the recognized result; (S34) Displaying the message and the evaluation result; (S36) Requesting the voice; (S38) Extracting the voice; (S42) Outputting the voice
Abstract translation: 目的:提供使用语音识别功能的消息服务方法,通过组合语音识别结果和用户真实语音来提供消息。 构成:消息服务器(20)识别从发送终端(10)发送的语音(S14)。 消息服务器根据混沌网络产生语音识别结果和N最佳结果。 消息服务器将生成的N最佳结果发送到发送终端(S20)。 消息服务器从发送终端接收所选择的消息和消息准确性的评估结果(S26)。 消息服务器将消息和评估结果发送到接收终端(30)(S32)。 (附图标记)(10)发送端子; (20)消息服务器; (30)接待台; (S10)输入声音; (S12,S40)发送语音; (S14)识别声音; (S16)生成识别结果和N最佳结果; (S18)存储日志数据; (S20)发送识别结果和N最佳结果; (S22)显示识别结果和N最佳结果; (S24)确定消息和评估结果; (S26,S32)发送消息和评估结果; (S28)存储其他日志数据; (S30)修正识别结果的错误; (S34)显示消息和评估结果; (S36)请求声音; (S38)提取声音; (S42)输出声音
-
公开(公告)号:KR1020120075585A
公开(公告)日:2012-07-09
申请号:KR1020100129360
申请日:2010-12-16
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A conversation method and system therefor are provided to enable a user to control system articulation step according to user learning steps by controlling various articulation flows. CONSTITUTION: A voice recognition unit(102) changes the received articulation of a user into articulation text by using articulation information. A language understanding unit(103) determines the articulation action of the user by using the changed articulation text. A conversation and progress management unit(104) determines one articulation point from the articulation points connected a target articulation point. A system conversation creation unit(106) searches articulation patterns connected with the articulation point corresponding to the determined articulation point.
Abstract translation: 目的:提供一种对话方法及其系统,以使用户能够通过控制各种关节流动来根据用户学习步骤来控制系统关节步骤。 构成:语音识别单元(102)通过使用发音信息将接收到的用户的发音改变为关节文本。 语言理解单元(103)通过使用改变的发音文本来确定用户的发音动作。 对话和进度管理单元(104)从连接到目标关节点的关节点确定一个关节点。 系统对话创建单元(106)搜索与所确定的关节点相对应的关节点连接的关节模式。
-
公开(公告)号:KR1020120066530A
公开(公告)日:2012-06-22
申请号:KR1020100127907
申请日:2010-12-14
Applicant: 한국전자통신연구원
CPC classification number: G10L15/065 , G10L15/187
Abstract: PURPOSE: An apparatus for estimating language model weight is provided to enhance performance of secondary search and to improve performance of a voice recognition system. CONSTITUTION: An apparatus for estimating language model weight comprises: a first search unit(101) for performing primary search by applying a first language model; a phoneme recognition unit(102) for outputting second sound score by applying a sound model to a sound feature vector; a weight estimation unit(103) for outputting a first language model weight in case that a sound score of voice recognition result is higher than a sound score of phoneme recognition result; and a second search unit(104) for applying the second language weight to word grid.
Abstract translation: 目的:提供一种用于估计语言模型权重的装置,以提高辅助搜索的性能并提高语音识别系统的性能。 一种用于估计语言模型权重的装置,包括:第一搜索单元,用于通过应用第一语言模型来执行初级搜索; 用于通过将声音模型应用于声音特征向量来输出第二声分数的音素识别单元(102) 用于在声音识别结果的声分高于音素识别结果的声分数的情况下输出第一语言模型权重的权重估计单元; 以及用于将第二语言权重应用于字网格的第二搜索单元(104)。
-
公开(公告)号:KR1020120056661A
公开(公告)日:2012-06-04
申请号:KR1020100118310
申请日:2010-11-25
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A voice signal pre-processing device and a method thereof are provided to interpolate and restore a voice signal whose size is abnormal under mobile environments, thereby increasing performance of recognizing a voice. CONSTITUTION: A voiced sound section detecting unit(120) detects a voiced sound section including a voiced sound signal from a voice section. A pre-processing method determining unit(140) detects a clipping signal which is generated during the voiced sound section. A clipping signal processing unit(160) extracts a signal sample close to the clipping signal. The clipping signal processing unit interpolates the clipping signal by using the signal sample.
Abstract translation: 目的:提供一种语音信号预处理装置及其方法,用于在移动环境下内插和恢复尺寸异常的语音信号,从而提高识别语音的性能。 声音部分检测单元(120)检测包括来自语音部分的有声声音信号的浊音部分。 预处理方法确定单元(140)检测在有声声部分期间产生的限幅信号。 剪辑信号处理单元(160)提取接近限幅信号的信号样本。 剪辑信号处理单元通过使用信号采样内插削波信号。
-
公开(公告)号:KR1020120026357A
公开(公告)日:2012-03-19
申请号:KR1020100088526
申请日:2010-09-09
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A device for driving voice recognition system is provided to perform the voice recognition by vocalization of a pre-stored keyword without additional key operation, thereby increasing the user convenience. CONSTITUTION: When a user speaks a registration target keyword, a user registration unit(100) calculates a threshold value from the keyword. The user registration unit stores the threshold value in a storage unit(114). A voice recognition and driving unit(150) calculates a likelihood ratio for a vocalized data following the input of the vocalized data. The voice recognition and driving unit drives the system by comparing the likelihood ratio with the threshold value.
Abstract translation: 目的:提供一种用于驱动语音识别系统的设备,用于通过预先存储的关键字的发声来执行语音识别,而无需附加的键操作,从而增加了用户的便利性。 构成:当用户说出注册目标关键字时,用户注册单元(100)根据关键字计算阈值。 用户登记单元将阈值存储在存储单元(114)中。 语音识别和驱动单元(150)计算声音数据输入之后的发声数据的似然比。 语音识别和驱动单元通过将似然比与阈值进行比较来驱动系统。
-
公开(公告)号:KR101082837B1
公开(公告)日:2011-11-11
申请号:KR1020080131243
申请日:2008-12-22
Applicant: 한국전자통신연구원
IPC: G10L21/0208 , G10L15/20
Abstract: 본발명은잡음제거장치및 방법에관한것으로, 잡음변화가심한환경과여러가지의잡음이혼재하는환경에서잡음제거효율을향상하기위해, 소프트마스킹기법등과같은음성/잡음분리기법을통해음성과잡음의분리기능을강화하고, 잡음가우시안혼합모델이입력신호에대한잡음성분을모델링하는데한계가있는점을보완하기위해잡음적응기법을사용함으로써, 깨끗한음성을보다정확히추정하여음성인식성능을높이는이점이있다.
-
-
-
-
-
-
-
-
-