-
公开(公告)号:KR1020100073167A
公开(公告)日:2010-07-01
申请号:KR1020080131761
申请日:2008-12-22
Applicant: 한국전자통신연구원
IPC: G10L15/20 , G10L15/10 , G10L21/0272 , G10L19/00
CPC classification number: H04R3/005 , H04R27/00 , H04R2430/03
Abstract: PURPOSE: A method for separating a source signals and an apparatus thereof are provided to improve the recording, transmission and recognition performances by separating only desirable sound source signal in plural sound source environments. CONSTITUTION: Fourier transformer(10) transforms a mixed input signal(S1) into each channel frequency domain through Fourier transformation. A frequency bandwidth divider(20) constitutes a frequency cluster from the each frequency domain. A frequency domain signal divider(30) applies a blind source separation for each cluster frequency domain. A reverse Fourier transformer(40) integrates the spectrums of divided signals through reverse Fourier transformation.
Abstract translation: 目的:提供一种用于分离源信号的方法及其装置,以通过在多个声源环境中分离所需的声源信号来提高记录,传输和识别性能。 构成:傅里叶变换器(10)通过傅里叶变换将混合输入信号(S1)转换成每个通道频域。 频率带宽分配器(20)构成来自每个频域的频率簇。 频域信号分频器(30)为每个群集频域应用盲源分离。 反傅里叶变换器(40)通过反傅里叶变换对分频信号的频谱进行积分。
-
公开(公告)号:KR1020100072842A
公开(公告)日:2010-07-01
申请号:KR1020080131369
申请日:2008-12-22
Applicant: 한국전자통신연구원
CPC classification number: G10L21/0208 , G10L15/20 , G10L25/48
Abstract: PURPOSE: A speech improving apparatus and a speech recognition system and method are provided to improve the voice recognition performance of a voice recognition system in a movable body of small resources by performing signal decoding through a sound model database. CONSTITUTION: A speed level divider(100) measures a moving speed level of a movable body through an inputted noise signal inputted in an initial stage of voice recognition. When the speed level of the movable body is lower than a predetermined value, a first sound quality improvement unit(112) improves the sound quality of a voice signal inputted by a Wiener filter. If the speed level of the movable body exceeds a predetermined value, a second sound quality improvement unit(114) improves the sound quality of a voice signal inputted by a GMM(Gaussian Mixture Model).
Abstract translation: 目的:提供语音改善装置和语音识别系统和方法,通过声音模型数据库执行信号解码来提高小资源移动体中语音识别系统的语音识别性能。 构成:速度分级器(100)通过在语音识别的初始阶段输入的输入噪声信号测量可移动体的移动速度水平。 当可移动体的速度水平低于预定值时,第一音质改善单元(112)提高了由维纳滤波器输入的语音信号的声音质量。 如果可移动体的速度水平超过预定值,则第二音质改善单元(114)提高了由GMM(高斯混合模型)输入的语音信号的声音质量。
-
93.
公开(公告)号:KR1020100069117A
公开(公告)日:2010-06-24
申请号:KR1020080127707
申请日:2008-12-16
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A CMS(Cepstrum Mean Subtraction) method and a device thereof are provided to accurately normalize a channel property by estimating an average CMS value of the real voice section based on the CMS average value of a mute section. CONSTITUTION: A property extractor(200) extracts the properties of a mute section before a start point, a sound section, and a mute section after a finish point. A firing unit CMS value calculator(600) calculates an actual firing unit cepstrum average about the entire sound section. A cepstrum average estimator(300) estimates the cepstrum average of the entire section based on the properties of the mute section. A property vector CMS applier(400) performs channel-normalization of the estimated average. A decoder decodes the channel-normalized MFCC property vector.
Abstract translation: 目的:提供CMS(倒谱平均减法)方法及其装置,以通过基于静音部分的CMS平均值估计真实语音部分的平均CMS值来准确地规范信道特性。 规定:属性提取器(200)在完成点之后提取起始点,声音部分和静音部分之前的静音部分的属性。 点火单元CMS值计算器(600)计算关于整个声音部分的实际发射单位倒谱平均值。 倒谱平均估计器(300)基于静音部分的属性来估计整个部分的倒谱平均值。 属性向量CMS应用程序(400)执行估计平均值的信道归一化。 解码器解码信道归一化的MFCC属性向量。
-
公开(公告)号:KR1020100066917A
公开(公告)日:2010-06-18
申请号:KR1020080125434
申请日:2008-12-10
Applicant: 한국전자통신연구원
CPC classification number: G01C21/3608 , G01C21/3611 , G01C21/3629 , G06F17/3074 , G10L15/18 , G10L15/22
Abstract: PURPOSE: A voice recognition method of a vehicle navigation terminal is provided to generate voice emitting isoform through a simple pattern construction using a resolute/tagged result by presenting a meaning classification system for POI name domain. CONSTITUTION: A voice recognition method of a vehicle navigation terminal is as follows. The points of interest(POI) list and POI learning data are recognized from the voice information of a voice emitting isoform input to the vehicle navigation terminal (S200). A resource is built on the POI list and the POI learning data recognized(S202). The resolution and tagging on the built resource are performed with the POI list(S204). The result resolved and tagged is created as POI database(S206). Simplex/analyzed database is built based on the POI list and the POI learning data. N-gram vocabulary is extracted from the POI learning data.
Abstract translation: 目的:提供一种车载导航终端的语音识别方法,通过呈现POI名称域的意义分类系统,通过简单的模式构造,通过坚决/标记的结果生成语音发射同种型。 构成:车辆导航终端的声音识别方法如下。 通过输入到车辆导航终端的发音同步体的语音信息来识别兴趣点(POI)列表和POI学习数据(S200)。 资源建立在POI列表和POI学习数据识别(S202)上。 使用POI列表执行内置资源的分辨率和标记(S204)。 解决和标记的结果被创建为POI数据库(S206)。 基于POI列表和POI学习数据构建Simplex /分析数据库。 从POI学习数据中提取N-gram词汇表。
-
公开(公告)号:KR100961716B1
公开(公告)日:2010-06-10
申请号:KR1020080069879
申请日:2008-07-18
Applicant: 한국전자통신연구원
Abstract: 본 발명은 디지털 캐릭터 간의 자세 변환 방법 및 그 장치에 관한 것으로, 점점 더 다양한 캐릭터가 등장하는 디지털 영상에 적용함으로써 적은 비용으로도 한 캐릭터의 자세로부터 새로운 캐릭터의 다양한 자세를 변환할 수 있어 제작기간을 단축하고 제작에 소요되는 비용도 크게 절감할 수 있다. 또한, 본 발명은 관절의 구조 또는 개수가 유사하거나 서로 상이한 캐릭터 간에 적용하여 사용자가 유지하고자 하는 원래 자세의 기본 특성을 그대로 보존하는 동시에 새로운 캐릭터의 자세 특성을 반영하면서 최적화 방법을 사용함으로써 계산이 효율적이면서도 하나의 캐릭터 자세로부터 자연스러운 캐릭터의 자세를 보다 다양하게 변환할 수 있어 생산성을 크게 향상시킬 수 있다.
디지털 캐릭터, 관절, 확률 분포 함수, 모델링
-
-
-
-