-
公开(公告)号:KR1020100096580A
公开(公告)日:2010-09-02
申请号:KR1020090015507
申请日:2009-02-24
CPC classification number: G10L17/26
Abstract: PURPOSE: A method for recognizing emotion by using a minimum classification error is provided to use discriminative weight to characteristic of emotion which is hard to classify, thereby improving the performance of recognizing the emotion. CONSTITUTION: An emotion recognition feature vector is extracted based on voice signal generated form a speaker and skin electricity reactivity of the speaker(S11). Based on extracted emotion recognition feature vector, neural emotion is classified by using a Gaussian mixture model(S12). By using Gaussian mixture model which discriminative weight, which minimizing a loss function value about the emotion recognition feature vector, is applied, pre-classified emotions except neutral emotion is classified(S13).
Abstract translation: 目的:提供一种通过使用最小分类误差来识别情绪的方法,以使用难以分类的情感特征的歧视性权重,从而提高识别情绪的表现。 构成:基于由扬声器产生的声音信号和扬声器的皮肤电反应性提取情感识别特征向量(S11)。 基于提取的情绪识别特征向量,使用高斯混合模型对神经情感进行分类(S12)。 通过使用高斯混合模型,将使情感识别特征向量的损失函数值最小化的区分权重应用于除了中性情绪之外的预分类情绪(S13)。
-
公开(公告)号:KR1020040079773A
公开(公告)日:2004-09-16
申请号:KR1020030014814
申请日:2003-03-10
Applicant: 한국전자통신연구원
IPC: G10L25/93
Abstract: PURPOSE: An apparatus and a method for discriminating a voiced sound from an unvoiced sound based on a statistical model are provided to improve the performance of discriminating the voice and unvoiced sounds from each other through a simple calculation even in a noisy environment. CONSTITUTION: An apparatus for discriminating a voiced sound from an unvoiced sound includes a signal converter(100) for converting a speech signal including noise into a frequency signal, a noise power estimator(200) for estimating a noise power from the frequency signal, and the first likelihood ratio calculator(310) for calculating a decision rule of voice activity detection of the frequency signal in a low frequency band with reference to the estimated noise power. The apparatus further includes the second likelihood ratio calculator(320) for calculating a decision rule of voice activity detection of the frequency signal in a high frequency band with reference to the estimated noise power, and a final likelihood ratio tester(400) for calculating a likelihood ratio of the voice/unvoiced sounds on the basis of the likelihood ratios calculated by the first and second likelihood ratio calculators and compares the likelihood ratio to a predetermined threshold to determine voice/unvoiced sounds.
Abstract translation: 目的:提供一种用于基于统计模型识别浊音与声音的装置和方法,以便即使在嘈杂的环境中也能通过简单的计算来提高辨别声音和清音的声音。 一种用于从浊音识别浊音的装置包括用于将包括噪声的语音信号转换为频率信号的信号转换器(100),用于从频率信号估计噪声功率的噪声功率估计器(200),以及 第一似然比计算器(310),用于参考估计的噪声功率计算低频带中的频率信号的语音活动检测的判定规则。 该装置还包括第二似然比计算器(320),用于根据所估计的噪声功率计算高频段中的频率信号的语音活动检测的判定规则;以及最终似然比测试器(400),用于计算 基于由第一和第二似然比计算器计算的似然比,语音/清音的似然比,并将似然比与预定阈值进行比较,以确定语音/清音。
-
公开(公告)号:KR101014321B1
公开(公告)日:2011-02-14
申请号:KR1020090015507
申请日:2009-02-24
CPC classification number: G10L17/26
Abstract: 개시된 최소 분류 오차 기법을 이용한 감정 인식 방법은 화자의 감정 인식에 있어서 가우시안 혼합 모델을 사용하여 중립 감정을 분류하고, 감정 인식 특징 벡터에 대한 분류 오류의 손실 함수 값을 최소화하는 변별적 가중치가 적용된 가우시안 혼합 모델을 사용하여 상기 중립 감정 외의 기타 감정을 분류한다. 감정 인식에 있어서 구분이 어려운 중립 감정 외의 감정들의 특징 벡터에 대하여 최소 분류 오차 기법에 의하여 구해진 변별적 가중치를 적용하여 감정 인식을 수행함으로써, 감정 인식의 성능을 향상시킬 수 있다.
감정 인식, 최소 분류 오차 기법, 가우시안 혼합 모델-
公开(公告)号:KR100901191B1
公开(公告)日:2009-06-04
申请号:KR1020070045691
申请日:2007-05-10
Abstract: 본 발명은 음성신호 기반의 성별인식 방법 및 장치에 관한 것이다. 본 발명에 따른 성별인식 방법에서는, 음성신호를 구성하는 유성음과 무성음 성분 중에서 성별에 따라 뚜렷한 변화를 가지지 않는 무성음 성분을 제거한 특징벡터를 추출하고, 상기 무성음 성분을 제거한 특징벡터를 미리 생성한 가우시안 혼합 모델(GMM Gaussian Mixture Model)과 비교함으로써 성별인식의 정확성을 높일 수 있다. 또한, 유성음에 적용되는 성대 떨림의 주기인 피치(pitch) 또는 포먼트 스펙트럼(formant spectrum)을 이용하여 성별을 인식할 경우, 본 발명에 따라 무성음을 제거한 음성신호를 기반으로 함으로써 상기 피치 또는 평균 성대 길이를 이용시 더욱 정확한 성별인식이 가능하다.
성별인식, 성대, 피치 유성음, 무성음, 특징벡터-
公开(公告)号:KR100824312B1
公开(公告)日:2008-04-22
申请号:KR1020070076362
申请日:2007-07-30
Abstract: A system and a method for identifying a gender of a voice signal are provided to obtain a feature vector by using group delay of a voice signal phase spectrum and compare the obtained feature vector with a Gaussian mixture model of each gender voice signal according to an expectation maximization algorithm, thereby effectively detecting the gender of the voice signal. A gender identifying system comprises the followings. A storage unit(40) stores a Gaussian mixture model of each gender voice signal. An input unit(10) receives a signal including a voice signal. An analyzing unit(30) detects a gender of the inputted voice signal by comparing a feature vector, obtained by using group delay of the inputted voice signal, with the Gaussian mixture model stored in the storage unit. The gender identifying system comprises a voice detector(20) which extracts only a voice signal from the signal inputted in the input unit and transmits the extracted voice signal to the analyzing unit.
Abstract translation: 提供用于识别语音信号的性别的系统和方法,以通过使用语音信号相位谱的组延迟来获得特征向量,并且根据期望值将获得的特征向量与每个性别语音信号的高斯混合模型进行比较 最大化算法,从而有效检测语音信号的性别。 性别识别系统包括以下内容。 存储单元(40)存储每个性别语音信号的高斯混合模型。 输入单元(10)接收包括语音信号的信号。 分析单元(30)通过将通过使用输入的语音信号的组延迟获得的特征向量与存储在存储单元中的高斯混合模型进行比较来检测输入的语音信号的性别。 性别识别系统包括语音检测器(20),其仅从输入单元中输入的信号中提取语音信号,并将提取的语音信号发送到分析单元。
-
公开(公告)号:KR100822024B1
公开(公告)日:2008-04-15
申请号:KR1020070076335
申请日:2007-07-30
Abstract: An acoustic signal based environment recognizing method for a context-aware communication terminal is provided to perform environment recognition without a special feature vector extracting process by using only an important parameter automatically in an encoding process. An environment recognizing method comprises the following steps of: configuring a Gaussian mixture model by using one or more feature vectors extracted from a real time voice signal during an SMV(Selectable Mode Vocoder) encoding process; obtaining likelihood for the real time voice signal by using the pre-trained Gaussian mixture model; and selecting an environment flag corresponding to the maximum likelihood value.
Abstract translation: 提供了一种用于上下文感知通信终端的基于声学信号的环境识别方法,用于在编码处理中仅使用自动重要参数来执行环境识别,而无需特殊特征向量提取处理。 环境识别方法包括以下步骤:在SMV(可选模式声码器)编码处理期间,通过使用从实时语音信号提取的一个或多个特征向量来配置高斯混合模型; 通过使用预先训练的高斯混合模型获得实时语音信号的可能性; 以及选择与最大似然值相对应的环境标志。
-
公开(公告)号:KR100893154B1
公开(公告)日:2009-04-16
申请号:KR1020080100278
申请日:2008-10-13
IPC: G10L15/02
Abstract: A method for recognizing the sex of a voice signal and an apparatus therefor are provided to recognize the sex by differently applying the weighted values which are produced by using the distinctive weight learning according to each degree of an MFCC feature vector, thereby improving the sex recognition performance. A voice signal is inputted(S501). From the inputted voice signal, an MFCC feature vector is extracted(S502). According to each degree of the extracted MFCC feature vector, distinctive weighted values are given(S503). The sex of the inputted signal is recognized by a decision formula to which the distinctive weighted values given according to each of the degrees are applied. Particularly, the sex is distinguished through a value, obtained by multiplying the weighted values with each degree of MFCC feature vector, and a kernel function.
Abstract translation: 提供了一种用于识别语音信号的性别的方法及其装置,以通过不同地应用通过使用根据MFCC特征向量的各个度的独特权重学习产生的加权值来识别性别,由此改善性别识别 性能。 输入语音信号(S501)。 从输入的语音信号中,提取MFCC特征向量(S502)。 根据提取的MFCC特征向量的每个程度,给出特征加权值(S503)。 通过应用根据每个度数给出的特征加权值的判定公式来识别输入信号的性别。 特别地,通过将加权值与MFCC特征向量的各个度乘以获得的值和核函数来区分性别。
-
公开(公告)号:KR100839065B1
公开(公告)日:2008-06-19
申请号:KR1020070076351
申请日:2007-07-30
Abstract: A voice detector and a voice detection method are provided to apply an optimized weight derived from a minimum classification error scheme to each frequency band likelihood ratio and calculate the geometric average value of the each frequency band likelihood ratio, thereby improving voice detection performance and detecting a voice signal efficiently even in environment with heavy noise. A voice detector comprises an input unit(10) and an analysis unit(20). The input unit receives a signal including a voice signal and performs the DFT(Discrete Fourier Transform) conversion of the received signal. The analysis unit comprises a likelihood ratio calculating module(21), a weight applying module(22), and a voice detecting module(23). The likelihood ratio calculating module calculates a likelihood ratio by frequency band of the converted signal. The weight applying module multiplies the likelihood ratio by frequency band by different weights to calculate the likelihood ratio to which the weight is applied. The voice detecting module compares a geometric average value of the calculated likelihood ratios with a preset value to measure whether the converted signal includes the voice signal. Further, the weight applying module calculates weight in frequency band by an MCE(minimum classification error) method.
Abstract translation: 提供语音检测器和语音检测方法,对每个频带似然比应用从最小分类误差方案导出的优化权重,并计算每个频带似然比的几何平均值,从而提高语音检测性能并检测 即使在环境噪音较大的情况下也能有效地发出语音信号。 语音检测器包括输入单元(10)和分析单元(20)。 输入单元接收包括语音信号的信号,并执行接收信号的DFT(离散傅里叶变换)转换。 分析单元包括似然比计算模块(21),加权应用模块(22)和语音检测模块(23)。 似然比计算模块根据转换后的信号的频带计算似然比。 权重应用模块将似然比乘以频带不同的权重,以计算应用权重的似然比。 语音检测模块将计算出的似然比的几何平均值与预设值进行比较,以测量转换的信号是否包括语音信号。 此外,权重施加模块通过MCE(最小分类误差)方法计算频带的权重。
-
公开(公告)号:KR100530261B1
公开(公告)日:2005-11-22
申请号:KR1020030014814
申请日:2003-03-10
Applicant: 한국전자통신연구원
IPC: G10L25/93
Abstract: 본 발명은 노이즈 환경에서도 복잡한 파라메터 계산을 요구하지 않으면서도 높은 성능을 가진 유성음과 무성음 판별장치 및 그 방법에 관한 것이다.
본 발명은 노이즈가 포함된 음성 신호를 주파수영역의 신호로 변환하는 신호 변환부와; 상기 주파수영역 신호로부터 잡음 파워를 추정하는 잡음 파워 추정부; 상기 추정된 잡음 파워를 참조하여 저주파 대역에서 상기 주파수영역 신호의 음성 검출의 결정 규칙을 계산하는 제 1 우도비 계산부; 상기 추정된 잡음 파워를 참조하여 고주파 대역에서 상기 주파수영역 신호의 음성 검출의 결정 규칙을 계산하는 제 2 우도비 계산부; 및 상기 제 1 우도비 계산부와 제 2 우도비 계산부에서 계산된 우도비에 기초하여 유성음/무성음을 결정하는 최종 우도비 검정부를 포함한다. 여기서 제 1 우도비 계산부 및 제 2 우도비 계산부는 통계적 모델을 기초로 하여 각각 저주파 대역과 고주파 대역의 음성 검출의 결정 규칙을 계산하게 된다.-
公开(公告)号:KR1020040056977A
公开(公告)日:2004-07-01
申请号:KR1020020083728
申请日:2002-12-24
Applicant: 한국전자통신연구원
IPC: G10L15/14
CPC classification number: G10L25/78
Abstract: PURPOSE: A voice activity detecting apparatus and method using a complex Laplacian statistical model are provided to compare a Laplacian model to a Gaussian model. CONSTITUTION: A voice activity detector using a complex Laplacian statistical model includes a fast Fourier converter(10), a noise power estimator(20), and a likelihood ratio test calculator(30). The fast Fourier converter performs fast Fourier conversion for an input speech to allow a speech signal of the time domain to be analyzed in the frequency domain. The noise power estimator estimates power of a noise signal from a speech signal contaminated with a noise on the frequency domain, output from the fast Fourier converter. The likelihood ratio test calculator calculates a decision rule of voice activity detection from the power of the noise signal and the complex Laplacian statistical model.
Abstract translation: 目的:提供一种使用复数拉普拉斯统计模型的语音活动检测装置和方法,以将拉普拉斯模型与高斯模型进行比较。 构成:使用复数拉普拉斯统计模型的语音活动检测器包括快速傅里叶变换器(10),噪声功率估计器(20)和似然比测试计算器(30)。 快速傅立叶变换器对输入语音执行快速傅里叶变换,以允许在频域中分析时域的语音信号。 噪声功率估计器从频域的噪声污染的语音信号估计噪声信号的功率,从快速傅里叶变换器输出。 似然比测试计算器根据噪声信号的功率和复数拉普拉斯统计模型计算语音活动检测的判定规则。
-
-
-
-
-
-
-
-
-