Patent search ap:("Interactive Intelligence Group Page Inc.") AND inv:"Ananth Nagaraja Iyer"

1.

发明授权
System and method for optimization of audio fingerprint search 有权

公开(公告)号：US10303800B2

公开(公告)日：2019-05-28

申请号：US14636474

申请日：2015-03-03

Applicant: Interactive Intelligence Group, Inc.

Inventor： Srinath Cheluvaraja , Ananth Nagaraja Iyer , Felix Immanuel Wyss

IPC: G06F17/00 , G06F17/30 , G10L25/51

Abstract: A system and method are presented for optimization of audio fingerprint search. In an embodiment, the audio fingerprints are organized into a recursive tree with different branches containing fingerprint sets that are dissimilar to each other. The tree is constructed using a clustering algorithm based on a similarity measure. The similarity measure may comprise a Hamming distance for a binary fingerprint or a Euclidean distance for continuous valued fingerprints. In another embodiment, each fingerprint is stored at a plurality of resolutions and clustering is performed hierarchically. The recognition of an incoming fingerprint begins from the root of the tree and proceeds down its branches until a match or mismatch is declared. In yet another embodiment, a fingerprint definition is generalized to include more detailed audio information than in the previous definition.

2.

发明授权
System and method for speaker change detection 有权

公开(公告)号：US10535000B2

公开(公告)日：2020-01-14

申请号：US15727498

申请日：2017-10-06

Applicant: INTERACTIVE INTELLIGENCE GROUP, INC.

Inventor： Zhenhao Ge , Ananth Nagaraja Iyer , Srinath Cheluvaraja , Aravind Ganapathiraju

IPC: G06N3/08 , G10L17/04 , G10L17/00 , G10L17/18 , G10L15/02

Abstract: A method for training a neural network of a neural network based speaker classifier for use in speaker change detection. The method comprises: a) preprocessing input speech data; b) extracting a plurality of feature frames from the preprocessed input speech data; c) normalizing the extracted feature frames of each speaker within the preprocessed input speech data with each speaker's mean and variance; d) concatenating the normalized feature frames to form overlapped longer frames having a frame length and a hop size; e) inputting the overlapped longer frames to the neural network based speaker classifier; and f) training the neural network through forward-backward propagation.

3.

发明授权
System and method for learning alternate pronunciations for speech recognition 有权

公开(公告)号：US09767792B2

公开(公告)日：2017-09-19

申请号：US15291353

申请日：2016-10-12

Applicant: Interactive Intelligence Group, Inc.

Inventor： Zhenhao Ge , Vivek Tyagi , Aravind Ganapathiraju , Ananth Nagaraja Iyer , Scott Allen Randal , Felix Immanuel Wyss

IPC: G10L15/06 , G10L15/187 , G06F17/28 , G09B19/04 , G09B19/06 , G06F17/27 , G10L15/14 , G10L15/08

CPC classification number: G10L15/063 , G06F17/2735 , G06F17/28 , G09B19/04 , G09B19/06 , G10L15/14 , G10L15/187 , G10L2015/081

Abstract: A system and method for learning alternate pronunciations for speech recognition is disclosed. Alternative name pronunciations may be covered, through pronunciation learning, that have not been previously covered in a general pronunciation dictionary. In an embodiment, the detection of phone-level and syllable-level mispronunciations in words and sentences may be based on acoustic models trained by Hidden Markov Models. Mispronunciations may be detected by comparing the likelihood of the potential state of the targeting pronunciation unit with a pre-determined threshold through a series of tests. It is also within the scope of an embodiment to detect accents.

4.

发明授权
System and method to correct for packet loss in automatic speech recognition systems utilizing linear interpolation 有权

公开(公告)号：US10157620B2

公开(公告)日：2018-12-18

申请号：US14638198

申请日：2015-03-04

Applicant: Interactive Intelligence Group, Inc.

Inventor： Srinath Cheluvaraja , Ananth Nagaraja Iyer , Aravind Ganapathiraju , Felix Immanuel Wyss

IPC: G10L19/005 , G10L15/08 , G10L15/14 , G10L15/20 , G10L19/00 , G10L15/02 , G10L25/18 , G10L25/21

Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.

5.

发明申请
TECHNOLOGIES FOR AUTHENTICATING A SPEAKER USING VOICE BIOMETRICS 审中-公开

公开(公告)号：US20170352353A1

公开(公告)日：2017-12-07

申请号：US15612898

申请日：2017-06-02

Applicant: Interactive Intelligence Group, Inc.

Inventor： Rajesh Dachiraju , Aravind Ganapathiraju , Ananth Nagaraja Iyer , Felix Immanuel Wyss

IPC: G10L17/22 , G10L21/038 , G10L17/04 , G10L25/12 , G10L17/02

CPC classification number: G10L17/22 , G10L15/02 , G10L15/28 , G10L17/02 , G10L17/04 , G10L21/038 , G10L25/12

Abstract: Technologies for authenticating a speaker in a voice authentication system using voice biometrics include a speech collection computing device and a speech authentication computing device. The speech collection computing device is configured to collect a speech signal from a speaker and transmit the speech signal to the speech authentication computing device. The speech authentication computing device is configured to compute a speech signal feature vector for the received speech signal, retrieve a speech signal classifier associated with the speaker, and feed the speech signal feature vector to the retrieved speech signal classifier. Additionally, the speech authentication computing device is configured to determine whether the speaker is an authorized speaker based on an output of the retrieved speech signal classifier. Additional embodiments are described herein.

6.

发明申请
System and Method for Learning Alternate Pronunciations for Speech Recognition 有权
Title translation: 学习用于语音识别的替代发音的系统和方法

公开(公告)号：US20170032780A1

公开(公告)日：2017-02-02

申请号：US15291353

申请日：2016-10-12

Applicant: Interactive Intelligence Group, Inc.

Inventor： Zhenhao Ge , Vivek Tyagi , Aravind Ganapathiraju , Ananth Nagaraja Iyer , Scott Allen Randal , Felix Immanuel Wyss

IPC: G10L15/06 , G09B19/06 , G06F17/27 , G09B19/04 , G10L15/187 , G10L15/14

CPC classification number: G10L15/063 , G06F17/2735 , G06F17/28 , G09B19/04 , G09B19/06 , G10L15/14 , G10L15/187 , G10L2015/081

Abstract: A system and method for learning alternate pronunciations for speech recognition is disclosed. Alternative name pronunciations may be covered, through pronunciation learning, that have not been previously covered in a general pronunciation dictionary. In an embodiment, the detection of phone-level and syllable-level mispronunciations in words and sentences may be based on acoustic models trained by Hidden Markov Models. Mispronunciations may be detected by comparing the likelihood of the potential state of the targeting pronunciation unit with a pre-determined threshold through a series of tests. It is also within the scope of an embodiment to detect accents.

Abstract translation: 公开了用于学习语音识别的替代发音的系统和方法。通过发音学习可以覆盖另类名称发音，这些发音先前未被一般发音词典涵盖。在一个实施例中，在单词和句子中检测电话级和音节级错误可以基于由隐马尔可夫模型训练的声学模型。可以通过一系列测试来比较目标语音单元的潜在状态与预定阈值的可能性来检测微分。检测重音也属于实施例的范围。

7.

发明申请
SYSTEM AND METHOD FOR OPTIMIZATION OF AUDIO FINGERPRINT SEARCH 审中-公开
Title translation: 用于优化音频指纹搜索的系统和方法

公开(公告)号：US20150254338A1

公开(公告)日：2015-09-10

申请号：US14636474

申请日：2015-03-03

Applicant: Interactive Intelligence Group, Inc.

Inventor： Srinath Cheluvaraja , Ananth Nagaraja Iyer , Felix Immanuel Wyss

IPC: G06F17/30 , G10L19/018

CPC classification number: G06F17/30743 , G10L25/51

Abstract: A system and method are presented for optimization of audio fingerprint search. In an embodiment, the audio fingerprints are organized into a recursive tree with different branches containing fingerprint sets that are dissimilar to each other. The tree is constructed using a clustering algorithm based on a similarity measure. The similarity measure may comprise a Hamming distance for a binary fingerprint or a Euclidean distance for continuous valued fingerprints. In another embodiment, each fingerprint is stored at a plurality of resolutions and clustering is performed hierarchically. The recognition of an incoming fingerprint begins from the root of the tree and proceeds down its branches until a match or mismatch is declared. In yet another embodiment, a fingerprint definition is generalized to include more detailed audio information than in the previous definition.

Abstract translation: 提出了一种用于音频指纹搜索优化的系统和方法。在一个实施例中，音频指纹被组织成具有不同分支的递归树，该分支包含彼此不相似的指纹集。使用基于相似性度量的聚类算法构建树。相似性度量可以包括二进制指纹的汉明距离或连续值指纹的欧几里德距离。在另一个实施例中，每个指纹以多个分辨率存储，并且分层地进行聚类。输入指纹的识别从树的根开始，并向下延伸到其分支，直到声明匹配或不匹配。在另一个实施例中，指纹定义被概括为包括比先前定义中更详细的音频信息。

8.

发明申请
System and Method for Learning Alternate Pronunciations for Speech Recognition 有权
Title translation: 学习用于语音识别的替代发音的系统和方法

公开(公告)号：US20150106082A1

公开(公告)日：2015-04-16

申请号：US14515607

申请日：2014-10-16

Applicant: Interactive Intelligence Group, Inc.

Inventor： Zhenhao Ge , Vivek Tyagi , Aravind Ganapathiraju , Ananth Nagaraja Iyer , Scott Allen Randal , Felix Immanuel Wyss

IPC: G10L15/187 , G06F17/28 , G09B19/04 , G06F17/27

CPC classification number: G10L15/063 , G06F17/2735 , G06F17/28 , G09B19/04 , G09B19/06 , G10L15/14 , G10L15/187 , G10L2015/081

Abstract: A system and method for learning alternate pronunciations for speech recognition is disclosed. Alternative name pronunciations may be covered, through pronunciation learning, that have not been previously covered in a general pronunciation dictionary. In an embodiment, the detection of phone-level and syllable-level mispronunciations in words and sentences may be based on acoustic models trained by Hidden Markov Models. Mispronunciations may be detected by comparing the likelihood of the potential state of the targeting pronunciation unit with a pre-determined threshold through a series of tests. It is also within the scope of an embodiment to detect accents.

Abstract translation: 公开了用于学习语音识别的替代发音的系统和方法。通过发音学习可以覆盖另类名称发音，这些发音先前未被一般发音词典涵盖。在一个实施例中，在单词和句子中检测电话级和音节级错误可以基于由隐马尔可夫模型训练的声学模型。可以通过一系列测试来比较目标语音单元的潜在状态与预定阈值的可能性来检测微分。检测重音也属于实施例的范围。

9.

发明授权
System and method for neural network based feature extraction for acoustic model development 有权

公开(公告)号：US10283112B2

公开(公告)日：2019-05-07

申请号：US15905789

申请日：2018-02-26

Applicant: Interactive Intelligence Group, Inc.

Inventor： Srinath Cheluvaraja , Ananth Nagaraja Iyer

IPC: G10L15/00 , G10L25/00 , G10L15/14 , G10L15/16 , G10L25/24 , G10L25/27 , G06N3/02

Abstract: A system and method are presented for neural network based feature extraction for acoustic model development. A neural network may be used to extract acoustic features from raw MFCCs or the spectrum, which are then used for training acoustic models for speech recognition systems. Feature extraction may be performed by optimizing a cost function used in linear discriminant analysis. General non-linear functions generated by the neural network are used for feature extraction. The transformation may be performed using a cost function from linear discriminant analysis methods which perform linear operations on the MFCCs and generate lower dimensional features for speech recognition. The extracted acoustic features may then be used for training acoustic models for speech recognition systems.

10.

发明申请
SYSTEM AND METHOD FOR NEURAL NETWORK BASED FEATURE EXTRACTION FOR ACOUSTIC MODEL DEVELOPMENT 审中-公开

公开(公告)号：US20180190267A1

公开(公告)日：2018-07-05

申请号：US15905789

申请日：2018-02-26

Applicant: Interactive Intelligence Group, Inc.

Inventor： Srinath Cheluvaraja , Ananth Nagaraja Iyer

IPC: G10L15/14 , G10L25/27 , G10L25/24 , G06N3/02 , G10L15/16

CPC classification number: G10L15/144 , G06N3/02 , G10L15/14 , G10L15/16 , G10L25/24 , G10L25/27

Abstract: A system and method are presented for neural network based feature extraction for acoustic model development. A neural network may be used to extract acoustic features from raw MFCCs or the spectrum, which are then used for training acoustic models for speech recognition systems. Feature extraction may be performed by optimizing a cost function used in linear discriminant analysis. General non-linear functions generated by the neural network are used for feature extraction. The transformation may be performed using a cost function from linear discriminant analysis methods which perform linear operations on the MFCCs and generate lower dimensional features for speech recognition. The extracted acoustic features may then be used for training acoustic models for speech recognition systems.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification