System and method for optimization of audio fingerprint search

    公开(公告)号:US10303800B2

    公开(公告)日:2019-05-28

    申请号:US14636474

    申请日:2015-03-03

    Abstract: A system and method are presented for optimization of audio fingerprint search. In an embodiment, the audio fingerprints are organized into a recursive tree with different branches containing fingerprint sets that are dissimilar to each other. The tree is constructed using a clustering algorithm based on a similarity measure. The similarity measure may comprise a Hamming distance for a binary fingerprint or a Euclidean distance for continuous valued fingerprints. In another embodiment, each fingerprint is stored at a plurality of resolutions and clustering is performed hierarchically. The recognition of an incoming fingerprint begins from the root of the tree and proceeds down its branches until a match or mismatch is declared. In yet another embodiment, a fingerprint definition is generalized to include more detailed audio information than in the previous definition.

    System and method for speaker change detection

    公开(公告)号:US10535000B2

    公开(公告)日:2020-01-14

    申请号:US15727498

    申请日:2017-10-06

    Abstract: A method for training a neural network of a neural network based speaker classifier for use in speaker change detection. The method comprises: a) preprocessing input speech data; b) extracting a plurality of feature frames from the preprocessed input speech data; c) normalizing the extracted feature frames of each speaker within the preprocessed input speech data with each speaker's mean and variance; d) concatenating the normalized feature frames to form overlapped longer frames having a frame length and a hop size; e) inputting the overlapped longer frames to the neural network based speaker classifier; and f) training the neural network through forward-backward propagation.

    SYSTEM AND METHOD FOR OPTIMIZATION OF AUDIO FINGERPRINT SEARCH
    7.
    发明申请
    SYSTEM AND METHOD FOR OPTIMIZATION OF AUDIO FINGERPRINT SEARCH 审中-公开
    用于优化音频指纹搜索的系统和方法

    公开(公告)号:US20150254338A1

    公开(公告)日:2015-09-10

    申请号:US14636474

    申请日:2015-03-03

    CPC classification number: G06F17/30743 G10L25/51

    Abstract: A system and method are presented for optimization of audio fingerprint search. In an embodiment, the audio fingerprints are organized into a recursive tree with different branches containing fingerprint sets that are dissimilar to each other. The tree is constructed using a clustering algorithm based on a similarity measure. The similarity measure may comprise a Hamming distance for a binary fingerprint or a Euclidean distance for continuous valued fingerprints. In another embodiment, each fingerprint is stored at a plurality of resolutions and clustering is performed hierarchically. The recognition of an incoming fingerprint begins from the root of the tree and proceeds down its branches until a match or mismatch is declared. In yet another embodiment, a fingerprint definition is generalized to include more detailed audio information than in the previous definition.

    Abstract translation: 提出了一种用于音频指纹搜索优化的系统和方法。 在一个实施例中,音频指纹被组织成具有不同分支的递归树,该分支包含彼此不相似的指纹集。 使用基于相似性度量的聚类算法构建树。 相似性度量可以包括二进制指纹的汉明距离或连续值指纹的欧几里德距离。 在另一个实施例中,每个指纹以多个分辨率存储,并且分层地进行聚类。 输入指纹的识别从树的根开始,并向下延伸到其分支,直到声明匹配或不匹配。 在另一个实施例中,指纹定义被概括为包括比先前定义中更详细的音频信息。

    System and Method for Learning Alternate Pronunciations for Speech Recognition
    8.
    发明申请
    System and Method for Learning Alternate Pronunciations for Speech Recognition 有权
    学习用于语音识别的替代发音的系统和方法

    公开(公告)号:US20150106082A1

    公开(公告)日:2015-04-16

    申请号:US14515607

    申请日:2014-10-16

    Abstract: A system and method for learning alternate pronunciations for speech recognition is disclosed. Alternative name pronunciations may be covered, through pronunciation learning, that have not been previously covered in a general pronunciation dictionary. In an embodiment, the detection of phone-level and syllable-level mispronunciations in words and sentences may be based on acoustic models trained by Hidden Markov Models. Mispronunciations may be detected by comparing the likelihood of the potential state of the targeting pronunciation unit with a pre-determined threshold through a series of tests. It is also within the scope of an embodiment to detect accents.

    Abstract translation: 公开了用于学习语音识别的替代发音的系统和方法。 通过发音学习可以覆盖另类名称发音,这些发音先前未被一般发音词典涵盖。 在一个实施例中,在单词和句子中检测电话级和音节级错误可以基于由隐马尔可夫模型训练的声学模型。 可以通过一系列测试来比较目标语音单元的潜在状态与预定阈值的可能性来检测微分。 检测重音也属于实施例的范围。

Patent Agency Ranking