System and Method to Correct for Packet Loss in ASR Systems
    2.
    发明申请
    System and Method to Correct for Packet Loss in ASR Systems 审中-公开
    系统和方法来纠正ASR系统中的丢包

    公开(公告)号:US20150255075A1

    公开(公告)日:2015-09-10

    申请号:US14638198

    申请日:2015-03-04

    Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.

    Abstract translation: 提出了一种用于在自动语音识别(ASR)系统中校正音频中的分组丢失的系统和方法。 如本文所示,分组丢失校正发生在识别阶段,而不修改在训练期间产生的任何声学模型。 因此,在没有丢包的情况下,ASR引擎的行为不会改变。 为了实现这一点,实际输入信号可以被纠正,识别分数可以被归一化以考虑信号误差,并且可以使用使用来自先前帧和声学模型的信息的最佳估计方法来代替噪声信号。

    System and method for speaker change detection

    公开(公告)号:US10535000B2

    公开(公告)日:2020-01-14

    申请号:US15727498

    申请日:2017-10-06

    Abstract: A method for training a neural network of a neural network based speaker classifier for use in speaker change detection. The method comprises: a) preprocessing input speech data; b) extracting a plurality of feature frames from the preprocessed input speech data; c) normalizing the extracted feature frames of each speaker within the preprocessed input speech data with each speaker's mean and variance; d) concatenating the normalized feature frames to form overlapped longer frames having a frame length and a hop size; e) inputting the overlapped longer frames to the neural network based speaker classifier; and f) training the neural network through forward-backward propagation.

    Method and system for acoustic data selection for training the parameters of an acoustic model

    公开(公告)号:US10157610B2

    公开(公告)日:2018-12-18

    申请号:US15850106

    申请日:2017-12-21

    Abstract: A system and method are presented for acoustic data selection of a particular quality for training the parameters of an acoustic model, such as a Hidden Markov Model and Gaussian Mixture Model, for example, in automatic speech recognition systems in the speech analytics field. A raw acoustic model may be trained using a given speech corpus and maximum likelihood criteria. A series of operations are performed, such as a forced Viterbi-alignment, calculations of likelihood scores, and phoneme recognition, for example, to form a subset corpus of training data. During the process, audio files of a quality that does not meet a criterion, such as poor quality audio files, may be automatically rejected from the corpus. The subset may then be used to train a new acoustic model.

    SYSTEM AND METHOD FOR OUTLIER IDENTIFICATION TO REMOVE POOR ALIGNMENTS IN SPEECH SYNTHESIS
    10.
    发明申请
    SYSTEM AND METHOD FOR OUTLIER IDENTIFICATION TO REMOVE POOR ALIGNMENTS IN SPEECH SYNTHESIS 有权
    用于在语音合成中移除不良对齐的外部识别的系统和方法

    公开(公告)号:US20160365085A1

    公开(公告)日:2016-12-15

    申请号:US14737080

    申请日:2015-06-11

    Abstract: A system and method are presented for outlier identification to remove poor alignments in speech synthesis. The quality of the output of a text-to-speech system directly depends on the accuracy of alignments of a speech utterance. The identification of mis-alignments and mis-pronunciations from automated alignments may be made based on fundamental frequency methods and group delay based outlier methods. The identification of these outliers allows for their removal, which improves the synthesis quality of the text-to-speech system.

    Abstract translation: 提出了一种系统和方法,用于异常值识别,以消除语音合成中的不良对准。 文本到语音系统的输出质量直接取决于语音语音对齐的准确性。 可以基于基本频率方法和基于组延迟的异常方法来识别自动比对的误排列和错误发音。 识别这些异常值允许它们的去除,这提高了文本到语音系统的综合质量。

Patent Agency Ranking