TEXT-DEPENDENT SPEAKER IDENTIFICATION
    1.
    发明申请
    TEXT-DEPENDENT SPEAKER IDENTIFICATION 有权
    文本依赖性扬声器识别

    公开(公告)号:US20150294670A1

    公开(公告)日:2015-10-15

    申请号:US14612830

    申请日:2015-02-03

    Applicant: Google Inc.

    CPC classification number: G10L17/18 G10L17/005

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speaker verification. The methods, systems, and apparatus include actions of inputting speech data that corresponds to a particular utterance to a first neural network and determining an evaluation vector based on output at a hidden layer of the first neural network. Additional actions include obtaining a reference vector that corresponds to a past utterance of a particular speaker. Further actions include inputting the evaluation vector and the reference vector to a second neural network that is trained on a set of labeled pairs of feature vectors to identify whether speakers associated with the labeled pairs of feature vectors are the same speaker. More actions include determining, based on an output of the second neural network, whether the particular utterance was likely spoken by the particular speaker.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于说话者验证的计算机程序。 方法,系统和装置包括将对应于特定话语的语音数据输入到第一神经网络并基于第一神经网络的隐藏层处的输出来确定评估向量的动作。 附加动作包括获得对应于特定说话者的过去话语的参考矢量。 进一步的动作包括将评估向量和参考矢量输入到第二神经网络,该第二神经网络被训练在一组标记的特征矢量对上,以识别与标记的特征矢量对相关联的扬声器是否是相同的扬声器。 更多的动作包括基于第二神经网络的输出确定特定话语是否可能由特定说话者说出。

    Systems and methods for performing actions in response to user gestures in captured images

    公开(公告)号:US09953216B2

    公开(公告)日:2018-04-24

    申请号:US14596168

    申请日:2015-01-13

    Applicant: Google Inc.

    Inventor: Raziel Alvarez

    CPC classification number: G06K9/00355 G06F3/017 G06K9/2081

    Abstract: Systems, methods, and computer-readable media are provided for performing actions in response to gestures made by a user in captured images. In accordance with one implementation, a computer-implemented system is provided that includes an image capture device that captures at least one image, a memory device that stores instructions, and at least one processor that executes the instructions stored in the memory device. In some implementations, the processor receives, from the image capture device, at least one image including a gesture made by a user and analyzes the at least one image to identify the gesture made by the user. In some implementations, the processor also determines, based on the identified gesture, one or more actions to perform on the at least one image.

    AUTOMATIC GAIN CONTROL FOR SPEECH RECOGNITION
    3.
    发明申请
    AUTOMATIC GAIN CONTROL FOR SPEECH RECOGNITION 有权
    用于语音识别的自动增益控制

    公开(公告)号:US20160099007A1

    公开(公告)日:2016-04-07

    申请号:US14727741

    申请日:2015-06-01

    Applicant: Google Inc.

    CPC classification number: G10L21/034 G10L25/78 H03G3/3005

    Abstract: This specification describes, among other things, a computer-implemented method. The method can include receiving a stream of audio data at a computing device. The stream of audio data can be segmented into a plurality of audio segments. Respective intensity levels are determined for each of the plurality of audio segments. For each of the plurality of audio segments and based on the respective intensity levels, a determination can be made as to whether the audio segment includes a speech signal. Selective gain control can be performed on the stream of audio data by automatically adjusting a gain of particular ones of the plurality of audio segments that are determined to include a speech signal.

    Abstract translation: 本说明书尤其描述了计算机实现的方法。 该方法可以包括在计算设备处接收音频数据流。 音频数据流可以被分割成多个音频段。 针对多个音频片段中的每一个确定相应的强度级别。 对于多个音频片段中的每一个并且基于相应的强度级别,可以确定音频片段是否包括语音信号。 可以通过自动调整被确定为包括语音信号的多个音频片段中的特定音频片段的增益,来对音频数据流执行选择性增益控制。

Patent Agency Ranking