Unified recognition of speech and music
    112.
    发明授权
    Unified recognition of speech and music 有权
    语音和音乐的统一认可

    公开(公告)号:US09224385B1

    公开(公告)日:2015-12-29

    申请号:US13919170

    申请日:2013-06-17

    Applicant: Google Inc.

    Abstract: Methods, systems, and computer programs are presented for unified recognition of speech and music. One method includes an operation for starting an audio recognition mode by a computing device while receiving an audio stream. Segments of the audio stream are analyzed as the audio stream is received, where the analysis includes simultaneous checking for speech and music. Further, the method includes an operation for determining a first confidence score for speech and a second confidence score for music. As the audio stream is received, additional segments are analyzed until the end of the audio stream or until the first and second confidence scores indicate that the audio stream has been identified as speech or music. Further, results are presented on a display based on the identification of the audio stream, including text entered if the audio stream was speech or song information if the audio stream was music.

    Abstract translation: 提出方法,系统和计算机程序,用于统一识别语音和音乐。 一种方法包括在接收音频流的同时由计算设备启动音频识别模式的操作。 当接收到音频流时,分析音频流的分段,其中分析包括语音和音乐的同时检查。 此外,该方法包括用于确定用于语音的第一可信度得分和用于音乐的第二可信度得分的操作。 当音频流被接收时,分析附加段直到音频流的结束,或者直到第一和第二置信度得分指示音频流已经被识别为语音或音乐。 此外,如果音频流是音乐,则在显示器上显示结果,该显示器基于音频流的标识,包括输入的文本,如果音频流是语音或歌曲信息。

    IDF weighting of LSH bands for live reference ingestion
    113.
    发明授权
    IDF weighting of LSH bands for live reference ingestion 有权
    用于实时参考摄取的LSH带的IDF加权

    公开(公告)号:US09208154B1

    公开(公告)日:2015-12-08

    申请号:US14458387

    申请日:2014-08-13

    Applicant: Google Inc.

    Abstract: Down scoring overcrowded bands via IDF weighting scores provides a soft way to reduce the effect of common bands from Locality Sensitive Hashing (LSH) processes. An index component indexes live video references of a live streaming infrastructure pathway process in a reference index. A scoring component scores a set of bands with a set of inverse document frequency (IDF) weighting scores in the reference index. A high score is generated for bands that are featured in a small number of references and a low score is generated for bands featured in a high number of references.

    Abstract translation: 通过IDF加权分数的下划线过度拥挤的频带提供了一种柔性的方法来减少局部敏感哈希(LSH)过程中常用频带的影响。 索引组件在参考索引中索引实况流基础设施路径进程的实时视频参考。 评分组件在参考指标中以一组逆文档频率(IDF)加权分数对一组频带进行评分。 对于以少量参考为特征的频带,产生高分,并且对于大量参考中的频带生成低分数。

    Classifying music by genre using discrete cosine transforms
    114.
    发明授权
    Classifying music by genre using discrete cosine transforms 有权
    使用离散余弦变换对流派进行音乐分类

    公开(公告)号:US09055376B1

    公开(公告)日:2015-06-09

    申请号:US13791131

    申请日:2013-03-08

    Applicant: Google Inc.

    CPC classification number: H04R3/00 G06F17/30743 H04R2430/03

    Abstract: Systems and methods are provided herein relating to audio classification. Genres of music can be identified by detecting unique spectral features inherent to those genres. One example genre detected is techno music. Two dimensional discrete cosine transforms can be generated for consecutive windows of the spectrogram or chromagram. A max value of the energy of portions of the two dimensional discrete cosine transforms can be determined. The max value can be normalized and aggregated with max values related to neighboring windows. If the aggregate scores meet a genre threshold, the audio sample, or portions thereof, can be associated with a genre of music.

    Abstract translation: 本文提供了与音频分类有关的系统和方法。 可以通过检测这些类型固有的独特光谱特征来识别音乐类型。 检测到的一个例子是技术音乐。 可以为光谱图或色谱图的连续窗口生成二维离散余弦变换。 可以确定二维离散余弦变换的部分的能量的最大值。 最大值可以与相邻窗口相关的最大值进行归一化和聚合。 如果聚合分数满足类型阈值,则音频样本或其部分可以与音乐类型相关联。

    SPEAKER IDENTIFICATION
    115.
    发明申请
    SPEAKER IDENTIFICATION 有权
    扬声器识别

    公开(公告)号:US20150127342A1

    公开(公告)日:2015-05-07

    申请号:US14523198

    申请日:2014-10-24

    Applicant: Google Inc.

    CPC classification number: G10L17/02 G10L17/005 G10L17/08 G10L17/18 G10L25/51

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, an utterance vector that is derived from an utterance is obtained. Hash values are determined for the utterance vector according to multiple different hash functions. A set of speaker vectors from a plurality of hash tables is determined using the hash values, where each speaker vector was derived from one or more utterances of a respective speaker. The speaker vectors in the set are compared with the utterance vector. A speaker vector is selected based on comparing the speaker vectors in the set with the utterance vector.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的用于执行说话人识别的计算机程序。 在一些实现中,获得从话语导出的话语向量。 根据多个不同的哈希函数为发声向量确定哈希值。 使用散列值来确定来自多个散列表的一组扬声器向量,其中每个扬声器向量是从相应说话者的一个或多个话语导出的。 将集合中的扬声器矢量与发声矢量进行比较。 基于将集合中的扬声器矢量与发声矢量进行比较来选择扬声器矢量。

    Query response using media consumption history
    116.
    发明授权
    Query response using media consumption history 有权
    使用媒体消费历史查询响应

    公开(公告)号:US09002835B2

    公开(公告)日:2015-04-07

    申请号:US14047708

    申请日:2013-10-07

    Applicant: Google Inc.

    Inventor: Matthew Sharifi

    Abstract: Methods, systems, and apparatus for receiving a natural language query of a user, and environmental data, identifying a media item based on the environmental data, determining an entity type based on the natural language query, selecting an entity associated with the media item that matches the entity type, selecting, from a media consumption database that identifies media items that have been indicated as consumed by the user, one or more media items that have been indicated as consumed by the user and that are associated with the selected entity, and providing a response to the query based on selecting the one or more media items that have been indicated as consumed by the user and that are associated with the selected entity.

    Abstract translation: 用于接收用户的自然语言查询的方法,系统和装置,以及环境数据,基于环境数据识别媒体项目,基于自然语言查询确定实体类型,选择与媒体项目相关联的实体, 匹配实体类型,从媒体消费数据库中选择,该媒体消费数据库标识已被指示为用户消费的媒体项目,已被指示为由用户消费并且与所选择的实体相关联的一个或多个媒体项目,以及 基于选择已被指示为由用户消费并且与所选择的实体相关联的一个或多个媒体项来向所述查询提供响应。

    IDF weighting of LSH bands for live reference ingestion
    118.
    发明授权
    IDF weighting of LSH bands for live reference ingestion 有权
    用于实时参考摄取的LSH带的IDF加权

    公开(公告)号:US08838609B1

    公开(公告)日:2014-09-16

    申请号:US13648511

    申请日:2012-10-10

    Applicant: Google Inc.

    Abstract: Down scoring overcrowded bands via IDF weighting scores provides a soft way to reduce the effect of common bands from Locality Sensitive Hashing (LSH) processes. An index component indexes live video references of a live streaming infrastructure pathway process in a reference index. A scoring component scores a set of bands with a set of inverse document frequency (IDF) weighting scores in the reference index. A high score is generated for bands that are featured in a small number of references and a low score is generated for bands featured in a high number of references.

    Abstract translation: 通过IDF加权分数的下划线过度拥挤的频带提供了一种柔性的方法来减少局部敏感哈希(LSH)过程中常用频带的影响。 索引组件在参考索引中索引实况流基础设施路径进程的实时视频参考。 评分组件在参考指标中以一组逆文档频率(IDF)加权分数对一组频带进行评分。 对于以少量参考为特征的频带,产生高分,并且对于大量参考中的频带生成低分数。

    ANSWERING QUESTIONS USING ENVIRONMENTAL CONTEXT
    119.
    发明申请
    ANSWERING QUESTIONS USING ENVIRONMENTAL CONTEXT 审中-公开
    使用环境语境解答问题

    公开(公告)号:US20140074466A1

    公开(公告)日:2014-03-13

    申请号:US13626439

    申请日:2012-09-25

    Applicant: GOOGLE INC.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data encoding an utterance and environmental data, obtaining a transcription of the utterance, identifying an entity using the environmental data, submitting a query to a natural language query processing engine, wherein the query includes at least a portion of the transcription and data that identifies the entity, and obtaining one or more results of the query.

    Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于接收编码话语和环境数据的音频数据,获得话语的转录,使用环境数据识别实体,向自然语言提交查询 查询处理引擎,其中查询包括识别实体的转录和数据的至少一部分,以及获得查询的一个或多个结果。

    Selectively obscuring private information based on contextual information

    公开(公告)号:US10311249B2

    公开(公告)日:2019-06-04

    申请号:US15476392

    申请日:2017-03-31

    Applicant: Google Inc.

    Abstract: A method includes determining, based at least in part on a type of information to be displayed at a display device associated with a computing device, a privacy level for the information to be displayed; and determining whether the privacy level satisfies a threshold privacy level. The method also includes, responsive to determining that the privacy level satisfies the threshold privacy level, determining whether an individual not associated with a currently active user account of the computing device is proximate to the display device. The method also includes determining an estimated speed of the individual not associated with the currently active user account relative to the display device. The method further includes determining, whether the estimated speed satisfies a threshold speed, and responsive to determining that the estimated speed satisfies the threshold speed, outputting the information such that at least a first portion of the information is obscured.

Patent Agency Ranking