-
公开(公告)号:US20160322066A1
公开(公告)日:2016-11-03
申请号:US13932198
申请日:2013-07-01
Applicant: Google Inc.
Inventor: Matthew Sharifi , Dominik Roblek
IPC: G10L25/81
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for analyzing an audio sample to determine whether the audio sample includes music audio data. One or more detectors, including a spectral fluctuation detector, a peak repetition detector, and a beat pitch detector, may analyze the audio sample and generate a score that represents whether the audio sample includes music audio data. One or more of the scores may be combined to determine whether the audio sample includes music audio data or non-music audio data.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于分析音频样本以确定音频样本是否包括音乐音频数据。 一个或多个检测器,包括光谱波动检测器,峰值重复检测器和拍频间隔检测器,可以分析音频样本并产生表示音频样本是否包括音乐音频数据的得分。 可以组合一个或多个分数以确定音频样本是否包括音乐音频数据或非音乐音频数据。
-
公开(公告)号:US09477709B2
公开(公告)日:2016-10-25
申请号:US14217940
申请日:2014-03-18
Applicant: Google Inc.
Inventor: Matthew Sharifi
CPC classification number: G06F17/30041 , G06F17/30026 , G06F17/30029 , G06F17/30035 , G06F17/30044 , G06F17/30401 , G06F17/30424 , G06F17/30477 , G06F17/3053 , G06F17/30746 , G06F17/30787 , G06F17/30867 , G06F17/30876 , G06Q30/02 , G06Q30/0631 , G10L25/54
Abstract: Methods, systems, and apparatus for receiving a natural language query of a user, and environmental data, identifying a media item based on the environmental data, determining an entity type based on the natural language query, selecting an entity associated with the media item that matches the entity type, selecting, from a media consumption database that identifies media items that have been indicated as consumed by the user, one or more media items that have been indicated as consumed by the user and that are associated with the selected entity, and providing a response to the query based on selecting the one or more media items that have been indicated as consumed by the user and that are associated with the selected entity.
-
公开(公告)号:US09435878B1
公开(公告)日:2016-09-06
申请号:US14488858
申请日:2014-09-17
Applicant: Google Inc.
Inventor: Matthew Sharifi
CPC classification number: G01S5/24 , G06F17/30743 , G10L25/48
Abstract: Systems and methods for determining location based on audio fingerprinting are disclosed. An extraction component extracts a set of interest points from an audio signal associated with an audio announcement. Then a matching component determines if the extracted set of interest points matches a set of interest points representative of an audio fingerprint in a data store comprising audio fingerprints. In an aspect, the audio fingerprints in the audio fingerprint data store represent announcements for underground transportation systems. A location component further determines location information associated with the audio fingerprint based in part on the set of extracted interest points matching the set of audio interest points representative of the audio fingerprint in the data store.
Abstract translation: 公开了基于音频指纹识别位置的系统和方法。 提取组件从与音频通知相关联的音频信号中提取一组感兴趣点。 然后,匹配组件确定所提取的感兴趣组是否与包括音频指纹的数据存储器中的表示音频指纹的一组感兴趣点匹配。 在一方面,音频指纹数据存储器中的音频指纹代表地下运输系统的公告。 位置组件还部分地基于与代表数据存储器中的音频指纹的音频兴趣点集合匹配的提取的兴趣点集合来确定与音频指纹相关联的位置信息。
-
公开(公告)号:US20160117072A1
公开(公告)日:2016-04-28
申请号:US14522927
申请日:2014-10-24
Applicant: GOOGLE INC.
Inventor: Matthew Sharifi , David Petrou
IPC: G06F3/0486 , G06F3/0484 , G06F3/0488
CPC classification number: G06F3/0486 , G06F3/04842 , G06F3/0488 , G06F3/04883 , G06F3/167 , G06F9/543 , G06F2203/04803
Abstract: Implementations provide an improved drag-and-drop operation on a mobile device. For example, a method includes identifying a drag area in a user interface of a first mobile application in response to a drag command, identifying an entity from a data store based on recognition performed on content in the drag area, receiving a drop location associated with a second mobile application, determining an action to perform in the second mobile application based on the drop location, and performing the action in the second mobile action using the entity. Another method may include receiving a selection of a smart copy control for a text input control in a first mobile application, receiving a selected area of a display generated by a second mobile application, identifying an entity in the selected area, automatically navigating back to the text input control, and pasting a description of the entity in the text input control.
Abstract translation: 实现方式可以在移动设备上提供改进的拖放操作。 例如,一种方法包括响应于拖曳命令识别第一移动应用的用户界面中的拖曳区域,基于对拖曳区域中的内容执行的识别从数据存储区识别实体,接收与 第二移动应用,基于所述丢弃位置确定在所述第二移动应用中执行的动作,以及使用所述实体在所述第二移动动作中执行所述动作。 另一种方法可以包括接收对第一移动应用中的文本输入控制的智能复制控制的选择,接收由第二移动应用生成的显示的选定区域,识别所选区域中的实体,自动导航回到 文本输入控件,并在文本输入控件中粘贴实体的描述。
-
公开(公告)号:US20160104480A1
公开(公告)日:2016-04-14
申请号:US14675932
申请日:2015-04-01
Applicant: Google Inc.
Inventor: Matthew Sharifi
CPC classification number: G10L15/285 , G06F3/167 , G10L15/01 , G10L15/08 , G10L15/22 , G10L15/32 , G10L17/22 , G10L2015/088 , G10L2015/223
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword detection on multiple devices are disclosed. In one aspect, a method includes the actions of receiving, by a first computing device, audio data that corresponds to an utterance. The actions further include determining a first value corresponding to a likelihood that the utterance includes a hotword. The actions further include receiving a second value corresponding to a likelihood that the utterance includes the hotword, the second value being determined by a second computing device. The actions further include comparing the first value and the second value. The actions further include based on comparing the first value to the second value, initiating speech recognition processing on the audio data.
-
公开(公告)号:US09263042B1
公开(公告)日:2016-02-16
申请号:US14340833
申请日:2014-07-25
Applicant: Google Inc.
Inventor: Matthew Sharifi
CPC classification number: G10L15/22 , G06F3/04842 , G06F3/167 , G10L15/063 , G10L15/08 , G10L15/18 , G10L15/265 , G10L15/30 , G10L2015/0631 , G10L2015/0638 , G10L2015/088 , G10L2015/223
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, for each of multiple words or sub-words, audio data corresponding to multiple users speaking the word or sub-word; training, for each of the multiple words or sub-words, a pre-computed hotword model for the word or sub-word based on the audio data for the word or sub-word; receiving a candidate hotword from a computing device; identifying one or more pre-computed hotword models that correspond to the candidate hotword; and providing the identified, pre-computed hotword models to the computing device.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于为多个单词或子单词中的每一个获得对应于说话单词或子单词的多个用户的音频数据; 针对所述多个单词或子单词中的每一个的训练,基于所述单词或子单词的音频数据的用于所述单词或子单词的预先计算的词典模型; 从计算设备接收候选词; 识别与所述候选词语对应的一个或多个预先计算的词典模型; 以及将所识别的预先计算的热词模型提供给所述计算设备。
-
公开(公告)号:US09240183B2
公开(公告)日:2016-01-19
申请号:US14181374
申请日:2014-02-14
Applicant: Google Inc.
Inventor: Matthew Sharifi , Dominik Roblek
IPC: G10L15/20 , G10L15/00 , G10L15/06 , G10L15/22 , G10L15/26 , G10L21/0208 , G10L13/00 , G10L15/02 , G10L25/24
CPC classification number: G10L15/20 , G10L13/00 , G10L15/02 , G10L15/222 , G10L15/26 , G10L25/24 , G10L2021/02087
Abstract: The technology described herein can be embodied in a method that includes receiving a first signal representing an output of a speaker device, and a second signal comprising the output of the speaker device, and an audio signal corresponding to an utterance of a speaker. The method includes aligning one or more segments of the first signal with one or more segments of the second signal. Acoustic features of the one or more segments of the first and second signals are classified to obtain a first set of vectors and a second set of vectors, respectively, the vectors being associated with speech units. The second set is modified using the first set, such that the modified second set represents a suppression of the output of the speaker device in the second signal. A transcription of the utterance of the speaker can be generated from the modified second set of vectors.
Abstract translation: 本文描述的技术可以以包括接收表示扬声器装置的输出的第一信号和包括扬声器装置的输出的第二信号以及对应于说话者发声的音频信号的方法来实现。 该方法包括将第一信号的一个或多个段对准第二信号的一个或多个段。 第一和第二信号的一个或多个段的声学特征被分类以分别获得与语音单元相关联的向量的第一组向量和第二组向量。 使用第一组修改第二组,使得修改的第二组表示抑制第二信号中的扬声器设备的输出。 可以从修改的第二组向量生成说话者的话语的转录。
-
158.
公开(公告)号:US09213703B1
公开(公告)日:2015-12-15
申请号:US13670453
申请日:2012-11-06
Applicant: Google Inc.
Inventor: Gheorghe Postelnicu , Matthew Sharifi
CPC classification number: G06F17/3002 , G06F17/30743 , G06F17/30761 , G10L25/18 , G10L25/51 , G10L25/54 , G10L2025/906
Abstract: Systems and methods are provided herein relating to audio matching. Descriptors can be generated based on anchor points and interest points that characterize the local neighborhood surrounding the anchor point. Characterizing the local spectrogram neighborhood surrounding anchor points can be more robust to pitch shift distortions and time stretch distortions. Those anchor points surrounded by a lack of spectral activity or even spectral activity can be filtered from further examination. Using these pitch shift and time stretch resistant audio features within descriptors can provide for more accurate and efficient audio matching.
Abstract translation: 本文提供了与音频匹配有关的系统和方法。 描述符可以基于表征锚点周围的局部邻域的锚点和兴趣点来生成。 表征锚点附近的局部光谱图邻域对于俯仰偏移失真和时间拉伸失真可以更加鲁棒。 由光谱活动不足或光谱活动所包围的那些锚点可以从进一步的检查中滤除。 在描述符中使用这些音调移位和时间延伸的音频特征可以提供更精确和高效的音频匹配。
-
公开(公告)号:US20150235651A1
公开(公告)日:2015-08-20
申请号:US14181374
申请日:2014-02-14
Applicant: Google Inc.
Inventor: Matthew Sharifi , Dominik Roblek
IPC: G10L21/0208 , G10L15/08 , G10L15/26 , G10L15/20
CPC classification number: G10L15/20 , G10L13/00 , G10L15/02 , G10L15/222 , G10L15/26 , G10L25/24 , G10L2021/02087
Abstract: The technology described herein can be embodied in a method that includes receiving a first signal representing an output of a speaker device, and a second signal comprising the output of the speaker device, and an audio signal corresponding to an utterance of a speaker. The method includes aligning one or more segments of the first signal with one or more segments of the second signal. Acoustic features of the one or more segments of the first and second signals are classified to obtain a first set of vectors and a second set of vectors, respectively, the vectors being associated with speech units. The second set is modified using the first set, such that the modified second set represents a suppression of the output of the speaker device in the second signal. A transcription of the utterance of the speaker can be generated from the modified second set of vectors.
Abstract translation: 本文描述的技术可以以包括接收表示扬声器装置的输出的第一信号和包括扬声器装置的输出的第二信号以及对应于扬声器发声的音频信号的方法来实现。 该方法包括将第一信号的一个或多个段对准第二信号的一个或多个段。 第一和第二信号的一个或多个段的声学特征被分类以分别获得与语音单元相关联的向量的第一组向量和第二组向量。 使用第一组修改第二组,使得修改的第二组表示抑制第二信号中的扬声器设备的输出。 可以从修改的第二组向量生成说话者的话语的转录。
-
公开(公告)号:US20150161990A1
公开(公告)日:2015-06-11
申请号:US14221520
申请日:2014-03-21
Applicant: Google Inc.
Inventor: Matthew Sharifi
CPC classification number: G10L15/22 , G10L15/02 , G10L15/063 , G10L15/08 , G10L15/265 , G10L15/285 , G10L17/22 , G10L2015/088 , G10L2015/223
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for designating certain voice commands as hotwords. The methods, systems, and apparatus include actions of receiving a hotword followed by a voice command. Additional actions include determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, where a voice command that is designated as a hotword is treated as a voice input regardless of whether the voice command is preceded by another hotword. Further actions include, in response to determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, designating the voice command as a hotword.
Abstract translation: 方法,系统和装置,包括在计算机存储介质上编码的计算机程序,用于将某些语音命令指定为热词。 方法,系统和装置包括接收随后的语音命令的热门词汇的动作。 附加动作包括确定语音命令满足与指定语音命令相关联的一个或多个预定准则作为热门词,其中指定为热门词汇的语音命令被视为语音输入,而不管语音命令是否在另一个之前 热门词 响应于确定语音命令满足与指定语音命令相关联的一个或多个预定标准作为热门词,响应于指定语音命令作为热门词语。
-
-
-
-
-
-
-
-
-