Abstract:
Methods, systems, and apparatus for receiving a natural language query of a user, and environmental data, identifying a media item based on the environmental data, determining an entity type based on the natural language query, selecting an entity associated with the media item that matches the entity type, selecting, from a media consumption database that identifies media items that have been indicated as consumed by the user, one or more media items that have been indicated as consumed by the user and that are associated with the selected entity, and providing a response to the query based on selecting the one or more media items that have been indicated as consumed by the user and that are associated with the selected entity.
Abstract:
In one example, a system comprises at least one processor configured to determine an indication of an audio portion of video content, determine, based at least in part on the indication, one or more candidate audio tracks, determine, based at least in part on the one or more candidate audio tracks, one or more search terms, and provide a search query that includes the search terms. The at least one processor may be further configured to, in response to the search query, receive a response that indicates a number of search results, wherein each one of the search results is associated with content that includes the one or more search terms, select, based at least in part on the response, a particular audio track of the one or more candidate audio tracks, and send a message that associates the video content with at least the particular audio track.
Abstract:
Systems and methods are disclosed for providing device-specific instructions in response to a perception of a media content segment. In one implementation, a processing device receives one or more media content segments from a user device. The processing device processes the one or more media content segments to determine one or more operations associated with the one or more media content segments. The processing device selects, based on one or more characteristics associated with the user device, at least one of the one or more operations. The processing device provides one or more instructions to perform the at least one of the one or more operations in relation to the user device.
Abstract:
Systems and methods are provided for suggesting actions for selected text based on content displayed on a mobile device. An example method can include converting a selection made via a display device into a query, providing the query to an action suggestion model that is trained to predict an action given a query, each action being associated with a mobile application, receiving one or more predicted actions, and initiating display of the one or more predicted actions on the display device. Another example method can include identifying, from search records, queries where a website is highly ranked, the website being one of a plurality of websites in a mapping of websites to mobile applications. The method can also include generating positive training examples for an action suggestion model from the identified queries, and training the action suggestion model using the positive training examples.
Abstract:
A method includes determining, based at least in part on a type of information to be displayed at a display device associated with a computing device, a privacy level for the information to be displayed; and determining whether the privacy level satisfies a threshold privacy level. The method also includes, responsive to determining that the privacy level satisfies the threshold privacy level, determining whether an individual not associated with a currently active user account of the computing device is proximate to the display device. The method also includes determining an estimated speed of the individual not associated with the currently active user account relative to the display device. The method further includes determining, whether the estimated speed satisfies a threshold speed, and responsive to determining that the estimated speed satisfies the threshold speed, outputting the information such that at least a first portion of the information is obscured.
Abstract:
Systems and methods are provided for detecting and ranking entities identified in screen content displayed on a mobile device. For example, a method includes receiving an image captured from a mobile device display for a mobile application and determining a window that includes a chronological set of images, the images each representing a respective screen captured from a display of a mobile device and having an associated timestamp. The method also includes identifying entities appearing in images in a first portion of the window using text for images in a remaining portion of the window as context to disambiguate ambiguous entity references.
Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for receiving audio data corresponding to an utterance, determining that the audio data corresponds to a hotword, generating a hotword audio fingerprint of the audio data that is determined to correspond to the hotword, comparing the hotword audio fingerprint to one or more stored audio fingerprints of audio data that was previously determined to correspond to the hotword, detecting whether the hotword audio fingerprint matches a stored audio fingerprint of audio data that was previously determined to correspond to the hotword based on whether the comparison indicates a similarity between the hotword audio fingerprint and one of the one or more stored audio fingerprints that satisfies a predetermined threshold, and in response to detecting that the hotword audio fingerprint matches a stored audio fingerprint, disabling access to a computing device into which the utterance was spoken.
Abstract:
Methods, systems, and media for recommending content based on network conditions are provided. In some embodiments, the method comprises: receiving, from a first user device, a request to present media content recommendations on the first user device; in response to receiving the request, determining information indicating a user context associated with the first user device and network connectivity information associated with a connection status of the first user device over a communications network; identifying a group of media content items to recommend based on the user context and the network connectivity information; and causing recommendations for the group of media content items to be presented on the first user device.
Abstract:
A method includes receiving, from an audio streaming system, a probe audio sample and identifying sufficiently matching reference audio samples based on a first comparison of a first portion of the probe audio sample to reference audio samples. The method also includes, in response to determining that the sufficiently matching reference audio samples do not meet a predetermined score threshold, retaining the sufficiently matching reference audio samples, identifying additional matching reference audio samples based on a second comparison a second portion of the probe audio sample to the reference audio samples, and outputting at least one of the reference audio samples based on the first comparison and the second comparison.
Abstract:
In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.