LLM AS A TRANSCRIPTION FILTER
    1.
    发明申请

    公开(公告)号:US20250157473A1

    公开(公告)日:2025-05-15

    申请号:US18506038

    申请日:2023-11-09

    Applicant: Google LLC

    Abstract: A user electronic device comprising: one or more microphones configured to capture raw audio data; and one or more processors and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations comprising: receiving the raw audio data captured by the one or more microphones; processing the raw audio data using a speech transcriber to generate a live transcription of the raw audio data that comprises a plurality of text tokens; processing the raw audio data to generate a speaker identification output that identifies, for each of the text tokens, a respective speaker for each of the text tokens in the live transcription; and processing a first input comprising (i) a first input prompt and (ii) an input text generated from the live transcription using a language model neural network to generate a modified transcription.

    INCREMENTAL HEAD-RELATED TRANSFER FUNCTION UPDATES

    公开(公告)号:US20250142277A1

    公开(公告)日:2025-05-01

    申请号:US18934895

    申请日:2024-11-01

    Applicant: GOOGLE LLC

    Inventor: Dongeek Shin

    Abstract: Disclosed implementations for generating personalized audio. Sensor data corresponding with at least one physical characteristic of a user is received. A three-dimensional mesh of the user is updated based on the sensor data. An impulse response for the user is determined based on the three-dimensional mesh. An audio stream is generated based on the impulse response.

    Time marking of media items at a platform using machine learning

    公开(公告)号:US12192550B2

    公开(公告)日:2025-01-07

    申请号:US17746818

    申请日:2022-05-17

    Applicant: Google LLC

    Inventor: Dongeek Shin

    Abstract: Methods and systems for time marking of media items at a platform using machine learning are provided herein. A media item to be provided to users of a platform is identified. The media item includes two or more content segments. An indication of the identified media item is provided as input to a machine learning model. The machine learning model is trained using to predict, for a given media item, content segments of the given media item depicting an event of interest to the one or more users. One or more outputs of the machine learning model are obtained. The one or more obtained outputs include event data identifying each content segment of the media item and an indication of a level of confidence that each respective content segment depicts an event of interest. In response to determining that at least one content segment is associated with a level of confidence that satisfies a level of confidence criterion, the at least one content segment is associated with a bookmark for a timeline of the media item. The media item and an indication of the bookmark is provided for presentation to the at least one user.

    AUDIO SIGNAL SYNTHESIS FROM A NETWORK OF DEVICES

    公开(公告)号:US20240304186A1

    公开(公告)日:2024-09-12

    申请号:US18119137

    申请日:2023-03-08

    Applicant: GOOGLE LLC

    Inventor: Dongeek Shin

    CPC classification number: G10L15/20 G10L15/16 G10L15/30

    Abstract: Merging first and second audio data to generate merged audio data, where the first audio data captures a spoken utterance of a user and is collected by a first computing device within an environment, and the second audio data captures the spoken utterance and is collected by a distinct second computing device that is within the environment. In some implementations, the merging includes merging the first audio data using a first weight value and merging the second audio data using a second weight value. The first and second weight values can be based on predicted signal-to-noise ratios (SNRs) for respective of the first audio data and the second audio data, such as a first SNR predicted by processing the first audio data using a neural network model and a second SNR predicted by processing the second audio data using the neural network model.

    Hand gesture recognition based on detected wrist muscular movements

    公开(公告)号:US12073028B2

    公开(公告)日:2024-08-27

    申请号:US18174358

    申请日:2023-02-24

    Applicant: GOOGLE LLC

    CPC classification number: G06F3/017 G06F1/163 G06F3/014 G06V40/28

    Abstract: Techniques of identifying gestures include detecting and classifying inner-wrist muscle motions at a user's wrist using micron-resolution radar sensors. For example, a user of an AR system may wear a band around their wrist. When the user makes a gesture to manipulate a virtual object in the AR system as seen in a head-mounted display (HMD), muscles and ligaments in the user's wrist make small movements on the order of 1-3 mm. The band contains a small radar device that has a transmitter and a number of receivers (e.g., three) of electromagnetic (EM) radiation on a chip (e.g., a Soli chip. This radiation reflects off the wrist muscles and ligaments and is received by the receivers on the chip in the band. The received reflected signal, or signal samples, are then sent to processing circuitry for classification to identify the wrist movement as a gesture.

    SYSTEM AND METHOD FOR MOTION CAPTURE
    6.
    发明公开

    公开(公告)号:US20240280687A1

    公开(公告)日:2024-08-22

    申请号:US18571627

    申请日:2021-06-28

    Applicant: Google LLC

    Inventor: Dongeek Shin

    CPC classification number: G01S13/878 G01S13/765 G06K7/10366

    Abstract: Ultra-wideband (UWB) tags can be used as part of a high-resolution motion capture system that may not require a cost or a complexity that is typically associated with visually based motion capture systems. The UWB based motion capture uses a bundle of UWB tags, which in a possible implementation, can be affixed to body parts of a user to sense motion of the body parts. The absolute positions of each UWB tag can then be determined by reconstructing a skeletal topology from a Euclidean distance matrix based on inter-tag ranging measurements using handshake signals of a UWB protocol.

    USING HISTORICAL USER ROUTES TO RECOMMEND NAVIGATIONAL ROUTES

    公开(公告)号:US20240200960A1

    公开(公告)日:2024-06-20

    申请号:US17801435

    申请日:2022-02-22

    Applicant: Google LLC

    Inventor: Dongeek Shin

    CPC classification number: G01C21/3484 G01C21/343

    Abstract: A computing system and method that can be used for a mapping system that can recommend paths for navigational routing to a primary user. More particularly, a primary user may be interested in navigational routes that secondary users, different from the primary user, have taken in the past. The mapping systems described herein can provide improved user navigational services by leveraging the insight that users who are connected (e.g., via social media, address books, etc.) may often be interested in visiting similar points of interest. More particularly, aspects of the present disclosure enable various interactions between connected users such as following in prior users' footsteps and taking a tour according to one or more prior users' route. Alternatively, aspects of the present disclosure enable an optimization of a primary user's navigational routing by leveraging the insight that the primary user may share interests with connected secondary users. Thus, the mapping system can generate a tailored navigational routing that guides the primary user optimally to the one or more predicted points of interests based on one or more historical navigational routings by connected secondary users.

Patent Agency Ranking