Invention Grant
- Patent Title: Speaker separation based on real-time latent speaker state characterization
-
Application No.: US18368459Application Date: 2023-09-14
-
Publication No.: US12315516B2Publication Date: 2025-05-27
- Inventor: Valentin Alain Jean Perret , Nándor Kedves , Nicolas Lucien Perony
- Applicant: Unity Technologies SF
- Applicant Address: US CA San Francisco
- Assignee: Unity Technologies SF
- Current Assignee: Unity Technologies SF
- Current Assignee Address: US CA San Francisco
- Agency: Schwegman Lundberg & Woessner, P.A.
- Main IPC: G10L17/18
- IPC: G10L17/18 ; G06N3/04 ; G06N3/045 ; G06N3/049 ; G06N3/08 ; G10L17/02 ; G10L17/04 ; G10L17/06 ; G10L17/08 ; G10L21/0272

Abstract:
Systems, methods, and non-transitory computer-readable media can obtain a stream of audio waveform data that represents speech involving a plurality of speakers. As the stream of audio waveform data is obtained, a plurality of audio chunks can be determined. An audio chunk can be associated with one or more identity embeddings. The stream of audio waveform data can be segmented into a plurality of segments based on the plurality of audio chunks and respective identity embeddings associated with the plurality of audio chunks. A segment can be associated with a speaker included in the plurality of speakers. Information describing the plurality of segments associated with the stream of audio waveform data can be provided.
Public/Granted literature
- US20240153509A1 SPEAKER SEPARATION BASED ON REAL-TIME LATENT SPEAKER STATE CHARACTERIZATION Public/Granted day:2024-05-09
Information query