Invention Grant
- Patent Title: Sample-efficient representation learning for real-time latent speaker state characterization
-
Application No.: US17115382Application Date: 2020-12-08
-
Publication No.: US11646037B2Publication Date: 2023-05-09
- Inventor: Valentin Alain Jean Perret , Nicolas Lucien Perony , Nándor Kedves
- Applicant: OTO Systems Inc.
- Applicant Address: US NY New York
- Assignee: OTO Systems Inc.
- Current Assignee: OTO Systems Inc.
- Current Assignee Address: US NY New York
- Agency: Schwegman Lundberg & Woessner, P.A.
- Main IPC: G10L17/18
- IPC: G10L17/18 ; G10L17/02 ; G06N3/04 ; G06N3/08 ; G06N3/049 ; G06N3/045 ; G06N3/048 ; G10L17/08

Abstract:
Systems, methods, and non-transitory computer-readable media can provide audio waveform data that corresponds to a voice sample to a temporal convolutional network for evaluation. The temporal convolutional network can pre-process the audio waveform data and can output an identity embedding associated with the audio waveform data. The identity embedding associated with the voice sample can be obtained from the temporal convolutional network. Information describing a speaker associated with the voice sample can be determined based at least in part on the identity embedding.
Information query