Patent search ap:("SRI International") AND inv:"Alexander Erdmann" Page 1

1.

发明公开
SPEECH MODIFICATION USING ACCENT EMBEDDINGS 审中-公开

公开(公告)号：US20240304175A1

公开(公告)日：2024-09-12

申请号：US18599018

申请日：2024-03-07

Applicant: SRI International

Inventor： Alexander Erdmann , Sarah Bakst , Harry Bratt , Dimitra Vergyri , Horacio Franco

IPC: G10L13/047 , G10L15/16

CPC classification number: G10L13/047 , G10L15/16

Abstract: Techniques for a machine learning system configured to obtain a dataset of a plurality of sample speech clips; generate a plurality of sequence; initialize a plurality of speaker embeddings and a plurality of accent embeddings; update the plurality of speaker embeddings; update the plurality of accent embeddings; generate a plurality of augmented embeddings based on the plurality of sequence embeddings, the plurality of speaker embeddings, and the plurality of accent embeddings; and generate a plurality of synthetic speech clips based on the plurality of augmented embeddings. The machine learning system may further be configured to obtain an audio waveform; decompose the audio waveform into first magnitude spectral slices and an original phase; process the first magnitude spectral slices to map the first magnitude spectral slices to second magnitude spectral slices; and generate a modified audio waveform in part by combining the second magnitude spectral slices and the original phase.

Patent Agency Ranking