Patent search ap:("SRI International") AND inv:"Clay Spence" Page 1

1.

发明公开
VOICE MODIFICATION 审中-公开

公开(公告)号：US20240355346A1

公开(公告)日：2024-10-24

申请号：US18576127

申请日：2022-07-14

Applicant: SRI International

Inventor： Jeffrey Lubin , Clay Spence

IPC: G10L21/013 , G10L17/02 , G10L17/04 , G10L17/26

CPC classification number: G10L21/013 , G10L17/02 , G10L17/04 , G10L17/26 , G10L2021/0135

Abstract: A computing system that receives an audio waveform representing speech from an individual and produces as output a modified version of the audio waveform that maintains the speaker's speech characteristics as well as prosody for specific utterances (e.g., voice timbre, intonation, timing, intensity). The system uses a bottleneck-based autoencoder with speech spectrograms as input and output. To produce the output audio waveform, the system includes a reconstruction error-based loss function with two additional loss functions. The second loss function is speaker “real vs fake” discriminator that penalizes for the output not sounding like the speaker. The third loss function is a speech intelligibility scorer that penalizes the output for speech that is difficult for the target population to understand. The produced modified audio waveform is an enhanced speech output that delivers speech m a target accent without sacrificing the personality of the speaker.

Patent Agency Ranking