Automatic synthesis of translated speech using speaker-specific phonemes

Invention Grant

US11594226B2 Automatic synthesis of translated speech using speaker-specific phonemes 有权

Please log in to see more content

Patent Title: Automatic synthesis of translated speech using speaker-specific phonemes
Application No.: US17131043

Application Date: 2020-12-22
Publication No.: US11594226B2

Publication Date: 2023-02-28
Inventor: Su Liu , Yang Liang , Debbie Anglin , Fan Yang
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Garg Law Firm, PLLC
Agent Rakesh Garg; Nathan Rau
Main IPC: G10L15/26
IPC: G10L15/26 ; G10L15/02 ; G10L13/02 ; G06F40/279 ; G06F40/58 ; G10L25/54 ; G06F16/683

Automatic synthesis of translated speech using speaker-specific phonemes

Abstract:

An embodiment includes converting an original audio signal to an original text string, the original audio signal being from a recording of the original text string spoken by a specific person in a source language. The embodiment generates a translated text string by translating the original text string from the source language to a target language, including translation of a word from the source language to a target language. The embodiment assembles a standard phoneme sequence from a set of standard phonemes, where the standard phoneme sequence includes a standard pronunciation of the translated word. The embodiment also associates a custom phoneme with a standard phoneme of the standard phoneme sequence, where the custom phoneme includes the specific person's pronunciation of a sound in the translated word. The embodiment synthesizes the translated text string to a translated audio signal including the translated word pronounced using the custom phoneme.

Public/Granted literature

US20220199086A1 AUTOMATIC SYNTHESIS OF TRANSLATED SPEECH USING SPEAKER-SPECIFIC PHONEMES Public/Granted day:2022-06-23

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/26	.语音—正文识别系统（G10L15/08优先）