Speech separation method, electronic device, chip, and computer- readable storage medium

Invention Grant

US12334092B2 Speech separation method, electronic device, chip, and computer- readable storage medium 有权

Please log in to see more content

Patent Title: Speech separation method, electronic device, chip, and computer- readable storage medium
Application No.: US18026960

Application Date: 2021-08-24
Publication No.: US12334092B2

Publication Date: 2025-06-17
Inventor: Henghui Lu , Lei Qin , Peng Zhang , Jiaming Xu , Bo Xu
Applicant: Huawei Technologies Co., Ltd. , INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
Applicant Address: CN Shenzhen; CN Beijing
Assignee: Huawei Technologies Co., Ltd.,INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
Current Assignee: Huawei Technologies Co., Ltd.,INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES
Current Assignee Address: CN Shenzhen; CN Beijing
Agency: Leydig, Voit & Mayer, Ltd.
Priority: CN202011027680.8 20200925
International Application: PCT/CN2021/114204 WO 20210824
International Announcement: WO2022/062800 WO 20220331
Main IPC: G10L21/0208
IPC: G10L21/0208 ; G06V20/40 ; G10L21/055

Speech separation method, electronic device, chip, and computer- readable storage medium

Abstract:

A speech separation method is provided, and relates to the field of speech. The method includes: obtaining, in a speaking process of a user, audio information including a user speech and video information including a user face; coding the audio information to obtain a mixed acoustic feature; extracting a visual semantic feature of the user from the video information; inputting the mixed acoustic feature and the visual semantic feature into a preset visual speech separation network to obtain an acoustic feature of the user; and decoding the acoustic feature of the user to obtain a speech signal of the user. An electronic device, a chip, and a computer-readable storage medium are provided.

Public/Granted literature

US20230335148A1 Speech Separation Method, Electronic Device, Chip, and Computer-Readable Storage Medium Public/Granted day:2023-10-19

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）
G10L21/02	.语音增强，例如降低噪声或消除回声（在直线传送系统中减轻回声效应入H04B3/20；免提电话中的回声抑制入H04M9/08）
G10L21/0208	..噪声过滤