Invention Publication
- Patent Title: SYSTEMS AND METHODS FOR PROCESSING BI-MODE DUAL-CHANNEL SOUND DATA FOR AUTOMATIC SPEECH RECOGNITION MODELS
-
Application No.: US17930567Application Date: 2022-09-08
-
Publication No.: US20240087592A1Publication Date: 2024-03-14
- Inventor: James J. Mou , Jun Li , Julie Zhu
- Applicant: Optum, Inc.
- Applicant Address: US MN Minnetonka
- Assignee: Optum, Inc.
- Current Assignee: Optum, Inc.
- Current Assignee Address: US MN Minnetonka
- Main IPC: G10L25/21
- IPC: G10L25/21 ; G10L15/187 ; G10L15/197 ; G10L15/26 ; G10L25/18 ; G10L25/45

Abstract:
Various embodiments of the present disclosure provide methods, apparatus, systems, computing devices, computing entities, and/or the like for pre-processing dual-channel voice data for an automatic speech recognition mode. The method comprises creating one or more spectrograms for each channel of the dual-channel voice data by applying fast Fourier transform and generating power spectral density. The one or more balanced power spectrograms are created by merging the spectrograms of the channels, and are provided as input for acoustic and language processing by an automatic speech recognition machine learning model.
Public/Granted literature
- US12154589B2 Systems and methods for processing bi-mode dual-channel sound data for automatic speech recognition models Public/Granted day:2024-11-26
Information query