SYSTEMS AND METHODS FOR PROCESSING BI-MODE DUAL-CHANNEL SOUND DATA FOR AUTOMATIC SPEECH RECOGNITION MODELS
Abstract:
Various embodiments of the present disclosure provide methods, apparatus, systems, computing devices, computing entities, and/or the like for pre-processing dual-channel voice data for an automatic speech recognition mode. The method comprises creating one or more spectrograms for each channel of the dual-channel voice data by applying fast Fourier transform and generating power spectral density. The one or more balanced power spectrograms are created by merging the spectrograms of the channels, and are provided as input for acoustic and language processing by an automatic speech recognition machine learning model.
Information query
Patent Agency Ranking
0/0