-
公开(公告)号:GB2615421B
公开(公告)日:2025-05-07
申请号:GB202303909
申请日:2021-08-24
Applicant: IBM
Inventor: AARON BAUGHMAN , COREY SHELTON , SHIKHAR KWATRA , STEPHEN HAMMER
Abstract: The disclosure includes using dilation of speech content from an interlaced audio input for speech recognition. A learning model is initiated to determine dilation parameters for each of a plurality of audible sounds of speech content from a plurality of speakers received at a computer as an audio input. As part of the learning model, a change of each of a plurality of independent sounds is determined in response to an audio stimulus, the independent sounds being derived from the audio input. The disclosure applies the dilation parameters, respectively, based on the change of each of the independent sounds. A voice print is constructed for each of the speakers based on the independent sounds and the dilation parameters, respectively. Speech content is attributed to each of the plurality of speakers based at least in part on the voice print, respectively, and the independent sounds.