Knowledge transfer in permutation invariant training for single-channel multi-talker speech recognition
Abstract:
Provided are a speech recognition training processing method and an apparatus including the same. The speech recognition training processing method includes acquiring a multi-talker mixed speech signal from a plurality of speakers, performing permutation invariant training (PIT) model training on the multi-talker mixed speech signal based on knowledge from a single-talker speech recognition model and updating a multi-talker speech recognition model based on a result of the PIT model training.
Information query
Patent Agency Ranking
0/0