Adaptive permutation invariant training with auxiliary information for monaural multi-talker speech recognition
Abstract:
Provided are a speech recognition training processing method and an apparatus including the same. The speech recognition training processing method includes acquiring a stream of speech data from one or more speakers, extracting an auxiliary feature corresponding to a speech characteristic of the one or more speaker and updating an acoustic model by performing permutation invariant training (PIT) model training based on the auxiliary feature.
Information query
Patent Agency Ranking
0/0