Systems, apparatuses, and methods for speaker verification using artificial neural networks
Abstract:
In one aspect, instead of discriminative training a single K-class ANN, a proposed architecture discriminative trains K ANNs (e.g., the following K 2-class ANNs are trained: ANN_1, ANN_2, ANN_K). Each one of these K 2-class ANNs learns to discriminate between audio material from one of the enrolled speakers and “average” speech material (e.g., a feature vector generated using a Gaussian Mixture Model trained Universal Background Model (GMM-UBM)). That is, for example, ANN_i is trained to discriminate between audio material from the ith enrolled speaker and the “average” speech material. In the event that a new enrolled speaker is to be added to the system, an additional ANN is trained (e.g., ANN_(K+1)) with the available audio material (audio features) from that particular speaker and audio features produced from the GMM-UBM system.
Information query
Patent Agency Ranking
0/0