Patent search ap:("IBM") AND inv:"BELLEGARDA JEROME R" Page 1

1.

发明专利
SPEECH CODING APPARATUS HAVING SPEAKER DEPENDENT PROTOTYPES GENERATED FROM A NONUSER REFERENCE DATA 未知

公开(公告)号：CA2077728A1

公开(公告)日：1993-06-06

申请号：CA2077728

申请日：1992-09-08

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , DE SOUZA PETER V , GOPALAKRISHNAN PONANI S , NADAS ARTHUR J , NAHAMAOO DAVID , PICHENY MICHAEL A

IPC: G10L19/00 , G10L15/02 , G10L15/06 , G10L15/10

Abstract: A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals. The synthesized training vector signals are transformed reference feature vector signals representing the values of features of one or more utterances of one or more speakers in a reference set of speakers. The measured training feature vector signals represent the values of features of one or more utterances of a new speaker/user not in the reference set.

2.

发明专利
未知

公开(公告)号：DE69224253D1

公开(公告)日：1998-03-05

申请号：DE69224253

申请日：1992-08-31

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , EPSTEIN EDWARD ADAM , LUCASSEN JOHN M , NAHAMOO DAVID , PICHENY MICHAEL ALAN

IPC: G10L19/00 , G10L15/02 , G10L19/02 , H03M7/30 , G10L5/00 , G10L5/06 , G10L7/08 , G10L9/06 , G10L9/16

3.

发明专利
未知

公开(公告)号：DE69221403T2

公开(公告)日：1998-02-19

申请号：DE69221403

申请日：1992-05-20

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , DE SOUZA PETER VINCENT , NAHAMOO DAVID , PICHENY MICHAEL ALAN

IPC: G10L19/00 , G10L15/02 , G10L15/06 , G10L5/06 , G10L3/00

Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes means for storing a training script model comprises a series of word-segment models. Each word-segment model comprises a series of elementary models. Means are provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. Means are provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises means for clustering the feature vector signals into a plurality of clusters. Each feature vector signal in a cluster corresponds to a single elementary model in a single location in a single word-segment model. Each cluster signal has a cluster value equal to an average of the feature values of all feature vectors in the signal. Finally, the apparatus includes means for storing a plurality of prototype vector signals. Each prototype vector signal corresponds to an elementary model, has an identifier, and comprises at least two partition values. The partition values are equal to combinations of the cluster values of one or more cluster signals corresponding to the elementary model.

4.

发明专利
未知

公开(公告)号：DE69221403D1

公开(公告)日：1997-09-11

申请号：DE69221403

申请日：1992-05-20

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , DE SOUZA PETER VINCENT , NAHAMOO DAVID , PICHENY MICHAEL ALAN

IPC: G10L19/00 , G10L15/02 , G10L15/06 , G10L5/06 , G10L3/00

Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes means for storing a training script model comprises a series of word-segment models. Each word-segment model comprises a series of elementary models. Means are provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. Means are provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises means for clustering the feature vector signals into a plurality of clusters. Each feature vector signal in a cluster corresponds to a single elementary model in a single location in a single word-segment model. Each cluster signal has a cluster value equal to an average of the feature values of all feature vectors in the signal. Finally, the apparatus includes means for storing a plurality of prototype vector signals. Each prototype vector signal corresponds to an elementary model, has an identifier, and comprises at least two partition values. The partition values are equal to combinations of the cluster values of one or more cluster signals corresponding to the elementary model.

5.

发明专利
未知

公开(公告)号：DE69028842D1

公开(公告)日：1996-11-14

申请号：DE69028842

申请日：1990-12-13

Applicant: IBM

Inventor： BELLEGARDA JEROME R , DE SOUZA PETER VINCENT , GOPALAKRISHNAN PONANI S , NAHAMOO DAVID , PICHENY MICHAEL ALAN

IPC: G06F7/00 , G06F17/18 , G10L15/06 , G10L15/10 , G10L15/14 , G10L5/06 , G10L3/00

Abstract: A method and apparatus of modeling a word by concatenating a series of elemental models to form a word model. At least one elemental model in the series is a composite elemental model formed by combining the starting states of at least first and second primitive elemental models. Each primitive elemental model represents a speech component. The primitive elemental models are combined by a weighted combination of their parameters in proportion to the values of the weighting factors. In order to tailor the word model to closely represent variations in the pronunciation of the word, the word is uttered a plurality of times by a plurality of different speakers. From the prior values of the weighting factors, and from the values of the parameters of the first and second primitive elemental models, the conditional probability of occurrence of the first primitive elemental model given the occurrence of the composite elemental model and given the occurrence of the observed sequence of component sounds is estimated. A posterior value for the first weighting factor is estimated from the conditional probability. By constructing word models from composite elemental models, and by constructing composite elemental models from primitive elemental models, it is possible for the resulting word model to closely represent many variations in the pronunciation of a word. By providing a relatively small set of primitive elemental models in comparison to a relatively large vocabulary of words, the models can be trained to the voice of a new speaker by having the new speaker utter only a small subset of the words in the vocabulary.

6.

发明专利
FAST ALGORITHM FOR DERIVING ACOUSTIC PROTOTYPES FOR AUTOMATIC SPEECH RECOGNITION 未知

公开(公告)号：CA2068041C

公开(公告)日：1996-10-29

申请号：CA2068041

申请日：1992-05-05

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , DE SOUZA PETER V , NAHAMOO DAVID , PICHENY MICHAEL A

IPC: G10L19/00 , G10L15/02 , G10L15/06 , G10L5/00 , G10L5/04 , G10L5/06

Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes means for storing a training script model comprises a series of word-segment models. Each word-segment model comprises a series of elementary models. Means are provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. Means are provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises means for clustering the feature vector signals into a plurality of clusters. Each feature vector signal in a cluster corresponds to a single elementary model in a single location in a single word-segment model. Each cluster signal has a cluster value equal to an average of the feature values of all feature vectors in the signal. Finally, the apparatus includes means for storing a plurality of prototype vector signals. Each prototype vector signal corresponds to an elementary model, has an identifier, and comprises at least two partition values. The partition values are equal to combinations of the cluster values of one or more cluster signals corresponding to the elementary model.

7.

发明专利
SPEECH CODING APPARATUS HAVING SPEAKER DEPENDENT PROTOTYPES GENERATED FROM A NONUSER REFERENCE DATA 未知

公开(公告)号：CA2077728C

公开(公告)日：1996-08-06

申请号：CA2077728

申请日：1992-09-08

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , DE SOUZA PETER V , GOPALAKRISHNAN PONANI S , NADAS ARTHUR J , NAHAMOO DAVID , PICHENY MICHAEL A

IPC: G10L19/00 , G10L15/02 , G10L15/06 , G10L15/10 , G10L5/06

Abstract: A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals. The synthesized training vector signals are transformed reference feature vector signals representing the values of features of one or more utterances of one or more speakers in a reference set of speakers. The measured training feature vector signals represent the values of features of one or more utterances of a new speaker/user not in the reference set.

8.

发明专利
未知

公开(公告)号：DE69224253T2

公开(公告)日：1998-08-13

申请号：DE69224253

申请日：1992-08-31

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , EPSTEIN EDWARD ADAM , LUCASSEN JOHN M , NAHAMOO DAVID , PICHENY MICHAEL ALAN

IPC: G10L19/00 , G10L15/02 , G10L19/02 , H03M7/30 , G10L5/00 , G10L5/06 , G10L7/08 , G10L9/06 , G10L9/16

9.

发明专利
SPEECH CODING APPARATUS WITH SINGLE-DIMENSION ACOUSTIC PROTOTYPES FOR A SPEECH RECOGNIZER 未知

公开(公告)号：CA2072721A1

公开(公告)日：1993-04-04

申请号：CA2072721

申请日：1992-06-29

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , EPSTEIN EDWARD A , LUCASSEN JOHN M , NAHAMOO DAVID , PICHENY MICHAEL A

IPC: G10L19/00 , G10L15/02 , G10L19/02 , H03M7/30

10.

发明专利
FAST ALGORITHM FOR DERIVING ACOUSTIC PROTOTYPES FOR AUTOMATIC SPEECH RECOGNITION 未知

公开(公告)号：CA2068041A1

公开(公告)日：1993-01-17

申请号：CA2068041

申请日：1992-05-05

Applicant: IBM

Inventor： BAHL LALIT R , BELLEGARDA JEROME R , DE SOUZA PETER V , NAHAMOO DAVID , PICHENY MICHAEL A

IPC: G10L19/00 , G10L15/02 , G10L15/06 , G10L5/00 , G10L5/04 , G10L5/06

Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes means for storing a training script model comprises a series of word-segment models. Each word-segment model comprises a series of elementary models. Means are provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. Means are provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises means for clustering the feature vector signals into a plurality of clusters. Each feature vector signal in a cluster corresponds to a single elementary model in a single location in a single word-segment model. Each cluster signal has a cluster value equal to an average of the feature values of all feature vectors in the signal. Finally, the apparatus includes means for storing a plurality of prototype vector signals. Each prototype vector signal corresponds to an elementary model, has an identifier, and comprises at least two partition values. The partition values are equal to combinations of the cluster values of one or more cluster signals corresponding to the elementary model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification