Patent search ap:("IBM") AND inv:"MERCER ROBERT L" Page 2

11.

发明专利
TRAINING OF MARKOV MODELS USED IN A SPEECH RECOGNITION SYSTEM 未知

公开(公告)号：CA1262188A

公开(公告)日：1989-10-03

申请号：CA528790

申请日：1987-02-02

Applicant: IBM

Inventor： BAHL LALIT R , BROWN PETER F , DESOUZA PETER V , MERCER ROBERT L

IPC: G10L11/00 , G10L15/14 , G10L5/06

Abstract: IMPROVING THE TRAINING OF MARKOV MODELS USED IN A SPEECH RECOGNITION SYSTEM In a word, or speech, recognition system for decoding a vocabulary word from outputs selected from an alphabet of outputs in response to a communicated word input wherein each word in the vocabulary is represented by a baseform of at least one probabilistic finite state model and wherein each probabilistic model has transition probability items and output probability items and wherein a value is stored for each of at least some probability items, the present invention relates to apparatus and method for determining probability values for probability items by biassing at least some of the stored values to enhance the likelihood that outputs generated in response to communication of a known word input are produced by the baseform for the known word relative to the respective likelihood of the generated outputs being produced by the baseform for at least one other word. Specifically, the current values of counts --from which probability items are derived-- are adjusted by uttering a known word and determining how often probability events occur relative to (a) the model corresponding to the known uttered "correct" word and (b) the model of at least one other "incorrect" word. The current count values are increased based on the event occurrences relating co the correct word and are reduced based on the event occurrences relating to the incorrect word or words.

12.

发明专利
SPEECH RECOGNITION EMPLOYING A SET OF MARKOV MODELS THAT INCLUDES MARKOV MODELS REPRESENTING TRANSITIONS TO AND FROM SILENCE 未知

公开(公告)号：CA1259411A

公开(公告)日：1989-09-12

申请号：CA504807

申请日：1986-03-24

Applicant: IBM

Inventor： BAHL LALIT R , DESOUZA PETER V , MERCER ROBERT L , PICHENY MICHAEL A

IPC: G10L9/02

Abstract: SPEECH RECOGNITION EMPLOYING A SET OF MARKOV MODELS THAT INCLUDES MARKOV MODELS REPRESENTING TRANSITIONS TO AND FROM SILENCE The present invention relates to apparatus and method for constructing word baseforms which can be matched against a string of generated acoustic labels which includes: forming a set of phonetic phone machines, wherein each phone machine has (i) a plurality of states, (ii) a plurality of transitions each of which extends from a state to a state, (iii) a stored probability for each transition, and (iv) stored label output probabilities, each label output probability corresponding to the probability of said each phone machine producing a corresponding label; wherein said set of phonetic machines is formed to include a subset of onset phone machines, the stored probabilities of each onset phone machine corresponding to at least one phonetic element being uttered at the beginning of a speech segment; and wherein said set of phonetic machines is formed to include a subset of trailing phone machines, the stored probabilities of each trailing phone machine corresponding to at least one single phonetic element being uttered at the end of a speech segment. Word baseforms are constructed by concatenating phone machines selected from the set.

13.

发明专利
APPARATUS AND METHOD FOR PERFORMING ACOUSTIC MATCHING 未知

公开(公告)号：CA1236577A

公开(公告)日：1988-05-10

申请号：CA494697

申请日：1985-11-06

Applicant: IBM

Inventor： BAHL LALIT R , MERCER ROBERT L , DEGENNARO STEVEN V

IPC: G10L5/00

Abstract: The invention herein provides, in a speech recognition system which represents each vocabulary word or a portion thereof by at least one sequence of phones wherein each phone corresponds to a respective phone machine, each phone machine having associated therewith (a) at least one transition and (b) actual label output probabilities, each actual label probability representing the probability that a subject label is generated at a given transition in the phone machine, a method of performing an acoustic match between phones and a string of labels produced by an acoustic processor in response to a speech input, the method comprising the steps of: forming simplified phone machines which includes the step of replacing by a single specific value the actual label probabilities for a given label at all transitions at which the given label may be generated in a particular phone machine; and determining the probability of a phone generating the labels in the string based on the simplified phone machine corresponding thereto.

14.

发明专利
Language Translation Apparatus and Method Using Context-Based Translation Models 未知

公开(公告)号：CA2125200A1

公开(公告)日：1995-04-29

申请号：CA2125200

申请日：1994-06-06

Applicant: IBM

Inventor： BERGER ADAM L , BROWN PETER F , DELLA PIETRA STEPHEN A , DELLA PIETRA VINCENT J , KEHLER ANDREW S , MERCER ROBERT L

IPC: G06F17/28 , G06F15/38

15.

发明专利
CONSTRUCTING MARKOV MODELS OF WORDS FROM MULTIPLE UTTERANCES 未知

公开(公告)号：CA1241751A

公开(公告)日：1988-09-06

申请号：CA504801

申请日：1986-03-24

Applicant: IBM

Inventor： BAHL LALIT R , DESOUZA PETER V , MERCER ROBERT L , PICHENY MICHAEL A

IPC: G10L15/14 , G10L5/06

Abstract: The present invention addresses the problem of constructing fenemic baseforms which take into account variations in pronunciation of words from one utterance thereof to another. Specifically, the invention relates to a method of constructing a fenemic baseform for a word in a vocabulary of word segments including the steps of: (a) transforming multiple utterances of the word into respective strings of fenemes; (b) defining a set of fenemic Markov model phone machines; (c) determining the best single phone machine P1 for producing the multiple feneme strings; (d) determining the best two phone baseform of the form P1P2 or P2P1 for producing the multiple feneme strings; (e) aligning the best two phone baseform against each feneme string; (f) splitting each feneme string into a left portion and a right portion with the left portion corresponding to the first phone machine of the two phone baseform and the right portion corresponding to the second phone machine of the two phone baseform; (g) identifying each left portion as a left substring and each right portion as a right substring; (h) processing the set of left substrings and the set of right substrings in the same manner as the set of feneme strings corresponding to the multiple utterances including the further step of inhibiting further splitting of a substring when the single phone baseform thereof has a higher probability of producing the substring than does the best two phone baseform; and (k) concatenating the unsplit single phones in an order corresponding to the order of the feneme substrings to which they correspond.

16.

发明专利
SPEECH RECOGNITION SYSTEM FOR NATURAL LANGUAGE TRANSLATION 未知

公开(公告)号：CA2091912C

公开(公告)日：1996-12-03

申请号：CA2091912

申请日：1993-03-18

Applicant: IBM

Inventor： BROWN PETER F , DELLA PIETRA STEPHEN A , DELLA PIETRA VINCENT J , JELINEK FREDERICK , MERCER ROBERT L

IPC: G10L15/18 , G10L5/02

Abstract: A speech recognition system displays a source text of one or more words in a source language. The system has an acoustic processor for generating a sequence of coded representations of an utterance to be recognized. The utterance comprises a series of one or more words in a target language different from the source language. A set of one or more speech hypotheses, each comprising one or more words from the target language, are produced. Each speech hypothesis is modeled with an acoustic model. An acoustic match score for each speech hypothesis comprises an estimate of the closeness of a match between the acoustic model of the speech hypothesis and the sequence of coded representations of the utterance. A translation match score for each speech hypothesis comprises an estimate of the probability of occurrence of the speech hypothesis given the occurrence of the source text. A hypothesis score for each hypothesis comprises a combination of the acoustic match score and the translation match score. At least one word of one or more speech hypotheses having the best hypothesis scores is output as a recognition result.

17.

发明专利
APPARATUS AND METHOD FOR DETERMINING A LIKELY WORD SEQUENCE FROM LABELS GENERATED BY AN ACOUSTIC PROCESSOR 未知

公开(公告)号：CA1248633A

公开(公告)日：1989-01-10

申请号：CA504800

申请日：1986-03-24

Applicant: IBM

Inventor： BAHL LALIT R , JELINEK FREDERICK , MERCER ROBERT L

IPC: G10L15/14 , G06F5/00

Abstract: APPARATUS AND METHOD FOR DETERMINING A LIKELY WORD SEQUENCE FROM LABELS GENERATED BY AN AN ACOUSTIC PROCESSOR The present invention addresses the problem of determining, in a speech recognition context, a likely sequence or path of words from a plurality of word paths given a string of labels that are generated at successive intervals. The invention features multiple stack decoding and a unique strategy for extending one word path at a time without undue reliance on word path length. With multiple stack decoding, a stack is associated with each label of the label string. Word paths that most likely end at a given label are assigned to the stack corresponding to the given label and are ordered according to likelihood at the given label. The strategy of deciding which word path to extend includes the forming of a likelihood envelope against which the word paths are compared to determine if a word path is sufficiently likely to be extended. From among the word paths that are found to be extendible, the word path of highest likelihood in the earliest stack --i.e. the shortest most likely word path-- is selected for extension. After a word path is extended, it is deleted from its stack and the word paths extended therefrom are entered into appropriate stacks.

18.

发明专利
APPARATUS AND METHOD FOR PRODUCING A LIST OF LIKELY CANDIDATE WORDS CORRESPONDING TO A SPOKEN INPUT 未知

公开(公告)号：CA1246229A

公开(公告)日：1988-12-06

申请号：CA504806

申请日：1986-03-24

Applicant: IBM

Inventor： BAHL LALIT R , DESOUZA PETER V , MERCER ROBERT L

IPC: G06F1/00

Abstract: APPARATUS AND METHOD FOR PRODUCING A LIST OF LIKELY CANDIDATE WORDS CORRESPONDING TO A SPOKEN INPUT A speech recognition apparatus and method of selecting likely word from a vocabulary of words, wherein each word is represented by a sequence of at least one probabilistic finite state phone machine and wherein an acoustic processor generates acoustic labels in response to a spoken input, include: (a) forming a first table in which each label in the alphabet provides a vote for each word in the vocabulary, each label vote for a subject word indicating the likelihood of the subject word producing the label providing the vote; (b) forming a second table in which each label is assigned a penalty for each word in the vocabulary, the penalty assigned to a given label for a given word being indicative of the likelihood of the given label not being produced according to the model for the given word; and (c) for a given string of labels, determining the likelihood of a particular word which includes the step of combining the votes of all labels in the string for the particular word together with the penalties of all labels not in the string for the particular word.

19.

发明专利
FENEME-BASED MARKOV MODELS FOR WORDS 未知

公开(公告)号：CA1236578A

公开(公告)日：1988-05-10

申请号：CA496161

申请日：1985-11-26

Applicant: IBM

Inventor： BAHL LALIT R , DESOUZA PETER V , MERCER ROBERT L , PICHENY MICHAEL A

IPC: G10L5/00

Abstract: FENEME-BASED MARKOV MODELS FOR WORDS In a speech recognition system, apparatus and method for modelling words with label-based Markov models is disclosed. The modelling includes: entering a first speech input, corresponding to words in a vocabulary, into an acoustic processor which converts each spoken word into a sequence of standard labels, where each standard label corresponds to a sound type assignable to an interval of time; representing each standard label as a probabilistic model which has a plurality of states, at least one transition from a state to a state, and at least one settable output probability at some transitions; entering selected acoustic inputs into an acoustic processor which converts the selected acoustic inputs into personalized labels, each personalized label corresponding to a sound type assigned to an interval of time; and setting each output probability as the probability or the standard label represented by a given model producing a particular personalized label at a given transition in the given model. The present invention addresses the problem of generating models of words simply and automatically in a speech recognition system.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification