-
公开(公告)号:CA2068780C
公开(公告)日:1998-12-22
申请号:CA2068780
申请日:1992-05-15
Applicant: IBM
Inventor: BROWN PETER F , COCKE JOHN , DELLA PIETRA STEPHEN A , DELLA PIETRA VINCENT J , JELINEK FREDERICK , LAI JENNIFER C , MERCER ROBERT L
Abstract: The present invention is a system for translating text from a first source language into second target language. The system assigns probabilities or scores to various target-language translations and then displays or makes otherwise available the highest, scoring translations. The source text is first transduced into one or more intermediate structural representations. From these intermediate source structures a set of intermediate target-structure hypotheses is generated. These hypotheses are scored by two different models: a language model which assigns a probability or score to an intermediate target structure, and a translation model which assigns a probability or score to the event that an intermediate target structure is translated into an intermediate source structure. Scores from the translation model and language model are combined into a combined score for each intermediate target-structure hypothesis. Finally, a set of target-text hypotheses is produced by transducing the highest scoring target-structure hypotheses into portions of text into the target language. The system can either run into batch mode, in which case it translates source-language text into a target language without human assistance, or it can function as an aid to a human translator. When functioning as an aid to a human translator, the human may simply select from the various translation hypotheses provided by the system, or he may optionally provide hints or constraints on how to perform one or more of the states of source transduction, hypothesis generation and target transduction.
-
公开(公告)号:CA2125200C
公开(公告)日:1999-03-02
申请号:CA2125200
申请日:1994-06-06
Applicant: IBM
Inventor: BERGER ADAM L , MERCER ROBERT L , BROWN PETER F , DELLA PIETRA STEPHEN A , DELLA PIETRA VINCENT J , KEHLER ANDREW SCOTT
IPC: G06F17/28
Abstract: An apparatus for translating a series of source words in a first language to a series of target words in a second language. For an input series of source words, at least two target hypotheses, each comprising a series of target words, are generated. Each target word has a context comprising at least one other word in the target hypothesis. For each target hypothesis, a language model match score comprises an estimate of the probability of occurrence of the series of words in the target hypothesis. At least one alignment connecting each source word with at least one target word in the target hypothesis is identified. For each source word and each target hypothesis, a word match score comprises an estimate of the conditional probability of occurrence of the source word, given the target word in the target hypothesis which is connected to the source word and given the context in the target hypothesis of the target word which is connected to the source word. For each target hypothesis, a translation match score comprises a combination of the word match scores for the target hypothesis and the source words in the input series of source words. A target hypothesis match score comprises a combination of the language model match score for the target hypothesis and the translation match score for the target hypothesis The target hypothesis having the best target hypothesis match score is output.
-
公开(公告)号:CA2125200A1
公开(公告)日:1995-04-29
申请号:CA2125200
申请日:1994-06-06
Applicant: IBM
Inventor: BERGER ADAM L , BROWN PETER F , DELLA PIETRA STEPHEN A , DELLA PIETRA VINCENT J , KEHLER ANDREW S , MERCER ROBERT L
-
公开(公告)号:CA2091912C
公开(公告)日:1996-12-03
申请号:CA2091912
申请日:1993-03-18
Applicant: IBM
Inventor: BROWN PETER F , DELLA PIETRA STEPHEN A , DELLA PIETRA VINCENT J , JELINEK FREDERICK , MERCER ROBERT L
Abstract: A speech recognition system displays a source text of one or more words in a source language. The system has an acoustic processor for generating a sequence of coded representations of an utterance to be recognized. The utterance comprises a series of one or more words in a target language different from the source language. A set of one or more speech hypotheses, each comprising one or more words from the target language, are produced. Each speech hypothesis is modeled with an acoustic model. An acoustic match score for each speech hypothesis comprises an estimate of the closeness of a match between the acoustic model of the speech hypothesis and the sequence of coded representations of the utterance. A translation match score for each speech hypothesis comprises an estimate of the probability of occurrence of the speech hypothesis given the occurrence of the source text. A hypothesis score for each hypothesis comprises a combination of the acoustic match score and the translation match score. At least one word of one or more speech hypotheses having the best hypothesis scores is output as a recognition result.
-
公开(公告)号:CA2091912A1
公开(公告)日:1993-11-22
申请号:CA2091912
申请日:1993-03-18
Applicant: IBM
Inventor: BROWN PETER F , DELLA PIETRA STEPHEN A , DELLA PIETRA VINCENT J , JELINEK FREDERICK , MERCER ROBERT L
Abstract: A speech recognition system displays a source text of one or more words in a source language. The system has an acoustic processor for generating a sequence of coded representations of an utterance to be recognized. The utterance comprises a series of one or more words in a target language different from the source language. A set of one or more speech hypotheses, each comprising one or more words from the target language, are produced. Each speech hypothesis is modeled with an acoustic model. An acoustic match score for each speech hypothesis comprises an estimate of the closeness of a match between the acoustic model of the speech hypothesis and the sequence of coded representations of the utterance. A translation match score for each speech hypothesis comprises an estimate of the probability of occurrence of the speech hypothesis given the occurrence of the source text. A hypothesis score for each hypothesis comprises a combination of the acoustic match score and the translation match score. At least one word of one or more speech hypotheses having the best hypothesis scores is output as a recognition result.
-
公开(公告)号:CA2068780A1
公开(公告)日:1993-01-26
申请号:CA2068780
申请日:1992-05-15
Applicant: IBM
-
-
-
-
-