-
公开(公告)号:DE69129015D1
公开(公告)日:1998-04-09
申请号:DE69129015
申请日:1991-12-10
Applicant: IBM
Inventor: NAHAMOO DAVID , DE SOUZA PETER VINCENT , BAHL LALIT R , PICHENY MICHAEL ALAN
Abstract: The present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values. To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes. The class of sound whose group of prototypes, or at least one of the prototypes, whose combined value most closely matches the value of the feature value of the feature vector is deemed to be the class corresponding to the feature vector. The feature vector is then labeled with the identifier associated with that class.
-
公开(公告)号:DE3874049D1
公开(公告)日:1992-10-01
申请号:DE3874049
申请日:1988-06-16
Applicant: IBM
Inventor: BAHL LALIT RAI , MERCER ROBERT LEROY , NAHAMOO DAVID
Abstract: Apparatus and method for training the statistics of a Markov Model speech recognizer to a subsequent speaker who utters part of a training text after the recognizer has been trained for the statistics of a reference speaker who utters a full training text. Where labels generated by an acoustic processor in response to uttered speech serve as outputs for Markov models, the present apparatus and method determine label output probabilities at transitions in the Markov models corresponding to the subsequent speaker where there is sparse training data. Specifically, label output probabilities for the subsequent speaker are re-parameterized based on confusion matrix entries having values indicative of the similarity between an l th label output of the subsequent speaker and a kth label output for the reference speaker. The label output probabilities based on re-parameterized data are combined with initialized label output probabilities to form "smoothed" label output probabilities which feature smoothed probability distributions. Based on label outputs generated when the subsequent speaker utters the shortened training text, "basic" label output probabilities computed by conventional methodology are linearly averaged against the smoothed label output probabilities to produce improved label output probabilities.
-
公开(公告)号:GB2506278A
公开(公告)日:2014-03-26
申请号:GB201316988
申请日:2012-03-13
Applicant: IBM
Inventor: KONS ZVI , HOORY RON , NAHAMOO DAVID , BEN-DAVID SHAY
IPC: G10L21/003 , G10L19/018
Abstract: Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.
-
公开(公告)号:DE69423692D1
公开(公告)日:2000-05-04
申请号:DE69423692
申请日:1994-09-08
Applicant: IBM
Inventor: EPSTEIN MARK EDWARD , GOPALAKRISHNAN PONANI S , NAHAMOO DAVID , PICHENY MICHAEL ALAN , SEDIVY JAN
Abstract: A speech coding apparatus and method uses classification rules to code an utterance while consuming fewer computing resources. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. Classification rules map each feature vector signal from a set of all possible feature vector signals to exactly one of at least two different classes of prototype vector signals. Each class contains a plurality of prototype vector signals. According to the classification rules, a first feature vector signal is mapped to a first class of prototype vector signals. The closeness of the feature value of the first feature vector signal is compared to the parameter values of only the prototype vector signals in the first class of prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal in the first class. At least the identification value of at least the prototype vector signal having the best prototype match score is output as a coded utterance representation signal of the first feature vector signal.
-
公开(公告)号:CA2345665A1
公开(公告)日:2000-04-13
申请号:CA2345665
申请日:1999-10-01
Applicant: IBM
Inventor: MAES STEPHANE H , DEGENNARO STEVEN V , COFFMAN DANIEL , NAHAMOO DAVID , COMERFORD LIAM D , EPSTEIN EDWARD A , GOPALAKRISHNAN PONANI
IPC: G06F3/16 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G06F40/00 , G10L13/00 , G10L15/22 , G10L15/26 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00 , G06F9/00
Abstract: A conversational computing system that provides a universal coordinated mult i- modal conversational user interface (CUI) (10) across a plurality of conversationally aware applications (11) (i.e., applications that "speak" conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application API's (13). The conversational kernel (14) controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system m ay be built on top of a conventional operating system and API's (15) and conventional device hardware (16). The conversational kernel (14) handles al l I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conve ys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.
-
公开(公告)号:DE69129015T2
公开(公告)日:1998-10-29
申请号:DE69129015
申请日:1991-12-10
Applicant: IBM
Inventor: NAHAMOO DAVID , DE SOUZA PETER VINCENT , BAHL LALIT R , PICHENY MICHAEL ALAN
Abstract: The present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values. To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes. The class of sound whose group of prototypes, or at least one of the prototypes, whose combined value most closely matches the value of the feature value of the feature vector is deemed to be the class corresponding to the feature vector. The feature vector is then labeled with the identifier associated with that class.
-
公开(公告)号:DE69224253D1
公开(公告)日:1998-03-05
申请号:DE69224253
申请日:1992-08-31
Applicant: IBM
Inventor: BAHL LALIT R , BELLEGARDA JEROME R , EPSTEIN EDWARD ADAM , LUCASSEN JOHN M , NAHAMOO DAVID , PICHENY MICHAEL ALAN
-
公开(公告)号:DE69221403T2
公开(公告)日:1998-02-19
申请号:DE69221403
申请日:1992-05-20
Applicant: IBM
Inventor: BAHL LALIT R , BELLEGARDA JEROME R , DE SOUZA PETER VINCENT , NAHAMOO DAVID , PICHENY MICHAEL ALAN
Abstract: An apparatus for generating a set of acoustic prototype signals for encoding speech includes means for storing a training script model comprises a series of word-segment models. Each word-segment model comprises a series of elementary models. Means are provided for measuring the value of at least one feature of an utterance of the training script during each of a series of time intervals to produce a series of feature vector signals representing the feature values of the utterance. Means are provided for estimating at least one path through the training script model which would produce the entire series of measured feature vector signals. From the estimated path, the elementary model in the training script model which would produce each feature vector signal is estimated. The apparatus further comprises means for clustering the feature vector signals into a plurality of clusters. Each feature vector signal in a cluster corresponds to a single elementary model in a single location in a single word-segment model. Each cluster signal has a cluster value equal to an average of the feature values of all feature vectors in the signal. Finally, the apparatus includes means for storing a plurality of prototype vector signals. Each prototype vector signal corresponds to an elementary model, has an identifier, and comprises at least two partition values. The partition values are equal to combinations of the cluster values of one or more cluster signals corresponding to the elementary model.
-
公开(公告)号:DE3876379T2
公开(公告)日:1993-06-09
申请号:DE3876379
申请日:1988-09-16
Applicant: IBM
Inventor: PICHENY MICHAEL ALAN , NAHAMOO DAVID , DE SOUZA PETER VINCENT , BROWN PETER FITZHUGH
-
公开(公告)号:DE102012220130B4
公开(公告)日:2019-04-04
申请号:DE102012220130
申请日:2012-11-06
Applicant: IBM
Inventor: BEN-DAVID SHAY , CONNELL JONATHAN HUDSON , HOORY RON , NAHAMOO DAVID , SICCONI ROBERTO
IPC: G06F21/32
Abstract: Ein Verfahren, ein System und ein Computerprogrammprodukt zum Zugang zu sicheren Einrichtungen werden bereitgestellt. Das Verfahren kann beinhalten: Empfangen einer Zugangsanfrage zu einer sicheren Einrichtung von einer Mobileinheit; Authentifizieren eines Benutzers mittels biometrischer Authentifizierung mit mehreren Faktoren mit Daten von der Mobileinheit; Erhalten von Daten von einer oder mehreren ortsfesten Sensoreinheiten an einem Standort in räumlicher Nähe der sicheren Einrichtung; Querprüfen der Daten von der Mobileinheit mit Daten von der einen oder den mehreren ortsfesten Sensoreinheiten; und Gewähren des Zugangs zur sicheren Einrichtung, wenn die Authentifizierung des Benutzers und die Querprüfung erfolgreich sind. Beim Querprüfen kann mithilfe von Daten von der einen oder den mehreren ortsfesten Sensoreinheiten ermittelt werden, ob die Zugangsanfrage von der Mobileinheit in der Nähe der sicheren Einrichtung erfolgt. Das Verfahren kann beinhalten: Erhalten von Daten von einer oder mehreren ortsfesten Sensoreinheiten und Verwenden der Daten, um Authentifizierungsdaten bereitzustellen; und Querprüfen einiger der Authentifizierungsdaten von der Mobileinheit mit einigen der Authentifizierungsdaten von der einen oder den mehreren ortsfesten Sensoreinheiten.
-
-
-
-
-
-
-
-
-