Method and system for encoding and decoding speech signals

    公开(公告)号:GB2357231B

    公开(公告)日:2004-06-09

    申请号:GB0023864

    申请日:2000-09-29

    Applicant: IBM

    Abstract: A method for encoding a digitized speech signal so as to generate data capable of being decoded as speech. A digitized speech signal is first converted to a series of feature vectors using for example known Mel-frequency Cepstral coefficients (MFCC) techniques. At successive instances instance of time a respective pitch value of the digitized speech signal is computed, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.

    FAST FREQUENCY-DOMAIN PITCH ESTIMATION

    公开(公告)号:CA2413138A1

    公开(公告)日:2002-01-24

    申请号:CA2413138

    申请日:2001-07-12

    Applicant: IBM

    Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval (42), and computing a second transform of the signal of the frequency domain over a second time interval (44), which contains the first time interval. A line spectrum of the signal is found, based on the first an d second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function (130) that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative (158), for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function (176, 178).

    Stimmentransformation mit codierten Informationen

    公开(公告)号:DE112012000698B4

    公开(公告)日:2019-04-18

    申请号:DE112012000698

    申请日:2012-03-13

    Applicant: IBM

    Abstract: Verfahren zur Stimmentransformation, wobei das Verfahren aufweist:Transformieren einer Quellsprachaufnahme einer Person unter Verwendung von Transformationsparametern, so dass die Transformation der Quellsprachaufnahme den Eindruck vermittelt, die enthaltene Sprache stamme von einer anderen Person;Codieren von Informationen zu den Transformationsparametern in eine Sprachausgabe unter Verwendung von Steganografie;wobei die Quellsprachaufnahme unter Verwendung der Sprachausgabe und der Informationen zu den Transformationsparametern rekonstruierbar ist.

    Secure facilities access
    14.
    发明专利

    公开(公告)号:GB2498042B

    公开(公告)日:2014-05-14

    申请号:GB201220270

    申请日:2012-11-12

    Applicant: IBM

    Abstract: Method, system, and computer program product are provided for secure facilities access. The method may include: receiving an access request from a mobile device to a secure facility; authenticating a user using multifactor biometric authentication with data from the mobile device; obtaining data from one or more fixed sensor devices at a location in the physical vicinity of the secure facility; cross-validating data from the mobile device with data from the one or more fixed sensor devices; and granting access to the secure facility if the authentication of the user and the cross-validation are successful. The cross-validating may determine that the access request from the mobile device is made in the vicinity of the secure facility using data from the one or more fixed sensor devices.

    Stimmentransformation mit codierten Informationen

    公开(公告)号:DE112012000698T5

    公开(公告)日:2013-11-14

    申请号:DE112012000698

    申请日:2012-03-13

    Applicant: IBM

    Abstract: Es werden ein Verfahren, ein System und ein Computerprogrammprodukt zur Stimmentransformation bereitgestellt. Das Verfahren weist ein Transformieren einer Quellsprache unter Verwendung von Transformationsparametern und ein Codieren von Informationen zu den Transformationsparametern in eine ausgegebene Sprache unter Verwendung von Steganografie auf, wobei die Quellsprache unter Verwendung der ausgegebenen Sprache und der Informationen zu den Transformationsparametern rekonstruiert werden kann. Außerdem wird ein Verfahren zum Rekonstruieren einer Stimmentransformation bereitgestellt, wobei das Verfahren aufweist: Empfangen einer ausgegebenen Sprache eines Stimmentransformationssystems, wobei es sich bei der ausgegebenen Sprache um transformierte Sprache handelt, die codierte Informationen zu den Transformationsparametern unter Verwendung von Steganografie aufweist; Entnehmen der Informationen zu den Transformationsparametern; und Ausführen einer Umkehrtransformation der ausgegebenen Sprache, um eine Annäherung an eine ursprüngliche Quellsprache zu erhalten.

    16.
    发明专利
    未知

    公开(公告)号:BRPI0613699A2

    公开(公告)日:2011-01-25

    申请号:BRPI0613699

    申请日:2006-05-12

    Applicant: IBM

    Abstract: A method for querying an electronic dictionary using letters of an alphabet enunciated by a user includes accepting a speech input from the user. The speech input includes a sequence of spelled letters enunciated by the user that spell a query word. The speech input is analyzed to determine one or more sequences of the letters that approximate the sequence of spelled letters. The one or more sequences of the letters are post-processed so as to produce a plurality of recognized words approximating the query word. The electronic dictionary is queried with the plurality of recognized words so as to retrieve a respective plurality of dictionary entries. A list of results including the plurality of recognized words and the respective plurality of dictionary entries is presented to the user.

    Fast frequency-domain pitch estimation

    公开(公告)号:AU7272901A

    公开(公告)日:2002-01-30

    申请号:AU7272901

    申请日:2001-07-12

    Applicant: IBM

    Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.

Patent Agency Ranking