MULTI-MODAL BROWSING AND METHOD AND SYSTEM FOR EXECUTING CONVERSATION TYPE MARK-UP LANGUAGE

    公开(公告)号:JP2001154852A

    公开(公告)日:2001-06-08

    申请号:JP2000311661

    申请日:2000-10-12

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide new application programming language based on user interaction with an arbitrary device which is being used by a user for performing access to information in an arbitrary type. SOLUTION: In a desired execution configuration, this conversation type mark-up language(CML) is the language of a high level XML base for expressing a 'dialog' or 'conversation' to be performed by a user with a prescribed computing device. An application preparer can program an application by using the element of an interactive base called 'conversation type gesture'. Also, it is possible to realize the various execution configurations of a multi- modal browser for supporting the characteristics of the CML according to various modal specific expressions, for example, the graphical user interface(GUI) browser of an HTML base and the speech browser of a Voice XML base.

    USER IDENTIFICATION DEVICE AND METHOD TO REJECT ACCESS OR SERVICE TO UNPERMITTED USER

    公开(公告)号:JPH11168561A

    公开(公告)日:1999-06-22

    申请号:JP19841298

    申请日:1998-07-14

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide the device and the method that reject access of the user of an exceptional class to a specific service or feature of a system by means of biological measurement identification and nonbiological measurement identification. SOLUTION: The device includes an acoustic model that represents related to each system user corresponding to a telephone number of each system user and a talker identification module that couples with a data base 12 in operation and acquires and decodes a voice sample from each user during trial by a potential system user going to make a phone call. The talker identification module compares the acquired and decoded voice sample with the acoustic model related to a telephone number dialed by the potential user and stored in advance and terminates a tried phone call by the potential user when the decoded voice sample is substantially in matching with the stored acoustic model.

    Transcription of speech data with segments from acoustically dissimilar environments
    4.
    发明公开
    Transcription of speech data with segments from acoustically dissimilar environments 失效
    语音数据的说明与声学不同环境的段

    公开(公告)号:EP0788090A3

    公开(公告)日:1998-08-19

    申请号:EP97300290

    申请日:1997-01-17

    Applicant: IBM

    CPC classification number: G10L15/20

    Abstract: A technique to improve the recognition accuracy when transcribing speech data that contains data from a wide range of environments. Input data in many situations contains data from a variety of sources in different environments. Such classes include: clean speech, speech corrupted by noise (e.g., music), non-speech (e.g., pure music with no speech), telephone speech, and the identity of a speaker. A technique is described whereby the different classes of data are first automatically identified, and then each class is transcribed by a system that is made specifically for it. The invention also describes a segmentation algorithm that is based on making up an acoustic model that characterizes the data in each class, and then using a dynamic programming algorithm (the viterbi algorithm) to automatically identify segments that belong to each class. The acoustic models are made in a certain feature space, and the invention also describes different feature spaces for use with different classes.

    6.
    发明专利
    未知

    公开(公告)号:DE69841527D1

    公开(公告)日:2010-04-15

    申请号:DE69841527

    申请日:1998-07-23

    Applicant: IBM

    Abstract: Apparatus for preventing unauthorized use of a voice dialing system and, particularly, a call forwarding feature associated with the system whereby system users may forward a telephone number respectively associated therewith to a remote location in order to receive phone calls at the remote location, comprises: a database for pre-storing telephone numbers of system users and for pre-storing acoustic models respectively representative of speech associated with each system user, the acoustic models respectively corresponding to the telephone numbers; and a speaker identification module operatively coupled to the database for obtaining and decoding a speech sample from a potential system user during the potential users' attempt to make a telephone call, the speaker identification module comparing the decoded speech sample obtained with the pre-stored acoustic model associated with the telephone number dialed by the potential user; whereby if the decoded speech sample substantially matches the pre-stored acoustic model, then the phone call attempted by the potential user is terminated.

    METHOD AND APPARATUS FOR SPEAKER RECOGNITION OVER LARGE POPULATION WITH FAST AND DETAILED MATCHES

    公开(公告)号:HK1015924A1

    公开(公告)日:1999-10-22

    申请号:HK99100943

    申请日:1999-03-09

    Applicant: IBM

    Abstract: Fast and detailed match techniques for speaker recognition are combined into a hybrid system in which speakers are associated in groups when potential confusion is detected between a speaker being enrolled and a previously enrolled speaker. Thus the detailed match techniques are invoked only at the potential onset of saturation of the fast match technique while the detailed match is facilitated by limitation of comparisons to the group and the development of speaker-dependent models which principally function to distinguish between members of a group rather than to more fully characterize each speaker. Thus storage and computational requirements are limited and fast and accurate speaker recognition can be extended over populations of speakers which would degrade or saturate fast match systems and degrade performance of detailed match systems.

Patent Agency Ranking