METHOD AND DEVICE FOR SIMULTANEOUS VOICE RECOGNITION, SPEAKER SEGMENTATION AND SPEAKER CLASSIFICATION

    公开(公告)号:JP2001060098A

    公开(公告)日:2001-03-06

    申请号:JP2000188625

    申请日:2000-06-23

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To obtain a method, in which audio information from an audio/video source is automatically transferred and a speaker is identified simultaneously, by tranferring the audio source, simultaneously identifying latent segment boundaries and assigning a speaker label to each identified segment. SOLUTION: The method includes a step, in which a transfer is made for an audio source to generate a text version of audio information, a step which simultaneously identifies latent segment boundaries, and a step in which a speaker label is assigned to each of identified segments. A simultaneous transfer, segmentation and speaker identification process 500 generates a transfer of audio information, which represents a speaker related to each segment, in real time. A segmentation process 600 identifies all frames in which segment boundaries may exist. A speaker identifying process 700 assigns a speaker label to each of the segments that use registered speaker databases.

    TOUCH GESTURE BASED INTERFACE FOR MOTOR VEHICLE
    2.
    发明申请
    TOUCH GESTURE BASED INTERFACE FOR MOTOR VEHICLE 审中-公开
    基于触摸式电动汽车接口的接口

    公开(公告)号:WO2006025891A3

    公开(公告)日:2006-06-15

    申请号:PCT/US2005018005

    申请日:2005-05-23

    Applicant: IBM

    Abstract: An improved apparatus and method is provided for operating devices and systems in a motor vehicle, while at the same time reducing vehicle operator distractions. One or more touch sensitive pads (112) are mounted on the steering wheel (114) of the motor vehicle (100), and the vehicle operator (104) touches the pads (112) in a pre-specified synchronized pattern, to perform functions such as controlling operation of the radio or adjusting a window. At least some of the touch patterns used to generate different commands may be selected by the vehicle operator (104). Usefully, the system of touch pad sensors and the signals generated thereby are integrated with speech recognition (304) and/or facial gesture recognition systems (306), so that commands may be generated by synchronized multi-mode inputs.

    Abstract translation: 提供了一种用于在机动车辆中操作装置和系统的改进的装置和方法,同时减少了车辆操作者的干扰。 一个或多个触敏垫(112)安装在机动车辆(100)的方向盘(114)上,并且车辆操作者(104)以预定的同步模式接触垫(112),以执行功能 例如控制无线电的操作或调整窗口。 用于产生不同命令的至少一些触摸图案可以由车辆操作者(104)选择。 有利地,触摸板传感器的系统和由此产生的信号与语音识别(304)和/或面部手势识别系统(306)集成,使得可以通过同步的多模式输入来生成命令。

    PROVISIONING SERVICES USING A CLOUD SERVICES CATALOG
    3.
    发明申请
    PROVISIONING SERVICES USING A CLOUD SERVICES CATALOG 审中-公开
    使用云服务目录提供服务

    公开(公告)号:WO2011067062A2

    公开(公告)日:2011-06-09

    申请号:PCT/EP2010066763

    申请日:2010-11-03

    CPC classification number: G06F9/5072 G06Q10/10

    Abstract: The present invention provides a system and method for provisioning Cloud services by establishing a Cloud services catalog using a Cloud service bus within a Cloud computing environment. In one embodiment, there is a Cloud services catalog manager configured to connect a plurality of Clouds in a Cloud computing environment; maintain a catalog of integrated Cloud services from the plurality of connected Clouds; and display an index of the integrated services on a user interface. Using this system and method will allow for multiple disparate services, offered by different partners, across unrelated, physically distinct Clouds to be presented as an index of integrated services.

    Abstract translation: 本发明提供了一种通过在云计算环境内使用云服务总线建立云服务目录来供应云服务的系统和方法。 在一个实施例中,存在被配置为连接云计算环境中的多个云的云服务目录管理器; 维护来自多个连接的云的集成云服务的目录; 并在用户界面上显示综合业务的索引。 使用这种系统和方法将允许由不同合作伙伴提供的跨不相关的,物理上不同的云的多种不同的服务被呈现为综合服务的索引。

    METHOD AND DEVICE FOR TRACKING LOUDSPEAKER IN AUDIO STREAM

    公开(公告)号:JP2001051691A

    公开(公告)日:2001-02-23

    申请号:JP2000188613

    申请日:2000-06-23

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide a method and a device for automatically identifying a loudspeaker from an audio (or video) source. SOLUTION: The audio/video source processes 300 first so as to identify a frame where a segment border showing a loudspeaker change exists based on a Bayes information criterion(BIC) model selection criterion, and the segment corresponding to the same loudspeaker is clustered 400, and the cluster identification data are allocated to each of the identified segments. A loudspeaker classification system 100 generates a clustering output file 160 providing a series a segment numbers (having start times and end times of each segment) together with the corresponding identified cluster numbers.

    METHOD AND DEVICE FOR RETRIEVING VOICE INFORMATION BY USING CONTENTS INFORMATION AND SPEAKER INFORMATION

    公开(公告)号:JP2000348064A

    公开(公告)日:2000-12-15

    申请号:JP2000102972

    申请日:2000-04-05

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To retrieve voice information according to voice contents and speaker discrimination by discriminating voice information matching a user inquiry by comparing the user inquiry with the contents index and speaker index of a voice source. SOLUTION: The user inquiry includes a text character string including one or more key words and given speaker discrimination. The restriction conditions of the inquiry are compared with an indexed voice or/and a video database to retrieve proper voice and video segments. A voice retrieval system 100 comprises an indexing system 500 which transcribes and indexes voice information and an voice retrieval system 600. The indexing system 500 processes a text outputted from a voice recognition system in the indexing stage to perform contents indexing and speaker indexing. In the retrieval stage, the contents and speaker voice retrieval system 600 matches an inquiry document according to the voice contents and speaker discrimination by using the contents indexes and speaker indexes in the indexing stage and returns a proper document to a user.

    Tracking speakers in an audio stream

    公开(公告)号:GB2351592A

    公开(公告)日:2001-01-03

    申请号:GB0015194

    申请日:2000-06-22

    Applicant: IBM

    Abstract: Audio information is processed to identify potential segment boundaries, corresponding to a speaker changes 220. Thereafter, homogeneous segments (generally corresponding to the same speaker) are clustered 230, and a cluster identifier is assigned to each identified segment. A segmentation subroutine identifies potential segment boundaries using the BIC model selection criterion. A window selection scheme considers a relatively small amount of data in areas where new boundaries are very likely to occur, and the window size is increased when boundaries are not very likely to occur. When a segment boundary is found in a window, the next window begins after the detected boundary, using the minimal window size. BIC tests can be eliminated when they correspond to locations where the detection of a boundary is very unlikely.

    Bereitstellung von Diensten unter Verwendung eines Cloud-Dienste-Katalogs

    公开(公告)号:DE112010003886T5

    公开(公告)日:2012-08-09

    申请号:DE112010003886

    申请日:2010-11-03

    Applicant: IBM

    Abstract: Die vorliegende Erfindung stellt ein System und ein Verfahren zur Bereitstellung von Cloud-Diensten bereit, durch Etablierung eines Cloud-Dienste-Katalogs unter Verwendung eines Cloud-Dienst-Bus innerhalb einer Cloud Computing-Umgebung. In einer Ausführungsform ist ein Cloud-Dienste-Katalog Manager konfiguriert, um eine Vielzahl von Clouds in der Cloud Computing-Umgebung zu verbinden; einen Katalog von integrierten Cloud-Diensten aus der Vielzahl von verbundenen Clouds zu pflegen, und einen Index der integrierten Dienste auf einer Benutzeroberfläche anzuzeigen. Mit diesem System und Verfahren werden mehrere disparate Dienste, die von verschiedenen Partnern angeboten werden, über unverbundene, physikalisch getrennte Clouds ermöglicht, die als Index von integrierten Diensten dargestellt werden.

    Methods and apparatus for tracking speakers in an audio stream

    公开(公告)号:GB2351592B

    公开(公告)日:2003-05-21

    申请号:GB0015194

    申请日:2000-06-22

    Applicant: IBM

    Abstract: Speakers are automatically identified in an audio (or video) source. The audio information is processed to identify potential segment boundaries. Homogeneous segments are clustered substantially concurrently with the segmentation routine, and a cluster identifier is assigned to each identified segment. A segmentation subroutine identifies potential segment boundaries using the BIC model selection criterion. A clustering subroutine uses a BIC model selection criterion to assign a cluster identifier to each of the identified segments. If the difference of BIC values for each model is positive, the two clusters are merged.

Patent Agency Ranking