Patent search ap:("IBM") AND inv:"NAHAMOO DAVID" Page 1

1.

发明公开
Method and apparatus for suppressing background music or noise from the speech input of a speech recognizer 失效
Title translation: 用于抑制背景音乐或侵入噪声的语音的输入信号的方法和装置

公开(公告)号：EP0788089A3

公开(公告)日：1998-09-30

申请号：EP97300293

申请日：1997-01-17

Applicant: IBM

Inventor： GOPALAKRISHNAN PONANI , NAHAMOO DAVID , PANMANABHAN MUKUND , POLYMENAKOS LAZAROS

IPC: G10K11/178 , G10L21/02 , G10L3/02

CPC classification number: G10L21/0208

Abstract: A method and apparatus for removing the effect of background music or noise from speech input to a speech recognizer so as to improve recognition accuracy has been devised. Samples of pure music or noise related to the background music or noise that corrupts the speech input are utilized to reduce the effect of the background in speech recognition. The pure music and noise samples can be obtained in a variety of ways. The music or noise corrupted speech input is segmented in overlapping segments and is then processed in two phases: first, the best matching pure music or noise segment is aligned with each speech segment; then a linear filter is built for each segment to remove the effect of background music or noise from the speech input and the overlapping segments are averaged to improve the signal to noise ratio. The resulting acoustic output can then be fed to a speech recognizer.

2.

发明申请
VOICE PRINT TAGGING OF INTERACTIVE VOICE RESPONSE SESSIONS 审中-公开
Title translation: 互动语音响应会话的语音打印标签

公开(公告)号：WO2014140970A3

公开(公告)日：2015-03-19

申请号：PCT/IB2014059198

申请日：2014-02-24

Applicant: IBM , IBM UK , IBM CHINA INVEST CO LTD

Inventor： MOORE VICTOR , NUSBICKEL WENDI LYNN , NAHAMOO DAVID , VAVRA CHRISTOPHER JAMES

IPC: H04M3/493 , G10L17/14

CPC classification number: H04M3/493 , H04M3/385 , H04M2203/2088

Abstract: Embodiments of the invention provide a method, system and computer program product for voice print tagging for interactive voice response (IVR) session management. In an embodiment of the invention, a method of voiceprint tagging for IVR session management is provided. The method includes establishing an IVR session for a caller from over a network and presenting a portion of the IVR session to the caller over the network. The method also includes storing a voiceprint tag in memory associating a voiceprint of the caller with a portion of the IVR session. Finally, the method includes responding to a premature termination of the IVR session by re-establishing the prematurely terminated IVR session with the caller at the portion of the IVR session indicated by the voiceprint tag of the caller.

Abstract translation: 本发明的实施例提供了一种用于语音打印标签以用于交互式语音响应（IVR）会话管理的方法，系统和计算机程序产品。在本发明的实施例中，提供了用于IVR会话管理的声纹标签的方法。该方法包括通过网络为呼叫者建立IVR会话，并通过网络向呼叫者呈现IVR会话的一部分。该方法还包括将声波标签存储在存储器中，将呼叫者的声纹与IVR会话的一部分相关联。最后，该方法包括通过在由呼叫者的声纹标签指示的IVR会话的部分重新建立与呼叫者的过早终止的IVR会话来响应IVR会话的过早终止。

3.

发明公开
CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE 审中-公开
Title translation: KONVERTIONALES预计大约会话虚拟机

公开(公告)号：EP1163576A4

公开(公告)日：2005-11-30

申请号：EP99950114

申请日：1999-10-01

Applicant: IBM

Inventor： COFFMAN DANIEL , COMERFORD LIAM D , DEGENNARO STEVEN V , EPSTEIN EDWARD A , GOPALAKRISHNAN PONANI , MAES STEPHANE H , NAHAMOO DAVID

IPC: G06F3/16 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G10L13/00 , G10L15/22 , G10L15/26 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00 , G06F17/20 , G10L13/02

CPC classification number: H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

Abstract: A method for conversational computing includes executing code embodying a conversational virtual machine, registering a plurality of input/output resources with a conversational kernel, providing an interface between a plurality of active applications and the conversational kernel processing input/output data, receiving input queries and input events of a multi-modal dialog across a plurality of user interface modalities of the plurality of active applications, generating output messages and output events of the multi-modal dialog in connection with the plurality of active applications, managing, by the conversational kernel, a context stack associated with the plurality of active applications and the multi-modal dialog to transform the input queries into application calls for the plurality of active applications and convert the output messages into speech, wherein the context stack accumulates a context of each of the plurality of active applications.

4.

发明公开
CONVERSATIONAL ASYNCRONOUS MULTINATIONAL COMMUNICATION THROUGH AN INTER-MODALITY BRIDGE 审中-公开
Title translation: 异步传输谈论多国INTERMODALITÄTSBRÜCKE

公开(公告)号：EP2279592A4

公开(公告)日：2013-10-30

申请号：EP09739165

申请日：2009-04-29

Applicant: IBM

Inventor： HUERTA JUAN M , LUBENSKY DAVID , NAHAMOO DAVID

IPC: H04L12/28 , H04L12/931 , H04L29/06

CPC classification number: H04L49/355 , G06Q10/06 , H04L69/08

5.

发明公开
CONVERSATIONAL BROWSER AND CONVERSATIONAL SYSTEMS 审中-公开
Title translation: INTERAKTIVER BROWSER UND INTERAKTIVE SYSTEME

公开(公告)号：EP1133734A4

公开(公告)日：2005-12-14

申请号：EP99950131

申请日：1999-10-01

Applicant: IBM

Inventor： GOPALAKRISHNAN PONANI , LUCAS BRUCE D , MAES STEPHANE H , NAHAMOO DAVID , SEDIVY JAN

IPC: G06F3/16 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G10L13/00 , G10L15/22 , G10L15/26 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00

CPC classification number: H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

Abstract: A method for conversational computing includes executing code embodying a conversational virtual machine, registering a plurality of input/output resources with a conversational kernel, providing an interface between a plurality of active applications and the conversational kernel processing input/output data, receiving input queries and input events of a multi-modal dialog across a plurality of user interface modalities of the plurality of active applications, generating output messages and output events of the multi-modal dialog in connection with the plurality of active applications, managing, by the conversational kernel, a context stack associated with the plurality of active applications and the multi-modal dialog to transform the input queries into application calls for the plurality of active applications and convert the output messages into speech, wherein the context stack accumulates a context of each of the plurality of active applications.

Abstract translation: 一种用于对话计算的方法包括执行体现对话虚拟机的代码，用对话内核注册多个输入/输出资源，提供多个活动应用与对话内核处理输入/输出数据之间的接口，接收输入查询和跨多个活动应用的多个用户界面模式的多模式对话的输入事件，与多个活动应用相关联地生成输出消息和多模式对话的输出事件，由对话内核管理，与所述多个活动应用相关联的上下文栈以及将所述输入查询转换为所述多个活动应用的应用程序调用并将所述输出消息转换为语音的所述多模式对话，其中，所述上下文堆栈累积所述多个活动应用中的每一个的上下文的活跃应用。

6.

发明专利
Speech recognition utilizing multitude of speech features 有权
Title translation: 语音识别利用多种语音特征

公开(公告)号：JP2005165272A

公开(公告)日：2005-06-23

申请号：JP2004270823

申请日：2004-09-17

Applicant: Internatl Business Mach Corp , インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation

Inventor： AXELROD SCOTT E , BALAKRISHNAN SREERAM V , CHEN STANLEY F , GAO YUGING , GOPINATH RAMESH A , KUO HONG-KWANG , MAISON BENOIT , NAHAMOO DAVID , PICHENY MICHAEL A , SAON GEORGE A , ZWEIG GEOFFREY G

IPC: G10L15/10 , G10L15/00 , G10L15/02 , G10L15/06 , G10L15/14

CPC classification number: G10L15/063 , G10L15/02 , G10L15/14 , G10L2015/085

Abstract: PROBLEM TO BE SOLVED: To provide a combination of a log-linear model with a multitude of speech features to recognize unknown speech utterances in a speech recognition system.
SOLUTION: The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using the log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
COPYRIGHT: (C)2005,JPO&NCIPI

Abstract translation: 要解决的问题：提供对数线性模型与多种语音特征的组合以识别语音识别系统中的未知语音话语。解决方案：语音识别系统使用对数线性模型建模与语音识别相关的语言单位的后验概率。后验模型捕获了语言单位给出观察到的语音特征和后验模型参数的概率。可以使用给定多个语音特征的单词序列假设的概率来确定后验模型。对数线性模型与来自稀疏或不完整数据的特征一起使用。所使用的语音特征可以包括异步，重叠和统计上非独立的语音特征。培训中使用的并非所有功能都需要出现在测试/识别中。版权所有（C）2005，JPO＆NCIPI

7.

发明申请
CONVERSATIONAL BROWSER AND CONVERSATIONAL SYSTEMS 审中-公开
Title translation: 对话浏览器和对话系统

公开(公告)号：WO0021232A3

公开(公告)日：2000-11-02

申请号：PCT/US9923008

申请日：1999-10-01

Applicant: IBM , GOPALAKRISHNAN PONANI , LUCAS BRUCE D , MAES STEPHANE H , NAHAMOO DAVID , SEDIVY JAN

Inventor： GOPALAKRISHNAN PONANI , LUCAS BRUCE D , MAES STEPHANE H , NAHAMOO DAVID , SEDIVY JAN

IPC: G06F3/16 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G10L13/00 , G10L15/22 , G10L15/26 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00 , G06F15/16

CPC classification number: H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

Abstract: A conversational browsing system (10) comprising a conversational browser (11) having a command and control interface (12) for converting speech commands or multi-modal input from I/O resources (27) into navigation request. The system (10) comprises conversational engines (23) for decoding input commands for interpretation by the command and control interface and decoding meta-information provided by the CML processor for generating synthesized audio output. The system includes a communication stack (19) for transmitting the navigation request to a content server and receiving a CML file from the content server based on the navigation request. A conversational transcoder (13) transforms presentation material from one modality to a conversational modality. The transcoder (13) includes a functional transcoder (13a) to transform a page of GUI to a page of CUI (conversational user interface) and a logical transcoder (13b) to transform business logic of an application, transaction or site into an acceptable dialog.

Abstract translation: 会话浏览系统（10）包括具有用于将来自I / O资源（27）的语音命令或多模态输入转换为导航请求的命令和控制接口（12）的对话浏览器（11）。系统（10）包括会话引擎（23），用于解码由命令和控制接口进行解释的输入命令，以及解码由CML处理器提供的元信息以产生合成音频输出。该系统包括用于将导航请求发送到内容服务器并基于导航请求从内容服务器接收CML文件的通信栈（19）。会话代码转换器（13）将演示材料从一种模式转换为对话模式。代码转换器（13）包括功能转码器（13a），用于将GUI的页面转换为CUI页面（会话用户界面）和逻辑代码转换器（13b），以将应用程序，交易或站点的业务逻辑转换成可接受的对话。

8.

发明申请
CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE 审中-公开
Title translation: 通过对话虚拟机对话计算

公开(公告)号：WO0020962A2

公开(公告)日：2000-04-13

申请号：PCT/US9922927

申请日：1999-10-01

Applicant: IBM , COFFMAN DANIEL , COMERFORD LIAM D , DEGENNARO STEVEN V , EPSTEIN EDWARD A , GOPALAKRISHNAN PONANI , MAES STEPHANE H , NAHAMOO DAVID

Inventor： COFFMAN DANIEL , COMERFORD LIAM D , DEGENNARO STEVEN V , EPSTEIN EDWARD A , GOPALAKRISHNAN PONANI , MAES STEPHANE H , NAHAMOO DAVID

IPC: G06F3/16 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G10L13/00 , G10L15/22 , G10L15/26 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00 , G06F9/00

CPC classification number: H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74

Abstract: A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) (10) across a plurality of conversationally aware applications (11) (i.e., applications that "speak" conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel (14) controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.

Abstract translation: 一种对话计算系统，其跨越多个会话感知应用（11）（即，“说”对话协议的应用“）和常规应用（12）提供通用协调多模式对话用户界面（CUI）（10）。对话感知应用（11）通过对话应用API（13）与对话内核（14）通信。对话内核（14）根据其注册的会话能力和要求，控制应用和设备（本地和网络）之间的对话，并提供统一的会话用户界面和对话服务和行为。对话计算系统可以构建在常规操作系统和API（15）和常规设备硬件（16）之上。对话内核（14）处理所有I / O处理和控制对话引擎（18）。会话内核（14）将语音请求转换为查询，并将会话引擎（18）和会话参数（17）将输出和结果转换为口语消息。对话应用程序API（13）传达对话内核（14）的所有信息，以将查询转换为应用程序调用，并相反地将输出转换为语音，在提供给用户之前进行适当排序。

9.

发明申请
CONVERSATIONAL ASYNCRONOUS MULTINATIONAL COMMUNICATION THROUGH AN INTER-MODALITY BRIDGE 审中-公开
Title translation: 对称异步多方通信通过一个模式间桥梁

公开(公告)号：WO2009134362A3

公开(公告)日：2009-12-30

申请号：PCT/US2009002595

申请日：2009-04-29

Applicant: IBM , HUERTA JUAN M , LUBENSKY DAVID , NAHAMOO DAVID

Inventor： HUERTA JUAN M , LUBENSKY DAVID , NAHAMOO DAVID

IPC: H04L12/28 , H04L12/56

CPC classification number: H04L49/355 , G06Q10/06 , H04L69/08

Abstract: A communications apparatus is configured to bridge modalities and different communications formats. The apparatus may a bridge that receives an input through a modality gateway and to deliver an output through an output channel and at least one communication engine that translates the input into the output.

Abstract translation: 通信装置被配置为桥接模态和不同的通信格式。该设备可以是通过模态网关接收输入并且通过输出通道递送输出的桥接器以及将输入转换成输出的至少一个通信引擎。

10.

发明专利
Secure facilities access using multifactor biometric authentication 未知

公开(公告)号：GB2498042A

公开(公告)日：2013-07-03

申请号：GB201220270

申请日：2012-11-12

Applicant: IBM

Inventor： HOORY RON , NAHAMOO DAVID , SICCONI ROBERTO , CONNELL JONATHAN HUDSON II , BEN-DAVID SHAY

IPC: G07C9/00 , G06F21/32 , G06K9/00

Abstract: A facilities access system comprises a mobile device to request access; the gathering multifactor biometric data (such as voiceprint, fingerprint, face, iris, multibiometrics etc.) to authenticate the user; fixed sensor/s located in/near the facility; the validating the mobile and fixed-sensor data; and the granting of access if successful. The fixed-sensor data may also validate that the request was made in the vicinity of the facility, possibly by content of the access request, or by use of an outgoing challenge followed by receipt of challenge confirmation. The position of the mobile device may be determined in order to select the closest fixed sensor device. The authentication process may be conducted on the mobile device or at a remote server; whilst the fixed-sensors may be used to determine whether a user is present or absent at the facility. The fixed-sensor may act to confirm the same biometric factor as the mobile device.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification