-
公开(公告)号:CA2290185A1
公开(公告)日:2000-05-30
申请号:CA2290185
申请日:1999-11-22
Applicant: IBM
Inventor: BASU SANKAR , MAES STEPHANE H
Abstract: Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves "synchrosqueezing" spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the components are, thus, extracted. The cepstrum generated from this information is referred to as "K-mean Wastrum." In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as "formant-based wastrum." Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added. This method requires adequate formant tracking. The resulting robust formant extraction has a number of applications in speech processing and analysis including vocal tract normalization.
-
公开(公告)号:AT338305T
公开(公告)日:2006-09-15
申请号:AT02747928
申请日:2002-06-20
Applicant: IBM
Inventor: GOPALAKRISHNAN PONANI , MAES STEPHANE H , RAMASWAMY GANESH N
Abstract: A system and method for intelligent caching and network management includes contextual information representing needs of a user. A contextual system determines settings based on the contextual information and determines services and devices available for the user, in accordance with the contextual information. A predictor receives the contextual information, the settings, the services available and the devices available and predicts the needs of the user to make resources available to the user in accordance with predictions.
-
公开(公告)号:CA2345660C
公开(公告)日:2006-01-31
申请号:CA2345660
申请日:1999-10-01
Applicant: IBM
Inventor: GOPALAKRISHNAN PONANI , MAES STEPHANE H
IPC: G06F3/16 , H04L12/24 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G06F40/00 , G10L13/00 , G10L13/08 , G10L15/22 , G10L15/26 , H04L29/02 , H04L29/06 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00
Abstract: A system and method for providing automatic and coordinated sharing of conversational resources, e.g. functions and arguments, between network-connected servers and devices, and their corresponding applications. In one aspect, a system for providing automatic and co-ordinated sharing of conversational resources comprises: a network comprising a first (100), and second (106) network device; the first (100) and second (106) network device each comprising a set of conversational resources ( 102, 107), a dialog manager ( 103, 108), for managing a conversation and executing calls requesting a conversational service, and a communication stack (111, 115), for communicating messages over a network using conversational protocols, wherein the conversational protocols establish coordinated network communication between the dialog managers of the first and second device to automatically share the set of conversational resources of the first and second network device, when necessary, to perform their respective requested conversational service.
-
公开(公告)号:HK1063371A1
公开(公告)日:2004-12-24
申请号:HK04106079
申请日:2004-08-13
Applicant: IBM
Inventor: MAES STEPHANE H , NETI CHALAPATHY V
IPC: G06F3/00 , G06T1/00 , G06F3/01 , G09G20060101 , G06F3/033 , G06F3/048 , G06F9/45 , G06F17/00 , G06K9/00 , G06T7/00 , G06T7/20 , G10L15/22 , G10L15/24
Abstract: Systems and methods are provided for performing focus detection, referential ambiguity resolution and mood classification in accordance with multi-modal input data, in varying operating conditions, in order to provide an effective conversational computing environment for one or more users.
-
公开(公告)号:DE69937962D1
公开(公告)日:2008-02-21
申请号:DE69937962
申请日:1999-10-01
Applicant: IBM
Inventor: MAES STEPHANE H , GOPALAKRISHNAN PONANI
IPC: G06F3/16 , G10L15/22 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G06F40/00 , G10L13/00 , G10L13/08 , G10L15/26 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00
Abstract: A method for conversational computing includes executing code embodying a conversational virtual machine, registering a plurality of input/output resources with a conversational kernel, providing an interface between a plurality of active applications and the conversational kernel processing input/output data, receiving input queries and input events of a multi-modal dialog across a plurality of user interface modalities of the plurality of active applications, generating output messages and output events of the multi-modal dialog in connection with the plurality of active applications, managing, by the conversational kernel, a context stack associated with the plurality of active applications and the multi-modal dialog to transform the input queries into application calls for the plurality of active applications and convert the output messages into speech, wherein the context stack accumulates a context of each of the plurality of active applications.
-
公开(公告)号:CA2290185C
公开(公告)日:2005-09-20
申请号:CA2290185
申请日:1999-11-22
Applicant: IBM
Inventor: BASU SANKAR , MAES STEPHANE H
Abstract: Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves "synchrosqueezing" spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the; components are, thus, extracted. The cepstrum generated from this information is referred to as "K-mean Wastrum." In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as "formant-based wastrum." Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added. This method requires adequate formant tracking. The resulting robust formant extraction has a number of applications in speech processing and analysis including vocal tract normalization.
-
公开(公告)号:AU2002318373A1
公开(公告)日:2003-01-08
申请号:AU2002318373
申请日:2002-06-20
Applicant: IBM
Inventor: RAMASWAMY GANESH N , GOPALAKRISHNAN PONANI , MAES STEPHANE H
Abstract: A system and method for intelligent caching and network management includes contextual information representing needs of a user. A contextual system determines settings based on the contextual information and determines services and devices available for the user, in accordance with the contextual information. A predictor receives the contextual information, the settings, the services available and the devices available and predicts the needs of the user to make resources available to the user in accordance with predictions.
-
公开(公告)号:CA2345661A1
公开(公告)日:2000-04-13
申请号:CA2345661
申请日:1999-10-01
Applicant: IBM
Inventor: NAHAMOO DAVID , SEDIVY JAN , GOPALAKRISHNAN PONANI , LUCAS BRUCE D , MAES STEPHANE H
IPC: G06F3/16 , G06F9/44 , G06F9/46 , G06F9/54 , G06F12/00 , G06F15/00 , G06F17/28 , G06F17/30 , G06F40/00 , G10L13/00 , G10L15/22 , G10L15/26 , H04M1/253 , H04M1/27 , H04M1/725 , H04M3/42 , H04M3/44 , H04M3/493 , H04M3/50 , H04M7/00 , H04M11/00 , G06F15/16
Abstract: A conversational browsing system (10) comprising a conversational browser (1 1) having a command and control interface (12) for converting speech commands o r multi-modal input from I/O resources (27) into navigation request. The syste m (10) comprises conversational engines (23) for decoding input commands for interpretation by the command and control interface and decoding meta- information provided by the CML processor for generating synthesized audio output. The system includes a communication stack (19) for transmitting the navigation request to a content server and receiving a CML file from the content server based on the navigation request. A conversational transcoder (13) transforms presentation material from one modality to a conversational modality. The transcoder (13) includes a functional transcoder (13a) to transform a page of GUI to a page of CUI (conversational user interface) and a logical transcoder (13b) to transform business logic of an application, transaction or site into an acceptable dialog.
-
-
-
-
-
-
-