Abstract:
A method and apparatus are provided for determining an instantaneous frequency and an instantaneous bandwidth of a speech resonance of a speech signal. The method includes receiving a speech signal having a real component; filtering the speech signal so as to generate a plurality of filtered signals such that the real component and an imaginary component of the speech signal are reconstructed; and generating a first estimated frequency and a first estimated bandwidth of a speech resonance of the speech signal based on both a first filtered signal of the plurality of filtered signals and a single-lag delay of the first filtered signal.
Abstract:
A system for determining the status of an answered telephone during the course of an outbound telephone call includes an automated telephone calling device for placing a telephone call to a location having a telephone number at which a target person is listed, upon the telephone call being answered, initiating a prerecorded greeting which asks for the target person and receiving a spoken response from an answering person and a speech recognition device for performing a speech recognition analysis on the spoken response to determine a status of the spoken response. If the speech recognition device determines that the answering person is the target person, the speech recognition device initiates a speech recognition application with the target person.
Abstract:
A phoneme estimator in a speech-recognition system includes energy detect circuitry for detecting the segments of a speech signal that should be analyzed for phoneme content. Speech-element processors then process the speech signal segments, calculating nonlinear representations of the segments. The nonlinear representation data is applied to speech-element modeling circuitry which reduces the data through speech element specific modeling. The reduced data are then subjected to further nonlinear processing. The results of the further nonlinear processing are again applied to speech- element modeling circuitry, producing phoneme isotype estimates. The phoneme isotype estimates are rearranged and consolidated, that is, the estimates are uniformly labeled and duplicated estimates are consolidated, forming estimates of words or phrases containing minimal numbers of phonemes. The estimates may then be compared with stored words or phrases to determine what was spoken.
Abstract:
A method and apparatus are provided for determining an instantaneous frequency and an instantaneous bandwidth of a speech resonance of a speech signal. The method includes receiving a speech signal having a real component; filtering the speech signal so as to generate a plurality of filtered signals such that the real component and an imaginary component of the speech signal are reconstructed; and generating a first estimated frequency and a first estimated bandwidth of a speech resonance of the speech signal based on both a first filtered signal of the plurality of filtered signals and a single-lag delay of the first filtered signal.
Abstract:
A phonetic data processing system processes phonetic stream data to produce a set of semantic data, using a context-free rich semantic grammar database (RSG DB) that includes a grammar tree, comprised of sub-trees, representing words and phrases. A phonetic searcher accepts the phonetic estimates and searches the RSG DB to produce a best word list, which is processed by a semantic parser, using the RSG DB, to produce a semantic tree instance, including all valid interpretations of the phonetic stream. An application accesses a semantic tree evaluator to interpret the semantic tree instance according to a context to produce a final linguistic interpretation of the phonetic stream, which is returned to the application.
Abstract:
A speech recognition system includes a line of service including a first server object coupled to a telephone network for receiving a voice data message from the telephone network, a second server object having a first connection to the first server object for receiving the voice data message from the first server object and converting the voice data message to a phonetic data message, a third server object having a second connection to the second server object for receiving the phonetic data message from the second server object and converting the phonetic data message to a syntactic data message and a fourth server object having a third connection to the third server object for receiving the syntactic data message from the third server object and converting the syntactic data message to a semantic data message, which is representative of the voice data message. The first, second, third and fourth server objects may be remote with respect to each other and the first, second and third connections are formed over a first computer network.
Abstract:
A system for determining the status of an answered telephone during the course of an outbound telephone call includes an automated telephone calling device for placing a telephone call to a location having a telephone number at which a target person is listed, upon the telephone call being answered, initiating a prerecorded greeting which asks for the target person (26) and receiving a spoken response from an answering person and a speech recognition device for performing a speech recognition analysis on the spoken response to determine a status of the spoken response. If the speech recognition device determines that the answering person is the target person (50), the speech recognition device initiates a speech recognition application with the target person (52).