Abstract:
A method and system for managing a communication session is provided. The communication session is associated with multiple communication devices. The method includes learning (304) a set of derived acoustic features of an audio communication signal that is associated substantially only with one user of a communication device. The method also includes receiving (306) a communication session signal. The communication session signal is an audio signal that includes a combination of audio communication signals. Each audio communication signal of the audio communication signals is associated with a user of a communication device of the multiple communication devices. The method includes modifying (308) the communication session signal based on the set of derived acoustic features.
Abstract:
Techniques and systems for recalling voicemail messages from remote voicemail systems are disclosed. In one embodiment, a method for recalling a voicemail message from a target mailbox can include: accessing a voicemail system by a caller using a device; authenticating the caller using speaker verification; and deleting the voicemail message from the target mailbox. The target mailbox owner can be a member of the voicemail system, while the caller can be a non-member of that voicemail system. The device may be configured to support a telephony user interface (TUI), for example.
Abstract:
One-to-many comparisons of callers' words and/or voice prints with known words and/or voice prints to identify any substantial matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract different words, such as words of anger. The system may also segment at least a portion of the customer's voice to create a tone profile, and it formats the segmented words and tone profiles for network transmission to a server. The server compares the customer's words and/or tone profiles with multiple known words and/or tone profiles stored on a database to determine any substantial matches. The identification of any matches may be used for a variety of purposes, such as providing representative feedback or customer follow-up.
Abstract:
An apparatus and method for detecting a fraud or fraud attempt in a captured interaction. The method comprising a selection step in which interactions suspected as capturing fraud attempts are selected for further analysis, and assigned a first fraud probability, and a fraud detection step in which the voice is scored against one or more voice prints, of the same alleged customer or of known fraudsters. The first fraud or fraud attempt probability is combined with the result of the scoring of the fraud detection step, to generate a total fraud or fraud attempt probability. If the total fraud or fraud attempt probability exceeds a threshold, a notification is issued. The selection, scoring and combination thereof are performed using user-defined rules and thresholds.
Abstract:
A method of verifying a user identity using a Web-based multimodal interface can include sending, to a remote computing device, a multimodal markup language document that, when rendered by the remote computing device, queries a user for a user identifier and causes audio of the user's voice to be sent to a multimodal, Web-based application. The user identifier and the audio can be received at about a same time from the client device. The audio can be compared with a voice print associated with the user identifier. The user at the remote computing device can be selectively granted access to the system according to a result obtained from the comparing step.
Abstract:
A system is provided for the automatic detection of fraudulent activity on a transaction network, wherein each transaction has an associated identifier. The system includes voice comparison means (32) for comparing a first sampled voice of a user of a first transaction with a subsequently sampled voice of a user of a subsequent transaction having an identical identifier to that of the first transaction. Control means in the form of a fraud detection engine (26) is provided for determining, from said comparison, a profile of user usage that is representative of a total number of different users of the associated identifier. The fraud detection engine is configured to compare the profile of user usage with a threshold for fraudulent use, and to generate a fraud condition signal (40) in the event that the profile exceeds said threshold. If required, the fraud condition signal may be provided to automatically terminate the fraudulent transaction.
Abstract:
Human speech is transported through a voice and data converged Internet network to recognize its content, verify the ifentity of the speaker, or to verify the content of a spoken phrase by utilizing the Internet protocol to transmit voice packets. The voice data (4) entered is processed and transmitted in the same way as Internet data packets over converged voice and data IP networks. A voice-enabled application isends a message (5), which is decoded by the speech API (2) and the appriopriate control and synchronization information is issued (7) to the data preparation module (9) and to the speech engine (3). Standard voice over IP includes a speech compression algorithm and the use of RTP (Real Time Protocol), enabling additional processing of the human voice anywhere in the network to perform speaker verification, with or without the knowledge of the speaker.
Abstract:
The invention relates to a method for the voice-operated identification of the user of a telecommunication line in a telecommunications network during an interactive communication using a voice-operated conversational system. According to the invention, during a person-to-person and/or person-to-machine conversation, the verbal expressions of a caller in a group of callers that is restricted to a telecommunication line are used to create a reference pattern for the user. A user identification is stored for each reference pattern. Said identification is activated after recognition of the user and, together with the CLI or ANI, is made available to a server with a voice-operated conversational system. Data for this user, which has already been saved using the CLI with user identification is determined by the system and made available for the interaction with the client. The inventive method can be advantageously used in voice-operated conversational systems.
Abstract:
Authentication of voice message recipient network addresses employs generating (102) and storing (104) a 'network file' that includes 'voice clips' and associated network addresses that are extracted from voice messages received across a network (10) from voice message systems (16, 18). A voice clip is the first one to three seconds of voice extracted from each received voice message. Over time, the network file will grow to contain multiple voice clips and associated network voice message addresses. When a voice message originator subsequently enters a recipient's network address (106), the originating voice message system searches (114) the network file for the network address, retrieves the associated voice clip (116), and plays it for the voice message originator to authenticate the recipient's network address. Voice authentication of a voice message originator entails encoding (134) into a 'voice print file', original voice clips and associated network addresses received from positively identified voice message originators. Thereafter, when a questionable voice message is received (138), the voice message system extracts a new voice clip (142), generates a new voice print (144), and compares it with the original voice print associated with the voice message address (148). If the voice prints are substantially the same, the received voice message is annotated with an 'authenticating' message (150).
Abstract:
A method and apparatus for recording and indexing audio information exchanged during an audio conference call, or video, audio and data information exchanged during a multimedia conference. For a multimedia conference, the method and apparatus utilize the voice activated switching functionality of a multipoint control unit (MCU) (26) to provide a video signal, which is input to the MCU (26) from a workstation from which an audio signal is detected, to each of the other workstations participating in the conference. A workstation and/or participant-identifying signal generated by the multipoint control unit (26) is stored, together or in correspondence with the audio signal and video information, for subsequent ready retrieval of the stored multimedia information. For an audio conference, a computer (32') is connected to an audio bridge (44) for recording the audio information along with an identification signal for correlating each conference participant with that participant's statements.