SYSTEMS AND METHODS OF SPEAKER-INDEPENDENT EMBEDDING FOR IDENTIFICATION AND VERIFICATION FROM AUDIO

    公开(公告)号:US20210280171A1

    公开(公告)日:2021-09-09

    申请号:US17192464

    申请日:2021-03-04

    Abstract: Embodiments described herein provide for audio processing operations that evaluate characteristics of audio signals that are independent of the speaker's voice. A neural network architecture trains and applies discriminatory neural networks tasked with modeling and classifying speaker-independent characteristics. The task-specific models generate or extract feature vectors from input audio data based on the trained embedding extraction models. The embeddings from the task-specific models are concatenated to form a deep-phoneprint vector for the input audio signal. The DP vector is a low dimensional representation of the each of the speaker-independent characteristics of the audio signal and applied in various downstream operations.

    FRAUD IMPORTANCE SYSTEM
    3.
    发明申请

    公开(公告)号:US20220006899A1

    公开(公告)日:2022-01-06

    申请号:US17365970

    申请日:2021-07-01

    Abstract: Embodiments described herein provide for a fraud detection engine for detecting various types of fraud at a call center and a fraud importance engine for tailoring the fraud detection operations to relative importance of fraud events. Fraud importance engine determines which fraud events are comparative more important than others. The fraud detection engine comprises machine-learning models that consume contact data and fraud importance information for various anti-fraud processes. The fraud importance engine calculates importance scores for fraud events based on user-customized attributes, such as fraud-type or fraud activity. The fraud importance scores are used in various processes, such as model training, model selection, and selecting weights or hyper-parameters for the ML models, among others. The fraud detection engine uses the importance scores to prioritize fraud alerts for review. The fraud importance engine receives detection feedback, which contacts involved false negatives, where fraud events were undetected but should have been detected.

    FRAUD IMPORTANCE SYSTEM
    4.
    发明公开

    公开(公告)号:US20240214490A1

    公开(公告)日:2024-06-27

    申请号:US18432316

    申请日:2024-02-05

    CPC classification number: H04M3/2281 G06N20/00 H04L63/1483

    Abstract: Embodiments described herein provide for a fraud detection engine for detecting various types of fraud at a call center and a fraud importance engine for tailoring the fraud detection operations to relative importance of fraud events. Fraud importance engine determines which fraud events are comparative more important than others. The fraud detection engine comprises machine-learning models that consume contact data and fraud importance information for various anti-fraud processes. The fraud importance engine calculates importance scores for fraud events based on user-customized attributes, such as fraud-type or fraud activity. The fraud importance scores are used in various processes, such as model training, model selection, and selecting weights or hyper-parameters for the ML models, among others. The fraud detection engine uses the importance scores to prioritize fraud alerts for review. The fraud importance engine receives detection feedback, which contacts involved false negatives, where fraud events were undetected but should have been detected.

    CROSS-LINGUAL SPEAKER RECOGNITION

    公开(公告)号:US20230137652A1

    公开(公告)日:2023-05-04

    申请号:US17977521

    申请日:2022-10-31

    Abstract: Disclosed are systems and methods including computing-processes executing machine-learning architectures for voice biometrics, in which the machine-learning architecture implements one or more language compensation functions. Embodiments include an embedding extraction engine (sometimes referred to as an “embedding extractor”) that extracts speaker embeddings and determines a speaker similarity score for determine or verifying the likelihood that speakers in different audio signals are the same speaker. The machine-learning architecture further includes a multi-class language classifier that determines a language likelihood score that indicates the likelihood that a particular audio signal includes a spoken language. The features and functions of the machine-learning architecture described herein may implement the various language compensation techniques to provide more accurate speaker recognition results, regardless of the language spoken by the speaker.

    SPEAKER RECOGNITION WITH QUALITY INDICATORS

    公开(公告)号:US20250124945A1

    公开(公告)日:2025-04-17

    申请号:US18989690

    申请日:2024-12-20

    Abstract: Embodiments described herein provide for a machine-learning architecture for modeling quality measures for enrollment signals. Modeling these enrollment signals enables the machine-learning architecture to identify deviations from expected or ideal enrollment signal in future test phase calls. These differences can be used to generate quality measures for the various audio descriptors or characteristics of audio signals. The quality measures can then be fused at the score-level with the speaker recognition's embedding comparisons for verifying the speaker. Fusing the quality measures with the similarity scoring essentially calibrates the speaker recognition's outputs based on the realities of what is actually expected for the enrolled caller and what was actually observed for the current inbound caller.

    SPEAKER RECOGNITION WITH QUALITY INDICATORS

    公开(公告)号:US20220059121A1

    公开(公告)日:2022-02-24

    申请号:US17408281

    申请日:2021-08-20

    Abstract: Embodiments described herein provide for a machine-learning architecture for modeling quality measures for enrollment signals. Modeling these enrollment signals enables the machine-learning architecture to identify deviations from expected or ideal enrollment signal in future test phase calls. These differences can be used to generate quality measures for the various audio descriptors or characteristics of audio signals. The quality measures can then be fused at the score-level with the speaker recognition's embedding comparisons for verifying the speaker. Fusing the quality measures with the similarity scoring essentially calibrates the speaker recognition's outputs based on the realities of what is actually expected for the enrolled caller and what was actually observed for the current inbound caller.

Patent Agency Ranking