SYSTEM AND METHOD FOR COMPUTING AND TRANSMITTING PARAMETERS IN A DISTRIBUTED VOICE RECOGNITION SYSTEM
    4.
    发明申请
    SYSTEM AND METHOD FOR COMPUTING AND TRANSMITTING PARAMETERS IN A DISTRIBUTED VOICE RECOGNITION SYSTEM 审中-公开
    在分布式语音识别系统中计算和发送参数的系统和方法

    公开(公告)号:WO02061727A3

    公开(公告)日:2003-02-27

    申请号:PCT/US0202625

    申请日:2002-01-29

    Applicant: QUALCOMM INC

    CPC classification number: G10L15/30 G10L15/02

    Abstract: A system and method for extracting acoustic features and speech activity on a device and transmitting them in a distributed voice recognition system. The distributed voice recognition system includes a local VR engine in a subscriber unit (102) and a server VR engine in a server (160). The local VR engine comprises a feature extraction (FE) module (104) that extracts features from a speech signal, and a voice activity detection module (VAD) (106) that detects voice activity within a speech signal. The voice activity signal and the features are downsampled before they are transmitted from the local engine to the server engine. The system includes filters, framing and windowing modules, power spectrum analyzers, a neural network, a nonlinear element, and other components to selectively provide an advanced front end vector including predetermined portions of the voice activity detection indication and extracted features from the subscriber unit (104) to the server (160). The indication of detected voice activity is transmitted ahead of the extracted features in order to avoid long recognition delays. The system also includes a module to generate additional feature vectors on the server from the received features using a feed-forward multilayer perception (MLP) and providing the same to the speech server (160).

    Abstract translation: 一种用于在设备上提取声学特征和语音活动并在分布式语音识别系统中传送它们的系统和方法。 分布式语音识别系统包括用户单元(102)中的本地VR引擎和服务器(160)中的服务器VR引擎。 本地VR引擎包括从语音信号中提取特征的特征提取(FE)模块(104)和检测语音信号内的语音活动的语音活动检测模块(VAD)(106)。 语音活动信号和特征在从本地引擎传输到服务器引擎之前被下采样。 该系统包括滤波器,成帧和开窗模块,功率谱分析仪,神经网络,非线性元件和其他组件,以选择性地提供包括来自用户单元的语音活动检测指示和提取特征的预定部分的高级前端矢量 104)发送到服务器(160)。 检测到的语音活动的指示在提取的特征之前传输,以避免长的识别延迟。 该系统还包括一个模块,用于使用前馈多层感知(MLP)从所接收的特征在服务器上产生附加的特征向量,并将其提供给语音服务器(160)。

Patent Agency Ranking