Invention Publication
- Patent Title: METHOD AND SYSTEM FOR VISUAL CONTEXT AWARE AUTOMATIC SPEECH RECOGNITION
-
Application No.: US18333983Application Date: 2023-06-13
-
Publication No.: US20240038224A1Publication Date: 2024-02-01
- Inventor: CHAYAN SARKAR , PRADIP PRAMANICK , RUCHIRA SINGH
- Applicant: Tata Consultancy Services Limited
- Applicant Address: IN Mumbai
- Assignee: Tata Consultancy Services Limited
- Current Assignee: Tata Consultancy Services Limited
- Current Assignee Address: IN Mumbai
- Priority: IN 2221043394 2022.07.28
- Main IPC: G10L15/16
- IPC: G10L15/16 ; G10L15/06

Abstract:
Accuracy of transcript is of foremost importance in Automatic Speech Recognition (ASR). State of the art system mostly rely on spelling correction based contextual improvement in ASR, which is generally a static vocabulary based biasing approach. Embodiments of the present disclosure provide a method and system for visual context aware ASR. The method provides biasing using shallow fusion biasing approach with a modified beam search decoding technique, which introduces a non-greedy pruning strategy to allow biasing at the sub-word level. The biasing algorithm brings in the visual context of the robot to the speech recognizer based on a dynamic biasing vocabulary, improving the transcription accuracy. The dynamic biasing vocabulary, comprising objects in a current environment accompanied by their self and relational attributes, is generated using a bias prediction network that explicitly adds label to objects, which are detected and captioned via a state of the art dense image captioning network.
Public/Granted literature
- US12334057B2 Method and system for visual context aware automatic speech recognition Public/Granted day:2025-06-17
Information query