- Patent Title: Speech-to-analytics framework with support for large n-gram corpora
-
Application No.: US17370441Application Date: 2021-07-08
-
Publication No.: US11217233B1Publication Date: 2022-01-04
- Inventor: Xiaozhuo Cheng , Xu Yang , Xiaolong Li , Biljana Belamaric Wilsey , Haipeng Liu , Jared Peterson
- Applicant: SAS Institute Inc.
- Applicant Address: US NC Cary
- Assignee: SAS Institute Inc.
- Current Assignee: SAS Institute Inc.
- Current Assignee Address: US NC Cary
- Agency: Kacvinsky Daisak Bluni PLLC
- Main IPC: G06N3/02
- IPC: G06N3/02 ; G06N7/00 ; G10L15/04 ; G10L15/16 ; G10L15/18 ; G10L15/197 ; G10L15/22 ; G10L15/30

Abstract:
An apparatus includes processor(s) to: generate a set of candidate n-grams based on probability distributions from an acoustic model for candidate graphemes of a next word most likely spoken following at least one preceding word spoken within speech audio; provide the set of candidate n-grams to multiple devices; provide, to each node device, an indication of which candidate n-grams are to be searched for within the n-gram corpus by each node device to enable searches for multiple candidate n-grams to be performed, independently and at least partially in parallel, across the node devices; receive, from each node device, an indication of a probability of occurrence of at least one candidate n-gram within the speech audio; based on the received probabilities of occurrence, identify the next word most likely spoken within the speech audio; and add the next word most likely spoken to a transcript of the speech audio.
Information query