Invention Grant
- Patent Title: Back-off language model compression
- Patent Title (中): 后退语言模型压缩
-
Application No.: US12486358Application Date: 2009-06-17
-
Publication No.: US08725509B1Publication Date: 2014-05-13
- Inventor: Boulos Harb , Ciprian Chelba , Jeffrey A. Dean , Sanjay Ghemawat
- Applicant: Boulos Harb , Ciprian Chelba , Jeffrey A. Dean , Sanjay Ghemawat
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Remarck Law Group PLC
- Main IPC: G10L15/00
- IPC: G10L15/00 ; G10L15/06 ; G10L15/28 ; G06F17/21

Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to language models stored for digital language processing. In one aspect, a method includes the actions of generating a language model, including: receiving a collection of n-grams from a corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus, and generating a trie representing the collection of n-grams, the trie being represented using one or more arrays of integers, and compressing an array representation of the trie using block encoding; and using the language model to identify a second probability of a particular string of words occurring.
Information query