Invention Grant
US08572126B2 Systems and methods for optimizing very large n-gram collections for speed and memory
有权
用于优化非常大的n-gram集合以用于速度和内存的系统和方法
- Patent Title: Systems and methods for optimizing very large n-gram collections for speed and memory
- Patent Title (中): 用于优化非常大的n-gram集合以用于速度和内存的系统和方法
-
Application No.: US13168338Application Date: 2011-06-24
-
Publication No.: US08572126B2Publication Date: 2013-10-29
- Inventor: Michael Flor
- Applicant: Michael Flor
- Applicant Address: US NJ Princeton
- Assignee: Educational Testing Service
- Current Assignee: Educational Testing Service
- Current Assignee Address: US NJ Princeton
- Agency: Jones Day
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
A computer memory stores a data structure representing a ternary search tree (TST) representing multiple word n-grams for a corpus of documents. The data structure includes plural records in a first memory, each record representing a node of the TST and comprising plural fields. At least some n-grams have a sequence of units. The plurality of fields includes one for identifying a given unit of the sequence for a given node, one reserved for storing payload information for the given node, and plural child fields reserved for storing information for a first, second and third child nodes of the given node. The child fields store a null value indicating the absence of the child node or an identifier identifying a memory location of the child node. For at least one record, at least one of the child fields stores an identifier identifying a memory location of a memory different than the first memory.
Public/Granted literature
- US20110320498A1 Systems and Methods for Optimizing Very Large N-Gram Collections for Speed and Memory Public/Granted day:2011-12-29
Information query