Invention Grant
US07792840B2 Two-level n-gram index structure and methods of index building, query processing and index derivation
有权
二级n-gram索引结构和索引构建方法,查询处理和索引推导
- Patent Title: Two-level n-gram index structure and methods of index building, query processing and index derivation
- Patent Title (中): 二级n-gram索引结构和索引构建方法,查询处理和索引推导
-
Application No.: US11501265Application Date: 2006-08-09
-
Publication No.: US07792840B2Publication Date: 2010-09-07
- Inventor: Kyu-Young Whang , Min-Soo Kim , Jae-Gil Lee , Min-Jae Lee
- Applicant: Kyu-Young Whang , Min-Soo Kim , Jae-Gil Lee , Min-Jae Lee
- Applicant Address: KR Daejeon
- Assignee: Korea Advanced Institute of Science and Technology
- Current Assignee: Korea Advanced Institute of Science and Technology
- Current Assignee Address: KR Daejeon
- Agency: Bachman & LaPointe, P.C.
- Priority: KR10-2005-0078687 20050826
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/30

Abstract:
Disclosed relates to a structure of two-level n-gram inverted index and methods of building the same, processing queries and deriving the index that reduce the size of n-gram inverted index and improves the query performance by eliminating the redundancy of the position information that exists in the n-gram inverted index.The inverted index of the present invention comprises a back-end inverted index using subsequences extracted from documents as a term and a front-end inverted index using n-grams extracted from the subsequences as a term. The back-end inverted index uses the subsequences of a specific length extracted from the documents to be overlapped with each other by n−1 (n: the length of n-gram) as a term and stores position information of the subsequences occurring in the documents in a posting list for the respective subsequences. The front-end inverted index uses the n-grams of a specific length extracted from the subsequences using a 1-sliding technique as a term and stores position information of the n-grams occurring in the subsequences in a posting list for the respective n-grams.
Public/Granted literature
- US20070050384A1 Two-level n-gram index structure and methods of index building, query processing and index derivation Public/Granted day:2007-03-01
Information query