Invention Grant
- Patent Title: Combined content indexing and data reduction
-
Application No.: US11277790Application Date: 2006-03-29
-
Publication No.: US09772981B2Publication Date: 2017-09-26
- Inventor: Roger F. Osmond , Gil Goren
- Applicant: Roger F. Osmond , Gil Goren
- Applicant Address: US MA Hopkinton
- Assignee: EMC IP HOLDING COMPANY LLC
- Current Assignee: EMC IP HOLDING COMPANY LLC
- Current Assignee Address: US MA Hopkinton
- Agency: Anderson Gorecki LLP
- Agent Holmes Anderson
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/00 ; G06F15/16 ; G06F17/22 ; G06F17/30 ; H03M7/30

Abstract:
Data storage is improved by combining content indexing and data reduction in text-containing files by using common word elimination. Raw data is processed by finding words in selected files, creating an index of found words, and replacing the words in the raw data with pointers to the corresponding words in the index. Each word appears only once in the index. Consequently, the index is relatively small and the procedure is completely reversible. In particular, the index is small relative to other methods because the data is transformed in place, and the transformed data and index are used together to capture the total information about the data.
Public/Granted literature
- US20070233707A1 COMBINED CONTENT INDEXING AND DATA REDUCTION Public/Granted day:2007-10-04
Information query