Invention Grant
- Patent Title: Information uniqueness assessment using string-based collection frequency
-
Application No.: US16887046Application Date: 2020-05-29
-
Publication No.: US11769005B2Publication Date: 2023-09-26
- Inventor: Shou-Huey Jiang , Wenjin Liu , Chao Su
- Applicant: EMC IP Holding Company LLC
- Applicant Address: US MA Hopkinton
- Assignee: EMC IP Holding Company LLC
- Current Assignee: EMC IP Holding Company LLC
- Current Assignee Address: US MA Hopkinton
- Agency: Ryan, Mason & Lewis, LLP
- Main IPC: G06F40/205
- IPC: G06F40/205 ; G06F40/284 ; G06N5/04 ; G06F21/62 ; G06V30/262

Abstract:
Techniques are provided for assessing uniqueness of information using string-based collection frequency techniques. One method comprises obtaining multiple collections of documents from at least one data source; determining a collection frequency for a given character string based on a number of the collections comprising the given character string relative to a total number of the collections; assigning a uniqueness rating to the given character string based at least in part on a comparison of the collection frequency of the given character string to a collection frequency of one or more additional character strings in one or more of the plurality of collections; and performing an automated action using the given character string based on the assigned uniqueness rating. The automated action may comprise protecting the given character string and/or identifying the given character string as important information satisfying one or more importance criteria.
Public/Granted literature
- US20210374336A1 INFORMATION UNIQUENESS ASSESSMENT USING STRING-BASED COLLECTION FREQUENCY Public/Granted day:2021-12-02
Information query