Invention Grant
- Patent Title: Automatically labeling data using natural language processing
-
Application No.: US17666001Application Date: 2022-02-07
-
Publication No.: US11544795B2Publication Date: 2023-01-03
- Inventor: John Wang , Hari Nathan
- Applicant: Futurity Group, Inc.
- Applicant Address: US IL Chicago
- Assignee: Futurity Group, Inc.
- Current Assignee: Futurity Group, Inc.
- Current Assignee Address: US IL Chicago
- Agency: Gardella Grace P.A.
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06Q40/08 ; G06F40/205 ; G06F40/40 ; G06F16/35

Abstract:
In an illustrative embodiment, methods and systems for automatically labeling unstructured data include accessing unstructured data representing data entry and analyzing the unstructured data by applying natural language processing to a text component of the unstructured data to obtain a set of term counts of words and/or phrases identified in the text component. Analyzing may include applying at least one clustering algorithm to the set of term counts to determine a term cluster, identifying a preexisting term cluster most closely matching the term cluster, and applying, to the unstructured data, a predefined label corresponding to the preexisting term cluster. The unstructured data may be analyzed to obtain formatting counts of formatting elements, and a formatting cluster may be determined and applied to match to a preexisting formatting cluster, thus deriving a predefined label corresponding to the preexisting formatting cluster.
Public/Granted literature
- US20220253942A1 Automatically Labeling Data using Natural Language Processing Public/Granted day:2022-08-11
Information query