Invention Grant
- Patent Title: Method and system of creating and summarizing unstructured natural language sentence clusters for efficient tagging
-
Application No.: US16798277Application Date: 2020-02-21
-
Publication No.: US11604926B2Publication Date: 2023-03-14
- Inventor: Ramaswamy Venkateshwaran , Sridevi Ramaswamy , Priya Rani , Huanchen Li , Ke Chen
- Applicant: Ramaswamy Venkateshwaran , Sridevi Ramaswamy , Priya Rani , Huanchen Li , Ke Chen
- Applicant Address: US CA Dublin; US CA Dublin; US CA Fremont; US CA Hayward; US CA Fremont
- Assignee: Ramaswamy Venkateshwaran,Sridevi Ramaswamy,Priya Rani,Huanchen Li,Ke Chen
- Current Assignee: Ramaswamy Venkateshwaran,Sridevi Ramaswamy,Priya Rani,Huanchen Li,Ke Chen
- Current Assignee Address: US CA Dublin; US CA Dublin; US CA Fremont; US CA Hayward; US CA Fremont
- Main IPC: G06F40/30
- IPC: G06F40/30 ; G06F16/35 ; G06F40/117 ; G06F16/33 ; G06V10/40 ; G06V30/10

Abstract:
A computerized method for reducing domain noise, creating and summarizing human-written sentences into clusters for efficient tagging in natural language processing comprising: receiving a typed, handwritten or printed text; implementing an optical character recognition (OCR) process on human written text to generate a digital version of the human written text; splitting the digital version of the typed, handwritten or printed text into an array of sentences, using a sentence splitter to generate a split sentence version; determining a domain of the human written text; based on the domain, implementing a domain noise reduction process on the split sentences version; hierarchically clustering the split sentences version after the domain noise reduction process; and summarizing the clustered sentences and reducing the amount of data to be tagged.
Information query