Invention Grant
- Patent Title: System and method for zero-shot learning with deep image neural network and natural language processing (NLP) for optical character recognition (OCR)
-
Application No.: US17689124Application Date: 2022-03-08
-
Publication No.: US12131563B2Publication Date: 2024-10-29
- Inventor: Tianhao Wu
- Applicant: Singularity Systems Inc.
- Applicant Address: US NJ Princeton
- Assignee: Singularity Systems Inc.
- Current Assignee: Singularity Systems Inc.
- Current Assignee Address: US NJ Princeton
- Agency: Zhong Law, LLC
- Main IPC: G06V30/19
- IPC: G06V30/19 ; G06V10/82

Abstract:
A system and method for constructing a training dataset and training a neural network include obtaining a searchable portable document format (PDF) document, identifying a bounding box defining a region in a background image that is associated with an overlaying text object defined in the PDF document, determining an image crop of the PDF document according to the bounding box, and generating a training data sample for the training dataset, the training data sample comprising a data pair of the image crop and the associated text object.
Public/Granted literature
Information query