System and method for zero-shot learning with deep image neural network and natural language processing (NLP) for optical character recognition (OCR)

Invention Grant

US12131563B2 System and method for zero-shot learning with deep image neural network and natural language processing (NLP) for optical character recognition (OCR) 有权

Please log in to see more content

Patent Title: System and method for zero-shot learning with deep image neural network and natural language processing (NLP) for optical character recognition (OCR)
Application No.: US17689124

Application Date: 2022-03-08
Publication No.: US12131563B2

Publication Date: 2024-10-29
Inventor: Tianhao Wu
Applicant: Singularity Systems Inc.
Applicant Address: US NJ Princeton
Assignee: Singularity Systems Inc.
Current Assignee: Singularity Systems Inc.
Current Assignee Address: US NJ Princeton
Agency: Zhong Law, LLC
Main IPC: G06V30/19
IPC: G06V30/19 ; G06V10/82

System and method for zero-shot learning with deep image neural network and natural language processing (NLP) for optical character recognition (OCR)

Abstract:

A system and method for constructing a training dataset and training a neural network include obtaining a searchable portable document format (PDF) document, identifying a bounding box defining a region in a background image that is associated with an overlaying text object defined in the PDF document, determining an image crop of the PDF document according to the bounding box, and generating a training data sample for the training dataset, the training data sample comprising a data pair of the image crop and the associated text object.

Public/Granted literature

US20220284721A1 SYSTEM AND METHOD FOR ZERO-SHOT LEARNING WITH DEEP IMAGE NEURAL NETWORK AND NATURAL LANGUAGE PROCESSING (NLP) FOR OPTICAL CHARACTER RECOGNITION (OCR) Public/Granted day:2022-09-08

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06V	图像或视频识别或理解
G06V30/00	字符识别；数字墨迹识别；面向文档的基于图像的模式识别（文档等的扫描、传输或复制 H04N1/00）
G06V30/10	.字符识别
G06V30/19	..使用电子方式识别