Invention Grant
- Patent Title: Multi-modal learning based intelligent enhancement of post optical character recognition error correction
-
Application No.: US17245349Application Date: 2021-04-30
-
Publication No.: US11842524B2Publication Date: 2023-12-12
- Inventor: Rajesh M. Desai , Ayush Utkarsh , Nazrul Islam , Praveen Vyas
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Stephen J. Walder, Jr.; Matt Zebrec
- Main IPC: G06K9/03
- IPC: G06K9/03 ; G06V10/40 ; G06F40/126 ; G06F40/109 ; G06N20/00 ; G06F40/232 ; G06V30/10 ; G06F18/214 ; G06V30/19 ; G06V30/12 ; G06V10/82 ; G06V30/26 ; G06F18/213 ; G06N3/0464 ; G06N3/0442 ; G06N3/0455 ; G06V10/98 ; G06F40/279 ; G06V30/242

Abstract:
A mechanism is provided for implementing an optical character recognition (OCR) error correction mechanism for correcting OCR errors. Responsive to receiving a document in which OCR has been performed, the mechanism assesses the document to identify a set of OCR errors generated by an OCR engine that performed the OCR using a set of visual embeddings. Responsive to identifying the set of OCR errors, the mechanism analyzes each character of a plurality of sentences within the document to generate a high-dimensional embedding for the characters of the plurality of sentences within the document. The mechanism then linguistically corrects each OCR error in the set of OCR error. The mechanism utilizes ground truth information and the set of visual embeddings to verify that character stream is linguistically correct. Responsive to verifying that the character stream is linguistically correct, the mechanism outputs an OCR error corrected document to a user.
Public/Granted literature
Information query