Invention Grant
- Patent Title: System and method for extracting structured information from image documents
-
Application No.: US16173760Application Date: 2018-10-29
-
Publication No.: US10853638B2Publication Date: 2020-12-01
- Inventor: Abhisek Mukhopadhyay , Shubhashis Sengupta
- Applicant: Accenture Global Solutions Limited
- Applicant Address: IE Dublin
- Assignee: Accenture Global Solutions Limited
- Current Assignee: Accenture Global Solutions Limited
- Current Assignee Address: IE Dublin
- Agency: Plumsea Law Group, LLC
- Priority: IN201841032793 20180831
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06Q50/18 ; G06Q40/08 ; G06F40/279

Abstract:
A system and method for extracting structured information from image documents is disclosed. An input image document is obtained, and the input image document may be analyzed to determine a skeletal layout of information included in the input image document. A measure of similarity between the determined skeletal layout and each of the document templates may be determined. A document template may be selected as a matched template, based on the determined measure of similarity. Box areas from the input image document may be cropped out, and optical character recognition (OCR) may be performed on the box areas. Obtained recognized text may be automatically processed using directed search to correct errors made by the OCR. Statistical language modeling may be used to classify the input image document into a classification category, and the classified input image document may be processed according to the classification category.
Public/Granted literature
- US20200074169A1 System And Method For Extracting Structured Information From Image Documents Public/Granted day:2020-03-05
Information query