Invention Grant
- Patent Title: Method and system for data extraction from images of semi-structured documents
-
Application No.: US14868683Application Date: 2015-09-29
-
Publication No.: US09754176B2Publication Date: 2017-09-05
- Inventor: Mikhail Kostyukov
- Applicant: ABBYY Development LLC
- Applicant Address: RU Moscow
- Assignee: ABBYY PRODUCTION LLC
- Current Assignee: ABBYY PRODUCTION LLC
- Current Assignee Address: RU Moscow
- Agency: Lowenstein Sandler LLP
- Priority: RU2015137956 20150907
- Main IPC: G06K9/18
- IPC: G06K9/18 ; G06K9/46

Abstract:
The present invention is directed to a method of extracting data from fields in an image of a document. In one implementation, a text representation of the image of the document is obtained. A graph for storing features of the text fragments in the text representation of the image of the document and their links is constructed. A cascade classification for computing the features of the text fragments in the text representation of the image of the document and their link is run. Hypotheses about the belonging of text fragments to the fields in the image of the document are generated. Combinations of the hypotheses are generated. A combination of the hypotheses is selected. And data from the fields in the image of the document is extracted based on the selected combination of the hypotheses.
Public/Granted literature
- US20170068866A1 METHOD AND SYSTEM FOR DATA EXTRACTION FROM IMAGES OF SEMI-STRUCTURED DOCUMENTS Public/Granted day:2017-03-09
Information query