Invention Grant
- Patent Title: System to extract information from documents
-
Application No.: US16794266Application Date: 2020-02-19
-
Publication No.: US11379690B2Publication Date: 2022-07-05
- Inventor: Vikas Kumar
- Applicant: Infrrd Inc
- Applicant Address: US CA San Jose
- Assignee: Infrrd Inc
- Current Assignee: Infrrd Inc
- Current Assignee Address: US CA San Jose
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06K9/62 ; G06N20/00 ; G06N5/04 ; G06F40/30 ; G06F40/284 ; G06F40/117 ; G06V10/56 ; G06V30/10

Abstract:
A method of training a system to extract information from documents comprises feeding digital form of training documents to an OCR module, which identifies multiple logical blocks in the documents and text present in the logical blocks. One or more tags for the whole of the document, the logical blocks and word tokens on the document are received by a tagging module. A text input comprising the text identified in the document and the tags for the whole of the document are received by a machine learning module. A first image of the document with layout of the one or more of the identified blocks superimposed, and the tags of the logical blocks in the document are received by the machine learning module, wherein the received text input, first image and tags for the logical blocks corresponds to a plurality of the training documents.
Public/Granted literature
- US20200184267A1 SYSTEM TO EXTRACT INFORMATION FROM DOCUMENTS Public/Granted day:2020-06-11
Information query