Extracting data from electronic documents
Abstract:
A structured data processing system includes hardware processors and a memory in communication with the hardware processors. The memory stores a data structure and an execution environment. The data structure includes an electronic document. The execution environment includes a data extraction solver configured to perform operations including identifying a particular page of the electronic document; performing an optical character recognition (OCR) on the page to determine a plurality of alphanumeric text strings on the page; determining a type of the page; determining a layout of the page; determining at least one table on the page based at least in part on the determined type of the page and the determined layout of the page; and extracting a plurality of data from the determined table on the page. The execution environment also includes a user interface module that generates a user interface that renders graphical representations of the extracted data; and a transmission module that transmits data that represents the graphical representations.
Public/Granted literature
Information query
Patent Agency Ranking
0/0