System and method for extracting tabular data from electronic document
Abstract:
Disclosed is system for extracting tabular data from electronic document, system having data processing arrangement comprising: tabular data detection module that is operable to: (i) receive electronic document; (ii) determine location of tabular data within electronic document; and (iii) extract image of tabular data from electronic document; and tabular data extraction module that receives extracted image of tabular data from tabular data detection module, wherein tabular data extraction module is operable to: (i) convert received image of tabular data into greyscale image; (ii) extract grid structure from greyscale image; (iii) remove grid structure from greyscale image; (iv) determine position for placement of horizontal and vertical lines in greyscale image; (v) generate horizontal and vertical lines on greyscale image; (vi) perform optical character recognition of text associated with tabular data from received image; and (vii) extract tabular data by combining information of grid structure with text, to generate tabular data.
Information query
Patent Agency Ranking
0/0