Automatic delineation and extraction of tabular data using machine learning

    公开(公告)号:GB2605052A

    公开(公告)日:2022-09-21

    申请号:GB202207244

    申请日:2020-10-20

    Applicant: IBM

    Abstract: A computer-implemented method for using a machine learning model(122) to automatically extract tabular data from an image includes receiving a set of images of tabular data and a set of markup data corresponding respectively to the images of tabular data. The method further includes training a first neural network to delineate the tabular data into cells(440) using the markup data, and training a second neural network to determine content of the cells(440)in the tabular data using the markup data. The method further includes, upon receiving an input image(112) containing a first tabular data without any markup data, generating an electronic output corresponding to the first tabular data by determining the structure of the first tabular data using the first neural network and extracting content of the first tabular using the second neural network.

Patent Agency Ranking