Invention Grant
- Patent Title: Table recognition in portable document format documents
-
Application No.: US16050803Application Date: 2018-07-31
-
Publication No.: US11200413B2Publication Date: 2021-12-14
- Inventor: Douglas Ronald Burdick , Wei Cheng , Alexandre Evfimievski , Marina Danilevsky Hailpern , Rajasekar Krishnamurthy , Shajith Ikbal Mohamed , Prithviraj Sen , Shivakumar Vaithyanathan
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Ryan, Mason & Lewis, LLP
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06F40/177 ; G06F40/284

Abstract:
Methods, systems, and computer program products for table recognition in PDF documents are provided herein. A computer-implemented method includes discretizing one or more contiguous areas of a PDF document; identifying one or more white-space separator lines within the one or more discretized contiguous areas of the PDF document; detecting one or more candidate table regions within the one or more discretized contiguous areas of the PDF document by clustering the one or more white-space separator lines into one or more grids; and outputting at least one of the candidate table regions as a finalized table in accordance with scores assigned to each of the one or more candidate table regions based on (i) border information and (ii) cell structure information.
Public/Granted literature
- US20200042785A1 Table Recognition in Portable Document Format Documents Public/Granted day:2020-02-06
Information query