Invention Grant
- Patent Title: Extracting information from tables embedded within documents
-
Application No.: US15594762Application Date: 2017-05-15
-
Publication No.: US10706218B2Publication Date: 2020-07-07
- Inventor: David Richard Milward , Himanshu Agrawal , James Robert Walton Cormack , Francisco Nuno Quintiliano Mendonca Carapeto Costa
- Applicant: Linguamatics Ltd.
- Applicant Address: GB Cambridge
- Assignee: Linguamatics Ltd.
- Current Assignee: Linguamatics Ltd.
- Current Assignee Address: GB Cambridge
- Agency: Maldjian Law Group LLC
- Agent John Maldjian
- Main IPC: G06F16/00
- IPC: G06F16/00 ; G06F40/14 ; G06F16/84 ; G06F40/18 ; G06F40/154 ; G06F40/166 ; G06F40/177

Abstract:
Much valuable information in documents is presented within tables. However, the information within tables is hard to extract automatically with high accuracy due to the wide variety and low quality of typical tables found in electronic documents. Information extraction technology can provide a method of extracting information from heterogeneous tables by recognizing tables, the header cells, and cells that are merged or should be merged, creating a richer representation of table structure and providing a convenient way of linking cells to their row and column headers. Use of this richer representation allows a few extraction patterns to successfully pull out information from a wide variety of differently formatted tables.
Public/Granted literature
- US20170329749A1 EXTRACTING INFORMATION FROM TABLES EMBEDDED WITHIN DOCUMENTS Public/Granted day:2017-11-16
Information query