Invention Grant
- Patent Title: Techniques for image content extraction
-
Application No.: US17889801Application Date: 2022-08-17
-
Publication No.: US11704785B2Publication Date: 2023-07-18
- Inventor: David James Wheaton , Stuart Dakari Cooke, III , William Robert Nadolski
- Applicant: SAS Institute Inc.
- Applicant Address: US NC Cary
- Assignee: SAS INSTITUTE INC.
- Current Assignee: SAS INSTITUTE INC.
- Current Assignee Address: US NC Cary
- Agency: KDB Firm PLLC
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06T7/00 ; G06F16/81 ; G06F16/93 ; G06F40/284 ; G06F40/186 ; G06F40/169 ; G06F3/04842 ; G06V10/40 ; G06K9/62 ; G06V30/10 ; G06V30/24 ; G06V30/418

Abstract:
Embodiments are directed to techniques for image content extraction. Some embodiments include extracting contextually structured data from document images, such as by automatically identifying document layout, document data, document metadata, and/or correlations therebetween in a document image, for instance. Some embodiments utilize breakpoints to enable the system to match different documents with internal variations to a common template. Several embodiments include extracting contextually structured data from table images, such as gridded and non-gridded tables. Many embodiments are directed to generating and utilizing a document template database for automatically extracting document image contents into a contextually structured format. Several embodiments are directed to automatically identifying and associating document metadata with corresponding document data in a document image to generate a machine-facilitated annotation of the document image. In some embodiments, the machine-facilitated annotation may be used to generate a template for the template database.
Public/Granted literature
- US20220392047A1 TECHNIQUES FOR IMAGE CONTENT EXTRACTION Public/Granted day:2022-12-08
Information query