Techniques for image content extraction

Invention Grant

US11704785B2 Techniques for image content extraction 有权

Please log in to see more content

Patent Title: Techniques for image content extraction
Application No.: US17889801

Application Date: 2022-08-17
Publication No.: US11704785B2

Publication Date: 2023-07-18
Inventor: David James Wheaton , Stuart Dakari Cooke, III , William Robert Nadolski
Applicant: SAS Institute Inc.
Applicant Address: US NC Cary
Assignee: SAS INSTITUTE INC.
Current Assignee: SAS INSTITUTE INC.
Current Assignee Address: US NC Cary
Agency: KDB Firm PLLC
Main IPC: G06K9/00
IPC: G06K9/00 ; G06T7/00 ; G06F16/81 ; G06F16/93 ; G06F40/284 ; G06F40/186 ; G06F40/169 ; G06F3/04842 ; G06V10/40 ; G06K9/62 ; G06V30/10 ; G06V30/24 ; G06V30/418

Abstract:

Embodiments are directed to techniques for image content extraction. Some embodiments include extracting contextually structured data from document images, such as by automatically identifying document layout, document data, document metadata, and/or correlations therebetween in a document image, for instance. Some embodiments utilize breakpoints to enable the system to match different documents with internal variations to a common template. Several embodiments include extracting contextually structured data from table images, such as gridded and non-gridded tables. Many embodiments are directed to generating and utilizing a document template database for automatically extracting document image contents into a contextually structured format. Several embodiments are directed to automatically identifying and associating document metadata with corresponding document data in a document image to generate a machine-facilitated annotation of the document image. In some embodiments, the machine-facilitated annotation may be used to generate a template for the template database.

Public/Granted literature

US20220392047A1 TECHNIQUES FOR IMAGE CONTENT EXTRACTION Public/Granted day:2022-12-08

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )