Document information extraction system using sequenced comparators

Invention Grant

US11657101B2 Document information extraction system using sequenced comparators 有权

Please log in to see more content

Patent Title: Document information extraction system using sequenced comparators
Application No.: US16740754

Application Date: 2020-01-13
Publication No.: US11657101B2

Publication Date: 2023-05-23
Inventor: Prabhdeep Singh Walia , Vikas Kushwaha
Applicant: Goldman Sachs & Co. LLC
Applicant Address: US NY New York
Assignee: Goldman Sachs & Co. LLC
Current Assignee: Goldman Sachs & Co. LLC
Current Assignee Address: US NY New York
Agency: Fenwick & West LLP
Main IPC: G06F16/93
IPC: G06F16/93 ; G06F16/22 ; G06F16/28 ; G06F16/904 ; G06F40/103

Document information extraction system using sequenced comparators

Abstract:

A document information extraction system determines a structure of an electronic document based on characteristics of the document's constituent elements. The system segments the document to generate elements with each element having similar characteristics. Elements may be clustered to assist in determining the document structure. The system determines directional relationships between elements (e.g., above, below, etc.). The system then employs a master comparator to determine familial relationships between adjacent elements. The master comparator includes a set of unit comparators and each unit comparator compares a specific characteristic between two elements. The master comparator sequentially applies the unit comparators to determine the familial relationship based on the comparisons. The system outputs a document hierarchy tree reflecting the determined familial relationships. The hierarchy tree represents the structure of the document.

Public/Granted literature

US20210216595A1 DOCUMENT INFORMATION EXTRACTION SYSTEM USING SEQUENCED COMPARATORS Public/Granted day:2021-07-15

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/90	.•与检索数据类型无关的数据库功能
G06F16/93	..••文件管理系统