Invention Grant
- Patent Title: Automated document extraction and classification
-
Application No.: US16054781Application Date: 2018-08-03
-
Publication No.: US10977291B2Publication Date: 2021-04-13
- Inventor: Ronnie Douglas Douthit , Deepankar Mohapatra , Ram Mohan Shamanna , Chiranjeev Jagannadha Reddy , Yexin Huang , Trichur Shivaramakrishnan Subramanian , Chinnadurai Duraisami , Karpaga Ganesh Patchirajan , Amar J. Mattey
- Applicant: Ronnie Douglas Douthit , Deepankar Mohapatra , Ram Mohan Shamanna , Chiranjeev Jagannadha Reddy , Yexin Huang , Trichur Shivaramakrishnan Subramanian , Chinnadurai Duraisami , Karpaga Ganesh Patchirajan , Amar J. Mattey
- Applicant Address: US TX Frisco; US TX The Colony; US TX Frisco; US TX Frisco; US TX Plano; US TX McKinney; US TX Plano; US TX Plano; US TX Frisco
- Assignee: Ronnie Douglas Douthit,Deepankar Mohapatra,Ram Mohan Shamanna,Chiranjeev Jagannadha Reddy,Yexin Huang,Trichur Shivaramakrishnan Subramanian,Chinnadurai Duraisami,Karpaga Ganesh Patchirajan,Amar J. Mattey
- Current Assignee: Ronnie Douglas Douthit,Deepankar Mohapatra,Ram Mohan Shamanna,Chiranjeev Jagannadha Reddy,Yexin Huang,Trichur Shivaramakrishnan Subramanian,Chinnadurai Duraisami,Karpaga Ganesh Patchirajan,Amar J. Mattey
- Current Assignee Address: US TX Frisco; US TX The Colony; US TX Frisco; US TX Frisco; US TX Plano; US TX McKinney; US TX Plano; US TX Plano; US TX Frisco
- Agency: Ferguson Braswell Fraser Kubasta PC
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F16/35 ; G06N5/02 ; G06Q40/00 ; G06F16/93

Abstract:
A method including receiving a source file containing a plurality of documents which, to a computer, initially are indistinguishable from each other. A first classification stage is applied to the source file using a convolutional neural network image classification to identify source documents in the multitude of documents and to produce a partially parsed file having a multitude of identified source documents. The partially parsed file includes sub-images corresponding to the plurality of identified source documents. A second classification stage, including a natural language processing artificial intelligence, is applied to sets of text in bounding boxes of the sub-images, to classify each of the multitude of identified source documents as a corresponding sub-type of document. Each of the sets of text corresponding to one of the sub-images. A parsed file having a multitude of identified sub-types of documents is produced. The parsed file is further computer processed.
Public/Granted literature
- US20200042645A1 AUTOMATED DOCUMENT EXTRACTION AND CLASSIFICATION Public/Granted day:2020-02-06
Information query