-
公开(公告)号:US20190089856A1
公开(公告)日:2019-03-21
申请号:US16191632
申请日:2018-11-15
Applicant: INTUIT INC.
Inventor: Vijay S. YELLAPRAGADA , Peijun CHIANG , Daniel LEE , Jason HALL , Shailesh SOLIWAL
Abstract: Aspects of the present disclosure provide methods and apparatuses for processing a digital image of a document, for example, to determine whether the document is a long document. An exemplary method generally includes obtaining a plurality of digital images of the document, segmenting at least a first digital image of the plurality of images into pixels associated with a foreground of the first digital image and pixels associated with a background of the first digital image, detecting a plurality of contours in the segmented first digital image, deciding, for each detected contour of the plurality of contours, whether that contour is an open contour or a closed contour, and determining that one or more sides of the document is out-of-bounds based, at least in part, on the decisions.
-
公开(公告)号:US20180096200A1
公开(公告)日:2018-04-05
申请号:US15285552
申请日:2016-10-05
Applicant: INTUIT INC.
Inventor: Eugene KRIVOPALTSEV , Sreeneel K. MADDIKA , Vijay S. YELLAPRAGADA
CPC classification number: G06K9/00442 , G06F3/0481 , G06F17/214 , G06K9/00463 , G06K9/4604 , G06K9/52 , G06K9/6256 , G06K2209/01 , G06T7/0042 , G06T7/0085 , G06T7/60 , G06T11/60
Abstract: Systems of the present disclosure generate accurate training data for optical character recognition (OCR). Systems disclosed herein generates images of a text passage as displayed piecemeal in a user interface (UI) element rendered in a selected font type and size, determine accurate dimensions and locations of bounding boxes for each character pictured in the images, stitch together a training image by concatenating the images, and associate the training image, the bounding box dimensions and locations, and the text passage together in a collection of training data. The collection of training data also includes a computer-readable master copy of the text passage with newline characters inserted therein.
-
公开(公告)号:US20180032804A1
公开(公告)日:2018-02-01
申请号:US15221971
申请日:2016-07-28
Applicant: INTUIT INC.
Inventor: Vijay S. YELLAPRAGADA , Peijun CHIANG , Sreeneel MADDIKA
CPC classification number: G06K9/00442 , G06F17/30011 , G06F17/30265 , G06K9/00483 , G06K9/46 , G06K9/6201 , G06K2009/4666 , G06T3/40 , G06T7/0042 , G06T2207/30176
Abstract: Techniques are disclosed for performing optical character recognition (OCR) by identifying a template based on a hash of a document. One embodiment includes a method for identifying a template associated with an image. The method includes receiving a digital image, a portion of the image depicting a first document, and extracting the portion of the image. The method further includes scaling the portion of the image and generating a first hash from the scaled image. The method further includes comparing the first hash to a set of hashes, each corresponding to a template. The method further includes selecting a first template as corresponding to the first document based on comparing the first hash to the set of hashes and extracting one or more sections of the portion of the image based on the selected first template. The method further includes performing OCR on the extracted one or more sections.
-
公开(公告)号:US20200244831A1
公开(公告)日:2020-07-30
申请号:US16850530
申请日:2020-04-16
Applicant: INTUIT INC.
Inventor: Vijay S. YELLAPRAGADA , Daniel LEE , Jason HALL , Shailesh SOLIWAL
Abstract: Aspects of the present disclosure provide methods and apparatuses for processing a digital image of a document, for example, to determine whether the document is a long document. An exemplary method generally includes obtaining a plurality of digital images of the document, segmenting at least a first digital image of the plurality of images into pixels associated with a foreground of the first digital image and pixels associated with a background of the first digital image, detecting a plurality of contours in the segmented first digital image, deciding, for each detected contour of the plurality of contours, whether that contour is an open contour or a closed contour, and determining that one or more sides of the document is out of bounds based, at least in part, on the decisions.
-
公开(公告)号:US20180365488A1
公开(公告)日:2018-12-20
申请号:US16112190
申请日:2018-08-24
Applicant: INTUIT INC.
Inventor: Eugene KRIVOPALTSEV , Sreeneel K. MADDIKA , Vijay S. YELLAPRAGADA
IPC: G06K9/00 , G06T11/60 , G06T7/60 , G06K9/46 , G06T7/70 , G06K9/62 , G06K9/52 , G06T7/73 , G06F3/0481 , G06F17/21
CPC classification number: G06K9/00442 , G06F3/0481 , G06F17/214 , G06K9/00463 , G06K9/4604 , G06K9/52 , G06K9/6255 , G06K9/6256 , G06K2209/01 , G06T7/13 , G06T7/60 , G06T7/70 , G06T7/73 , G06T11/60
Abstract: Systems of the present disclosure generate accurate training data for optical character recognition (OCR). Systems disclosed herein generates images of a text passage as displayed piecemeal in a user interface (UI) element rendered in a selected font type and size, determine accurate dimensions and locations of bounding boxes for each character pictured in the images, stitch together a training image by concatenating the images, and associate the training image, the bounding box dimensions and locations, and the text passage together in a collection of training data. The collection of training data also includes a computer-readable master copy of the text passage with newline characters inserted therein.
-
公开(公告)号:US20180365487A1
公开(公告)日:2018-12-20
申请号:US16111121
申请日:2018-08-23
Applicant: INTUIT INC.
Inventor: Eugene KRIVOPALTSEV , Sreeneel K. MADDIKA , Vijay S. YELLAPRAGADA
IPC: G06K9/00 , G06T11/60 , G06T7/60 , G06K9/46 , G06T7/70 , G06K9/62 , G06K9/52 , G06T7/73 , G06F3/0481 , G06F17/21
CPC classification number: G06K9/00442 , G06F3/0481 , G06F17/214 , G06K9/00463 , G06K9/4604 , G06K9/52 , G06K9/6255 , G06K9/6256 , G06K2209/01 , G06T7/13 , G06T7/60 , G06T7/70 , G06T7/73 , G06T11/60
Abstract: Systems of the present disclosure generate accurate training data for optical character recognition (OCR). Systems disclosed herein generates images of a text passage as displayed piecemeal in a user interface (UI) element rendered in a selected font type and size, determine accurate dimensions and locations of bounding boxes for each character pictured in the images, stitch together a training image by concatenating the images, and associate the training image, the bounding box dimensions and locations, and the text passage together in a collection of training data. The collection of training data also includes a computer-readable master copy of the text passage with newline characters inserted therein.
-
公开(公告)号:US20180082146A1
公开(公告)日:2018-03-22
申请号:US15271562
申请日:2016-09-21
Applicant: INTUIT INC.
Inventor: Eugene KRIVOPALTSEV , Sreeneel K. MADDIKA , Vijay S. YELLAPRAGADA
CPC classification number: G06K9/4671 , G06K9/00442 , G06K2209/01
Abstract: The present disclosure includes techniques for selecting a candidate presentation style for individual documents for inclusion in an aggregate training data set for a document type that may be used to train an OCR processing engine prior to identifying text in an image of a document of the document type. In one embodiment, text input corresponding to a text sample in a document is received, and an image of the text sample in the document is received. For each of a plurality of candidate presentation styles, an OCR processing engine is trained using a training data set corresponding to the given candidate presentation style, and the OCR processing engine is used, as trained, to identify text in the received image. The OCR processing results for each candidate presentation style are compared to the received text input. A candidate presentation style for the document is selected based on the comparisons.
-
-
-
-
-
-