-
公开(公告)号:US20180032804A1
公开(公告)日:2018-02-01
申请号:US15221971
申请日:2016-07-28
Applicant: INTUIT INC.
Inventor: Vijay S. YELLAPRAGADA , Peijun CHIANG , Sreeneel MADDIKA
CPC classification number: G06K9/00442 , G06F17/30011 , G06F17/30265 , G06K9/00483 , G06K9/46 , G06K9/6201 , G06K2009/4666 , G06T3/40 , G06T7/0042 , G06T2207/30176
Abstract: Techniques are disclosed for performing optical character recognition (OCR) by identifying a template based on a hash of a document. One embodiment includes a method for identifying a template associated with an image. The method includes receiving a digital image, a portion of the image depicting a first document, and extracting the portion of the image. The method further includes scaling the portion of the image and generating a first hash from the scaled image. The method further includes comparing the first hash to a set of hashes, each corresponding to a template. The method further includes selecting a first template as corresponding to the first document based on comparing the first hash to the set of hashes and extracting one or more sections of the portion of the image based on the selected first template. The method further includes performing OCR on the extracted one or more sections.