Method and system for optical character recognition of series of images
Abstract:
Systems and methods for performing OCR of a series of images depicting text symbols. An example method comprises: receiving, by a processing device, a current image of a series of images of an original document, wherein the current image at least partially overlaps with a previous image of the series of images; performing optical character recognition (OCR) of the current image to produce an OCR text and a corresponding text layout; associating at least part of the OCR text with a first cluster of a plurality of clusters of symbol sequences associated with one or more previously received images of the series of images; identifying a first string representing the first cluster of symbol sequences based on a first subset of images of the series of images; identifying a first template field of a document template corresponding to the first cluster based on the first string representing the first cluster and the text layout of the current image; identifying, for the first cluster, a second-level median based on one or more parameters of the first template; and producing, using the second-level string, a resulting OCR text representing at least a portion of the first template field of the original document.
Information query
Patent Agency Ranking
0/0