Optical character recognition of series of images
Abstract:
Systems and methods for performing optical character recognition (OCR) are disclosed. An example method may include receiving a current image that overlaps with a previous image of a series of images of an original document; performing OCR of the current image to produce an OCR text; identifying a plurality of textual artifacts in the images that are each represented by a sequence of symbols having a frequency of occurrence within the OCR text falling below a threshold frequency; identifying corresponding base points that are each associated with a textural artifact; identifying parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image; associating part of the OCR text with a cluster of symbol sequences, the symbol sequences being produced by processing previously received images; identifying a median string representing the cluster; and producing a resulting OCR text representing a portion of the original document.
Public/Granted literature
Information query
Patent Agency Ranking
0/0