System and method for automatic detection and verification of optical character recognition data
Abstract:
Systems and methods for automatically verifying optical character recognition (OCR) detected text of a native electronic document having an image layer comprising a matrix of pixels and a text layer comprising a sequence of characters. The method includes determining a location of OCR-detected text in the text layer of the native electronic document based on a pixel-based coordinate location of the OCR-detected text in the image layer of the native electronic document. The method also includes applying the location of the OCR-detected text to the text layer of the native electronic document to detect text in the text layer corresponding to the OCR-detected text. The method also includes rendering only the detected text in the text layer as an output when the OCR-detected text does not match the detected text in the text layer, to improve accuracy of the output text.
Information query
Patent Agency Ranking
0/0