Invention Grant
US08208726B2 Method and system for optical character recognition using image clustering
有权
使用图像聚类的光学字符识别的方法和系统
- Patent Title: Method and system for optical character recognition using image clustering
- Patent Title (中): 使用图像聚类的光学字符识别的方法和系统
-
Application No.: US12841839Application Date: 2010-07-22
-
Publication No.: US08208726B2Publication Date: 2012-06-26
- Inventor: Kave Eshghi , George Forman , Prakash Reddy
- Applicant: Kave Eshghi , George Forman , Prakash Reddy
- Applicant Address: US TX Houston
- Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee Address: US TX Houston
- Main IPC: G06K9/34
- IPC: G06K9/34

Abstract:
The present disclosure provides a computer-implemented method of translating an image-based electronic document into a text-based electronic document. The method includes electronically scanning an image-based document to determine positions of word images in the image-based document. The method also includes extracting the word images from the image-based document and storing the word images to an electronic storage device. The method also includes grouping a subset of the word images into a word cluster based on a similarity of the word images, wherein the word images in the word cluster correspond to a same actual word. The method also includes generating a character-encoded transcription for the word cluster based on the word images in the word cluster. The method also includes adding the character-encoded transcription to a text-based electronic document at locations corresponding to the positions of the word images in the image-based document.
Public/Granted literature
- US20120020561A1 METHOD AND SYSTEM FOR OPTICAL CHARACTER RECOGNITION USING IMAGE CLUSTERING Public/Granted day:2012-01-26
Information query