Invention Grant
- Patent Title: Automatic Arabic text image optical character recognition method
- Patent Title (中): 自动阿拉伯文字图像光学字符识别方法
-
Application No.: US12382916Application Date: 2009-03-26
-
Publication No.: US08150160B2Publication Date: 2012-04-03
- Inventor: Husni A. Al-Muhtaseb , Sabri A. Mahmoud , Rami Qahwaji
- Applicant: Husni A. Al-Muhtaseb , Sabri A. Mahmoud , Rami Qahwaji
- Applicant Address: SA Dhahran
- Assignee: King Fahd University of Petroleum & Minerals
- Current Assignee: King Fahd University of Petroleum & Minerals
- Current Assignee Address: SA Dhahran
- Agent Richard C. Litman
- Main IPC: G06K9/18
- IPC: G06K9/18 ; G06K9/46 ; G06K9/66

Abstract:
The automatic Arabic text image optical character recognition method includes training a text recognition system using Arabic printed text, using the produced models for classification of newly unseen Arabic scanned text, and generating the corresponding textual information. Scanned images of Arabic text and copies of minimal Arabic text are used in the training sessions. Each page is segmented into lines. Features of each line are extracted and input to Hidden Markov Model (HMM). All training data training features are used. HMM runs training algorithms to produce codebook and language models. In the classification stage new Arabic text is input in scanned form. Line segmentation where lines are extracted is passed through. In the feature stage, line features are extracted and input to the classification stage. In the classification stage the corresponding Arabic text is generated.
Public/Granted literature
- US20100246963A1 Automatic arabic text image optical character recognition method Public/Granted day:2010-09-30
Information query