-
公开(公告)号:DE69529015T2
公开(公告)日:2003-10-09
申请号:DE69529015
申请日:1995-04-04
Applicant: IBM
Inventor: CASEY RICHARD G , TAKAHASHI HIROYASU
Abstract: An automated optical character recognition method is provided for use in conjunction with a programmable digital processing device. The method inputs a sequence of values representing one or more characters in an array of characters to be optically recognized. The values define one or more dimensional characteristics of the characters. From the input values, a standard dimensional value is determined from a frequency distribution of a selected one of the character dimensional characteristics. For each of the input characters, a set of normalized values is determined from the standard dimensional value. The normalized values correspond to the one or more character dimensional characteristics. Optical character recognition is thereafter performed using the normalized values.
-
公开(公告)号:JP2001109844A
公开(公告)日:2001-04-20
申请号:JP28517799
申请日:1999-10-06
Applicant: IBM
Inventor: TAKAHASHI HIROYASU
Abstract: PROBLEM TO BE SOLVED: To extract the character string of a handwritten address, etc., fast with high precision while evading a complicated integrating process in all pixels. SOLUTION: A connecting pixel group (CC) detection part 13 detects a connecting pixel group (CC) formed of black connecting pixels in a binarized image, a character size connecting pixel group (CharCC) extraction part 14 extracts a character size connecting pixel group (CharCC) which is neither too large nor too small among detected connecting pixel groups, and a lateral expansion part 21 and a longitudinal expansion part 22 which expand and contract the extracted character size connecting pixel group by expanding it in a virtual character string direction and contracting it at right angles to the virtual character string direction. Then thin and long connecting pixel group (LongCC) extraction parts 25 and 26 combine expanded and contracted connecting pixel groups in the virtual character string direction to extract thin and long connecting pixel groups and a character string selection part 30 selects an object character string of image recognition according to the extracted long and thin connecting pixel groups.
-
公开(公告)号:JP2000322510A
公开(公告)日:2000-11-24
申请号:JP11931599
申请日:1999-04-27
Applicant: IBM
Inventor: KATO MAKOTO , TAKAHASHI HIROYASU
Abstract: PROBLEM TO BE SOLVED: To remove a ruled line at high speed and to restore a character part erased together at high speed by detecting the black run of a lateral ruled line and storing it as a run length table for each longitudinal position. SOLUTION: The black run is detected from an image area and for each longitudinal position of the detected black run, the run length table composed of a lateral starting point and a length from the starting point is prepared (S1). The black run of the length equal to or greater than a threshold value is selected and removed from an image (S2). Residual noise with the mark of the ruled line is erased (S3). Two components having possibility to be longitudinally divided by removal are simply coupled (S4). The erased black run is reproduced on the image in the same size as an input image (S5). ANDed for each pixel and only the erased character part overlapped with the ruled line is left (S6). The ruled line removed image and the erased black pixel in the character part are ORed for each pixel and synthesized and a part erased together with the ruled line is restored (S7).
-
24.
公开(公告)号:JPH10162099A
公开(公告)日:1998-06-19
申请号:JP31724296
申请日:1996-11-28
Applicant: IBM
Inventor: TAKAHASHI HIROYASU , ARIMA TOSHIMICHI
Abstract: PROBLEM TO BE SOLVED: To provide a high speed picture processing system with a simple user interface, that can automatically decide rectangular object such as character frame. page mark, position correction mark only by the click with a mouse. SOLUTION: The scanned image of document containing a black frame 500 is displayed on a display, the character frame at a left end is clicked by the mouse at every recognition field and a character frame at the right end of the same field is clicked. Thus, the field is automatically designated. At that time, a field position/size decision program scans an image leftward/ rightward/upward/downward from the two clucked points and a walk on the black frame is selected. Furthermore, a rectangle is set between the two character frames and a histogram is taken. The number of character frames and the thickness of a black line between the character frames are automatically detected. In the document with the page mark and the position correction mark, what kind of mark it is, is automatically detected from that the vicinity is black and from the height of the mark only by clicking the inner part of the mark with the mouse.
-
公开(公告)号:JPH09237338A
公开(公告)日:1997-09-09
申请号:JP4230296
申请日:1996-02-29
Applicant: IBM
Inventor: NEMOTO NAOYUKI , TAKAHASHI HIROYASU
Abstract: PROBLEM TO BE SOLVED: To speed up OCR processing by executing smoothing operation for the n pieces of line conversion by using only shift, NOT, OR, and AND operation. SOLUTION: A bit pattern is inputted to a working array (line conversion) and two upper and lower lines of the array are blanked. A loop variable (i) is increased one by one from 2 up to (n+1) and the preceding line (i-1), current line (i) and succeeding line (i+1) of the i-th line of the working array are respectively substituted for a working variable. Then the working variables are shifted to right and left respectively by one bit and the shifted variables are substituted for respectively different working variables. Working variables not to be shifted on the current lines in four patterns A to D are held as they are, the bits of working variables corresponding to the positions of black points are inverted and the total OR of each bit is found out by eight working variables. Then AND between the OR result and a result working variable is found out in each bit, the AND is substituted for the result working variable, and finally the result working variable is substituted for the i-th line pattern of the working array.
-
公开(公告)号:JPS58201184A
公开(公告)日:1983-11-22
申请号:JP8414982
申请日:1982-05-20
Applicant: Ibm
Inventor: TAKAHASHI HIROYASU
-
公开(公告)号:JP2000113111A
公开(公告)日:2000-04-21
申请号:JP28102298
申请日:1998-10-02
Applicant: IBM
Inventor: TAKAHASHI HIROYASU
Abstract: PROBLEM TO BE SOLVED: To derive a stable characteristic value without being affected by the contour shape of a graphic form by dividing into a group of strongly curved parts and a group of weakly curved parts according to the value of the degree (sharp degree) of projecting and recessed parts on a contour and switching the definition of projecting and recessed directions (contour direction) to a different kind to perform calculation. SOLUTION: This device traces continuous contours outside and inside a character image and produces a list of contour points for a traced contour. Next, it produces a list 331 of sharp degrees by deciding a value corresponding to the sharp angle of a curvature between the orientation before N contour points of each contour point and the orientation after N contour points from it. It is divided into five kinds being a strongly recessed part, a weakly recessed part, a linear/reverse point, a weakly projecting part and a strongly projecting part in accordance with a sharp degree. Then, when it is a strongly projecting part or a strongly recessed part, normal direction calculation is performed in contour direction deviation and when it is a linear/reverse point, tangential direction calculation is performed. Also, the middle point of a straight line connecting points being before N points and after N points is connected to an attentional contour point and calculation is performed with it as a contour direction.
-
公开(公告)号:GB2355100B
公开(公告)日:2003-10-15
申请号:GB0024221
申请日:2000-10-04
Applicant: IBM
Inventor: TAKAHASHI HIROYASU
Abstract: An image processing apparatus comprises: a binarization unit 12 for obtaining a binary image for the image entered by an image input unit 10; a connected component detector 14 for detecting the obtained connected components; a comparator 16 for comparing the size of the detected connected components with a predetermined threshold size; a mesh image forming unit 18 for dividing the image entered by the image input unit 10 into mesh images having a predetermined size; a corresponding mesh image detector 19 for detecting, from the mesh images, a mesh image that corresponds to a connected component that is determined by the comparator 16 to occupy a range within the threshold size; a specific area extraction unit 22 for extracting a specific area in accordance with the connection state of the corresponding mesh image that is detected; and an image recognition unit 23 for recognizing an image that is located in the extracted specific area.
-
公开(公告)号:DE69529015D1
公开(公告)日:2003-01-16
申请号:DE69529015
申请日:1995-04-04
Applicant: IBM
Inventor: CASEY RICHARD G , TAKAHASHI HIROYASU
Abstract: An automated optical character recognition method is provided for use in conjunction with a programmable digital processing device. The method inputs a sequence of values representing one or more characters in an array of characters to be optically recognized. The values define one or more dimensional characteristics of the characters. From the input values, a standard dimensional value is determined from a frequency distribution of a selected one of the character dimensional characteristics. For each of the input characters, a set of normalized values is determined from the standard dimensional value. The normalized values correspond to the one or more character dimensional characteristics. Optical character recognition is thereafter performed using the normalized values.
-
公开(公告)号:CA1299292C
公开(公告)日:1992-04-21
申请号:CA575692
申请日:1988-08-25
Applicant: IBM
Inventor: TAKAHASHI HIROYASU , YAMASHITA AKIO
Abstract: CHARACTER RECOGNITION APPARATUS A recognition dictionary of a tree structure is created by using multi-value feature with maximum and minimum values such as a white run length, rather than a binary feature value such as black or white pels, so that the dictionary can be created through very simple operations with fewer samples. In addition, recognition error due to erroneous branching can be detected at the time of searching of the dictionary tree by using a ternary tree instead of a binary tree, to improve the efficiency in the recognition. Combination of the recognition technique with the tree structure dictionary and the pattern matching technique also provides a feature where the former can efficiently narrow down the candidate character categories, and then the latter can order the candidates. Thus, because high speed and high recognition ratio can be realized, and the processing is simple, a practical system can be implemented on a general purpose personal computer in software only, without using special hardware. Additionally, it is easy to improve the recognition ratio through adaption because addition or modification of the recognition dictionary can be very easily performed. JA9-82-002
-
-
-
-
-
-
-
-
-