BINARIZING METHOD FOR OPTICAL CHARACTER RECOGNITION SYSTEM

    公开(公告)号:JP2000011089A

    公开(公告)日:2000-01-14

    申请号:JP12965499

    申请日:1999-05-11

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To estimate a relative threshold corresponding to an intensity difference between a text and a background in an OCR system. SOLUTION: A text pixel is determined in accordance with a result that differences between the value of a pixel 10 and the values of plural pixels separated from the pixel 10 by a prescribed distance are larger than a relative threshold corresponding to an intensity difference between a text and a background or not, an image is subsamplied at a rate corresponding to two pixels for detecting the kernel of the text and an image pixel is binarized only on a tile having the sideface of plural stroke widths and including the kernel of the text by using the estimated threshold. In the determination of a text pixel, which difference out of differences between two pixels located on positions where a circle 12 having a radius equal to stroke width W around a pixel to be analyzed intersects with a row line, a column line and two lines having an 45 deg. angle and the value of the pixel to be analyzed is larger than the relative threshold is examined.

    2.
    发明专利
    未知

    公开(公告)号:DE10010621B4

    公开(公告)日:2006-08-24

    申请号:DE10010621

    申请日:2000-03-03

    Applicant: IBM

    Abstract: A method for locating a structured field in a gray-scale image of an object, including choosing a plurality of anchor points in the image, each anchor point having a gray-scale value associated therewith. For each anchor point there is determined a horizontal variation dependent on a difference between the gray-scale value of the anchor point and the gray-scale value of a horizontally neighboring anchor point, and there is also determined a vertical variation dependent on a difference between the gray-scale value of the anchor point and the gray-scale value of a vertically neighboring anchor point. Those anchor points whose vertical and horizontal variations obey a first or a second predefined condition are defined as vertically or horizontally dominant respectively. One or more kernels are defined in the image, each such kernel comprising a group of anchor points n predetermined mutual proximity and satisfying a third predefined condition relating the number of vertically-dominant and horizontally-dominant anchor points in the group. The structured field in the image is located using one or more kernels.

    3.
    发明专利
    未知

    公开(公告)号:DE69302003D1

    公开(公告)日:1996-05-02

    申请号:DE69302003

    申请日:1993-12-01

    Applicant: IBM

    Abstract: A data entry system generates an electronically stored coded representation of a character sequence from one or more electronically stored document images, comprising optical character recognition logic (90) for generating, from the document image or images, character data specifying one of a plurality of possible character values for corresponding segments of the document images; characterised by interactive display apparatus comprising: means (110) for generating and sequentially displaying, one or more types of composite image, each composite image comprising segments of the document image or images arranged according to the character data, and a correction mechanism responsive to a user input operation to enable the operator to correct the character data associated with displayed segments.

    4.
    发明专利
    未知

    公开(公告)号:DE69822608T2

    公开(公告)日:2005-01-05

    申请号:DE69822608

    申请日:1998-05-28

    Applicant: IBM

    Abstract: Method of binarization to be used in an OCR system consisting in determining text pixels by checking, for each pixel, that the difference between its value and the values of a plurality of pixels located at a predetermined distance therefrom is greater than a relative threshold corresponding to the difference in intensities between the text and the background of the image, subsampling the image at a rate corresponding to at least two pixels in order to detect kernels of text, and binarizing the image pixels only in tiles of several stroke width sides containing text kernels by using in each tile, an absolute threshold estimated in this tile. The step of determining text pixels consists, for each analyzed pixel, in checking that either one of the differences between the value of the analyzed pixel and the value of the two pixels located at each intersection of a circle (12) centered at the location of the analyzed pixel and having a radius equal to the stroke width with each one of the row line, column line and both lines at the angle of 45 degrees, is greater than the relative threshold.

    5.
    发明专利
    未知

    公开(公告)号:DE69521934D1

    公开(公告)日:2001-09-06

    申请号:DE69521934

    申请日:1995-05-17

    Applicant: IBM

    Inventor: WALACH EUGENE

    Abstract: A method is disclosed for sorting a set of mail items according to a predefined delivery sequence, the method comprising the steps of: generating for each of a first subset of the mail items a first sequence number according to the position of their respective destination addresses in the delivery sequence; sorting, using a sorting machine, the first subset into batches according to the first sequence number disregarding a number N of the most significant digits thereof; associating with each of a second subset of the mail items, one of the first sequence numbers corresponding to the destination addresses of the mail items in the first subset between which their respective destination addresses lie in the delivery sequence; generating for each of the second subset, a second sequence number according to the position of their respective destination addresses in the delivery sequence among the destination addresses of mail items in the second subset associated with the same first sequence number; sorting, using a sorting machine, the second subset into batches according to the second sequence number and the first sequence number disregarding N of the most significant digits of the first sequence number; interleaving the batches of mail items from the first subset and from the second subset; and sorting the mail items according to the N most significant digits of the first sequence numbers. In this way, all the mail is sorted in sequence, but sorting of the mail can begin prior to all the mail being physically present at the sorter or its location in the sorting scheme being known.

    Localization of address blocks in gray scale images

    公开(公告)号:DE10010621A1

    公开(公告)日:2000-09-21

    申请号:DE10010621

    申请日:2000-03-03

    Applicant: IBM

    Abstract: Selects number of anchor points in image, each assigned gray scale value. Determines horizontal and vertical deviation for each point, which is dependent on difference between gray scale value of anchor point and gray scale value of neighboring horizontal or vertical anchor point. Defines anchor point as dominant if corresponds to predefined condition. In first processing stage a general search for regions of interest (ROI) is carried out. Dominance is determined (52) by selecting first number of anchor points each having corresponding intensity. Each anchor point has horizontal and vertical deviation calculated to determine vertical and horizontal dominant points used to determine (54) probable text containing positions in image. Second processing stage carries out second iteration with text regions of interest found in first stage, to improve recognition of text ROIs, delete ROIs falsely labelled and to assign order of precedence to text ROIs.

    7.
    发明专利
    未知

    公开(公告)号:DE69109842T2

    公开(公告)日:1995-12-14

    申请号:DE69109842

    申请日:1991-12-11

    Applicant: IBM

    Inventor: WALACH EUGENE

    Abstract: An image processing system including attenuation estimation logic 6 for estimating local attenuation coefficient values for segments of a scanned image from the amplitude values of scanned image pixels within said segments has micro-segmentation logic 3 for determining average amplitude values for micro-segments of the image, the micro-segments being smaller than the segments of the image and the average amplitude values of the micro-segments being used for estimating the attenuation coefficient values for the segments of the image. The invention enables individual features to be more clearly identified in the attentuation map than with prior images processing systems without the micro-segmentation logic with the result that it is more easy for the operator of the system to interpret the images.

    8.
    发明专利
    未知

    公开(公告)号:DE69822608D1

    公开(公告)日:2004-04-29

    申请号:DE69822608

    申请日:1998-05-28

    Applicant: IBM

    Abstract: Method of binarization to be used in an OCR system consisting in determining text pixels by checking, for each pixel, that the difference between its value and the values of a plurality of pixels located at a predetermined distance therefrom is greater than a relative threshold corresponding to the difference in intensities between the text and the background of the image, subsampling the image at a rate corresponding to at least two pixels in order to detect kernels of text, and binarizing the image pixels only in tiles of several stroke width sides containing text kernels by using in each tile, an absolute threshold estimated in this tile. The step of determining text pixels consists, for each analyzed pixel, in checking that either one of the differences between the value of the analyzed pixel and the value of the two pixels located at each intersection of a circle (12) centered at the location of the analyzed pixel and having a radius equal to the stroke width with each one of the row line, column line and both lines at the angle of 45 degrees, is greater than the relative threshold.

    9.
    发明专利
    未知

    公开(公告)号:DE69109842D1

    公开(公告)日:1995-06-22

    申请号:DE69109842

    申请日:1991-12-11

    Applicant: IBM

    Inventor: WALACH EUGENE

    Abstract: An image processing system including attenuation estimation logic 6 for estimating local attenuation coefficient values for segments of a scanned image from the amplitude values of scanned image pixels within said segments has micro-segmentation logic 3 for determining average amplitude values for micro-segments of the image, the micro-segments being smaller than the segments of the image and the average amplitude values of the micro-segments being used for estimating the attenuation coefficient values for the segments of the image. The invention enables individual features to be more clearly identified in the attentuation map than with prior images processing systems without the micro-segmentation logic with the result that it is more easy for the operator of the system to interpret the images.

    10.
    发明专利
    未知

    公开(公告)号:DE69521934T2

    公开(公告)日:2002-04-04

    申请号:DE69521934

    申请日:1995-05-17

    Applicant: IBM

    Inventor: WALACH EUGENE

    Abstract: A method is disclosed for sorting a set of mail items according to a predefined delivery sequence, the method comprising the steps of: generating for each of a first subset of the mail items a first sequence number according to the position of their respective destination addresses in the delivery sequence; sorting, using a sorting machine, the first subset into batches according to the first sequence number disregarding a number N of the most significant digits thereof; associating with each of a second subset of the mail items, one of the first sequence numbers corresponding to the destination addresses of the mail items in the first subset between which their respective destination addresses lie in the delivery sequence; generating for each of the second subset, a second sequence number according to the position of their respective destination addresses in the delivery sequence among the destination addresses of mail items in the second subset associated with the same first sequence number; sorting, using a sorting machine, the second subset into batches according to the second sequence number and the first sequence number disregarding N of the most significant digits of the first sequence number; interleaving the batches of mail items from the first subset and from the second subset; and sorting the mail items according to the N most significant digits of the first sequence numbers. In this way, all the mail is sorted in sequence, but sorting of the mail can begin prior to all the mail being physically present at the sorter or its location in the sorting scheme being known.

Patent Agency Ranking