-
公开(公告)号:CA1050167A
公开(公告)日:1979-03-06
申请号:CA209648
申请日:1974-09-19
Applicant: IBM
Inventor: CHAIRES ANNE M , CICONTE JEAN M , HILLIARD JOHN J , ROSENBAUM WALTER S
Abstract: An online numeric discriminator is disclosed which performs the decision making process between strings of characters coming from a dual output optical character recognition system for use in text processing or mail processing applications. The dual output OCR uses separate recognition processes for alphabetic and numeric characters and attempts to recognize each character independently as both an alphabetic and a numeric character. The alphabetic interpretation of the scanned word is outputted as an alphabetic subfield on a first output line and the numeric interpretation of the scanned word is outputted as a numeric subfield on a second output line from the OCR. The bayesian online numeric discriminator then analyzes the two character streams by calculating a first conditional probability that the OCR perceived the alphabetic subfield given that a numeric subfield was actually scanned and a second conditional probability that the OCR perceived the numeric subfield given that an alphabetic subfield was actually scanned. These first and second conditional probabilities are then compared. If the conditional probability that the OCR read the alphabetic subfield given that the numeric subfield was actually scanned, is larger than the conditional probability that the OCR read the numeric subfield given that the alphabetic subfield was actually scanned, then the numeric subfield is selected by the discriminator as the most probable interpretation of the word scanned by the OCR.
-
公开(公告)号:FR2336743A1
公开(公告)日:1977-07-22
申请号:FR7636143
申请日:1976-11-24
Applicant: IBM
Inventor: HILLIARD JOHN J , MULLAN PHILIP J , ROSENBAUM WALTER S
Abstract: The print convention apparatus and method disclosed herein effects a decision making process with respect to a determination as to whether an alphabetic character field output from an optical character reader (OCR) is related to the OCR scan of an upper case or a lower case inscription on the document scanned. The alphabetic character field (e.g., a word) is comprised of one or a series of alphabetic characters which represent the OCR's interpretation of characters printed on the scanned document. Each word output by the OCR corresponds to a field (i.e., word) of characters imprinted on the scanned document. The electrical signals representative of the upper and lower case alphabetic characters and rejects including conflicts outputted from the OCR are applied to a character occurrence probability storage apparatus which contains precomputed empirical probabilities therein that: (1) a given character recognition is the result of the scan of an upper case character; and (2) a given character recognition is the result of the scan of a lower case character. In addition, the storage apparatus includes probability values for character conflicts and rejects. As the series of alphabetic character signals from the OCR output are applied character-by-character to the character occurrence probability storage apparatus (e.g., a read-only store), a running sum of the respective probabilities for the upper case and lower case print conventions is developed so that, following the input of the final character, reject or conflict within a word to the aforesaid apparatus, an appropriate upper or lower case determination can be made for all of the characters within the word. This determination corresponds with the print convention of the word inscribed on the scanned document. A corresponding upper or lower case flag is correspondingly generated with the print convention determination, and associated with the alphabetic character word output from the print convention apparatus for further text processing. In one embodiment of the invention the probability for each OCR output alphabetic character being an upper or lower case character is stored in respective upper and lower case character occurrence probability storage devices after having been precomputed as the product of two probability factors; i.e., (1) a first probability factor with respect to the likelihood that the OCR recognition resulted from the scan of an upper or lower case character, and (2) a second probability factor with respect to the likelihood of a given character occurring in a specified language (e.g., English) document. In another embodiment of the invention, the character occurrence probability storage devices are functionally replaced by a read-only store having an address position for each upper and lower case alphabetic character outputted by the OCR including conflicts and rejects, and a precomputed numerical probability value associated with each address position to represent the quotient of: (1) the probability that a given character is related to an upper case print convention; and (2) the probability that the same character is related to a lower case print convention.
-
公开(公告)号:DK0424728T4
公开(公告)日:1998-09-14
申请号:DK90119392
申请日:1990-10-10
Applicant: IBM
Inventor: ROSENBAUM WALTER S , HILLIARD JOHN J
Abstract: The invention is characterized as a data processing architecture and method for multi-stage processing of mail, using knowledge based techniques. The system includes OCR-scanning a multipart address field of a mail piece at a sending location, the address field including at least two portions, a first stage routing portion (destination city, state, country, zip code) and a second stage routing portion (destination street address, building floor, corporate addressee internal routing). At the sending location, the image of the entire address field is captured by an OCR head and stored in memory. A serial number is printed on the mail piece. The first routing portion is then converted into sorting signals to sort the mail piece to a truck at the sending location. While the mail piece is in transit on the truck, the knowledge processor completes its analysis and is able to transmit by electronic communications link to the destination location, the information that the mail piece is on its way and the second stage routing information needed to automatically sort and deliver the mail piece to its corporate addressee.
-
公开(公告)号:DE69213532D1
公开(公告)日:1996-10-17
申请号:DE69213532
申请日:1992-03-25
Applicant: IBM
Inventor: ROSENBAUM WALTER S , CARRIS BARR T , ANKERSTJERNE ANKER
Abstract: A system and method are disclosed for enabling the technique of deferred processing of OCR scanned mail to be compatible with existing techniques for mechanical sortation of mail that use standard sort barcode formats which are common to a given destination postal system. This enables deferred OCR processed mail to be sorted on an unsegregated basis along with other types of mail which have not been processed by the deferred OCR technique. This allows the OCR encoded mail to be processed along with other types of encoded mail during standard sort barcode that has been imprinted using prior technology such as OCR or manual code desks.
-
公开(公告)号:DE69016572T2
公开(公告)日:1995-08-10
申请号:DE69016572
申请日:1990-10-10
Applicant: IBM
Inventor: ROSENBAUM WALTER S , HILLIARD JOHN J
Abstract: The invention is characterized as a data processing architecture and method for multi-stage processing of mail, using knowledge based techniques. The system includes OCR-scanning a multipart address field of a mail piece at a sending location, the address field including at least two portions, a first stage routing portion (destination city, state, country, zip code) and a second stage routing portion (destination street address, building floor, corporate addressee internal routing). At the sending location, the image of the entire address field is captured by an OCR head and stored in memory. A serial number is printed on the mail piece. The first routing portion is then converted into sorting signals to sort the mail piece to a truck at the sending location. While the mail piece is in transit on the truck, the knowledge processor completes its analysis and is able to transmit by electronic communications link to the destination location, the information that the mail piece is on its way and the second stage routing information needed to automatically sort and deliver the mail piece to its corporate addressee.
-
公开(公告)号:DK0424728T3
公开(公告)日:1995-06-26
申请号:DK90119392
申请日:1990-10-10
Applicant: IBM
Inventor: ROSENBAUM WALTER S , HILLIARD JOHN J
Abstract: The invention is characterized as a data processing architecture and method for multi-stage processing of mail, using knowledge based techniques. The system includes OCR-scanning a multipart address field of a mail piece at a sending location, the address field including at least two portions, a first stage routing portion (destination city, state, country, zip code) and a second stage routing portion (destination street address, building floor, corporate addressee internal routing). At the sending location, the image of the entire address field is captured by an OCR head and stored in memory. A serial number is printed on the mail piece. The first routing portion is then converted into sorting signals to sort the mail piece to a truck at the sending location. While the mail piece is in transit on the truck, the knowledge processor completes its analysis and is able to transmit by electronic communications link to the destination location, the information that the mail piece is on its way and the second stage routing information needed to automatically sort and deliver the mail piece to its corporate addressee.
-
公开(公告)号:CA1092244A
公开(公告)日:1980-12-23
申请号:CA288124
申请日:1977-10-04
Applicant: IBM
Inventor: KOLPEK ROBERT A , MACDUFFEE DAVID L , ROSENBAUM WALTER S
IPC: B41J5/30 , B41J7/96 , G06F3/09 , G06F3/12 , G06F11/00 , G06F17/22 , G06F17/27 , G06K9/72 , G06K15/00 , G06F5/00
Abstract: SYSTEM FOR AUTOMATICALLY PROOFREADING A DOCUMENT Spelling errors in a word processing system are detected and presented to the operator for correction at the end of a document page. A dictionary memory contains representations of the correct spellings for words most frequently used. As each word is typed, it is stored in a word queue where it is compared to the contents of the dictionary memory. If the compare is unequal, then the word and its location on the page is stored in an error memory. When an end of page indicator is set the printer automatically repositions the print head at the ending character of the first word in the error list. When the operator keys in the correct spelling, the printer is caused to remove the misspelled word from the page and type the correct spelling. The corresponding word in the error memory is also corrected. As each misspelled word in the error memory is corrected, the remainder of the memory is scanned and repetitions of the same spelling error are automatically corrected.
-
公开(公告)号:CA1066423A
公开(公告)日:1979-11-13
申请号:CA255922
申请日:1976-06-29
Applicant: IBM
Inventor: ROSENBAUM WALTER S
IPC: B41J5/30 , B41B27/36 , G06F3/12 , G06F7/00 , G06F40/191 , G06K15/06 , G09G5/36 , G06F5/00 , G06K15/18
Abstract: APPARATUS FOR AUTOMATIC HYPHENATION Apparatus is disclosed for automatic hyphenation of input words from a word processing system. The apparatus includes a digital reference matrix memory containing a vector representation of all legal hyphenations for each dictionary word in the form of a calculated magnitude and associated unique vector angles. The vector magnitude constitutes the address data for accessing the memory. When an input word is received for hyphenation, a hyphen is added to the word and its magnitude is calculated. The memory is accessed for an address which equals the calculated magnitude. If the address is not found a signal is generated indicating that the word cannot be legally hyphenated. If the address is found, then the corresponding angles, representing legal hyphenations of the input word, are compared with test words generated by sequentially inserting hyphens in the input word. All equal compares are flagged and the corresponding hyphenated input words are gated onto the output line.
-
公开(公告)号:CA1066422A
公开(公告)日:1979-11-13
申请号:CA255923
申请日:1976-06-29
Applicant: IBM
Inventor: ROSENBAUM WALTER S
Abstract: DIGITAL REFERENCE MATRIX FOR WORD VERIFICATION: A digital reference matrix apparatus is disclosed for verifying input alpha words from a keyboard, character recognition machine, or voice analyzer as valid linguistic expressions. The organization of the digital reference matrix is based upon the character transfer function of the input apparatus. The digital reference matrix contains a vector representation for each dictionary word in the form of a calculated vector magnitude and unique vector angle. The set of magnitudes and angles is stored in the digital reference matrix using a form of run length coding by storing a single magnitude pointer followed by the chain of unique angles for words having the same magnitude. The vector magnitude so calculated constitutes the address data for accessing the digital reference matrix. When an input word is received for verification, the word's magnitude and angle attributes are calculated and the digital reference matrix is accessed at the magnitude of the input word and the corresponding angles are searched for a match. An output signal is generated indicating whether or not the input word is valid. The organization of the digital reference matrix minimizes the size of the array needed for accurate word verification representation through the use of the combination of digital angle representation and run length compaction of the magnitude/angle verification syntax.
-
公开(公告)号:FR2395841A1
公开(公告)日:1979-01-26
申请号:FR7814657
申请日:1978-05-09
Applicant: IBM
Inventor: KETTLER HOWARD G , KOLPEK ROBERT A , ROSENBAUM WALTER S
IPC: B41J19/32 , B41J19/58 , B41J19/64 , B41J25/12 , B41J25/22 , G06F3/12 , G06F17/21 , G06K15/00 , G06K15/08
Abstract: The aesthetic characteristics of adjacent characters are used to enhance the quality of output in a proportional spacing printer and to provide right margin justification for composing. Spacing between characters is determined on the basis of the character being printed and the preceding character already printed on the page. An intercharacter displacement memory contains a list of ideal spacing for all combinations of characters to be printed. As each character is typed, it and the previously stored preceding character address the intercharacter displacement memory. The output of the intercharacter displacement memory is the ideal value of escapement for this combination of characters and font style. The printer positions the print head prior to printing the next character, rather than positioning the print head after the previous character is printed. Line ending decisions for composing are eliminated during initial and final typing of a document by adding to the intercharacter displacement memory recommendations for altering the ideal spacing between characters, where aesthetically possible, to eliminate the need for line ending hyphenation. During initial keying, escapements for adjacent pairs of characters are totaled in a memory for ideal, shortest (tight), and longest (loose) recommended escapements. The line is automatically terminated within the justification range by a carrier return function based on the escapement totals and the selected right margin. Final playout of the page from memory alters the intercharacter escapements from the ideal values to either longer or shorter escapements depending on whether the line is to be lengthened or shortened.
-
-
-
-
-
-
-
-
-