Binary reference matrix for a character recognition machine
    1.
    发明授权
    Binary reference matrix for a character recognition machine 失效
    用于字符识别机的二进制参考矩阵

    公开(公告)号:US3925761A

    公开(公告)日:1975-12-09

    申请号:US49425174

    申请日:1974-08-02

    Applicant: IBM

    CPC classification number: G06K9/72 G06F17/28 G06K2209/01 G10L15/00

    Abstract: A binary reference matrix apparatus is diclosed for verifying input alpha words from a character recognition machine as valid linguistic expressions. The organization of the binary reference matrix is based upon the character transfer function of the character recognition machine. The alphabetic character stream for each word scanned by the character recognition machine, is mapped into a vector representation through the assignment of a unique numeric value for each letter in the alphabet. The vector magnitude and angle so calculated constitute the address data for accessing the binary reference matrix. The point accessed in the matrix will have a binary value of 1 if the scanned word is valid and will have a binary value of 0 if the scanned word is invalid. The organization of the binary reference matrix minimizes the size of the array needed for accurate verification by choosing numerical values for the alphabetic characters in an inverse proportion to the characters read reliability in the character recognition machine, as determined by the empirical measurement of the character recognition machine, character transfer function.

    Abstract translation: 二进制参考矩阵装置被用于验证来自字符识别机器的输入字词作为有效的语言表达。 二进制参考矩阵的组织基于字符识别机的字符传递函数。 由字符识别机扫描的每个字的字母字符流通过为字母表中的每个字母分配唯一的数值而映射到向量表示。 如此计算的矢量幅度和角度构成用于访问二进制参考矩阵的地址数据。 如果扫描字有效,矩阵中访问的点将具有1的二进制值,如果扫描字无效,则其二进制值为0。 二进制参考矩阵的组织通过字符识别机器中字符读取可靠性的倒数与字母字符反比例选择数字值来最小化精确验证所需的数组的大小,由字符识别的经验测量确定 机器,字符传递功能。

    4.
    发明专利
    未知

    公开(公告)号:DE2654815A1

    公开(公告)日:1978-01-26

    申请号:DE2654815

    申请日:1976-12-03

    Applicant: IBM

    Abstract: The print convention apparatus and method disclosed herein effects a decision making process with respect to a determination as to whether an alphabetic character field output from an optical character reader (OCR) is related to the OCR scan of an upper case or a lower case inscription on the document scanned. The alphabetic character field (e.g., a word) is comprised of one or a series of alphabetic characters which represent the OCR's interpretation of characters printed on the scanned document. Each word output by the OCR corresponds to a field (i.e., word) of characters imprinted on the scanned document. The electrical signals representative of the upper and lower case alphabetic characters and rejects including conflicts outputted from the OCR are applied to a character occurrence probability storage apparatus which contains precomputed empirical probabilities therein that: (1) a given character recognition is the result of the scan of an upper case character; and (2) a given character recognition is the result of the scan of a lower case character. In addition, the storage apparatus includes probability values for character conflicts and rejects. As the series of alphabetic character signals from the OCR output are applied character-by-character to the character occurrence probability storage apparatus (e.g., a read-only store), a running sum of the respective probabilities for the upper case and lower case print conventions is developed so that, following the input of the final character, reject or conflict within a word to the aforesaid apparatus, an appropriate upper or lower case determination can be made for all of the characters within the word. This determination corresponds with the print convention of the word inscribed on the scanned document. A corresponding upper or lower case flag is correspondingly generated with the print convention determination, and associated with the alphabetic character word output from the print convention apparatus for further text processing. In one embodiment of the invention the probability for each OCR output alphabetic character being an upper or lower case character is stored in respective upper and lower case character occurrence probability storage devices after having been precomputed as the product of two probability factors; i.e., (1) a first probability factor with respect to the likelihood that the OCR recognition resulted from the scan of an upper or lower case character, and (2) a second probability factor with respect to the likelihood of a given character occurring in a specified language (e.g., English) document. In another embodiment of the invention, the character occurrence probability storage devices are functionally replaced by a read-only store having an address position for each upper and lower case alphabetic character outputted by the OCR including conflicts and rejects, and a precomputed numerical probability value associated with each address position to represent the quotient of: (1) the probability that a given character is related to an upper case print convention; and (2) the probability that the same character is related to a lower case print convention.

    5.
    发明专利
    未知

    公开(公告)号:DE2513566A1

    公开(公告)日:1976-02-19

    申请号:DE2513566

    申请日:1975-03-27

    Applicant: IBM

    Abstract: A binary reference matrix apparatus is diclosed for verifying input alpha words from a character recognition machine as valid linguistic expressions. The organization of the binary reference matrix is based upon the character transfer function of the character recognition machine. The alphabetic character stream for each word scanned by the character recognition machine, is mapped into a vector representation through the assignment of a unique numeric value for each letter in the alphabet. The vector magnitude and angle so calculated constitute the address data for accessing the binary reference matrix. The point accessed in the matrix will have a binary value of 1 if the scanned word is valid and will have a binary value of 0 if the scanned word is invalid. The organization of the binary reference matrix minimizes the size of the array needed for accurate verification by choosing numerical values for the alphabetic characters in an inverse proportion to the characters read reliability in the character recognition machine, as determined by the empirical measurement of the character recognition machine, character transfer function.

    BINARY REFERENCE MATRIXES
    6.
    发明专利

    公开(公告)号:AU8100375A

    公开(公告)日:1976-11-11

    申请号:AU8100375

    申请日:1975-05-09

    Applicant: IBM

    Abstract: A binary reference matrix apparatus is diclosed for verifying input alpha words from a character recognition machine as valid linguistic expressions. The organization of the binary reference matrix is based upon the character transfer function of the character recognition machine. The alphabetic character stream for each word scanned by the character recognition machine, is mapped into a vector representation through the assignment of a unique numeric value for each letter in the alphabet. The vector magnitude and angle so calculated constitute the address data for accessing the binary reference matrix. The point accessed in the matrix will have a binary value of 1 if the scanned word is valid and will have a binary value of 0 if the scanned word is invalid. The organization of the binary reference matrix minimizes the size of the array needed for accurate verification by choosing numerical values for the alphabetic characters in an inverse proportion to the characters read reliability in the character recognition machine, as determined by the empirical measurement of the character recognition machine, character transfer function.

    7.
    发明专利
    未知

    公开(公告)号:DE2541204A1

    公开(公告)日:1976-04-15

    申请号:DE2541204

    申请日:1975-09-16

    Applicant: IBM

    Abstract: A cluster storage apparatus is disclosed for outputting groups of valid alpha words as potential candidates for the correct form of an alpha word misrecognized by a character recognition machine. Groups of alpha words are arranged in the cluster storage apparatus such that adjacent locations contain alpha words having similar character recognition misread propensities. Alpha words which have been determined to be misrecognized, are input to the cluster storage apparatus. Numerical values assigned to the characters of which the input word is composed, are used to calculate the address of that group of valid alpha words having similar character recognition misread propensities. The cluster storage apparatus then outputs the accessed groups of alpha words for subsequent processing. The organization of the cluster storage apparatus minimizes the difference in address between alpha words with similar character recognition misread propensities by assigning high numeric values to highly reliable characters, as determined by measuring the character transfer function of the character recognition machine.

    8.
    发明专利
    未知

    公开(公告)号:DE2435889A1

    公开(公告)日:1975-10-16

    申请号:DE2435889

    申请日:1974-07-25

    Applicant: IBM

    Abstract: An online numeric discriminator is disclosed which performs the decision making process between strings of characters coming from a dual output optical character recognition system for use in text processing or mail processing applications. The dual output OCR uses separate recognition processes for alphabetic and numeric characters and attempts to recognize each character independently as both an alphabetic and a numeric character. The alphabetic interpretation of the scanned word is outputted as an alphabetic subfield on a first output line and the numeric interpretation of the scanned word is outputted as a numeric subfield on a second output line from the OCR. The bayesian online numeric discriminator then analyzes the two character streams by calculating a first conditional probability that the OCR perceived the alphabetic subfield given that a numeric subfield was actually scanned and a second conditional probability that the OCR perceived the numeric subfield given that an alphabetic subfield was actually scanned. These first and second conditional probabilities are then compared. If the conditional probability that the OCR read the alphabetic subfield given that the numeric subfield was actually scanned, is larger than the conditional probability that the OCR read the numeric subfield given that the alphabetic subfield was actually scanned, then the numeric subfield is selected by the discriminator as the most probable interpretation of the word scanned by the OCR.

Patent Agency Ranking