Bayesian online numeric discriminant
    1.
    发明授权
    Bayesian online numeric discriminant 失效
    贝叶斯在线数字歧视

    公开(公告)号:US3839702A

    公开(公告)日:1974-10-01

    申请号:US40952473

    申请日:1973-10-25

    Applicant: IBM

    CPC classification number: G06K9/72 G06K2209/01

    Abstract: An online numeric discriminator is disclosed which performs the decision making process between strings of characters coming from a dual output optical character recognition system for use in text processing or mail processing applications. The dual output OCR uses separate recognition processes for alphabetic and numeric characters and attempts to recognize each character independently as both an alphabetic and a numeric character. The alphabetic interpretation of the scanned word is outputted as an alphabetic subfield on a first outline line and the numeric interpretation of the scanned word is outputted as a numeric subfield on a second output line from the OCR. The bayesian online numeric discriminator then analyzes the two character streams by calculating a first conditional probability that the OCR perceived the alphabetic subfield given that a numeric subfield was actually scanned and a second conditional probability that the OCR perceived the numeric subfield given that an alphabetic subfield was actually scanned. These first and second conditional probabilities are then compared. If the conditional probability that the OCR read the alphabetic subfield given that the numeric subfield was actually scanned, is larger than the conditional probability that the OCR read the numeric subfield given that the alphabetic subfield was actually scanned, then the numeric subfield is selected by the discriminator as the most probable interpretation of the word scanned by the OCR.

    Abstract translation: 公开了一种在线数字鉴别器,其执行来自用于文本处理或邮件处理应用的双输出光学字符识别系统的字符串之间的决策处理。 双输出OCR为字母和数字字符使用单独的识别过程,并尝试将字符独立识别为字母和数字字符。 扫描字的字母解释作为第一轮廓线上的字母子字段输出,并且扫描字的数字解释作为来自OCR的第二输出行上的数字子字段输出。 贝叶斯在线数字鉴别器然后通过计算给定数字子字段被实际扫描的OCR感知到字母子字段的第一条件概率,以及给定字母子字段是OCR感知数字子字段的第二条件概率来分析两个字符流 实际扫描。 然后比较这些第一和第二条件概率。 如果考虑到数字子字段被实际扫描,OCR读取字母子字段的条件概率大于在实际扫描了字母子字段时OCR读取数字子字段的条件概率,则数字子字段由 鉴别器是由OCR扫描的单词的最可能解释。

    3.
    发明专利
    未知

    公开(公告)号:BR7504944A

    公开(公告)日:1976-07-27

    申请号:BR7504944

    申请日:1975-08-01

    Applicant: IBM

    Abstract: A binary reference matrix apparatus is diclosed for verifying input alpha words from a character recognition machine as valid linguistic expressions. The organization of the binary reference matrix is based upon the character transfer function of the character recognition machine. The alphabetic character stream for each word scanned by the character recognition machine, is mapped into a vector representation through the assignment of a unique numeric value for each letter in the alphabet. The vector magnitude and angle so calculated constitute the address data for accessing the binary reference matrix. The point accessed in the matrix will have a binary value of 1 if the scanned word is valid and will have a binary value of 0 if the scanned word is invalid. The organization of the binary reference matrix minimizes the size of the array needed for accurate verification by choosing numerical values for the alphabetic characters in an inverse proportion to the characters read reliability in the character recognition machine, as determined by the empirical measurement of the character recognition machine, character transfer function.

    4.
    发明专利
    未知

    公开(公告)号:BR7506545A

    公开(公告)日:1976-08-17

    申请号:BR7506545

    申请日:1975-10-07

    Applicant: IBM

    Abstract: A cluster storage apparatus is disclosed for outputting groups of valid alpha words as potential candidates for the correct form of an alpha word misrecognized by a character recognition machine. Groups of alpha words are arranged in the cluster storage apparatus such that adjacent locations contain alpha words having similar character recognition misread propensities. Alpha words which have been determined to be misrecognized, are input to the cluster storage apparatus. Numerical values assigned to the characters of which the input word is composed, are used to calculate the address of that group of valid alpha words having similar character recognition misread propensities. The cluster storage apparatus then outputs the accessed groups of alpha words for subsequent processing. The organization of the cluster storage apparatus minimizes the difference in address between alpha words with similar character recognition misread propensities by assigning high numeric values to highly reliable characters, as determined by measuring the character transfer function of the character recognition machine.

Patent Agency Ranking