WORD VECTORIZATION ANGLE HASHING METHOD AND APPARATUS

    公开(公告)号:CA1157563A

    公开(公告)日:1983-11-22

    申请号:CA378661

    申请日:1981-05-29

    Applicant: IBM

    Inventor: GLICKMAN DAVID

    Abstract: WORD VECTORIZATION ANGLE HASHING METHOD AND APPARATUS A method and apparatus for vectorizing text words for compact storage and spelling verification in a miniprocessor system without the use of complex mathematics functions. A binary storage table contains a plurality of addressable binary numbers. Each character in an input word is converted into a numerical weighting value. The numerical weighting values for the characters in a word are used to index into a magnitude weighting table. The selected magnitude weights are summed to produce a vector magnitude representation for the input word. The numerical weighting values are also used to cumulatively access the binary storage table. The values output from the binary storage table are modulo-2 added and accumulated to produce a vector angle representation for the input word. The calculated magnitude and angle values are used to compactly store a dictionary memory of correctly spelled words. Words subsequently input for spelling verification are similarly converted to vector magnitude and angle representations for comparison to the stored dictionary. AT9-80-003 .

    ALPHA CONTENT MATCH PRESCAN METHOD FOR AUTOMATIC SPELLING ERROR CORRECTIONS

    公开(公告)号:CA1153471A

    公开(公告)日:1983-09-06

    申请号:CA362811

    申请日:1980-10-20

    Applicant: IBM

    Abstract: ALPHA CONTENT MATCH PRESCAN METHOD FOR AUTOMATIC SPELLING ERROR CORRECTIONS A system for reducing the computation required to match a misspelled word against various candidates from a dictionary to find one or more words that represent the best match to the misspelled word. The major facility offered is the ability to computationally discern the degree of apparent match that exists between words that do not perfectly match a given target word without requiring the computationally tedious procedure of character by character positional matching which necessitates shifting and realignment to accommodate for differences between the candidate and target words due to character differences or added and dropped syllables. The system includes a method for storing and retrieving words from the dictionary based on their likelihood of being the correct version of a misspelled word and then reviewing those words further using the Prescan Alpha Content Match to reduce the number of candidates that must then be examined in a high resolution positional match to find the candidate(s) which matches the misspelled word with the greatest character affinity. The Prescan Alpha Content Match reduces the number of candidates in contention so as to make a high resolution match computationally feasible on a real-time basis. AT9-79-027

    ALPHA CONTENT PRESCAN METHOD FOR AUTOMATIC SPELLING ERROR CORRECTION

    公开(公告)号:DE3175443D1

    公开(公告)日:1986-11-13

    申请号:DE3175443

    申请日:1981-02-05

    Applicant: IBM

    Abstract: Method and system for reducing the computation required to match a misspelled word against various candidates from a dictionary to find one or more words that represent the best match to the misspelled word. The method consists in comparing steps (20-24) a bit mask whose bits are set to reflect the presence or absence of specific characters or character combinations without regard to position in the misspelled word and in each of the dictionary candidate words. Then, (steps 25-27) a candidate word is dismissed from additional processing if there is not a predetermined percentage of bit mask match between the masks of the misspelled word and the candidate word.

    ALPHA CONTENT MATCH PRESCAN METHOD AND SYSTEM FOR AUTOMATIC SPELLING ERROR CORRECTION

    公开(公告)号:DE3071473D1

    公开(公告)日:1986-04-10

    申请号:DE3071473

    申请日:1980-12-04

    Applicant: IBM

    Abstract: Method and system for reducing the computation required to match a misspelled word against various candidates from a dictionary to find one or more words that represent the best match to the misspelled word. The method consists in inventorying (steps 20-27), without regard to position, the respective characters in the misspelled words and in each of the dictionary candidate words. Then (steps 28-31) a candidate word is dismissed from additional processing if there is not a predetermined percentage match between its character content and that of the misspelled word. Such a prescan alpha content match reduces the number of candidates in contention so as to make a high resolution match computationally feasible on a real-time basis.

    OFFICE CORRESPONDENCE STORAGE AND RETRIEVAL SYSTEM

    公开(公告)号:CA1241122A

    公开(公告)日:1988-08-23

    申请号:CA363345

    申请日:1980-10-27

    Applicant: IBM

    Abstract: OFFICE CORRESPONDENCE STORAGE AND RETRIEVAL SYSTEM A system that intelligently abstracts and archives a document for storage and interprets a free form user retrieval query to recall the document from the storage file. The system includes a method for automatically selecting keywords from the document using a partial speech directory. A method is given for weighing the importance or centrality of each keyword with respect to the document of its origin. Using the same logic paths, a free form query that describes the document in the same manner that it would have to be descried to a secretary to "find" it in a filing cabinet, the system automatically determines the key matching terms and finds the archived document(s) with the greatest affinity.

    METHOD AND APPARATUS FOR VECTORIZING TEXT WORDS IN A TEXT PROCESSING SYSTEM

    公开(公告)号:DE3164082D1

    公开(公告)日:1984-07-19

    申请号:DE3164082

    申请日:1981-03-19

    Applicant: IBM

    Inventor: GLICKMAN DAVID

    Abstract: Method and apparatus for vectorizing text words for compact storage and spelling verification in a text processing system without the use of complex mathematics functions. Each character in an input word is converted into a numerical value in decode means (7). The numerical values for the characters in a word are used to index into a magnitude table (30), the selected magnitude values being summed in adder (9) to produce a vector magnitude representation for the input word. The numerical values are also used to cumulatively access the binary table (17), the values output from this table being modulo-2 added and accumulated in a XOR (18) to produce a vector angle representation for the input word. The calculated magnitude and angle values are used to compactly store a dictionary a memory of correctly spelled words or for comparison to the stored dictionary.

Patent Agency Ranking