ALGORITHM FOR CONSTRUCTING TREE STRUCTURED CLASSIFIERS

    公开(公告)号:CA1266527A

    公开(公告)日:1990-03-06

    申请号:CA528724

    申请日:1987-02-02

    Applicant: IBM

    Abstract: Disclosed is a method for assigning features to nodes of a tree structured classifier and for determining terminal nodes in response to a training set of objects, each of such objects being determined by a plurality of features. The method comprises the steps at each node of the tree of: (1) determining a selected characteristic, such as a cost function based on the minimum description length, of the plurality of features unused at prior nodes along the path from the root to the present node; (2) assigning a feature to the node having a preferred value for the selected characteristic relative to the other features; (3) creating child nodes in response to the assigned feature; (4) for each child node, determining the selected characteristic for the plurality of features unused at prior nodes and assigning a feature to the child node having a preferred value for the selected characteristic relative to the other features; (5) generating a combination of the values for the selected characteristics of the assigned features for the child nodes of the node; and (6) classifying the node as a terminal node in response to a comparison of the combination of values for the features assigned to the child nodes and the value for the feature assigned to the node.

    ARITHMETIC CODING ENCODER AND DECODER SYSTEM

    公开(公告)号:CA1291821C

    公开(公告)日:1991-11-05

    申请号:CA544052

    申请日:1987-08-07

    Applicant: IBM

    Abstract: YO986-091 Apparatus and method for compressing and de-compressing binary decision data by arithmetic coding and decoding wherein the estimated probability Qe of the less probable of the two decision events, or outcomes, adapts as decisions are successively encoded. To facilitate coding computations, an augend value A for the current number line interval is held to approximate one by renormalizing A whenever it becomes less than a prescribed minimum AMIN. When A is renormalized, the value of Qe is up-dated. The renormalization of A and up-dating of Qe are preferably based on a single-bit test. Also, each Qe value is preferably specified as a 12-bit value having the least significant bit set to 1 and having no more than four other bits set to 1. The number of Qe values in the 1/4 to 1/2 probability range is enhanced to improve coding efficiency. A decision coding parameter of preferably six bits indicates the sense of the more probable symbol (MPS) in one bit and identifies a corresponding Qe value with the remaining five bits. In addition to probability adaptation, the present invention discloses an allocation of bits in a code stream register in which preferably two spacer bits are inserted between a next byte portion (which contains a byte of data en route to a buffer) and a fractional portion which may be involved in further computation. With the two spacer bits, any code greater than or equal to Hex 'CO' which fol YO986-091 lows a Hex 'FF' byte is illegal for data and therefore provides for an escape from the code stream. The two spacer bits also reduce the number of stuff bits inserted to account for carry or borrow propagation. Encoding and decoding can be performed interchangeably by hardware or software which feature differing coding conventions.

    7.
    发明专利
    未知

    公开(公告)号:FR2382718A1

    公开(公告)日:1978-09-29

    申请号:FR7803456

    申请日:1978-02-01

    Applicant: IBM

    Abstract: There is disclosed a method and means for compacting (encoding) and decompacting (decoding) binary bit strings which avoids the blocking of string elements required by Huffman coding and the ever increasing memory as is the case in simple enumerative coding. The method and means arithmetically encodes successive terms or symbols in a symbol string s=ai aj . . . , in which each new term ak in a source alphabet of N symbols gives rise to a new encoded string C(sak) and a new length indicator L(sak). The method and means comprises the steps of forming L(sak) from the recursion L(sak)=L(s)+l(ak), where l(ak) is a rational approximation of log2 1/p(ak), p(ak) being the a'priori probability of occurrence of ak, and l(ak) being so constrained that the Kraft inequality is satisfied: AND FORMING C(sak) from the recursion C(s)+[Pk-12L(sak)],where: AND WHERE Pk-1 is the cumulative probability of occurrence of an arbitrary ordering of all symbols.

Patent Agency Ranking