METHOD AND APPARATUS FOR DERIVATION OF OPTIMIZATION COUPLINGRULE

    公开(公告)号:JPH09134365A

    公开(公告)日:1997-05-20

    申请号:JP28483695

    申请日:1995-11-01

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To detect at high speed the correlation between the data having both numerical and specific (0-1) attributes by dividing a numerical attribute axis into plural sections, counting the number of data included in every divided section and also the number of data on the (0-1) attribute and then performing a specific processing. SOLUTION: A bucket processing part 1510 divides a numerical attribute axis corresponding to the numerical attribute into plural sections and counts the number of data and the number of data having the (0-1) attributes equal to 1 included in every divided section. A plane constitution processing part 1520 virtually constitutes a plane by means of a 1st axis corresponding to the total number of data on every section and a 2nd axis corresponding to the total number of data having the (0-1) attribute equaling to 1 on every section. Then the part 1520 virtually plots the points corresponding to the values to the sections on the plane. Furthermore, a largest tilt line extraction part 1530 extracts a pair of points having their connection line of the largest tilt among those pairs of points having intervals larger than T.N (T: rate, N: total data number) set toward the 1st axis and then outputs the corresponding section between the extracted pair of points.

    METHOD AND DEVICE FOR GENERATING REGRESSION TREE

    公开(公告)号:JPH1115831A

    公开(公告)日:1999-01-22

    申请号:JP16128097

    申请日:1997-06-18

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To constitute more preferable tree through the use of two numeric attributes for a division rule by cutting the area of a packet minimizing the mean square error of the value of a target numeric value and generating nodes on data in the area of the packet which is cut and data out of the area. SOLUTION: A plane corresponding to the two predicate numeric attributes is constituted and the plane is meshed (step 122). Respective mesh elements store data on the number of tapple belonging to a pertinent mesh and the sum of the target numeric attributes of the tapple belonging to the pertinent mesh in a data set D. The form of the area R which is cut from the plane is designated (step 124). The forms of the areas R which are cut are x-monotone, an orthogonal projection and base monotone. Then, a probing parameter θ is changed and the area R making interclass variance to be maximum is cut from the pane (step 126). The area R is set to be the division rule R.

    METHOD AND DEVICE FOR DERIVING INTER-DATA CONNECTION RULE, METHOD AND DEVICE FOR CUTTING ORTHOGONAL PROJECTION AREA

    公开(公告)号:JPH10240747A

    公开(公告)日:1998-09-11

    申请号:JP3460597

    申请日:1997-02-19

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To make it easy to find a connection rule between data by composing a plane of two numeric attributes that an analyzed body has and cutting an orthogonal projection area between true and false attributes under specific conditions. SOLUTION: A database having data including two kinds of numeric attribute and one kind of true/false attribute has two axes corresponding to two kinds of numeric attribute and stores the number v(i, j) of data belonging to respective pixels (i row, j column) of a plane divided into N×N pixels and the number v(i, j) of data whose true/false attributes are true. Then a specific condition θis inputted to cut an orthogonal projection area S of pixels, maximizing an equation I, out of a plane. Thus, the area in the orthogonal projection shape is cut out to make it easy for people to grasp the connection rule. Lastly, data included in the cut orthogonal projection area S are outputted.

    METHOD AND DEVICE FOR DERIVING CONNECTION RULE BETWEEN DATA

    公开(公告)号:JPH09179883A

    公开(公告)日:1997-07-11

    申请号:JP4466096

    申请日:1996-03-01

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To make it possible to find the correlation between data which have a two-term numeral attribute and a true/false attribute. SOLUTION: A plane is constituted with two numeral attributes first and divided into meshes, and data in the meshes (packet) and data which have true attributes are counted. This plane can be grasped as the plane image that the number of data corresponds to the gray level and the number of data having true attributes corresponds to the saturation. Then a permissible image which is an area that is convex to one axis of the plane is cut under specific conditions and a part where the correlation of data is strong is found. Then when the area as the cut permissible area meets conditions of a support maximization rule, etc., the area is shown to a user. Further, necessary attributes of data included in the area are extracted from a data base at need.

    METHOD AND APPARATUS FOR RETRIEVAL OF DATABASE

    公开(公告)号:JPH09134363A

    公开(公告)日:1997-05-20

    申请号:JP28541695

    申请日:1995-11-01

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To derive connection rules that are set between numerical value attribute and (0-1) attribute. SOLUTION: A numerical attribute is divided into plural sections (buckets), and each of data is put into a bucket according to the value of the numerical attribute. Then the number of data contained in every bucket and the number of data having the (0-1) attribute equals to 1 are counted. Then, (s) satisfying the condition of the formula is detected, and the starting section to be detected is actually detected. In this expression, ui is the number of data included in a certain section and vi is the number of data whose (0-1) attributes in the section are equal to 1 respectively. Then the ending section corresponding to the starting one is detected, so that the largest section having the certainty higher than a prescribed level α is detected. A pair of sections including the largest number of customers is defined as an answer among those detected pairs of starting and ending sections. Then the necessary data attributes are taken out of the data which are included in the answer section.

Patent Agency Ranking