METHOD AND DEVICE FOR GENERATING REGRESSION TREE

    公开(公告)号:JPH1115831A

    公开(公告)日:1999-01-22

    申请号:JP16128097

    申请日:1997-06-18

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To constitute more preferable tree through the use of two numeric attributes for a division rule by cutting the area of a packet minimizing the mean square error of the value of a target numeric value and generating nodes on data in the area of the packet which is cut and data out of the area. SOLUTION: A plane corresponding to the two predicate numeric attributes is constituted and the plane is meshed (step 122). Respective mesh elements store data on the number of tapple belonging to a pertinent mesh and the sum of the target numeric attributes of the tapple belonging to the pertinent mesh in a data set D. The form of the area R which is cut from the plane is designated (step 124). The forms of the areas R which are cut are x-monotone, an orthogonal projection and base monotone. Then, a probing parameter θ is changed and the area R making interclass variance to be maximum is cut from the pane (step 126). The area R is set to be the division rule R.

    METHOD AND DEVICE FOR DERIVING CONNECTION RULE BETWEEN DATA

    公开(公告)号:JPH09179883A

    公开(公告)日:1997-07-11

    申请号:JP4466096

    申请日:1996-03-01

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To make it possible to find the correlation between data which have a two-term numeral attribute and a true/false attribute. SOLUTION: A plane is constituted with two numeral attributes first and divided into meshes, and data in the meshes (packet) and data which have true attributes are counted. This plane can be grasped as the plane image that the number of data corresponds to the gray level and the number of data having true attributes corresponds to the saturation. Then a permissible image which is an area that is convex to one axis of the plane is cut under specific conditions and a part where the correlation of data is strong is found. Then when the area as the cut permissible area meets conditions of a support maximization rule, etc., the area is shown to a user. Further, necessary attributes of data included in the area are extracted from a data base at need.

    REGION CALCULATING METHOD, SPATIAL DATA MINING DEVICE, MAP INFORMATION DISPLAY DEVICE, SPATIAL DATA MINING SYSTEM AND STORAGE MEDIUM

    公开(公告)号:JP2001337956A

    公开(公告)日:2001-12-07

    申请号:JP2000153087

    申请日:2000-05-24

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide a high-grade spacial data mining by optimizing a region in consideration of a special continuity. SOLUTION: This region calculating method, in which a two-dimensional correlation rule is derived from a data base containing space information such as dwelling space and applied on a map, includes a step of defining an objective function not containing regional information requested in outputting, for deriving the two-dimensional correlation rule, a step (S101) of bucketing a region on a map into a pixel grid having a designated size, a step (S103) of accumulating data from the data base by each bucket, a step (S104) of calculating the region for optimizing the objective function according to the defined objective function, a step (S106) of extracting entity on the map corresponding to the calculated region, and a step of outputting a region applied on the map according to the extracted entity.

    METHOD AND APPARATUS FOR DERIVATION OF OPTIMIZATION COUPLINGRULE

    公开(公告)号:JPH09134365A

    公开(公告)日:1997-05-20

    申请号:JP28483695

    申请日:1995-11-01

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To detect at high speed the correlation between the data having both numerical and specific (0-1) attributes by dividing a numerical attribute axis into plural sections, counting the number of data included in every divided section and also the number of data on the (0-1) attribute and then performing a specific processing. SOLUTION: A bucket processing part 1510 divides a numerical attribute axis corresponding to the numerical attribute into plural sections and counts the number of data and the number of data having the (0-1) attributes equal to 1 included in every divided section. A plane constitution processing part 1520 virtually constitutes a plane by means of a 1st axis corresponding to the total number of data on every section and a 2nd axis corresponding to the total number of data having the (0-1) attribute equaling to 1 on every section. Then the part 1520 virtually plots the points corresponding to the values to the sections on the plane. Furthermore, a largest tilt line extraction part 1530 extracts a pair of points having their connection line of the largest tilt among those pairs of points having intervals larger than T.N (T: rate, N: total data number) set toward the 1st axis and then outputs the corresponding section between the extracted pair of points.

    SYSTEM, METHOD FOR EVALUATING EVALUATION ITEM OF EVALUCATED OBJECT HAVING TEMPORAL VARIATION, AND RECORDING MEDIUM

    公开(公告)号:JP2000348015A

    公开(公告)日:2000-12-15

    申请号:JP16023499

    申请日:1999-06-07

    Applicant: IBM JAPAN

    Abstract: PROBLEM TO BE SOLVED: To evaluate an object company at proper point of time for evaluation by performing the company evaluation on the basis of the inputs of an output evaluation calculated by using a static model and dynamic data. SOLUTION: A calculation part 230 for evaluation data by the static model calculates the grading numeral of the object company to be evaluated by applying data of the object company from an input part 240 to the static model and outputs the result 234. For the subsequence generation of a dynamic model, data of respective sample companies are also applied to the static model to calculate and output respective grating numerals 232. A dynamic model generation part 250 structures evaluation models as to the sample companies on the basis of the evaluation output 232 by the static model and the input 210 of the dynamic data. A calculation part 260 for evaluation data by the dynamic model applies the dynamic data of the object company to be evaluated from the input part 240 to the dynamic model together with the output 234 from the static model to calculate a rating variation value of the object company and outputs its evaluation data 270.

    METHOD AND APPARATUS FOR RETRIEVAL OF DATABASE

    公开(公告)号:JPH09134363A

    公开(公告)日:1997-05-20

    申请号:JP28541695

    申请日:1995-11-01

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To derive connection rules that are set between numerical value attribute and (0-1) attribute. SOLUTION: A numerical attribute is divided into plural sections (buckets), and each of data is put into a bucket according to the value of the numerical attribute. Then the number of data contained in every bucket and the number of data having the (0-1) attribute equals to 1 are counted. Then, (s) satisfying the condition of the formula is detected, and the starting section to be detected is actually detected. In this expression, ui is the number of data included in a certain section and vi is the number of data whose (0-1) attributes in the section are equal to 1 respectively. Then the ending section corresponding to the starting one is detected, so that the largest section having the certainty higher than a prescribed level α is detected. A pair of sections including the largest number of customers is defined as an answer among those detected pairs of starting and ending sections. Then the necessary data attributes are taken out of the data which are included in the answer section.

    METHOD AND DEVICE FOR MINING SPACE DATA AND RECORDING MEDIUM

    公开(公告)号:JP2001318938A

    公开(公告)日:2001-11-16

    申请号:JP2000135928

    申请日:2000-05-09

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide a space data mining method for finding out a distance itself and an azimuth itself for optimizing a certain purpose to be requested by many analytical operations without previously determining a distance and an azimuth and deriving a space correlation rule. SOLUTION: The space data mining device for calculating an optimum distance from a data base including space information such as an address is provided with an input means for inputting an object function necessary for distance optimization, an intermediate table preparation part 30 for generating an intermediate table by calculating a distance between a start point and a question point on the basis of start point set data and question point set data stored in a data base, and an optimum distance calculation part 39 for calculating a distance for optimizing the value of the object function inputted by the input means on the basis of the intermediate table generated by the preparation part 30.

    METHOD AND DEVICE FOR CALCULATING PROBABILITY OF DEFAULT ON OBLIGATION

    公开(公告)号:JP2000259719A

    公开(公告)日:2000-09-22

    申请号:JP5988599

    申请日:1999-03-08

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide a method for calculating the probability of a bankruptcy which is capable of calculating actual probability regardless of the ratio of bankruptcies/non-bankruptcies in learned data. SOLUTION: The method for calculating the probability of the occurrence of a default on an obligation of a company includes (a) a step for inputting the financial data of a plurality of companies, (b) a step for preparing a decision tree from the financial data, (c) a step for applying the financial data of an objective company to the decision tree and (d) a step for calculating the probability of the occurrence of the default on an obligation from the result of applying the decision tree to the objective company. In addition, (d) the step utilizes the Bayes' theorem.

    METHOD AND DEVICE FOR DETERMINING RULES IN DATA BASE

    公开(公告)号:JPH11345124A

    公开(公告)日:1999-12-14

    申请号:JP14979098

    申请日:1998-05-29

    Applicant: IBM JAPAN

    Abstract: PROBLEM TO BE SOLVED: To segment the area of a smooth boundary from a plane spread by two axes, corresponding to two numerical predicate attributes of data and to utilize the area for estimating the target attribute of data. SOLUTION: When a rule for predicting the target attribute value of data in a data base while relating it to the target attribute of data is determined, corresponding to the respective packets of a plane divided into N×M packets, while having two axes corresponding to the first and second numerical predicate attributes of data in the data base, this method is provided with a step for storing a value concerning data belonging to the relevant packet, an area segmenting step for segmenting the area of the packet satisfying prescribed conditions from the plane, a step for performing smoothing processing to the boundary of the area of the segmented packet, and a rule determining step for determining the rule for predicting the target attribute value of certain data from the area after smoothing processing.

Patent Agency Ranking