Abstract:
A system and a method for diagnosis and clinical test selection using a case based machine learning inference are provided to offer information about accurate diagnosis by using an NNDT(Neural Network Decision Tree) teaching machine and a random forest teaching machine at the same time. A system for diagnosis and clinical test selection using a case based machine learning inference comprises a patient case database(200), an input device(100) and a machine learning classification device(400) constituted with one or more machine learning device(410). A machine learning trainer(300) determines important test items of each diseases, and stores it in a test items database(600). A diagnostor(500) including a test item selector(520) derives the main test item selection result of each diseases. An output device(700) outputs the test item selection result and a disease determination.
Abstract:
A method for searching a classifier gene set from a microarray dataset is provided to stably select the classifier gene set from the microarray dataset having various characteristics by minimizing problems of the microarray dataset including small sample number, presence of abnormal value and unequal distribution of data in each class. The method for searching the classifier gene set from the microarray dataset comprises the steps of: (a) discretizing the expression amount value date of the microarray dataset to produce a discretized gene expression profile(S100); (b) filtering the genes by leaving genes of which gene-class association value calculated from the discretized gene expression profile by using the Fisher's exact test is lower than or identical to the predetermined value, and removing genes having higher gene-class association value than the predetermined value(S200); (c) initiating the classifier gene set by selecting a gene having the smallest gene-class association value from the filtered genes(S300); (d) selecting a gene having the smallest value obtained by dividing the calculated gene-class association value of each gene by the overlap value of expression pattern between the filtered genes calculated by the Fisher's exact test, and adding the selected gene into the initialized classifier gene set(S400); and (e) evaluating the sample classification error of the classifier gene set formed in the step(d) and determining whether an additional gene is added into the classifier gene set(S500).
Abstract:
A semantic information-based grid management system supporting grid computing and a method thereof are provided to infer information for a program suitable for requested work and resource requirement, and generate a grid work detail file for assigning grid resources based on interred information by constructing application information in an ontology type, and forming the management system interposed between a user and grid middleware. An ontology data manager(140) defines ontology representing the information for an application field analysis and stores ontology data collected from an information provider to an ontology repository(130). A grid resource manager(180) collects and stores grid resource information of each grid resource to a grid resource table(190). Grid middleware(160) assigns a program to each grid resource according to grid work details and transfers a program execution result to the user. An inference engine(120) determines the optimal program for performing the analysis based on the ontology data and the grid resource information, and stores a grid resource list performing the optimal program. A grid work detail generator(150) makes the grid work detail by using the optimal program and the grid resource list.
Abstract:
A method of bottom-Up protein modifications detection using a mass shift list table and a program storage device are provided to consider not only the detection of specific protein modifications but all possibility and to quickly and correctly detect the protein modifications by using a mass/mass shift ion search method through a bottom-up protein analysis. The mass shift list table is made by comparing a mass of fragmented ions measured through a mass/mass experiment with the theoretical mass of the fragment ions(S100). The position causing the protein modifications is searched while searching the generated list according to sequence order of the protein(S200). A candidate protein modification group probably present in an error range is collected by searching a protein modification database with mass shift caused by the protein modifications and a characteristic of amino acid of the location causing the modification(S300).
Abstract:
A method for aligning a genome sequence in a grid computing environment and a program storing medium are provided to efficiently apply present sequence alignment programs difficult to compare a large quality of genome sequences owing to restriction of computing resources under the grid computing environment. The first and second genome sequences are cut by specific overlapped section based on calculating algorithm which aligns the size of the first and second sequences and the sequences, and repeated sequences in cut fragments are indicated(S100). The first and second genome sequence fragments are distributed to each computer of the grid computing environment, and are aligned by using the sequence alignment program(S200). Generated alignment results are added up and only the statistically meaning sequence information are extracted(S300).