METHOD AND ARRANGEMENT FOR HANDLING DATA SETS, DATA PROCESSING PROGRAM AND COMPUTER PROGRAM PRODUCT
    1.
    发明申请
    METHOD AND ARRANGEMENT FOR HANDLING DATA SETS, DATA PROCESSING PROGRAM AND COMPUTER PROGRAM PRODUCT 审中-公开
    处理数据集,数据处理程序和计算机程序产品的方法和安排

    公开(公告)号:WO2012034733A3

    公开(公告)日:2012-11-08

    申请号:PCT/EP2011062074

    申请日:2011-07-14

    CPC classification number: G06F17/10 G06F17/30569 G06F17/30598

    Abstract: An improved method for handling data sets (12, 14) is disclosed. The method comprises the steps of: Providing a first characteristic (20.1) associated with a first data set (12) and at least one of the following: A single data value (12') and a second characteristic (20.2) associated with a second data set (14); the provided characteristics allowing feasible comparison of the first data set (12), the second data set (14) and the single data value (12'), and calculating at least one of the following: Similarity of the first data set (12) with the second data set (14) based on the first and second characteristics (20.1, 20.2), similarity of the first data set (12) with the single data value (12') based on the first characteristic (20.1) and the single data value (12'), confidence indicating how well the first characteristic reflects properties of the first data set (12) based on the first characteristic, and confidence indicating how well the similarity of the first data set with the single data value (12') reflects properties of the single data value based on the first characteristic and the single data value (12').

    Abstract translation: 公开了一种用于处理数据集(12,14)的改进方法。 该方法包括以下步骤:提供与第一数据集(12)相关联的第一特征(20.1)以及以下至少一个:与第二数据集(12)关联的单个数据值(12')和第二特征(20.2) 数据集(14); 所提供的特征允许可行地比较第一数据集(12),第二数据集(14)和单个数据值(12'),并且计算以下至少一个:第一数据集(12) 与基于所述第一和第二特性(20.1,20.2)的所述第二数据集(14)相比较,基于所述第一特性(20.1),所述第一数据集(12)与所述单个数据值(12')的相似性, 数据值(12'),置信度指示第一特征基于第一特征反映第一数据集(12)的特性的程度,以及指示第一数据集与单个数据值(12')的相似性有多好的置信度, )基于第一特性和单个数据值(12')来反映单个数据值的特性。

    System und Verfahren zur Datenqualitätsüberwachung

    公开(公告)号:DE102012210794A1

    公开(公告)日:2013-02-07

    申请号:DE102012210794

    申请日:2012-06-26

    Applicant: IBM

    Abstract: Datenqualitätsüberwachung bezieht sich auf das Messen von Datenqualität geladener Daten in Bezug auf eine vordefinierte Datenqualitätsmessgröße. Die Datenqualität wird durch Anwenden eines in Qualitätsregeln definierten logischen Kalküls auf die geladenen Daten gemessen. Die Datenqualitätsmessung wird unter Verwendung von zumindest einem des Folgenden durchgeführt: Delta-Veränderungen der geladenen Daten und Delta-Veränderungen der Qualitätsregeln.

    Generation of analysis reports using trusted and public distributed file systems

    公开(公告)号:GB2523761A

    公开(公告)日:2015-09-09

    申请号:GB201403752

    申请日:2014-03-04

    Applicant: IBM

    Abstract: A data processing system for: receiving an analysis request comprising multiple data analysis commands to generate an analysis report; dividing the commands into private analysis commands and public analysis commands; sending the private analysis commands to a trusted distributed file system; sending a portion of the public analysis commands to an public distributed file system; sending the remainder of the public analysis commands to the trusted distributed file system; and generating the analysis report using public analysis results from the public distributed file system and trusted analysis results from the trusted distributed file system.

    A method for a logging process in a data storage system

    公开(公告)号:GB2516872A

    公开(公告)日:2015-02-11

    申请号:GB201313863

    申请日:2013-08-02

    Applicant: IBM

    Abstract: A method for a logging process in a data storage system (10C) including a set of storage tiers (115), each storage tier of the set of storage tiers (115) having different performancecharacteristics (e.g. error rate, communication rate, power consumption, delay time). The set of storage tiers (115) is divided into subsets (115A, 115B, 115C, 115D) using the performance characteristics. The logging process is initialized for creating a separate log file (121A, 121B, 121C, 121D) for each of the subsets of storage tiers (115A, 115B, 115C, 115D) for maintaining a history of data changes in the subset of storage tiers, thereby creating a plurality of log files (121); in response to a change in data stored in at least one storage tier of a subset of storage tiers (115), generating one or more log records comprising information about the change, and writing the one or more log records into the respective log files (121A, 121B, 1210, 121D). Such log files may be used during backup and restoration.

    Discovering composite keys
    5.
    发明专利

    公开(公告)号:GB2505183A

    公开(公告)日:2014-02-26

    申请号:GB201214851

    申请日:2012-08-21

    Applicant: IBM

    Abstract: A computer-implemented method for detecting one or more multi-column composite key column sets, the method comprising: accessing (102) a plurality of first columns (Pl-P3); selecting (104) two or more of the first columns for use as a current set (218) of candidate columns; determining (106), by comparing object-identifiers stored in association with parameter values of the candidate columns with each other, if for the current sec of candidate columns at least one tuple (219) of parameter values exists whose parameter values are respectively stored in association with two or more shared ones of the object identifiers; in case said at least one tuple does not exist, identifying (110) the current candidate column set as a multi-column composite key column set; otherwise, replacing (112) the second candidate column by another selected one of the first columns or adding said other selected one of the first columns to the candidate column set.

    Method and system for accessing a set of data tables in a source database

    公开(公告)号:GB2517787A

    公开(公告)日:2015-03-04

    申请号:GB201315611

    申请日:2013-09-03

    Applicant: IBM

    Abstract: A method for accessing a set of data tables in a source database (117), the method comprises: providing a set of table categories for tables in the source database; providing a set of metrics (such as read access rates, number of records, number of primary keys), each metric comprising a respective characteristic metric for each table category; For each table of the set of the data tables evaluating the set of metrics; analyzing the evaluated set of metrics; and categorizing the table into one of the set of table categories using the result of the analysis; outputting information indicative of the table category of each table of the set of tables; in response to the outputting receiving a request to select data tables of the set of data tables according to a part of the table categories for data processing; and selecting a subset of data tables of the set of data tables using the table categories for performing the data processing (e.g. ETL or data mining) on the subset of data tables.

    Backup management for a plurality of logical partitions

    公开(公告)号:GB2515537A

    公开(公告)日:2014-12-31

    申请号:GB201311435

    申请日:2013-06-27

    Applicant: IBM

    Abstract: A method for managing backups comprises the provision of a computer system with main memory; a plurality of logical partitions (LPARs), each assigned respective first portions of memory, and each with at least one application consuming a fraction of first memory portion. A second portion of memory is used as global memory, not overlapping with the first portion, and for each LPAR is used to store images of the first memory portions consumed by the application on the logical partition. The application may be a database management program, whilst images may be created by copy-on-write, split-mirror or redirect-on-write. The image may be a complete image of the assigned first memory portion. Memory elements may be dynamically reallocated to resize global memory and/or first memory portion; and sub-portions of global memory may be dynamically resized according to requirement predictions.

    Detecting Multi-Column Composite Key Column Sets

    公开(公告)号:DE102013215530A1

    公开(公告)日:2014-02-27

    申请号:DE102013215530

    申请日:2013-08-07

    Applicant: IBM

    Abstract: Die Erfindung betrifft ein von einem Computer ausgeführtes Verfahren zum Erkennen von einer oder mehreren mehrspaltigen Spaltengruppen mit zusammengesetztem Schlüssel, wobei das Verfahren aufweist: a) Zugreifen (102) auf eine Vielzahl von ersten Spalten (P1 bis P3); b) Auswählen (104) von zwei oder mehr der ersten Spalten, um sie als eine aktuelle Gruppe (218) von in Frage kommenden Spalten zu verwenden; c) Feststellen (106), indem Objektkennungen miteinander verglichen werden, die in Verbindung mit Parameterwerten der in Frage kommenden Spalten gespeichert werden, ob für die aktuelle Gruppe von in Frage kommenden Spalten mindestens ein Tupel (219) aus Parameterwerten vorhanden ist, dessen Parameterwerte jeweils in Verbindung mit zwei oder mehr gemeinsam verwendeten Kennungen der Objektkennungen gespeichert werden; d1) falls das mindestens eine Tupel nicht vorhanden ist, Kennzeichnen (110) der aktuellen in Frage kommenden Spaltengruppe als eine mehrspaltige Spaltengruppe mit zusammengesetztem Schlüssel; d2) andernfalls Ersetzen (112) der zweiten in Frage kommenden Spalte durch eine andere ausgewählte Spalte der ersten Spalten oder Hinzufügen der anderen ausgewählten Spalte der ersten Spalten zu der in Frage kommenden Spaltengruppe.

Patent Agency Ranking