Invention Grant
- Patent Title: Dataset relevance estimation in storage systems
-
Application No.: US15660434Application Date: 2017-07-26
-
Publication No.: US10592147B2Publication Date: 2020-03-17
- Inventor: Giovanni Cherubini , Mark A. Lantz , Taras Lehinevych , Vinodh Venkatesan
- Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Anthony M. Pallone
- Main IPC: G06F16/30
- IPC: G06F16/30 ; G06F3/06

Abstract:
The invention is notably directed to computer-implemented methods and systems for managing datasets in a storage system. In such systems, it is assumed that a (typically small) subset of datasets are labeled with respect to their relevance, so as to be associated with respective relevance values. Essentially, the present methods determine, for each unlabeled dataset of the datasets, a respective probability distribution over a set of relevance values. From this probability distribution, a corresponding relevance value can be obtained. This probability distribution is computed based on distances (or similarities), in terms of metadata values, between said each unlabeled dataset and the labeled datasets. Based on their associated relevance values, datasets can then be efficiently managed in a storage system.
Public/Granted literature
- US20190034083A1 DATASET RELEVANCE ESTIMATION IN STORAGE SYSTEMS Public/Granted day:2019-01-31
Information query