Machine learning to improve caching efficiency in a storage system
Abstract:
A system and method improve caching efficiency in a data storage system by performing machine learning processes on metadata relating to extents of data blocks, rather than individual blocks themselves. Thus, once the storage devices are divided into extents, various metadata regarding access to the blocks within each extent are aggregated, and per-extent features are extracted. These features are used to train a data regression model that is subsequently used to infer a most likely “hotness” value for each extent at a future time. These predicted values, which may be further classified as e.g. “hot”, “warm”, and “cold” using thresholds, are used to implement the cache replacement policy. Embodiments scale to large and multi-layered caches, and may avoid common caching problems like thrashing, by adjusting the extent size. Policy goal functions may be optimized by dynamically adjusting the classification thresholds.
Public/Granted literature
Information query
Patent Agency Ranking
0/0