Abstract:
The present invention generally provides a method for grid storage including balancing read and write requests from applications across a first group of nodes in a grid storage system for avoiding hot spots and optimizing performance through smart caching; balancing storage capacity across a second group of nodes in the grid storage system, nodes in the first and second groups being at least one of hardware interchangeable online, capable of being added to change performance or capacity of the grid storage system and capable of being removed to change performance or capacity of the grid storage system; and self managing of the first and second groups of nodes for providing at least one of scalability, self healing after failure of components in the grid storage, non-disruptive upgrades to the grid storage system, and eliminating duplicate data on an object or sub-object level in the grid storage system.
Abstract:
System(s) and method(s) are provided for data management and data processing. For example, various embodiments may include systems and methods relating to relatively larger groups of data being selected with comparable or better performing selection results (e.g. high data redundancy elimination and/or average chunk size). In various embodiments, the system(s) and method(s) may include, for example a data group, block, or chunk combining technique and/or a data group, block, or chunk splitting technique. Various embodiments may include a first standard or typical data grouping, blocking, or chunking technique and/or data group, block or chunk combining technique and/or a data group, block, or chunk splitting technique. Exemplary system(s) and method(s) may relate to data hashing and/or data elimination. Embodiments may include a look-ahead buffer and determine whether to emit small chunks or large chunks based on characteristics of underlying data and/or particular application of the invention (e.g. for backup).
Abstract:
A system and method for storing data in a content-addressable system is provided. The system includes a content-addressable storage system and a persistent cache. The persistent cache includes a temporary address generator that is configured to generate a temporary address which is associated with data to be stored in the persistent cache, and a non-content-addressable storage system configured to store and retrieve data in the persistent cache using the temporary address. The persistent cache further comprises an address translator configured to map a temporary address associated with the data in the non-content addressable storage system with a content address associated with the data in the content-addressable storage system.
Abstract:
A fixed prefix peer to peer network has a number of physical nodes. The nodes are logically divided into a number of storage slots. Blocks of data are erasure coded into original and redundant data fragments and the resultant fragments of data are stored in slots on separate physical nodes such that no physical node has more than one original and/or redundant fragment. The storage locations of all of the fragments are organized into a logical virtual node (e.g., a supernode). Thus, the supernode and the original block of data can be recovered even if some of the physical nodes are lost.
Abstract:
An architecture for a peer-to-peer network is disclosed which advantageously is able to maintain short fixed path length routing as the network grows.
Abstract:
A system and method are disclosed for improving the efficiency of a storage system. At least one application-oriented property is associated with data to be stored on a storage system. Based on the at least one application-oriented property, a manner of implementing at least one caching function in the storage system is determined. Data placement and data movement are controlled in the storage system to implement the at least one caching function.
Abstract:
Information, such as files received from a client, etc. is stored in a storage system, such as a content addressable storage system. A file server receives data from a client and chunks the data into blocks of data. The file server also generates metadata for use in forming a data structure. The blocks of data are stored in a block store and a copy of the data blocks and the metadata are locally cached at the file server. A commit server retrieves the metadata. In at least one embodiment, the metadata is retrieved from an update log shared between the file server and the commit server. Based on the retrieved metadata, the commit server generates a version of a data structure. The data structure is then stored at the block store.
Abstract:
A fixed prefix peer to peer network has a number of physical nodes. The nodes are logically divided into a number of storage slots. Blocks of data are erasure coded into original and redundant data fragments and the resultant fragments of data are stored in slots on separate physical nodes such that no physical node has more than one original and/or redundant fragment. The storage locations of all of the fragments are organized into a logical virtual node (e.g., a supernode). Thus, the supernode and the original block of data can be recovered even if some of the physical nodes are lost.
Abstract:
Systems and methods for data management and data processing are provided. Embodiments may include systems and methods relating to fast data selection with reasonably high quality results, and may include a faster data selection function and a slower data selection function. Various embodiments may include systems and methods relating to data hashing and/or data redundancy identification and elimination for a data set or a string of data. Embodiments may include a first selection function is used to pre-select boundary points or data blocks/windows from a data set or data stream and a second selection function is used to refine the boundary points or data blocks/windows. The second selection function may be better at determining the best places for boundary points or data blocks/windows in the data set or data stream. In various embodiments, data may be processed by a first faster hash function and slower more discriminating second hash function.