Abstract:
The disclosed techniques include generation of a single index table when backing up data in a first backup format to a backup storage system that uses a second backup format. Using the single index table, a query for a data item can be answered by searching the single index table. The single index table avoids having to search through multiple index tables, each corresponding to a different backup format that may be used for backing up the searched data item.
Abstract:
Data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, are performed within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
Abstract:
A system according to certain embodiments associates a signature value corresponding to a data block with one or more data blocks and a reference to the data block to form a signature/data word corresponding to the data block. The system further logically organizes the signature/data words into a plurality of files each comprising at least one signature/data word such that the signature values are embedded in the respective file. The system according to certain embodiments reads a previously stored signature value corresponding to a respective data block for sending from a backup storage system having at least one memory device to a secondary storage system. Based on an indication as to whether the data block is already stored on the secondary storage system, the system reads the data block from the at least one memory device for sending to the secondary storage system if the data block does not exist on the secondary storage system, wherein the signature value and not the data block is read from the at least one memory device if the data block exists on the secondary storage system.
Abstract:
A stand-alone, network accessible data storage device, such as a filer or NAS device, is capable of transferring data objects based on portions of the data objects. The device transfers portions of files, folders, and other data objects from a data store within the device to external secondary storage based on certain criteria, such as time-based criteria, age-based criteria, and so on. A portion may be one or more blocks of a data object, or one or more chunks of a data object, or other segments that combine to form or store a data object. For example, the device identifies one or more blocks of a data object that satisfy a certain criteria, and migrates the identified blocks to external storage, thereby freeing up storage space within the device. The device may determine that a certain number of blocks of a file have not been modified or called by a file system in a certain time period, and migrate these blocks to secondary storage.
Abstract:
Various systems and methods may be used for performing data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods for content indexing data stored within a cloud environment may facilitate later searching, including collaborative searching. Methods for performing containerized deduplication may reduce the strain on a system namespace, effectuate cost savings, etc. Methods may identify suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, the systems and methods may be used for providing a cloud gateway and a scalable data object store within a cloud environment.
Abstract:
An information management system according to certain aspects may determine whether snapshot operations will work prior to executing them. The system may check various factors or parameters relating to a snapshot storage policy to verify whether the storage policy will work at runtime without actually executing the policy. Some examples of factors can include: availability of primary storage devices for which a snapshot should be obtained, availability of secondary storage devices, license availability for snapshot software, user credentials for connecting to primary and/or second storage devices, available storage capacity, connectivity to storage devices, etc. The system may also check whether a particular system configuration is supported in connection with snapshot operations. The result of the determination can be provided in the form of a report summarizing any problems found with the snapshot storage policy. The report can include recommended courses of action or solutions for resolving any identified issues.
Abstract:
Described in detail herein are systems and methods for single instancing blocks of data in a data storage system. For example, the data storage system may include multiple computing devices (e.g., client computing devices) that store primary data. The data storage system may also include a secondary storage computing device, a single instance database, and one or more storage devices that store copies of the primary data (e.g., secondary copies, tertiary copies, etc.). The secondary storage computing device receives blocks of data from the computing devices and accesses the single instance database to determine whether the blocks of data are unique (meaning that no instances of the blocks of data are stored on the storage devices). If a block of data is unique, the single instance database stores it on a storage device. If not, the secondary storage computing device can avoid storing the block of data on the storage devices.
Abstract:
The disclosed techniques include generation of a single index table when backing up data in a first backup format to a backup storage system that uses a second backup format. Using the single index table, a query for a data item can be answered by searching the single index table. The single index table avoids having to search through multiple index tables, each corresponding to a different backup format that may be used for backing up the searched data item.
Abstract:
A high availability distributed, deduplicated storage system according to certain embodiments is arranged to include multiple deduplication database media agents. The deduplication database media agents store signatures of data blocks stored in secondary storage. In addition, the deduplication database media agents are configured as failover deduplication database media agents in the event that one of the deduplication database media agents becomes unavailable.
Abstract:
Data storage systems monitor the performance of data storage operations on a granular level and compile the information for presenting to a user. The system measures the time of execution for individual granular stages of the storage operation and in response to the monitoring results, automatically adjust parameters to optimize performance. Further, the system performs a performance test by simulating the data storage operation, but may not actually write the data to the secondary storage medium.