Abstract:
The disclosure herein describes enhancing data durability of a base component using a delta component. A delta component is generated based on the base component becoming unavailable. The delta component is configured to include unwritten storage space with an address space matching the base component and a tracking bitmap associated with data blocks of the address space of the delta component. Write operations targeted for the base component are routed to the delta component. Based on the routed write operations, bits associated with data blocks affected by the write operations are changed in the tracking bitmap. Based on the base component becoming available, data blocks affected by routed write operations are identified based on the tracking bitmap and the identified data blocks are synchronized from the delta component to the base component. The delta component is then removed.
Abstract:
In a storage cluster having nodes, blocks of a logical storage space of a storage object are allocated flexibly by a parent node to component nodes that are backed by physical storage. The method includes maintaining a first allocation map for the parent node, and second and third allocation maps for the first and second component nodes, respectively, executing a first write operation on the first component node and updating the second allocation map to indicate that the first block is a written block, and upon detecting that the first component node is offline, executing a second write operation that targets a second block of the logical storage space, which is allocated to the first component node, on the second component node and updating the third allocation map to indicate that the second block is a written block.
Abstract:
Hybrid synchronization using a shadow component includes detecting a first component of a plurality of mirrored components of a distributed data object becoming unavailable. The mirrored components include a delta component (a special shadow component) and a regular mirror (shadow) component. The delta component indicates a shorter history of changes to data blocks of a log-structured file system (LFS) than is indicated by the regular mirror component. During the unavailability of the first component, at least one write I/O is committed by the delta component. The commit is tracked by the delta component in a first tracking bitmap associated with the delta component. Based at least on detecting the first component becoming available, the first component is synchronized with data from the delta component, based at least on changed data blocks indicated in the first tracking bitmap.
Abstract:
A method for compressing is provided. The method including receiving a block of data to store on at least one physical disk; determining whether to store the data in a data log as uncompressed or compressed data based on a determined size of resulting compressed data. When the method determines to store the data as compressed, compressing the data and storing the compressed data in at least one sector in the data log. Otherwise, the method stores the data, uncompressed, in a plurality of sectors in the data log. The method generates a one or more state bits indicating (i) whether the data is stored as uncompressed or compressed, and (ii) if the data is stored as compressed, a size of the compressed data. The method then stores the one or more state bits in an entry of a logical map table associated with an LBA that corresponds to the data block.
Abstract:
Exemplary methods, apparatuses, and systems include a replica node storing a component of a storage object detecting that a primary coordinator for the storage object component is no longer available to serve as primary coordinator. The replica node is within a cluster of nodes storing components of the storage object. In response to detecting that the primary coordinator is no longer available, the replica node updates a first metadata entry indicating that a secondary coordinator for the storage object component is unhealthy. The replica node rejects connection requests from the secondary coordinator in response to the first metadata entry indicating that the secondary coordinator for the storage object component is unhealthy.
Abstract:
Examples perform input/output (I/O) requests, issued by a plurality of clients to an owner-node, in a virtual storage area network (vSAN) environment. I/O requests are guaranteed, as all I/O requests are performed during non-overlapping, exclusive sessions between one client at a time and the owner node. The owner node rejects requests for simultaneous sessions, and duplicate sessions are prevented by requiring that a client refresh its memory state after termination of a previous session.
Abstract:
An example method of resynchronizing a first replica of an object and a second replica of an object in an object storage system, includes: determining, by storage software in response to the second replica transitioning from failed to available, a stale sequence number for the second replica, the storage software having associated the stale sequence number with the second replica when the second replica failed; querying, by the storage software, block-level metadata for the object using the stale sequence number, the block-level metadata relating logical blocks of the object with sequence numbers for operations on the object; determining, by the software as a result of the querying, a set of the logical blocks each related to a sequence number being the same or after the stale sequence number; and copying, by the storage software, data of the set of logical blocks from the first replica to the second replica.
Abstract:
An efficient scheduling of IOs in a computing system using dynamic bandwidth regulation includes building up a shared regulator to limit the total IOPS scheduling among all IO classes at any given time. Reserved regulators may be used to place limits on the IOPS scheduled for a particular IO class at any given time. An outstanding IO window may also limit the overall number of outstanding IOs, and/or the bytes of outstanding IOs at any particular time. A first stage of IO scheduling may involve enforcing the reserved regulators to limit the IOPS scheduled for particular IO classes. A second stage of IO scheduling may involve enforcing the shared regulator to limit the total IOPS scheduled for all IO classes.
Abstract:
Embodiments of the disclosure provide techniques for partitioning a resource object into multiple resource components of a cluster of host computer nodes in a distributed resources system. The distributed resources system translates high-level policy requirements into a resource configuration that the system accommodates. The system determines an allocation based on the policy requirements and identifies resource configurations that are available. Upon selecting a resource configuration, the distributed resources system assigns the allocation and associated values to the selected configuration and publishes the new configuration to other host computer nodes in the cluster.
Abstract:
Embodiments of the disclosure provide techniques for partitioning a resource object into multiple resource components of a cluster of host computer nodes in a distributed resources system. The distributed resources system translates high-level policy requirements into a resource configuration that the system accommodates. The system determines an allocation based on the policy requirements and identifies resource configurations that are available. Upon selecting a resource configuration, the distributed resources system assigns the allocation and associated values to the selected configuration and publishes the new configuration to other host computer nodes in the cluster.