Abstract:
A request is received to remove duplicate data. A log data container associated with a storage volume in a storage server is accessed. The log data container includes a plurality of entries. Each entry is identified by an extent identifier in a data structures stored in a volume associated with the storage server. For each entry in the log data container, a determination is made if the entry matches another entry in the log data container. If the entry matches another entry in the log data container, a determination is made of a donor extent and a recipient extent. If an external reference count associated with the recipient extent equals a first predetermined value, block sharing is performed for the donor extent and the recipient extent. A determination is made if the reference count of the donor extent equals a second predetermined value. If the reference count of the donor extent equals the second predetermined value, the donor extent is freed.
Abstract:
An extent-based storage architecture is implemented by a storage server receiving a read request for an extent from a client, wherein the extent includes a group of contiguous blocks and the read request includes a file block number. The storage server retrieves an extent identifier from a first sorted data structure, wherein the storage server uses the received file block number to traverse the first sorted data structure to the extent identifier. The storage server retrieves a reference to the extent from a second sorted data structure, wherein the storage server uses the retrieved extent identifier to traverse the second sorted data structure to the reference, and wherein the second sorted data structure is global across a plurality of volumes. The storage server retrieves the extent from a storage device using the reference and returns the extent to the client.
Abstract:
An extent-based storage architecture is implemented by a storage server receiving a read request for an extent from a client, wherein the extent includes a group of contiguous blocks and the read request includes a file block number. The storage server retrieves an extent identifier from a first sorted data structure, wherein the storage server uses the received file block number to traverse the first sorted data structure to the extent identifier. The storage server retrieves a reference to the extent from a second sorted data structure, wherein the storage server uses the retrieved extent identifier to traverse the second sorted data structure to the reference, and wherein the second sorted data structure is global across a plurality of volumes. The storage server retrieves the extent from a storage device using the reference and returns the extent to the client.
Abstract:
A technique to name data is disclosed to allow preservation of storage efficiency over a link between a source and a destination in a replication relationship as well as in storage at the destination. The technique allows the source to send named data to the destination once and refer to it by name multiple times in the future, without having to resend the data. The technique also allows the transmission of data extents to be decoupled from the logical containers that refer to the data extents. Additionally, the technique allows a replication system to accommodate different extent sizes between replication source and destination while preserving storage efficiency.
Abstract:
A technique to name data is disclosed to allow preservation of storage efficiency over a link between a source and a destination in a replication relationship as well as in storage at the destination. The technique allows the source to send named data to the destination once and refer to it by name multiple times in the future, without having to resend the data. The technique also allows the transmission of data extents to be decoupled from the logical containers that refer to the data extents. Additionally, the technique allows a replication system to accommodate different extent sizes between replication source and destination while preserving storage efficiency.
Abstract:
A request is received to remove duplicate data. A log data container associated with a storage volume in a storage server is accessed. The log data container includes a plurality of entries. Each entry is identified by an extent identifier in a data structures stored in a volume associated with the storage server. For each entry in the log data container, a determination is made if the entry matches another entry in the log data container. If the entry matches another entry in the log data container, a determination is made of a donor extent and a recipient extent. If an external reference count associated with the recipient extent equals a first predetermined value, block sharing is performed for the donor extent and the recipient extent. A determination is made if the reference count of the donor extent equals a second predetermined value. If the reference count of the donor extent equals the second predetermined value, the donor extent is freed.