Abstract:
A request to perform a write operation on a file stored in a distributed file system that includes a first and second storage server and an arbiter system may be received. An identification of whether whether one of the first or second storage servers is available to perform the write operation and that the other of the first or second storage servers is not available to perform the write operation may be performed. An identification of whether the arbiter system is available to record the write operation may be performed. In response to identifying that one of the storage servers and the arbiter system are available and that the other of the storage servers is unavailable, the write operation may be performed on the file in view of write operation data that indicates whether the one of the storage servers is consistent with the arbiter system.
Abstract:
A server computer system performs a first set of operations for a first transaction. The first transaction pertaining to data stored in a file system. The server computer system delays a second set of operations for the first transaction and identifies a second transaction pertaining to the data. In response to identifying the second transaction, the server computer system cancels the second set of operations for the first transaction, and cancels a first set of operations for the second transaction.
Abstract:
A first storage server of the file system receives a request to perform an operation on the data file. The operation is to be performed on a first replica of the data file stored at the first storage server and a second replica of the data file stored at a second storage server of the file system. The first storage server configures first metadata associated with a first index file to indicate that the operation is to be performed on a first portion of the first replica. The first storage server determines the second replica is an outdated state indicating that the operation on the second replica has not been performed by the second storage server. In response to the second replica being in the outdated state, updating a first portion of the second replica identified in view of the first metadata and corresponding to the first portion of the first replica.
Abstract:
Techniques for pro-active self-healing in a distributed file system are disclosure herein. In accordance with one embodiment, a method is provided. The method comprises prior to detecting an access request by a client application to an image on a storage server, identifying, by a self-healing daemon executed by a processing device, a first region of the image comprising stale data. A partial lock on the image is acquired. The partial lock prevents access to the first region of the image. Responsive to acquiring the partial lock, the self-healing daemon provides access to a second region of the image file comprising data other than the stale data.
Abstract:
An outcast index in a distributed file system is described. A first server can receive an indication that a first replica stored on the first server is to be modified in view of a second replica stored on a second server. The first replica and the second replica are replicas of a same file. The first server updates metadata associated with the first replica to indicate an outcast state of the first replica. The first server receives an indication that the modification of the first replica is complete. The first server updates the metadata associated with the first replica to remove the outcast state of the first replica.
Abstract:
A first server identifies a second server connecting to a cluster of servers in a file system. The first server examines a file in a replication directory hierarchy in the second server. The file has not been accessed by a client application. The first server determines, prior to the file being accessed by the client application that the file on the second server has stale data and overwrites the stale data in the file on the second server with current data.
Abstract:
A request to perform a write operation on a file stored in a distributed file system may be received. A determination may be made as to whether a quorum of servers of the distributed file system is satisfied. The servers of the quorum may be used to perform the write operation or to record the write operation. The write operation may be performed on the file in view of determining that the quorum has been satisfied.
Abstract:
An outcast index in a distributed file system is described. A first server can receive an indication that a first replica stored on the first server is to be modified in view of a second replica stored on a second server. The first replica and the second replica are replicas of a same file. The first server updates metadata associated with the first replica to indicate an outcast state of the first replica. The first server receives an indication that the modification of the first replica is complete. The first server updates the metadata associated with the first replica to remove the outcast state of the first replica.
Abstract:
A server computer system identifies change operations for an object in a file system. The object can be a file or a directory. The change operations can include a change to a local copy of the object and one or more remote copies of the object. The server computer system determines that one of the change operations is unsuccessful and creates tracking data that identifies the object that is associated with at least one change operation that is unsuccessful.
Abstract:
A server computer system identifies change operations for an object in a file system. The object can be a file or a directory. The change operations can include a change to a local copy of the object and one or more remote copies of the object. The server computer system determines that one of the change operations is unsuccessful and creates tracking data that identifies the object that is associated with at least one change operation that is unsuccessful.