Abstract:
Embodiments of the present invention relate to asynchronously replicating data in a distributed computing environment. To achieve asynchronous replication, data received at a primary data store may be annotated with information, such as an identifier of the data. The annotated data may then be communicated to a secondary data store, which may then write the data and annotated information to one or more logs for eventual replay and committal at the secondary data store. The primary data store may communicate an acknowledgment of success in committing the data at the primary data store as well as of success in writing the data to the secondary data store. Additional embodiments may include committing the data at the secondary data store in response to receiving an instruction that authorizes committal of data through a identifier.
Abstract:
Embodiments of the present invention relate to synchronously replicating data in a distributed computing environment. To achieve synchronous replication both an eventual consistency approach and a strong consistency approach are contemplated. Received data may be written to a log of a primary data store for eventual committal. The data may then be annotated with a record, such as a unique identifier, which facilitates the replay of the data at a secondary data store. Upon receiving an acknowledgment that the secondary data store has written the data to a log, the primary data store may commit the data and communicate an acknowledgment of success back to the client. In a strong consistency approach, the primary data store may wait to send an acknowledgement of success to the client until it receives an acknowledgment that the secondary has not only written, but also committed, the data.
Abstract:
Partition management for a scalable, structured storage system is provided. The storage system provides storage represented by one or more tables, each of which includes rows that represent data entities. A table is partitioned into a number of partitions, each partition including a contiguous range of rows. The partitions are served by table servers and managed by a table master. Load distribution information for the table servers and partitions is tracked, and the table master determines to split and/or merge partitions based on the load distribution information.
Abstract:
Embodiments of the present invention relate to synchronously replicating data in a distributed computing environment. To achieve synchronous replication both an eventual consistency approach and a strong consistency approach are contemplated. Received data may be written to a log of a primary data store for eventual committal. The data may then be annotated with a record, such as a unique identifier, which facilitates the replay of the data at a secondary data store. Upon receiving an acknowledgment that the secondary data store has written the data to a log, the primary data store may commit the data and communicate an acknowledgment of success back to the client. In a strong consistency approach, the primary data store may wait to send an acknowledgement of success to the client until it receives an acknowledgment that the secondary has not only written, but also committed, the data.
Abstract:
Se proporciona manejo de división para un sistema de almacenamiento escalable, estructurado. El sistema de almacenamiento proporciona almacenamiento representado por una o más tablas, cada una de las cuales incluye filas que representan entidades de datos. Una tabla se divide en un número divisiones, cada división incluyendo una variedad de filas contiguas. Las divisiones son servidas por servidores de tabla y se manejan por un maestro de tabla. La información de distribución de carga para los servidores de tabla y divisiones se rastrea, y el maestro de tabla determina dividir y/o fusionar divisiones basándose en la información de distribución de carga.
Abstract:
Partition management for a scalable, structured storage system is provided. The storage system provides storage represented by one or more tables, each of which includes rows that represent data entities. A table is partitioned into a number of partitions, each partition including a contiguous range of rows. The partitions are served by table servers and managed by a table master. Load distribution information for the table servers and partitions is tracked, and the table master determines to split and/or merge partitions based on the load distribution information.