Checkpointing in distributed streaming platform for real-time applications
Abstract:
Software receives a data stream for an application running on a distributed streaming platform over a networked cluster of servers. The software converts the data into a plurality of data tuples structured according to a schema. The software repeatedly emits a plurality of the data tuples as a streaming window, which is separated from other streaming windows by a leading control tuple associated with an ordinal identifier for the streaming window. The streaming window is a sequential sequence of tuples that is associated with a recovery policy. Then the software emits a checkpointing tuple after a plurality of streaming windows. The checkpointing tuple causes checkpointing of an instance of an operator for the application when the checkpointing tuple is received by the instance. Each of the operations is executed by one or more processors in real time or near real time rather than offline.
Information query
Patent Agency Ranking
0/0