Certifying events in a streaming pipeline
Abstract:
Disclosed are embodiments for providing batch performance using a stream processor. In one embodiment, a method is disclosed comprising detecting a real close of books (COB) of a data transport, the real COB associated with a set of raw events transmitted over the data transport, flushing a stream processor in response to detecting the real COB, and retrieving a set of processed events from a distributed file system after the flushing is complete. A fact COB computation is then performed on the set of processed events and the set of raw events, the fact COB computation outputting a number of missing events, each missing event representing a raw event that is not present in the set of processed events. The processed events are then certified upon determining that the number of missing events is below a threshold.
Public/Granted literature
Information query
Patent Agency Ranking
0/0