- Patent Title: Time-sliced approximate data structure for storing group statistics
-
Application No.: US16936013Application Date: 2020-07-22
-
Publication No.: US11886413B1Publication Date: 2024-01-30
- Inventor: Miguel Angel Casanova , David Christopher Tracey
- Applicant: Rapid7, Inc.
- Applicant Address: US MA Boston
- Assignee: Rapid7, Inc.
- Current Assignee: Rapid7, Inc.
- Current Assignee Address: US MA Boston
- Agent Ashwin Anand
- Main IPC: G06F16/22
- IPC: G06F16/22 ; G06F16/2458 ; G06F16/2455 ; G06F16/28

Abstract:
Systems and methods are disclosed to implement a bounded group by query system that computes approximate time-sliced statistics for groups of records in a dataset according to a group by query. In embodiments, a single pass scan of the dataset is performed to accumulate exact results for a maximum number of groups in a result grouping structure (RGS) and approximate results for additional groups in an approximate result grouping structure (ARGS). RGSs and ARGSs are accumulated by a set of accumulator nodes and provided to an aggregator node, which combines the received structures to generate exact or approximate statistical results for at least a subset of the groups in the dataset. Advantageously, the disclosed query system is able to produce approximate results for at least some of the groups in a single pass of the dataset using size-bounded data structures, without predetermining the actual number of groups in the dataset.
Information query