STATE REBALANCING IN STRUCTURED STREAMING

    公开(公告)号:US20250061132A1

    公开(公告)日:2025-02-20

    申请号:US18822023

    申请日:2024-08-30

    Abstract: A data processing service performs a rebalancing process for rebalancing stateful tasks on a cluster computing system. In one instance, the method for rebalancing stateful tasks is performed such that the per-operator partitions are spread across available executors of a cluster of the cluster computing system with respect to one or more statistics of the tasks. In one instance, the method for rebalancing stateful tasks is also performed such that the total number of stateful tasks are balanced per executor as long as this rebalancing does not imbalance the per-operator placements. In this way, the processing of stateful tasks can be spread across multiple executors in a relatively uniform manner, even though there may be an upfront cost of breaking the local caching on an executor.

    PIPELINED EXECUTION OF DATABASE QUERIES PROCESSING STREAMING DATA

    公开(公告)号:US20250165477A1

    公开(公告)日:2025-05-22

    申请号:US18511902

    申请日:2023-11-16

    Abstract: A database system performs pipelined execution of queries that process batches of streaming data. The database system compiles a database query to generate an execution plan and determines a set of stages based on the execution plan. The database query processes streaming data comprising batches. A scheduler schedules pipelined execution stages of the database query. Accordingly, the database system performs execution of a particular stage processing a batch of the streaming data in parallel with subsequent stages of the database query processing previous batches of the streaming data. The system further maintains watermarks for different stages of the database query.

    State rebalancing in structured streaming

    公开(公告)号:US12099525B2

    公开(公告)日:2024-09-24

    申请号:US18219314

    申请日:2023-07-07

    CPC classification number: G06F16/278 G06F16/24568

    Abstract: A data processing service performs a rebalancing process for rebalancing stateful tasks on a cluster computing system. In one instance, the method for rebalancing stateful tasks is performed such that the per-operator partitions are spread across available executors of a cluster of the cluster computing system with respect to one or more statistics of the tasks. In one instance, the method for rebalancing stateful tasks is also performed such that the total number of stateful tasks are balanced per executor as long as this rebalancing does not imbalance the per-operator placements. In this way, the processing of stateful tasks can be spread across multiple executors in a relatively uniform manner, even though there may be an upfront cost of breaking the local caching on an executor.

    STATE REBALANCING IN STRUCTURED STREAMING
    4.
    发明公开

    公开(公告)号:US20240202211A1

    公开(公告)日:2024-06-20

    申请号:US18219314

    申请日:2023-07-07

    CPC classification number: G06F16/278 G06F16/24568

    Abstract: A data processing service performs a rebalancing process for rebalancing stateful tasks on a cluster computing system. In one instance, the method for rebalancing stateful tasks is performed such that the per-operator partitions are spread across available executors of a cluster of the cluster computing system with respect to one or more statistics of the tasks. In one instance, the method for rebalancing stateful tasks is also performed such that the total number of stateful tasks are balanced per executor as long as this rebalancing does not imbalance the per-operator placements. In this way, the processing of stateful tasks can be spread across multiple executors in a relatively uniform manner, even though there may be an upfront cost of breaking the local caching on an executor.

Patent Agency Ranking