Tail latency-based job offloading in load-balanced groups
Abstract:
A type of a request that is currently being processed at a system is determined. A distribution is selected from a set of processing time distributions, the distribution forming a model that is applicable to the type. A threshold point is computed for the model. A processing time that exceeds a threshold point processing time is regarded as exhibiting tail latency. Tail latency includes a delay in processing of the request due to a reason other than a utilization of a resource of the system exceeding a threshold utilization and a size of a queue in the system exceeding a threshold size. An evaluation is made that the request will experience tail latency during processing at the system and the processing of the request at the system is aborted. The request is offloaded for processing at a peer system in a load-balanced group of systems.
Public/Granted literature
Information query
Patent Agency Ranking
0/0