Dynamic kernel slicing for VGPU sharing in serverless computing systems
Abstract:
Various examples are disclosed for dynamic kernel slicing for virtual graphics processing unit (vGPU) sharing in serverless computing systems. A computing device is configured to provide a serverless computing service, receive a request for execution of program code in the serverless computing service in which a plurality of virtual graphics processing units (vGPUs) are used in the execution of the program code, determine a slice size to partition a compute kernel of the program code into a plurality of sub-kernels for concurrent execution by the vGPUs, the slice size being determined for individual ones of the sub-kernels based on an optimization function that considers a load on a GPU, determine an execution schedule for executing the individual ones of the sub-kernels on the vGPUs in accordance with a scheduling policy, and execute the sub-kernels on the vGPUs as partitioned in accordance with the execution schedule.
Information query
Patent Agency Ranking
0/0