-
公开(公告)号:US20250147766A1
公开(公告)日:2025-05-08
申请号:US18500879
申请日:2023-11-02
Applicant: QUALCOMM Incorporated
IPC: G06F9/38
Abstract: Aspects of the disclosure are directed to processor prefetching. In accordance with one aspect, processor prefetching includes determining if an exposed latency parameter is greater than zero; determining if a loop is present in a program code; determining if access overlap across fibers is true; and performing one of the following: a) insert a prefetch every quantity K fibers at a prefetch distance D; b) insert a prefetch every quantity K fibers with a maximum of quantity N iterations; or c) insert a prefetch per fiber with a maximum of quantity N iterations ahead; and wherein, the prefetch distance D is equal to an exposed latency parameter divided by an average instruction latency parameter, the quantity K is equal to a cache line size divided by a per fiber access size, and the quantity N is equal to an exposed latency parameter divided by an average iteration latency parameter.
-
公开(公告)号:US20250165301A1
公开(公告)日:2025-05-22
申请号:US18512330
申请日:2023-11-17
Applicant: QUALCOMM Incorporated
Inventor: Krishna Raju VEGIRAJU , Siva Rama Krishna Reddy BOGI REDDY , Adarsh GOLIKERI , Hongqiang WANG
IPC: G06F9/50 , G06F16/901
Abstract: Certain aspects provide techniques and apparatuses for efficient operation of a machine learning model in a heterogeneous computing environment. An example method includes partitioning a graph representing a machine learning model into a plurality of subgraphs. Each subgraph generally represents a portion of the machine learning model. For each subgraph, a plurality of execution paths are simulated based on permutations of using different processing unit types to execute portions of the subgraph and starting with each input source processing unit type selected from the different processing unit types, and an execution path having a lowest cost is selected from the plurality of execution paths. The machine learning model is implemented based on the selected execution path for each subgraph.
-