-
公开(公告)号:US20190325314A1
公开(公告)日:2019-10-24
申请号:US16456863
申请日:2019-06-28
Applicant: Intel Corporation
Inventor: Mikael Bourges-Sevenier , Adam Herr , Sridhar Sharma , Derek Gerstmann , Todd Anderson , Justin Gottschlich
Abstract: Methods, apparatus, systems and articles of manufacture to optimize execution of a machine learning model are disclosed. An example apparatus includes a quantizer to quantize a layer of a model based on an execution constraint, the layer of the model represented by a matrix. A packer is to pack the quantized layer of the matrix to create a packed layer represented by a packed matrix, the packed matrix having non-zero values of the matrix grouped together along at least one of a row or a column of the matrix. A blocker is to block the packed layer into a blocked layer by dividing the non-zero values in the packed matrix into blocks. A fuser is to fuse the blocked layer into a pipeline. A packager is to package the pipeline into a binary.
-
12.
公开(公告)号:US20190317740A1
公开(公告)日:2019-10-17
申请号:US16455379
申请日:2019-06-27
Applicant: Intel Corporation
Inventor: Adam Herr , Derek Gerstmann , Justin Gottschlich , Mikael Bourges-Sevenier , Sridhar Sharma
Abstract: Methods, apparatus, systems and articles of manufacture are disclosed for runtime scheduling of software executing on a heterogeneous system. An example apparatus includes in response to a variant compiler to generate a representation of an algorithm in a domain-specific language (DSL), a compilation auto-scheduler to generate a schedule based on configurations for processing elements of the heterogeneous system, the processing elements including at least a first and a second processing element, the variant compiler to compile variant binaries based on the schedule, each of the variant binaries associated with the algorithm in the DSL, the variant binaries including a first variant binary corresponding to the first processing element and a second variant binary corresponding to the second processing element, and an application compiler to generate a fat binary including a runtime scheduler to select one or more of the variant binaries to execute a workload based on the schedule.
-