-
1.
公开(公告)号:US20240037378A1
公开(公告)日:2024-02-01
申请号:US18255391
申请日:2020-12-24
Applicant: Intel Corporation
Inventor: Guokai Ma , Jiong Gong , Dhiraj Kalamkar , Rachitha Prem Seelin , Hongzhen Liu , Akshay Jain , Liangang Zhang
Abstract: Systems, apparatuses and methods may provide for technology that identifies an embedding table associated with a neural network. The neural network is associated with a plurality of compute nodes. The technology further identifies a number of entries of the embedding table, and determines whether to process gradients associated with the embedding table as dense gradients or sparse gradients based on the number of entries.
-
公开(公告)号:US12229554B2
公开(公告)日:2025-02-18
申请号:US17463405
申请日:2021-08-31
Applicant: Intel Corporation
Inventor: Alexander Heinecke , Menachem Adelman , Robert Valentine , Zeev Sperber , Amit Gradstein , Mark Charney , Evangelos Georganas , Dhiraj Kalamkar , Christopher Hughes , Cristina Anderson
Abstract: Techniques for performing BF16 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a BF16 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand.
-