-
公开(公告)号:AU2021251304A1
公开(公告)日:2022-09-15
申请号:AU2021251304
申请日:2021-01-28
Applicant: IBM
Inventor: SAWADA JUN , MODHA DHARMENDRA , CASSIDY ANDREW STEPHEN , ARTHUR JOHN VERNON , NAYAK TAPAN , ORTEGA OTERO CARLOS , TABA BRIAN SEISHO , AKOPYAN FILIPP , DATTA PALLAB
Abstract: Neural inference chips for computing neural activations are provided. In various embodiments, a neural inference chip comprises at least one neural core, a memory array, an instruction buffer, and an instruction memory. The instruction buffer has a position corresponding to each of a plurality of elements of the memory array. The instruction memory provides at least one instruction to the instruction buffer. The instruction buffer advances the at least one instruction between positions in the instruction buffer. The instruction buffer provides the at least one instruction to at least one of the plurality of elements of the memory array from its associated position in the instruction buffer when the memory of the at least one of the plurality of elements contains data associated with the at least one instruction. Each element of the memory array provides a data block from its memory to its horizontal buffer in response to the arrival of an associated instruction from the instruction buffer. The horizontal buffer of each element of the memory array provides a data block to the horizontal buffer of another of the elements of the memory array or to the at least one neural core.
-
公开(公告)号:AU2020369825A1
公开(公告)日:2022-03-31
申请号:AU2020369825
申请日:2020-10-01
Applicant: IBM
Inventor: CASSIDY ANDREW , AKOPYAN FILIPP , APPUSWAMY RATHINAKUMAR , ARTHUR JOHN , DATTA PALLAB , DEBOLE MICHAEL , ESSER STEVE , FLICKNER MYRON , MODHA DHARMENDRA , ORTEGA OTERO CARLOS , SAWADA JUN
Abstract: Three-dimensional neural inference processing units are provided. A first tier comprises a plurality of neural cores. Each core comprises a neural computation unit. The neural computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. A second tier comprises a first neural network model memory adapted to store the plurality of synaptic weights. A communication network is operatively coupled to the first neural network model memory and to each of the plurality of neural cores, and adapted to provide the synaptic weights from the first neural network model memory to each of the plurality of neural cores.
-