-
公开(公告)号:US11270197B2
公开(公告)日:2022-03-08
申请号:US16672918
申请日:2019-11-04
Applicant: NVIDIA Corp.
Inventor: Yakun Shao , Rangharajan Venkatesan , Miaorong Wang , Daniel Smith , William James Dally , Joel Emer , Stephen W. Keckler , Brucek Khailany
Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.
-
公开(公告)号:US20220076110A1
公开(公告)日:2022-03-10
申请号:US17530852
申请日:2021-11-19
Applicant: NVIDIA Corp.
Inventor: Yakun Shao , Rangharajan Venkatesan , Miaorong Wang , Daniel Smith , William James Dally , Joel Emer , Stephen W. Keckler , Brucek Khailany
Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.
-
公开(公告)号:US20200293867A1
公开(公告)日:2020-09-17
申请号:US16672918
申请日:2019-11-04
Applicant: NVIDIA Corp.
Inventor: Yakun Shao , Rangharajan Venkatesan , Miaorong Wang , Daniel Smith , William James Dally , Joel Emer , Stephen W. Keckler , Brucek Khailany
Abstract: A distributed deep neural net (DNN) utilizing a distributed, tile-based architecture includes multiple chips, each with a central processing element, a global memory buffer, and a plurality of additional processing elements. Each additional processing element includes a weight buffer, an activation buffer, and vector multiply-accumulate units to combine, in parallel, the weight values and the activation values using stationary data flows.
-
-