-
公开(公告)号:US10387319B2
公开(公告)日:2019-08-20
申请号:US15640534
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Michael C. Adler , Chiachen Chou , Neal C. Crago , Kermin Fleming , Kent D. Glossop , Aamer Jaleel , Pratik M. Marolia , Simon C. Steely, Jr. , Samantika S. Sury
IPC: G06F12/0802 , G06F15/00 , G06F12/0862 , H03K19/177 , G06F15/78 , G11C8/12 , G06F17/50 , G06F15/80
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.
-
公开(公告)号:US10469397B2
公开(公告)日:2019-11-05
申请号:US15640540
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, Jr.
IPC: H04L12/721 , H04L12/801 , H04L12/863 , H04L12/935 , H04L12/937
Abstract: Systems, methods, and apparatuses relating to configurable network-based dataflow operator circuits are described. In one embodiment, a processor includes a spatial array of processing elements, and a packet switched communications network to route data within the spatial array between processing elements according to a dataflow graph to perform a first dataflow operation of the dataflow graph, wherein the packet switched communications network further comprises a plurality of network dataflow endpoint circuits to perform a second dataflow operation of the dataflow graph.
-
公开(公告)号:US20190004878A1
公开(公告)日:2019-01-03
申请号:US15640542
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Michael C. Adler , Kermin Fleming , Kent D. Glossop , Simon C. Steely, JR.
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of two dataflow graphs each comprising a plurality of nodes, wherein a first dataflow graph and a second dataflow graph are be overlaid into a first and second portion, respectively, of the interconnect network and a first and second subset, respectively, of the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the first and second subsets of the plurality of processing elements are to perform a first and second operation, respectively, when incoming first and second, respectively, operand sets arrive at the plurality of processing elements.
-
公开(公告)号:US10445234B2
公开(公告)日:2019-10-15
申请号:US15640533
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, Jr. , Samantika S. Sury
IPC: G06F12/0802 , H03K19/177 , G06F17/50 , G11C7/10 , G06F15/78 , G06F15/80 , G11C8/12
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In an embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an atomic operation when an incoming operand set arrives at the plurality of processing elements.
-
公开(公告)号:US10416999B2
公开(公告)日:2019-09-17
申请号:US15396395
申请日:2016-12-30
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, Jr.
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a second operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements.
-
公开(公告)号:US20190095383A1
公开(公告)日:2019-03-28
申请号:US15719281
申请日:2017-09-28
Applicant: Intel Corporation
Inventor: Kermin Fleming , Simon C. Steely, JR. , Kent D. Glossop
IPC: G06F15/80
Abstract: Systems, methods, and apparatuses relating to debugging a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. At least a first of the plurality of processing elements is to enter a halted state in response to being represented as a first of the plurality of dataflow operators.
-
公开(公告)号:US11086816B2
公开(公告)日:2021-08-10
申请号:US15719281
申请日:2017-09-28
Applicant: Intel Corporation
Inventor: Kermin Fleming , Simon C. Steely, Jr. , Kent D. Glossop
IPC: G06F15/80
Abstract: Systems, methods, and apparatuses relating to debugging a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. At least a first of the plurality of processing elements is to enter a halted state in response to being represented as a first of the plurality of dataflow operators.
-
公开(公告)号:US10515046B2
公开(公告)日:2019-12-24
申请号:US15640543
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, Jr.
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a synchronizer circuit coupled between an interconnect network of a first tile and an interconnect network of a second tile and comprising storage to store data to be sent between the interconnect network of the first tile and the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data, and send the converted data between the interconnect network of the first tile and the interconnect network of the second tile
-
公开(公告)号:US20190005161A1
公开(公告)日:2019-01-03
申请号:US15640535
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, JR. , Ping Tak Peter Tang
IPC: G06F17/50 , G06F15/78 , G06F12/0802
CPC classification number: G06F17/505 , G06F12/0802 , G06F12/0862 , G06F12/0888 , G06F12/0895 , G06F15/7867 , G06F15/8015 , G11C8/12 , H03K19/17736 , H03K19/17756 , H03K19/17764 , H03K19/17776 , H03K19/1778
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. At least one of the plurality of processing elements includes a plurality of control inputs.
-
公开(公告)号:US20190004955A1
公开(公告)日:2019-01-03
申请号:US15640534
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Michael C. Adler , Chiachen Chou , Neal C. Crago , Kermin Fleming , Kent D. Glossop , Aamer Jaleel , Pratik M. Marolia , Simon C. Steely, JR. , Samantika S. Sury
IPC: G06F12/0862 , G06F12/0802 , H03K19/177 , G06F15/78
CPC classification number: G06F12/0862 , G06F12/0802 , G06F15/7867 , G06F15/8015 , G06F17/505 , G06F2212/6026 , G11C8/12 , H03K19/17736 , H03K19/17756 , H03K19/1776 , H03K19/17764 , H03K19/17776 , H03K19/1778
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.
-
-
-
-
-
-
-
-
-