Patent search ap:("INTEL CORPORATION") AND inv:"TOLL Page Bret"

11.

发明公开
DEEP LEARNING IMPLEMENTATIONS USING SYSTOLIC ARRAYS AND FUSED OPERATIONS 审中-公开

公开(公告)号：EP3798928A1

公开(公告)日：2021-03-31

申请号：EP20179527.5

申请日：2020-06-11

Applicant: INTEL Corporation

Inventor： RASH, William , MAIYURAN, Subramaniam , GEORGE, Varghese , TOLL, Bret , SANKARAN, Rajesh , CHAPPELL, Robert , PAL, Supratim , HEINECKE, Alexander F. , OULD-AHMED-VALL, Elmoustapha , CHEN, Gang

IPC: G06N3/063 , G06N3/04 , G06F9/30 , G06F17/16

Abstract: Disclosed embodiments relate to deep learning implementations using systolic arrays and fused operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of a destination and N source matrices, the opcode indicating the processor is to load the N source matrices from memory, perform N convolutions on the N source matrices to generate N feature maps, and store results of the N convolutions in registers to be passed to an activation layer, wherein the processor is to perform the N convolutions and the activation layer with at most one memory load of each of the N source matrices. The processor further includes scheduling circuitry to schedule execution of the instruction and execution circuitry to execute the instruction as per the opcode.

12.

发明公开
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS 审中-公开

公开(公告)号：EP3629154A3

公开(公告)日：2020-05-06

申请号：EP19182737.7

申请日：2019-06-26

Applicant: INTEL Corporation

Inventor： TOLL, Bret , HUGHES, Christopher J. , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC: G06F9/30

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

13.

发明公开
SYSTEMS AND METHODS FOR IMPLEMENTING CHAINED TILE OPERATIONS 审中-公开

公开(公告)号：EP3547120A1

公开(公告)日：2019-10-02

申请号：EP19157043.1

申请日：2019-02-13

Applicant: INTEL Corporation

Inventor： HUGHES, Christopher J. , HEINECKE, Alexander F. , VALENTINE, Robert , TOLL, Bret , CORBAL, Jesus , OULD-AHMED-VALL, Elmoustapha

IPC: G06F9/38 , G06F15/78 , G06F9/30

Abstract: Disclosed embodiments relate to systems and methods for implementing chained tile operations. In one example, a processor includes fetch circuitry to fetch one or more instructions until a plurality of instructions has been fetched, each instruction to specify source and destination tile operands, decode circuitry to decode the fetched instructions, and execution circuitry, responsive to the decoded instructions, to: identify first and second decoded instructions belonging to a chain of instructions, dynamically select and configure a SIMD path comprising first and second processing engines (PE) to execute the first and second decoded instructions, and set aside the specified destination of the first decoded instruction, and instead route a result of the first decoded instruction from the first PE to be used by the second PE to perform the second decoded instruction.

14.

发明公开
MULTIPLE REGISTER MEMORY ACCESS INSTRUCTIONS, PROCESSORS, METHODS, AND SYSTEMS 审中-公开
Title translation: SPEICHERZUGRIFFSBEFEHLE，-PROZESSOREN，-VERFAHREN，UND -SYSTEME MIT MEHREREN REGISTERN

公开(公告)号：EP3014416A1

公开(公告)日：2016-05-04

申请号：EP14817022.8

申请日：2014-06-26

Applicant: Intel Corporation

Inventor： HINTON, Glenn , TOLL, Bret , SINGHAL, Ronak

IPC: G06F9/06 , G06F12/08

CPC classification number: G11C7/1036 , G06F9/30043 , G06F9/30109 , G06F9/30163

Abstract: A processor includes N-bit registers and a decode unit to receive a multiple register memory access instruction. The multiple register memory access instruction is to indicate a memory location and a register. The processor includes a memory access unit coupled with the decode unit and with the N-bit registers. The memory access unit is to perform a multiple register memory access operation in response to the multiple register memory access instruction. The operation is to involve N-bit data, in each of the N-bit registers comprising the indicated register. The operation is also to involve different corresponding N-bit portions of an MxN-bit line of memory corresponding to the indicated memory location. A total number of bits of the N-bit data in the N-bit registers to be involved in the multiple register memory access operation is to amount to at least half of the MxN-bits of the line of memory.

Abstract translation: 处理器包括N位寄存器和用于接收多寄存器存储器访问指令的解码单元。多个寄存器存储器访问指令是指示存储器位置和寄存器。处理器包括与解码单元和N位寄存器耦合的存储器存取单元。存储器访问单元响应于多个寄存器存储器访问指令执行多个寄存器存储器访问操作。该操作涉及在包括所指示的寄存器的每个N位寄存器中涉及N位数据。操作还涉及对应于所指示的存储器位置的M×N位存储器线的不同对应的N位部分。要在多个寄存器存储器存取操作中涉及的N位寄存器中的N位数据的总位数至少等于存储器行的M×N位的至少一半。

15.

发明公开
METHOD AND APPARATUS FOR DISABLING A CLOCK SIGNAL WITHIN A MULTITHREADED PROCESSOR 有权
Title translation: 方法和装置消除时钟信号的线程处理器了很多

公开(公告)号：EP1236107A2

公开(公告)日：2002-09-04

申请号：EP00970828.0

申请日：2000-10-11

Applicant: INTEL CORPORATION

Inventor： RODGERS, Dion , TOLL, Bret , WOOD, Amiee

IPC: G06F9/46

CPC classification number: G06F1/3203 , G06F1/3237 , G06F1/3287 , G06F9/384 , G06F9/3851 , Y02D10/128 , Y02D10/171 , Y02D50/20

Abstract: A method includes maintaining an indication of a pending event with respect to each of a number of threads supported within a multithreaded processor. An indication is also maintained of an active or inactive state for each of the multiple threads. A clock disable condition is detected. This clock disable condition may be indicated by the absence of pending events with respect to each of the multiple threads and an inactive state for each of the multiple threads. A clocks signal, if enabled, is then disabled with respect to at least one functional unit within the multithreaded processor responsive to the detection of the clock disable condition.

16.

发明公开
SYSTEMS AND METHODS FOR PERFORMING HORIZONTAL TILE OPERATIONS 审中-公开

公开(公告)号：EP3623940A3

公开(公告)日：2020-05-06

申请号：EP19183497.7

申请日：2019-06-28

Applicant: Intel Corporation

Inventor： HUGHES, Christopher J. , TOLL, Bret , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying horizontal tile operations. In one example, a processor includes fetch circuitry to fetch an instruction specifying a horizontal tile operation, a location of a M by N source matrix comprising K groups of elements, and locations of K destinations, wherein each of the K groups of elements comprises the same number of elements, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction by generating K results, each result being generated by performing the specified horizontal tile operation across every element of a corresponding group of the K groups, and writing each generated result to a corresponding location of the K specified destination locations.

17.

发明公开
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSFORM MATRICES INTO ROW-INTERLEAVED FORMAT 审中-公开

公开(公告)号：EP3629158A2

公开(公告)日：2020-04-01

申请号：EP19183078.5

申请日：2019-06-27

Applicant: INTEL Corporation

Inventor： SADE, Raanan , VALENTINE, Robert , TOLL, Bret , HUGHES, Christopher J. , HEINECKE, Alexander F. , OULD-AHMED-VALL, ElMoustapha , CHARNEY, Mark J.

IPC: G06F9/30

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified Rowlnt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

18.

发明公开
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS 审中-公开

公开(公告)号：EP3629154A2

公开(公告)日：2020-04-01

申请号：EP19182737.7

申请日：2019-06-26

Applicant: INTEL Corporation

Inventor： TOLL, Bret , HUGHES, Christopher J. , BAUM, Dan , OULD-AHMED-VALL, Elmoustapha , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC: G06F9/30

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

19.

发明公开
SYSTEMS AND METHODS FOR PERFORMING MATRIX COMPRESS AND DECOMPRESS INSTRUCTIONS 审中-公开

公开(公告)号：EP3629153A2

公开(公告)日：2020-04-01

申请号：EP19182736.9

申请日：2019-06-26

Applicant: INTEL Corporation

Inventor： BAUM, Dan , ESPIG, Michael , GUILFORD, James , FEGHALI, Wajdi K. , SADE, Raanan , HUGHES, Christopher J. , VALENTINE, Robert , TOLL, Bret , OULD-AHMED-VALL, Elmoustapha , CHARNEY, Mark J. , GOPAL, Vinodh , ZOHAR, Ronen , HEINECKE, Alexander F.

IPC: G06F9/30

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

20.

发明公开
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS SPECIFYING TERNARY TILE LOGIC OPERATIONS 审中-公开

公开(公告)号：EP3623941A2

公开(公告)日：2020-03-18

申请号：EP19183501.6

申请日：2019-06-28

Applicant: INTEL Corporation

Inventor： OULD-AHMED-VALL, Elmoustapha , HUGHES, Christopher J. , TOLL, Bret , BAUM, Dan , SADE, Raanan , VALENTINE, Robert , CHARNEY, Mark J. , HEINECKE, Alexander F.

IPC: G06F9/30 , G06F9/38 , G06F17/16

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying ternary tile operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction specifying a ternary tile operation, and locations of destination and first, second, and third source matrices, each of the matrices having M rows by N columns; and execution circuitry to respond to the decoded instruction by, for each equal-sized group of K elements of the specified first, second, and third source matrices, generate K results by performing the ternary tile operation in parallel on K corresponding elements of the specified first, second, and third source matrices, and store each of the K results to a corresponding element of the specified destination matrix, wherein corresponding elements of the specified source and destination matrices occupy a same relative position within their associated matrix.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification