-
公开(公告)号:US11113053B2
公开(公告)日:2021-09-07
申请号:US16579394
申请日:2019-09-23
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Edward T. Grochowski , Jonathan D. Pearce , Deborah T. Marr , Ehud Cohen , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal San Adrian , Robert Valentine , Mark J. Charney , Christopher J. Hughes , Milind B. Girkar
IPC: G06F9/30
Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.
-
2.
公开(公告)号:US20180004510A1
公开(公告)日:2018-01-04
申请号:US15201442
申请日:2016-07-02
Applicant: Intel Corporation
Inventor: Edward T. Grochowski , Asit K. Mishra , Robert Valentine , Mark J. Charney , Simon C. Steely, JR.
CPC classification number: G06F9/3001 , G06F9/30036 , G06F9/30145 , G06F9/3861 , G06F9/3865
Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.
-
公开(公告)号:US20170185413A1
公开(公告)日:2017-06-29
申请号:US14998151
申请日:2015-12-23
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Kshitij A. Doshi , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Deborah T. Marr
CPC classification number: G06F9/3887 , G06F9/30 , G06F9/30032 , G06F9/30036 , G06F9/30098
Abstract: Single Instruction, Multiple Data (SIMD) technologies are described. A processer may include a first register to receive a plurality of source elements and second register. The processor may receive a permute index at a third register. The conjugate permute index has elements, each of which corresponds to one of the source elements. The processor then stores each of the source elements to a position in the second register based on a select element corresponding to the source element.
-
4.
公开(公告)号:US12050912B2
公开(公告)日:2024-07-30
申请号:US18220225
申请日:2023-07-10
Applicant: Intel Corporation
Inventor: Edward T. Grochowski , Asit K. Mishra , Robert Valentine , Mark J. Charney , Simon C. Steely, Jr.
CPC classification number: G06F9/3001 , G06F9/30036 , G06F9/30145 , G06F9/3861 , G06F9/3865
Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.
-
5.
公开(公告)号:US11048508B2
公开(公告)日:2021-06-29
申请号:US16398200
申请日:2019-04-29
Applicant: Intel Corporation
Inventor: Edward T. Grochowski , Asit K. Mishra , Robert Valentine , Mark J. Charney , Simon C. Steely, Jr.
Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.
-
公开(公告)号:US20170090924A1
公开(公告)日:2017-03-30
申请号:US14866921
申请日:2015-09-26
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Edward T. Grochowski , Jonathan D. Pearce , Deborah T. Marr , Ehud Cohen , Elmoustapha OuId-Ahmed-Vall , Jesus Corbal San Adrian , Robert Valentine , Mark J. Charney , Christopher J. Hughes , Milind B. Girkar
IPC: G06F9/30
Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.
-
公开(公告)号:US10423411B2
公开(公告)日:2019-09-24
申请号:US14866921
申请日:2015-09-26
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Edward T. Grochowski , Jonathan D. Pearce , Deborah T. Marr , Ehud Cohen , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal San Adrian , Robert Valentine , Mark J. Charney , Christopher J. Hughes , Milind B. Girkar
IPC: G06F9/30
Abstract: A processor includes a decode unit to decode an instruction that is to indicate a first source packed data operand that is to include at least four data elements, to indicate a second source packed data operand that is to include at least four data elements, and to indicate one or more destination storage locations. The execution unit, in response to the instruction, is to store at least one result mask operand in the destination storage location(s). The at least one result mask operand is to include a different mask element for each corresponding data element in one of the first and second source packed data operands in a same relative position. Each mask element is to indicate whether the corresponding data element in said one of the source packed data operands equals any of the data elements in the other of the source packed data operands.
-
8.
公开(公告)号:US10275243B2
公开(公告)日:2019-04-30
申请号:US15201442
申请日:2016-07-02
Applicant: Intel Corporation
Inventor: Edward T. Grochowski , Asit K. Mishra , Robert Valentine , Mark J. Charney , Simon C. Steely, Jr.
Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.
-
9.
公开(公告)号:US20180173437A1
公开(公告)日:2018-06-21
申请号:US15384178
申请日:2016-12-19
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Deborah T. Marr , Edward T. Grochowski
IPC: G06F3/06
CPC classification number: G06F3/0614 , G06F3/0646 , G06F3/0683 , G06F9/3001 , G06F9/30036 , G06F9/3877 , G06F2212/1016
Abstract: First elements of a dense vector to be multiplied with first elements of a first row of a sparse array may be determined. The determined first elements of the dense vector may be written into a memory. A dot product for the first elements of the sparse array and the first elements of the dense vector may be calculated in a plurality of increments by multiplying a subset of the first elements of the sparse array and a corresponding subset of the first elements of the dense vector. A sequence number may be updated after each increment is completed to identify a column number and/or a row number of the sparse array for which the dot product calculations have been completed.
-
公开(公告)号:US20170185415A1
公开(公告)日:2017-06-29
申请号:US14757609
申请日:2015-12-23
Applicant: Intel Corporation
Inventor: Asit K. Mishra , Kshitij A. Doshi , Elmoustapha Ould-Ahmed-Vall , Deborah T. Marr
CPC classification number: G06F9/3889 , G06F9/30 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/30112 , G06F9/3861 , G06F9/3887
Abstract: A processor comprises a first register to store a plurality of data items at a plurality of positions within the first register, a second register, and an execution unit, operatively coupled to the first register and the second register, the execution unit comprising a logic circuit implementing a sort instruction for sorting the plurality of data items stored in the first register in an order of data item values, and storing, in the second register, a plurality of indices, wherein each index identifies a position associated with a data item stored in the first register prior to the sorting.
-
-
-
-
-
-
-
-
-