-
公开(公告)号:US20190171515A1
公开(公告)日:2019-06-06
申请号:US15831195
申请日:2017-12-04
Applicant: Intel Corporation
Inventor: Zeev Sperber , Stanislav Shwartsman , Jared W. Stark, IV , Lihu Rappoport , Igor Yanover , George Leifman
CPC classification number: G06F11/0793 , G06F9/30043 , G06F9/30058 , G06F9/3802 , G06F9/3855 , G06F11/0721 , G06F12/0215 , G06F12/0253 , G06F2212/654 , G06F2212/702
Abstract: A method for handling load faults in an out-of-order processor is described. The method includes detecting, by a memory ordering buffer of the out-of-order processor, a load fault corresponding to a load instruction that was executed out-of-order by the out-of-order processor; determining, by the memory ordering buffer, whether instant reclamation is available for resolving the load fault of the load instruction; and performing, in response to determining that instant reclamation is available for resolving the load fault of the load instruction, instant reclamation to re-fetch the load instruction for execution prior to attempting to retire the load instruction.
-
公开(公告)号:US10303605B2
公开(公告)日:2019-05-28
申请号:US15214895
申请日:2016-07-20
Applicant: INTEL CORPORATION
Inventor: Raanan Sade , Joseph Nuzman , Stanislav Shwartsman , Igor Yanover , Liron Zur
IPC: G06F12/00 , G06F13/00 , G06F12/0815 , G06F12/0893
Abstract: An example system on a chip (SoC) includes a processor, a cache, and a main memory. The SoC can include a first memory to store data in a memory line, wherein the memory line is set to an invalid state. The processor can include a processor coupled to the first memory. The processor can determine that a data size of a first data set received from an application is within a data size range. The processor can determine that an aggregate data size of the first data set and a second data set received from the application is at least a same data size as data size of the memory line. The processor can perform an invalid-to-modify (I2M) operation to change the memory line from the invalid state to a modified state. The processor can write the first data set and the second data set to the memory line.
-
公开(公告)号:US20180210842A1
公开(公告)日:2018-07-26
申请号:US15416549
申请日:2017-01-26
Applicant: Intel Corporation
Inventor: Joseph Nuzman , Raanan Sade , Igor Yanover , Ron Gabor , Amit Gradstein
IPC: G06F12/1036
CPC classification number: G06F12/1036 , G06F12/1027 , G06F2212/1016 , G06F2212/657 , G06F2212/683 , G06F2212/684
Abstract: A processing device including a linear address transformation circuit to determine that a metadata value stored in a portion of a linear address falls within a pre-defined metadata range. The metadata value corresponds to a plurality of metadata bits. The linear address transformation circuit to replace each of the plurality of the metadata bits with a constant value.
-
公开(公告)号:US20160103785A1
公开(公告)日:2016-04-14
申请号:US14881111
申请日:2015-10-12
Applicant: Intel Corporation
Inventor: Zeev Sperber , Robert Valentine , Guy Patkin , Stanislav Shwartsman , Shlomo Raikin , Igor Yanover , Gal Ofir
CPC classification number: G06F15/8007 , G06F9/30018 , G06F9/30036 , G06F9/30043 , G06F9/30145 , G06F9/345 , G06F9/3887
Abstract: Methods and apparatus are disclosed for using an index array and finite state machine for scatter/gather operations. Embodiment of apparatus may comprise: decode logic to decode a scatter/gather instruction and generate a set of micro-operations, and an index array to hold a set of indices and a corresponding set of mask elements. A finite state machine facilitates the gather operation. Address generation logic generates an address from an index of the set of indices for at least each of the corresponding mask elements having a first value. An address is accessed to load a corresponding data element if the mask element had the first value. The data element is written at an in-register position in a destination vector register according to a respective in-register position the index. Values of corresponding mask elements are changed from the first value to a second value responsive to completion of their respective loads.
-
公开(公告)号:US11915000B2
公开(公告)日:2024-02-27
申请号:US18160600
申请日:2023-01-27
Applicant: Intel Corporation
Inventor: Ahmad Yasin , Raanan Sade , Liron Zur , Igor Yanover , Joseph Nuzman
CPC classification number: G06F9/30145 , G06F9/30098 , G06F9/544 , G06F9/546 , G06F11/3037 , G06F11/348
Abstract: Systems, methods, and apparatuses relating to circuitry to precisely monitor memory store accesses are described. In one embodiment, a system includes a memory, a hardware processor core comprising a decoder to decode an instruction into a decoded instruction, an execution circuit to execute the decoded instruction to produce a resultant, a store buffer, and a retirement circuit to retire the instruction when a store request for the resultant from the execution circuit is queued into the store buffer for storage into the memory, and a performance monitoring circuit to mark the retired instruction for monitoring of post-retirement performance information between being queued in the store buffer and being stored in the memory, enable a store fence after the retired instruction to be inserted that causes previous store requests to complete within the memory, and on detection of completion of the store request for the instruction in the memory, store the post-retirement performance information in storage of the performance monitoring circuit.
-
公开(公告)号:US11693785B2
公开(公告)日:2023-07-04
申请号:US16728527
申请日:2019-12-27
Applicant: Intel Corporation
Inventor: Ron Gabor , Enrico Perla , Raanan Sade , Igor Yanover , Tomer Stark , Joseph Nuzman
IPC: G06F12/00 , G06F12/0895 , G06F12/1081 , G06F12/1009 , G06F12/0811 , G06F12/14 , G06F9/30 , G06F11/30
CPC classification number: G06F12/0895 , G06F9/30043 , G06F9/30101 , G06F11/3037 , G06F12/0811 , G06F12/1009 , G06F12/1081 , G06F12/1441 , G06F12/1466 , G06F2212/7207
Abstract: An apparatus and method for tagged memory management. For example, one embodiment of a processor comprises: execution circuitry to execute instructions and process data, at least one instruction to generate a system memory access request having a first address pointer; and address translation circuitry to determine whether to translate the first address pointer with or without metadata processing, wherein if the first address pointer is to be translated with metadata processing, the address translation circuitry to: perform a lookup in a memory metadata table to identify a memory metadata value, determine a pointer metadata value associated with the first address pointer, and compare the memory metadata value with the pointer metadata value, the comparison to generate a validation of the memory access request or a fault condition, wherein if the comparison results in a validation of the memory access request, then accessing a set of one or more address translation tables to translate the first address pointer to a first physical address and to return the first physical address responsive to the memory access request.
-
公开(公告)号:US11544062B2
公开(公告)日:2023-01-03
申请号:US16833596
申请日:2020-03-28
Applicant: Intel Corporation
Inventor: Raanan Sade , Igor Yanover , Stanislav Shwartsman , Muhammad Taher , David Zysman , Liron Zur , Yiftach Gilad
Abstract: An apparatus and method for pairing store operations. For example, one embodiment of a processor comprises: a grouping eligibility checker to evaluate a plurality of store instructions based on a set of grouping rules to determine whether two or more of the plurality of store instructions are eligible for grouping; and a dispatcher to simultaneously dispatch a first group of store instructions of the plurality of store instructions determined to be eligible for grouping by the grouping eligibility checker.
-
公开(公告)号:US11288069B2
公开(公告)日:2022-03-29
申请号:US16487755
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert Valentine , Menachem Adelman , Elmoustapha Ould-Ahmed-Vall , Bret L. Toll , Milind B. Girkar , Zeev Sperber , Mark J. Charney , Rinat Rappoport , Jesus Corbal , Stanislav Shwartsman , Igor Yanover , Alexander F. Heinecke , Barukh Ziv , Dan Baum , Yuri Gebil
Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information
-
公开(公告)号:US20210271305A1
公开(公告)日:2021-09-02
申请号:US17183518
申请日:2021-02-24
Applicant: Intel Corporation
Inventor: Alexander Gendler , Igor Yanover , Gavri Berger , Edo Hachamo , Elkana Korem , Hanan Shomroni , Daniela Kaufman , Lev Makovsky , Haim Granot
IPC: G06F1/324 , G06F1/3206
Abstract: In an embodiment, a processor includes processing cores to execute instructions; and throttling logic. The throttling logic is to: determine an average capacitance score for execution events in a sliding window; perform frequency throttling when the average capacitance score exceeds a throttling threshold; determine a count of frequency throttling instances; and in response to a determination that the count of frequency throttling instances exceeds a maximum throttling value, increase the throttling threshold and concurrently reduce a baseline frequency. Other embodiments are described and claimed.
-
公开(公告)号:US11086623B2
公开(公告)日:2021-08-10
申请号:US16487787
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Rinat Rappoport , Stanislav Shwartsman , Dan Baum , Igor Yanover , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman , Jesus Corbal , Yuri Gebil , Simon Rubanovich
Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.
-
-
-
-
-
-
-
-
-