-
公开(公告)号:US11989555B2
公开(公告)日:2024-05-21
申请号:US15638120
申请日:2017-06-29
Applicant: Intel Corporation
Inventor: Doddaballapur N. Jayasimha , Jonas Svennebring , Samantika S. Sury , Christopher J. Hughes , Jong Soo Park , Lingxiang Xiang
CPC classification number: G06F9/3004 , G06F9/3001 , G06F9/30185 , G06F9/3836 , G06F9/46 , G06F13/28
Abstract: Disclosed embodiments relate to atomic memory operations. In one example, a method of executing an instruction atomically and with weak order includes: fetching, by fetch circuitry, the instruction from code storage, the instruction including an opcode, a source identifier, and a destination identifier, decoding, by decode circuitry, the fetched instruction, selecting, by a scheduling circuit, an execution circuit among multiple circuits in a system, scheduling, by the scheduling circuit, execution of the decoded instruction out of order with respect to other instructions, with an order selected to optimize at least one of latency, throughput, power, and performance, and executing the decoded instruction, by the execution circuit, to: atomically read a datum from a location identified by the destination identifier, perform an operation on the datum as specified by the opcode, the operation to use a source operand identified by the source identifier, and write a result back to the location.
-
公开(公告)号:US20240362021A1
公开(公告)日:2024-10-31
申请号:US18670427
申请日:2024-05-21
Applicant: Intel Corporation
Inventor: Doddaballapur N. Jayasimha , Jonas Svennebring , Samantika S. Sury , Christopher J. Hughes , Jong Soo Park , Lingxiang Xiang
CPC classification number: G06F9/3004 , G06F9/3001 , G06F9/30185 , G06F9/3836 , G06F9/46 , G06F13/28
Abstract: Disclosed embodiments relate to atomic memory operations. In one example, a method of executing an instruction atomically and with weak order includes: fetching, by fetch circuitry, the instruction from code storage, the instruction including an opcode, a source identifier, and a destination identifier, decoding, by decode circuitry, the fetched instruction, selecting, by a scheduling circuit, an execution circuit among multiple circuits in a system, scheduling, by the scheduling circuit, execution of the decoded instruction out of order with respect to other instructions, with an order selected to optimize at least one of latency, throughput, power, and performance, and executing the decoded instruction, by the execution circuit, to: atomically read a datum from a location identified by the destination identifier, perform an operation on the datum as specified by the opcode, the operation to use a source operand identified by the source identifier, and write a result back to the location.
-
公开(公告)号:US20240152448A1
公开(公告)日:2024-05-09
申请号:US18284265
申请日:2021-06-21
Applicant: Intel Corporation
Inventor: Zhe Wang , Lingxiang Xiang , Christopher J. Hughes
IPC: G06F12/02 , G06F12/0811
CPC classification number: G06F12/023 , G06F12/0811
Abstract: An embodiment of an integrated circuit may comprise circuitry communicatively coupled to two or more sub-non-uniform memory access clusters (SNCs) to allocate a specified memory space in the two or more SNCs in accordance with a SNC memory allocation policy indicated from a request to initialize the specified memory space. An embodiment of an apparatus may comprise decode circuitry to decode a single instruction, the single instruction to include a field for an opcode, and execution circuitry to execute the decoded instruction according to the opcode to provide an indicated SNC memory allocation policy (e.g., a SNC policy hint). Other embodiments are disclosed and claimed.
-
公开(公告)号:US20240303195A1
公开(公告)日:2024-09-12
申请号:US18562743
申请日:2021-12-15
Applicant: Intel Corporation
Inventor: Keqiang Wu , Lingxiang Xiang , Heidi Pan , Christopher J. Hughes , Zhe Wang
IPC: G06F12/0831 , G06F12/084 , G06F12/0891
CPC classification number: G06F12/0835 , G06F12/084 , G06F12/0891
Abstract: In one embodiment, a processor includes interconnect circuitry, processing circuitry, a first cache, and cache controller circuitry. The interconnect circuitry communicates over a processor interconnect with a second processor that includes a second cache. The processing circuitry generates a memory read request for a corresponding memory address of a memory. Based on the memory read request, the cache controller circuitry detects a cache miss in the first cache, which indicates that the first cache does not contain a valid copy of data for the corresponding memory address. Based on the cache miss, the cache controller circuitry requests the data from the second cache or the memory based on a current bandwidth utilization of the processor interconnect.
-
公开(公告)号:US20220206945A1
公开(公告)日:2022-06-30
申请号:US17134254
申请日:2020-12-25
Applicant: Intel Corporation
Inventor: Carl J. Beckmann , Samantika S. Sury , Christopher J. Hughes , Lingxiang Xiang , Rahul Agrawal
IPC: G06F12/0811 , G06F12/0817 , G06F12/0862 , G06F12/084
Abstract: Disclosed embodiments relate to atomic memory operations. In one example, an apparatus includes multiple processor cores, a cache hierarchy, a local execution unit, and a remote execution unit, and an adaptive remote atomic operation unit. The cache hierarchy includes a local cache at a first level and a shared cache at a second level. The local execution unit is to perform an atomic operation at the first level if the local cache is a storing a cache line including data for the atomic operation. The remote execution unit is to perform the atomic operation at the second level. The adaptive remote atomic operation unit is to determine whether to perform the first atomic operation at the first level or at the second level and whether to copy the cache line from the shared cache to the local cache.
-
公开(公告)号:US12216579B2
公开(公告)日:2025-02-04
申请号:US17134254
申请日:2020-12-25
Applicant: Intel Corporation
Inventor: Carl J. Beckmann , Samantika S. Sury , Christopher J. Hughes , Lingxiang Xiang , Rahul Agrawal
IPC: G06F12/0811 , G06F12/0817 , G06F12/084 , G06F12/0862
Abstract: Disclosed embodiments relate to atomic memory operations. In one example, an apparatus includes multiple processor cores, a cache hierarchy, a local execution unit, and a remote execution unit, and an adaptive remote atomic operation unit. The cache hierarchy includes a local cache at a first level and a shared cache at a second level. The local execution unit is to perform an atomic operation at the first level if the local cache is a storing a cache line including data for the atomic operation. The remote execution unit is to perform the atomic operation at the second level. The adaptive remote atomic operation unit is to determine whether to perform the first atomic operation at the first level or at the second level and whether to copy the cache line from the shared cache to the local cache.
-
公开(公告)号:US12210446B2
公开(公告)日:2025-01-28
申请号:US18284265
申请日:2021-06-21
Applicant: Intel Corporation
Inventor: Zhe Wang , Lingxiang Xiang , Christopher J. Hughes
IPC: G06F12/00 , G06F9/30 , G06F12/02 , G06F12/0811
Abstract: An embodiment of an integrated circuit may comprise circuitry communicatively coupled to two or more sub-non-uniform memory access clusters (SNCs) to allocate a specified memory space in the two or more SNCs in accordance with a SNC memory allocation policy indicated from a request to initialize the specified memory space. An embodiment of an apparatus may comprise decode circuitry to decode a single instruction, the single instruction to include a field for an opcode, and execution circuitry to execute the decoded instruction according to the opcode to provide an indicated SNC memory allocation policy (e.g., a SNC policy hint). Other embodiments are disclosed and claimed.
-
公开(公告)号:US20190004810A1
公开(公告)日:2019-01-03
申请号:US15638120
申请日:2017-06-29
Applicant: Intel Corporation
Inventor: Doddaballapur N. Jayasimha , Jonas Svennebring , Samantika S. Sury , Christopher J. Hughes , Jong Soo Park , Lingxiang Xiang
IPC: G06F9/38 , G06F12/0893 , G06F9/26 , G06F13/28
Abstract: Disclosed embodiments relate to atomic memory operations. In one example, a method of executing an instruction atomically and with weak order includes: fetching, by fetch circuitry, the instruction from code storage, the instruction including an opcode, a source identifier, and a destination identifier, decoding, by decode circuitry, the fetched instruction, selecting, by a scheduling circuit, an execution circuit among multiple circuits in a system, scheduling, by the scheduling circuit, execution of the decoded instruction out of order with respect to other instructions, with an order selected to optimize at least one of latency, throughput, power, and performance, and executing the decoded instruction, by the execution circuit, to: atomically read a datum from a location identified by the destination identifier, perform an operation on the datum as specified by the opcode, the operation to use a source operand identified by the source identifier, and write a result back to the location.
-
-
-
-
-
-
-