-
公开(公告)号:US20210357214A1
公开(公告)日:2021-11-18
申请号:US17334901
申请日:2021-05-31
Applicant: Intel Corporation
Inventor: Michael Mishaeli , Jason W. Brandt , Gilbert Neiger , Asit K. Mallick , Rajesh M. Sankaran , Raghunandan Makaram , Benjamin C. Chaffin , James B. Crossland , H. Peter Anvin
Abstract: A processor of an aspect includes a decode unit to decode a user-level suspend thread instruction that is to indicate a first alternate state. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the instruction at a user privilege level. The execution unit in response to the instruction, is to: (a) suspend execution of a user-level thread, from which the instruction is to have been received; (b) transition a logical processor, on which the user-level thread was to have been running, to the indicated first alternate state; and (c) resume the execution of the user-level thread, when the logical processor is in the indicated first alternate state, with a latency that is to be less than half a latency that execution of a thread can be resumed when the logical processor is in a halt processor power state.
-
公开(公告)号:US20170228233A1
公开(公告)日:2017-08-10
申请号:US15019112
申请日:2016-02-09
Applicant: INTEL CORPORATION
Inventor: Michael Mishaeli , Jason W. Brandt , Gilbert Neiger , Asit K. Mallick , Rajesh M. Sankaran , Raghunandan Makaram , Benjamin C. Chaffin , James B. Crossland , H. Peter Anvin
CPC classification number: G06F9/3009 , G06F9/3004 , G06F9/30076 , G06F9/3851 , G06F9/52 , G06F13/4068
Abstract: A processor of an aspect includes a decode unit to decode a user-level suspend thread instruction that is to indicate a first alternate state. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the instruction at a user privilege level. The execution unit in response to the instruction, is to: (a) suspend execution of a user-level thread, from which the instruction is to have been received; (b) transition a logical processor, on which the user-level thread was to have been running, to the indicated first alternate state; and (c) resume the execution of the user-level thread, when the logical processor is in the indicated first alternate state, with a latency that is to be less than half a latency that execution of a thread can be resumed when the logical processor is in a halt processor power state.
-
3.
公开(公告)号:US20170185458A1
公开(公告)日:2017-06-29
申请号:US14998217
申请日:2015-12-24
Applicant: Intel Corporation
Inventor: Benjamin C. Chaffin , Robert J. Kyanko , Avinash Sodani
IPC: G06F9/52
CPC classification number: G06F9/52 , G06F12/0806 , G06F2201/885 , G06F2209/521
Abstract: Instructions and logic provide user-level thread synchronization with MONITOR and MWAIT instructions. One or more model specific registers (MSRs) in a processor may be configured in a first execution state to specify support of a user-level thread synchronization architecture. Embodiments include multiple hardware threads or processing cores, corresponding monitored address state storage to store a last monitored address for each of a plurality of execution threads that issues a MONITOR request, cache memory to record MONITOR requests and associated states for addresses of memory storage locations, and responsive to receipt of an MWAIT request for the address, to record an associated wait-to-trigger state of monitored addresses for execution cores associated with an MWAIT request; wherein the execution core is to transition a requesting thread to an optimized sleep state responsive to the receipt of said MWAIT request when said one or more MSRs are configured in the first execution state.
-
公开(公告)号:US11023233B2
公开(公告)日:2021-06-01
申请号:US15019112
申请日:2016-02-09
Applicant: INTEL CORPORATION
Inventor: Michael Mishaeli , Jason W. Brandt , Gilbert Neiger , Asit K. Mallick , Rajesh M. Sankaran , Raghunandan Makaram , Benjamin C. Chaffin , James B. Crossland , H. Peter Anvin
Abstract: A processor of an aspect includes a decode unit to decode a user-level suspend thread instruction that is to indicate a first alternate state. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the instruction at a user privilege level. The execution unit in response to the instruction, is to: (a) suspend execution of a user-level thread, from which the instruction is to have been received; (b) transition a logical processor, on which the user-level thread was to have been running, to the indicated first alternate state; and (c) resume the execution of the user-level thread, when the logical processor is in the indicated first alternate state, with a latency that is to be less than half a latency that execution of a thread can be resumed when the logical processor is in a halt processor power state.
-
5.
公开(公告)号:US09898351B2
公开(公告)日:2018-02-20
申请号:US14998217
申请日:2015-12-24
Applicant: Intel Corporation
Inventor: Benjamin C. Chaffin , Robert J. Kyanko , Avinash Sodani
IPC: G06F9/52 , G06F12/0806
CPC classification number: G06F9/52 , G06F12/0806 , G06F2201/885 , G06F2209/521
Abstract: Instructions and logic provide user-level thread synchronization with MONITOR and MWAIT instructions. One or more model specific registers (MSRs) in a processor may be configured in a first execution state to specify support of a user-level thread synchronization architecture. Embodiments include multiple hardware threads or processing cores, corresponding monitored address state storage to store a last monitored address for each of a plurality of execution threads that issues a MONITOR request, cache memory to record MONITOR requests and associated states for addresses of memory storage locations, and responsive to receipt of an MWAIT request for the address, to record an associated wait-to-trigger state of monitored addresses for execution cores associated with an MWAIT request; wherein the execution core is to transition a requesting thread to an optimized sleep state responsive to the receipt of said MWAIT request when said one or more MSRs are configured in the first execution state.
-
公开(公告)号:US12020031B2
公开(公告)日:2024-06-25
申请号:US17334901
申请日:2021-05-31
Applicant: Intel Corporation
Inventor: Michael Mishaeli , Jason W. Brandt , Gilbert Neiger , Asit K. Mallick , Rajesh M. Sankaran , Raghunandan Makaram , Benjamin C. Chaffin , James B. Crossland , H. Peter Anvin
CPC classification number: G06F9/3009 , G06F9/3004 , G06F9/30076 , G06F9/3851 , G06F9/485 , G06F13/4068
Abstract: A processor of an aspect includes a decode unit to decode a user-level suspend thread instruction that is to indicate a first alternate state. The processor also includes an execution unit coupled with the decode unit. The execution unit is to perform the instruction at a user privilege level. The execution unit in response to the instruction, is to: (a) suspend execution of a user-level thread, from which the instruction is to have been received; (b) transition a logical processor, on which the user-level thread was to have been running, to the indicated first alternate state; and (c) resume the execution of the user-level thread, when the logical processor is in the indicated first alternate state, with a latency that is to be less than half a latency that execution of a thread can be resumed when the logical processor is in a halt processor power state.
-
公开(公告)号:US09886396B2
公开(公告)日:2018-02-06
申请号:US14581285
申请日:2014-12-23
Applicant: Intel Corporation
Inventor: Roger Gramunt , Rammohan Padmanabhan , Ramon Matas , Neal S. Moyer , Benjamin C. Chaffin , Avinash Sodani , Alexey P. Suprun , Vikram S. Sundaram , Chung-Lun Chan , Gerardo A. Fernandez , Julio Gago , Michael S. Yang , Aditya Kesiraju
CPC classification number: G06F12/122 , G06F9/384 , G06F9/3851 , G06F9/3855 , G06F9/3859 , G06F9/4806 , G06F2212/62
Abstract: In one embodiment, a processor includes a frontend unit having an instruction decoder to receive and to decode instructions of a plurality of threads, an execution unit coupled to the instruction decoder to receive and execute the decoded instructions, and an instruction retirement unit having a retirement logic to receive the instructions from the execution unit and to retire the instructions associated with one or more of the threads that have an instruction or an event pending to be retired. The instruction retirement unit includes a thread arbitration logic to select one of the threads at a time and to dispatch the selected thread to the retirement logic for retirement processing.
-
公开(公告)号:US09715432B2
公开(公告)日:2017-07-25
申请号:US14581859
申请日:2014-12-23
Applicant: INTEL CORPORATION
Inventor: Ramon Matas , Roger Gramunt , Chung-Lun Chan , Benjamin C. Chaffin , Aditya Kesiraju , Jonathan C. Hall , Jesus Corbal
CPC classification number: G06F11/141 , G06F9/30036 , G06F9/30072 , G06F9/38 , G06F9/3859 , G06F9/3865
Abstract: Exemplary aspects are directed toward resolving fault suppression in hardware, which at the same time does not incur a performance hit. For example, when multiple instructions are executing simultaneously, a mask can specify which elements need not be executed. If the mask is disabled, those elements do not need to be executed. A determination is then made as to whether a fault happens in one of the elements that have been disabled. If there is a fault in one of the elements that has been disabled, a state machine re-fetches the instructions in a special mode. More specifically, the state machine determines if the fault is on a disabled element, and if the fault is on a disabled element, then the state machine specifies that the fault should be ignored. If during the first execution there was no mask, if there is an error present during execution, then the element is re-run with the mask to see if the error is a “real” fault.
-
公开(公告)号:US10175986B2
公开(公告)日:2019-01-08
申请号:US15589510
申请日:2017-05-08
Applicant: Intel Corporation
Inventor: Roger Gramunt , Ramon Matas , Benjamin C. Chaffin , Neal S. Moyer , Rammohan Padmanabhan , Alexey P. Suprun , Matthew G. Smith
Abstract: A processor includes a logic for stateless capture of data linear addresses (DLA) during precise event based sampling (PEBS) for an out-of-order execution engine. The engine may include a PEBS unit with logic to increment a counter each time an instance of a designated micro-op is retired a reorder buffer, capture output DLA referenced by an instance of the micro-op that executes after the counter overflows, set a captured bit associated with a reorder buffer identifier for the instance of the micro-op, and store a PEBS record in a debug storage when the instance of the micro-op is retired from the reorder buffer. The designated micro-op references a DLA of a memory accessible to the processor.
-
10.
公开(公告)号:US09804842B2
公开(公告)日:2017-10-31
申请号:US14581535
申请日:2014-12-23
Applicant: Intel Corporation
Inventor: Jesus Corbal San Adrian , Dennis R. Bradford , Benjamin C. Chaffin , Taraneh Bahrami , Jonathan C. Hall , Thomas B. Maciukenas , Roger Gramunt , Rohan Sharma
CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/30032 , G06F9/30072 , G06F9/30101 , G06F15/8084
Abstract: An apparatus and method for efficiently managing the architectural state of a processor. For example, one embodiment of a processor comprises: a source mask register to be logically subdivided into at least a first portion to store a usable portion of a mask value and a second portion to store an indication of whether the usable portion of the mask value has been updated; a control register to store an unusable portion of the mask value; architectural state management logic to read the indication to determine whether the mask value has been updated prior to performing a store operation, wherein if the mask value has been updated, then the architectural state management logic is to read the usable portion of the mask value from the first portion of the source mask register and zero out bits of the unusable portion of the mask value to generate a final mask value to be saved to memory, and wherein if the mask value has not been updated, then the architectural state management logic is to concatenate the usable portion of the mask value with the unusable portion of the mask value read from the control register to generate a final mask value to be saved to memory.
-
-
-
-
-
-
-
-
-