-
公开(公告)号:US10687127B2
公开(公告)日:2020-06-16
申请号:US15395174
申请日:2016-12-30
Applicant: Intel Corporation
Inventor: Johan G. Van De Groenendaal , Mrittika Ganguli , Ahmad Yasin
IPC: H04L29/08 , H04L12/24 , H04L12/26 , H04Q11/00 , H03M7/30 , H03M7/40 , G06F16/901 , G06F3/06 , H04L12/811 , G06F1/20 , G11C7/10 , H05K7/14 , G06F1/18 , G06F13/40 , H05K5/02 , G08C17/02 , H04L12/851 , G06F9/50 , H04L12/911 , G06F12/109 , H04L29/06 , G11C14/00 , G11C5/02 , G11C11/56 , G02B6/44 , G06F8/65 , G06F12/14 , G06F13/16 , H04B10/25 , G06F9/4401 , G02B6/38 , G02B6/42 , B25J15/00 , B65G1/04 , H05K7/20 , H04L12/931 , H04L12/939 , H04W4/02 , H04L12/751 , G06F13/42 , H05K1/18 , G05D23/19 , G05D23/20 , H04L12/927 , H05K1/02 , H04L12/781 , H04Q1/04 , G06F12/0893 , H05K13/04 , G11C5/06 , G06F11/14 , G06F11/34 , G06F12/0862 , G06F15/80 , H04L12/919 , G06F12/10 , G06Q10/06 , G07C5/00 , H04L12/28 , H04L29/12 , H04L9/06 , H04L9/14 , H04L9/32 , H04L12/933 , H04L12/947 , G06F9/30 , G06F9/38 , G06F9/54 , H04L12/10 , H04W4/80 , G06Q10/08 , G06Q10/00 , G06Q50/04
Abstract: Technologies for managing the efficiency of workload execution in a managed node include a managed node that includes one or more processors that each include multiple cores. The managed nodes is to execute threads of workloads assigned to the managed node, generate telemetry data indicative of an efficiency of execution of the threads, determine, as a function of the telemetry data, an adjustment to a configuration of the threads among the cores to increase the efficiency of the execution of the threads, and apply the determined adjustment. Other embodiments are also described and claimed.
-
公开(公告)号:US10592244B2
公开(公告)日:2020-03-17
申请号:US15423143
申请日:2017-02-02
Applicant: INTEL CORPORATION
Inventor: Michael W. Chynoweth , Jonathan D. Combs , Joseph K. Olivas , Beeman C. Strong , Rajshree A. Chabukswar , Ahmad Yasin , Jason W. Brandt , Ofer Levy , John M. Esper , Andreas Kleen , Christopher M. Chrulski
IPC: G06F9/30
Abstract: An example processor that includes a decoder, an execution circuit, a counter, and a last branch recorder (LBR) register. The decoder may decode a branch instruction for a program. The execution circuit may be coupled to the decoder, where the execution circuit may execute the branch instruction. The counter may be coupled to the execution circuit, where the counter may store a cycle count. The LBR register coupled to the execution circuit, where the LBR register may include a counter field to store a first value of the counter when the branch instruction is executed and a type field to store type information indicating a type of the branch instruction.
-
公开(公告)号:US20190056939A1
公开(公告)日:2019-02-21
申请号:US15918927
申请日:2018-03-12
Applicant: Intel Corporation
Inventor: Ahmad Yasin
Abstract: A processor includes a front end, an execution unit, a retirement stage, a counter, and a performance monitoring unit. The front end includes logic to receive an event instruction to enable supervision of a front end event that will delay execution of instructions. The execution unit includes logic to set a register with parameters for supervision of the front end event. The front end further includes logic to receive a candidate instruction and match the candidate instruction to the front end event. The counter includes logic to generate the front end event upon retirement of the candidate instruction.
-
公开(公告)号:US20180253370A1
公开(公告)日:2018-09-06
申请号:US15972390
申请日:2018-05-07
Applicant: INTEL CORPORATION
Inventor: Matthew C. Merten , Beeman C. Strong , Michael W. Chynoweth , Grant G. Zhou , Andreas Kleen , Kimberly C. Weier , Angela D. Schmid , Stanislav Bratanov , Seth Abraham , Jason W. Brandt , Ahmad Yasin
CPC classification number: G06F11/3636 , G06F9/45558 , G06F2009/45591 , H04L41/0613 , H04L43/04
Abstract: A processor is to execute and retire instructions for a virtual machine. A reload register is coupled to the core is to store a reload value. A performance monitoring counter (PMC) register is coupled to the reload register and an event-based sampler operatively is coupled to the reload register and the PMC register. The event-based sampler includes circuitry to load the reload value into the PMC register and increment the PMC register after detecting each occurrence of an event of a certain type as a result of execution of the instructions. Upon detecting an occurrence of the event after the PMC register reaches a predetermined trigger value, the event-based sampler is to execute microcode to generate field data for elements within a sampling record, wherein the field data relates to a current processor state of execution, and reload the reload value from the reload register into the PMC register.
-
公开(公告)号:US20180027066A1
公开(公告)日:2018-01-25
申请号:US15395174
申请日:2016-12-30
Applicant: Intel Corporation
Inventor: Johan G. Van De Groenendaal , Mrittika Ganguli , Ahmad Yasin
Abstract: Technologies for managing the efficiency of workload execution in a managed node include a managed node that includes one or more processors that each include multiple cores. The managed nodes is to execute threads of workloads assigned to the managed node, generate telemetry data indicative of an efficiency of execution of the threads, determine, as a function of the telemetry data, an adjustment to a configuration of the threads among the cores to increase the efficiency of the execution of the threads, and apply the determined adjustment. Other embodiments are also described and claimed.
-
公开(公告)号:US20180004532A1
公开(公告)日:2018-01-04
申请号:US15200326
申请日:2016-07-01
Applicant: Intel Corporation
Inventor: Ahmad Yasin , Eti Pardo-Fridman , Ofer Levy
IPC: G06F9/38 , G06F9/30 , G06F12/0875
Abstract: A processor includes a front end including circuitry to decode an instruction from an instruction stream and a core including circuitry to process the instruction. The core includes an execution pipeline, a dynamic core frequency logic unit, and a counter compensation logic unit. The execution pipeline includes circuitry to execute the instruction. The dynamic core frequency logic unit includes circuitry to squash a clock of the core to reduce a core frequency. The clock may not be visible to software. The counter compensation logic unit includes circuitry to adjust a performance counter increment associated with a performance counter based on at least the dynamic core frequency logic unit circuitry to squash a clock of the core to reduce a core frequency.
-
47.
公开(公告)号:US20180004521A1
公开(公告)日:2018-01-04
申请号:US15200676
申请日:2016-07-01
Applicant: Intel Corporation
Inventor: Andreas Kleen , Raanan Sade , Ahmad Yasin , Ravi Rajwar , Robert S. Chappell , Roman Dementiev
IPC: G06F9/30 , G06F12/0815 , G06F3/06 , G06F12/084
CPC classification number: G06F9/30043 , G06F3/0619 , G06F3/0653 , G06F3/0656 , G06F3/0673 , G06F9/3004 , G06F9/30145 , G06F9/3834 , G06F11/30 , G06F12/0815 , G06F12/084
Abstract: A method of analyzing aborts of transactional execution transactions. Starting a transactional execution transaction with a first logical processor. Performing, with a second logical processor, store to memory instructions, while the first logical processor is performing the transactional execution transaction. Capturing memory addresses of, and instruction pointer values associated with, at least a sample of the store to memory instructions. Performing, with the second logical processor, a first store to memory instruction to a first memory address, which is to cause the transactional execution transaction to abort. Capturing the first memory address. Determining an instruction pointer value associated with the first store to memory instruction by correlating at least the captured first memory address with the captured memory addresses of said at least the sample of the store to memory instructions.
-
公开(公告)号:US20170371769A1
公开(公告)日:2017-12-28
申请号:US15194881
申请日:2016-06-28
Applicant: INTEL CORPORATION
Inventor: Matthew C. Merten , Beeman C. Strong , Michael W. Chynoweth , Grant G. Zhou , Andreas Kleen , Kimberly C. Weier , Angela D. Schmid , Stanislav Bratanov , Seth Abraham , Jason W. Brandt , Ahmad Yasin
CPC classification number: G06F11/3636 , G06F9/45558 , G06F2009/45591 , H04L41/0613 , H04L43/04
Abstract: A core includes a memory buffer and executes an instruction within a virtual machine. A processor tracer captures trace data and formats the trace data as trace data packets. An event-based sampler generates field data for a sampling record in response to occurrence of an event of a certain type as a result of execution of the instruction. The processor tracer, upon receipt of the field data: formats the field data into elements of the sampling record as a group of record packets; inserts the group of record packets between the trace data packets as a combined packet stream; and stores the combined packet stream in the memory buffer as a series of output pages. The core, when in guest profiling mode, executes a virtual machine monitor to map output pages of the memory buffer to host physical pages of main memory using multilevel page tables.
-
公开(公告)号:US09829957B2
公开(公告)日:2017-11-28
申请号:US14225960
申请日:2014-03-26
Applicant: Intel Corporation
Inventor: Ahmad Yasin , Nir Rosenzweig , Eliezer Weissmann , Efraim Efi Rotem
IPC: G06F1/32
CPC classification number: G06F1/3243 , G06F1/324 , Y02D10/126 , Y02D10/152
Abstract: A processing device implementing performance scalability prediction is disclosed. A processing device of the disclosure includes a first counter to increment with each cycle of the processing device in which threads of the processing device are active. The processing device further includes a second counter to increment with each cycle of the processing device in which execution units of the processing device are stalled for one of the threads, and an access request from the one of the threads to memory external to the processing device is pending.
-
公开(公告)号:US09690588B2
公开(公告)日:2017-06-27
申请号:US15155204
申请日:2016-05-16
Applicant: Intel Corporation
Inventor: Ahmad Yasin , Michael W. Chynoweth , Ofer Levy , Jason W. Brandt , Angela Schmid
CPC classification number: G06F9/3806 , G06F9/30058 , G06F9/30098 , G06F11/3419 , G06F11/348 , G06F2201/865 , G06F2201/88
Abstract: A processing device implementing an elapsed cycle timer in last branch records (LBRs) is disclosed. A processing device of the disclosure includes a last branch record (LBR) counter to iterate with each cycle of the processing device. The processing device further includes at least one register communicably coupled to the LBR counter, the at least one register to provide an LBR structure comprising a plurality of LBR entries. An LBR entry of the plurality of LBR entries includes an address instruction pointer (IP) of a branch instruction executed by the processing device, an address IP of a target of the branch instruction, and an elapsed time field that stores a value of the LBR counter in response to creation of the LBR entry.
-
-
-
-
-
-
-
-
-