-
公开(公告)号:US11507412B2
公开(公告)日:2022-11-22
申请号:US16861082
申请日:2020-04-28
Applicant: Intel Corporation
Inventor: Keqiang Wu , Jiwei Lu , Koichi Yamada , Yong-Fong Lee
IPC: G06F9/46
Abstract: A disclosed example apparatus includes memory; and processor circuitry to: identify a lock-protected section of instructions in the memory; replace lock/unlock instructions with transactional lock acquire and transactional lock release instructions to form a transactional process; and execute the transactional process in a speculative execution.
-
公开(公告)号:US20240303195A1
公开(公告)日:2024-09-12
申请号:US18562743
申请日:2021-12-15
Applicant: Intel Corporation
Inventor: Keqiang Wu , Lingxiang Xiang , Heidi Pan , Christopher J. Hughes , Zhe Wang
IPC: G06F12/0831 , G06F12/084 , G06F12/0891
CPC classification number: G06F12/0835 , G06F12/084 , G06F12/0891
Abstract: In one embodiment, a processor includes interconnect circuitry, processing circuitry, a first cache, and cache controller circuitry. The interconnect circuitry communicates over a processor interconnect with a second processor that includes a second cache. The processing circuitry generates a memory read request for a corresponding memory address of a memory. Based on the memory read request, the cache controller circuitry detects a cache miss in the first cache, which indicates that the first cache does not contain a valid copy of data for the corresponding memory address. Based on the cache miss, the cache controller circuitry requests the data from the second cache or the memory based on a current bandwidth utilization of the processor interconnect.
-
公开(公告)号:US20190041943A1
公开(公告)日:2019-02-07
申请号:US15867880
申请日:2018-01-11
Applicant: Intel Corporation
Inventor: Keqiang Wu , Yong-fong Lee , Krishnaswamy Viswanathan , Emad Guirguis
CPC classification number: G06F1/324 , G06F1/28 , G06F1/3206
Abstract: Systems, apparatuses and methods may provide for technology that determines a first real-time correlation between a power consumption of a processor and an operating frequency of the processor, determines a second real-time correlation between a performance level of the processor and the operating frequency of the processor, and sets the operating frequency of the processor to a value based on the first and second real-time correlations. In one example, the performance level or performance per watt of the processor decreases at one or more operating frequencies greater than the value.
-
公开(公告)号:US09411363B2
公开(公告)日:2016-08-09
申请号:US14565512
申请日:2014-12-10
Applicant: Intel Corporation
Inventor: Keqiang Wu , Jiwei Lu , Yong-Fong Lee
CPC classification number: G06F9/30065 , G06F8/52 , G06F8/70 , G06F9/30087 , G06F11/3024 , G06F11/3409 , G06F11/3466 , G06F2201/81 , G06F2201/825 , G06F2201/865 , G06F2201/88
Abstract: One embodiment provides an apparatus. The apparatus includes a processor, a chipset, a memory to store a process, and logic. The processor includes one or more core(s) and is to execute the process. The logic is to acquire performance monitoring data in response to a platform processor utilization parameter (PUP) greater than a detection utilization threshold (UT), identify a spin loop based, at least in part, on at least one of a detected hot function and/or a detected hot loop, modify the identified spin loop using binary translation to create a modified process portion, and implement redirection from the identified spin loop to the modified process portion.
Abstract translation: 一个实施例提供了一种装置。 该装置包括处理器,芯片组,用于存储处理的存储器和逻辑。 处理器包括一个或多个核心,并且是执行该过程。 逻辑是响应于大于检测利用阈值(UT)的平台处理器利用参数(PUP)来获取性能监视数据,至少部分地基于检测到的热功能和 /或检测到的热循环,使用二进制转换修改所识别的自旋循环,以创建经修改的处理部分,并且实现从所识别的旋转循环到修改的处理部分的重定向。
-
公开(公告)号:US12026106B2
公开(公告)日:2024-07-02
申请号:US17802117
申请日:2020-03-30
Applicant: INTEL CORPORATION
Inventor: Keqiang Wu , Zhidong Yu , Cheng Xu , Samuel Ortiz , Weiting Chen
CPC classification number: G06F13/1678 , G06F13/1621 , G06F13/4004
Abstract: The present disclosure provides an interconnect for a non-uniform memory architecture platform to provide remote access where data can dynamically and adaptively be compressed and decompressed at the interconnect link. A requesting interconnect link can add a delay to before transmitting requested data onto an interconnect bus, compress the data before transmission, or packetize and compress data before transmission. Likewise, a remote interconnect link can decompress request data.
-
公开(公告)号:US09760404B2
公开(公告)日:2017-09-12
申请号:US14842359
申请日:2015-09-01
Applicant: Intel Corporation
Inventor: Keqiang Wu , Kingsum Chow , Ying C. Feng , Khun Ban
CPC classification number: G06F9/5011 , G06F1/32 , G06F11/3006 , G06F11/3024 , G06F11/3409 , G06F11/3442 , G06F11/3466 , G06F11/3495 , G06F2201/81 , G06F2201/865 , G06F2201/88
Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for dynamic tuning of multiprocessor and multicore computing systems to improve application performance and scalability. A system may include a number of processing units (CPUs) and profiling circuitry configured to detect the existence of a scalability problem associated with the execution of an application on CPUs and to determine if the scalability problem is associated with an access contention or a resource constraint. The system may also include scheduling circuitry configured to bind the application to a subset of the total number of CPUs if the scalability problem is associated with access contention.
-
公开(公告)号:US09223699B2
公开(公告)日:2015-12-29
申请号:US13837069
申请日:2013-03-15
Applicant: Intel Corporation
Inventor: Keqiang Wu , Kingsum Chow , Yong-Fong Lee
IPC: G06F12/08
CPC classification number: G06F12/0802 , G06F12/0253 , G06F12/0815 , G06F12/0837
Abstract: Methods and apparatus to provide cache management in managed runtime environments are described. In one embodiment, a controller comprises logic to determine an update frequency for an object in the runtime environment and assigning the object to an unshared cache line when the update frequency exceeds an update frequency threshold. Other embodiments are also described.
Abstract translation: 描述了在受管运行时环境中提供缓存管理的方法和装置。 在一个实施例中,控制器包括用于在更新频率超过更新频率阈值时确定运行时环境中的对象的更新频率并将对象分配给非共享高速缓存行的逻辑。 还描述了其它实施例。
-
公开(公告)号:US10761586B2
公开(公告)日:2020-09-01
申请号:US15867880
申请日:2018-01-11
Applicant: Intel Corporation
Inventor: Keqiang Wu , Yong-fong Lee , Krishnaswamy Viswanathan , Emad Guirguis
IPC: G06F1/32 , G06F1/324 , G06F1/28 , G06F1/3206
Abstract: Systems, apparatuses and methods may provide for technology that determines a first real-time correlation between a power consumption of a processor and an operating frequency of the processor, determines a second real-time correlation between a performance level of the processor and the operating frequency of the processor, and sets the operating frequency of the processor to a value based on the first and second real-time correlations. In one example, the performance level or performance per watt of the processor decreases at one or more operating frequencies greater than the value.
-
公开(公告)号:US10452443B2
公开(公告)日:2019-10-22
申请号:US15670525
申请日:2017-08-07
Applicant: Intel Corporation
Inventor: Keqiang Wu , Kingsum Chow , Ying Feng , Khun Ban
Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for dynamic tuning of multiprocessor and multicore computing systems to improve application performance and scalability. A system may include a number of processing units (CPUs) and profiling circuitry configured to detect the existence of a scalability problem associated with the execution of an application on CPUs and to determine if the scalability problem is associated with an access contention or a resource constraint. The system may also include scheduling circuitry configured to bind the application to a subset of the total number of CPUs if the scalability problem is associated with access contention.
-
10.
公开(公告)号:US09954744B2
公开(公告)日:2018-04-24
申请号:US14842438
申请日:2015-09-01
Applicant: Intel Corporation
Inventor: Keqiang Wu , Kingsum Chow , Ying Feng , Khun Ban , Zhidong Yu
IPC: H04L12/26
CPC classification number: H04L43/04 , H04L43/024 , H04L43/062 , H04L43/067 , H04L43/0817 , H04L43/16
Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for estimation of application execution performance variations on a processor, without a priori knowledge of the application. A system may include network traffic data collection circuitry configured to sample a first network traffic statistic, from a network interface circuit associated with the processor, at a first sampling time interval during the application execution. The network traffic data collection circuitry may also be configured to sample a second network traffic statistic from the network interface circuit at a second sampling time interval during the application execution. The system may further include performance analysis circuitry configured to calculate a ratio of the first network traffic statistic to the second network traffic statistic and to estimate the application execution performance variation from the first sampling time interval to the second sampling time interval, wherein the estimation is proportional to the calculated ratio.
-
-
-
-
-
-
-
-
-