Patent search ap:("Intel Corporation") AND inv:"David Keppel" Page 2

11.

发明授权
Apparatus and method for flushing dirty cache lines based on cache activity levels 有权

公开(公告)号：US09965023B2

公开(公告)日：2018-05-08

申请号：US15264548

申请日：2016-09-13

Applicant: Intel Corporation

Inventor： David Keppel , Kelvin Kwan , Jawad Nasrullah

IPC: G06F12/08 , G06F1/32 , G06F12/121 , G06F12/0804 , G06F12/0891 , G06F12/0897 , G06F12/122

CPC classification number: G06F1/3287 , G06F1/3243 , G06F1/3275 , G06F12/0804 , G06F12/0891 , G06F12/0897 , G06F12/121 , G06F12/122 , G06F2212/1028 , Y02D10/13 , Y02D10/14 , Y02D10/152

Abstract: A method performed by a multi-core processor is described. The method includes, while a core is executing program code, reading a dirty cache line from the core's last level cache and sending the dirty cache line from the core for storage external from the core, where, the dirty cache line has not been evicted from the cache nor requested by another core or processor.

12.

发明授权
Controlling power consumption of a processor using interrupt-mediated on-off keying 有权

公开(公告)号：US09766685B2

公开(公告)日：2017-09-19

申请号：US13894642

申请日：2013-05-15

Applicant: Intel Corporation

Inventor： David Keppel , Jawad Nasrullah

IPC: G06F1/32

CPC classification number: G06F1/324 , G06F1/3206 , G06F1/3287 , Y02D10/171 , Y02D50/20

Abstract: In an embodiment, a processor includes a logic to cause at least one core to operate with a power control cycle including a plurality of on times and a plurality of off times according to an ON-OFF keying protocol, where the on and off times vary depending on whether and when an interrupt is incurred. Other embodiments are described and claimed.

13.

发明申请
TECHNOLOGIES FOR AGGREGATION-BASED MESSAGE SYNCHRONIZATION 审中-公开

公开(公告)号：US20170085442A1

公开(公告)日：2017-03-23

申请号：US14862854

申请日：2015-09-23

Applicant: Intel Corporation

Inventor： James Dinan , Mario Flajslik , David Keppel , Ulf R. Hanebutte

IPC: H04L12/26 , H04L12/863

CPC classification number: H04L47/62 , H04L43/04 , H04L43/16 , H04L47/10 , H04L47/30 , H04L47/41 , Y02D50/30

Abstract: Technologies for aggregation-based message processing include multiple computing nodes in communication over a network. A computing node receives a message from a remote computing node, increments an event counter in response to receiving the message, determines whether an event trigger is satisfied in response to incrementing the counter, and writes a completion event to an event queue if the event trigger is satisfied. An application of the computing node monitors the event queue for the completion event. The application may be executed by a processor core of the computing node, and the other operations may be performed by a host fabric interface of the computing node. The computing node may be a target node and count one-sided messages received from an initiator node, or the computing node may be an initiator node and count acknowledgement messages received from a target node. Other embodiments are described and claimed.

14.

发明授权
Apparatus and method for reduced core entry into a power state having a powered down core cache 有权
Title translation: 用于减少核心进入具有掉电核心高速缓存的电力状态的装置和方法

公开(公告)号：US09442849B2

公开(公告)日：2016-09-13

申请号：US13730915

申请日：2012-12-29

Applicant: Intel Corporation

Inventor： David Keppel , Kelvin Kwan , Jawad Nasrullah

IPC: G06F12/12 , G06F12/08 , G06F1/32

CPC classification number: G06F1/3287 , G06F1/3243 , G06F1/3275 , G06F12/0804 , G06F12/0891 , G06F12/0897 , G06F12/121 , G06F12/122 , G06F2212/1028 , Y02D10/13 , Y02D10/14 , Y02D10/152

Abstract: A method performed by a multi-core processor is described. The method includes, while a core is executing program code, reading a dirty cache line from the core's last level cache and sending the dirty cache line from the core for storage external from the core, where, the dirty cache line has not been evicted from the cache nor requested by another core or processor.

Abstract translation: 描述由多核处理器执行的方法。该方法包括，当核心正在执行程序代码时，从核心的最后一级缓存中读取脏高速缓存行，并从核心发送脏高速缓存行，以从外部存储于核心，其中，脏高速缓存行尚未被驱逐缓存也不是由另一核心或处理器请求。

15.

发明授权
Reduced network load with combined put or get and receiver-managed offset 有权

公开(公告)号：US12242753B2

公开(公告)日：2025-03-04

申请号：US17485114

申请日：2021-09-24

Applicant: Intel Corporation

Inventor： David Keppel , David M. Ozog

IPC: G06F3/06

Abstract: Methods and apparatus for reduced network load with receiver-managed offset (RMO) PUT or GET messages. An RMO PUT message including an RMO key, data, and a length is sent from an initiator to a target, where the RMO key is extracted by a Network Interface controller (NIC), SmartNIC, or Infrastructure Processing Unit and used to identify an address or address offset of a memory buffer in a target memory at which to write the data. An RMO GET message is sent from an initiator to a target and includes an RMO key, a source buffer on the target, and a length. The target processes the RMO GET, reads the length of data from its source buffer, and returns a message to the initiator including the RMO key, the read data, and the length. The RMO key is extracted and used to identify an address or address offset of a memory buffer in a memory on the initiator in which to write the read data.

16.

发明授权
Programmable address range engine for larger region sizes 有权

公开(公告)号：US11989135B2

公开(公告)日：2024-05-21

申请号：US16786815

申请日：2020-02-10

Applicant: Intel Corporation

Inventor： Farah E. Fargo , Mitchell Diamond , David Keppel , Samantika S. Sury , Binh Pham , Shobha Vissapragada

IPC: G06F12/10 , G06F12/1027

CPC classification number: G06F12/1027 , G06F2212/657

Abstract: Examples described herein relate to a computing system supporting custom page sized ranges for an application to map contiguous memory regions instead of many smaller sized pages. An application can request a custom range size. An operating system can allocate a contiguous physical memory region to a virtual address range by specifying a custom range sizes that are larger or smaller than the normal general page sizes. Virtual-to-physical address translation can occur using an address range circuitry and translation lookaside buffer in parallel. The address range circuitry can determine if a custom entry is available to use to identify a physical address translation for the virtual address. Physical address translation can be performed by transforming the virtual address in some examples.

17.

发明授权
Apparatus and method for efficiently implementing a processor pipeline 有权

公开(公告)号：US10409763B2

公开(公告)日：2019-09-10

申请号：US14319265

申请日：2014-06-30

Applicant: Intel Corporation

Inventor： Patrick P. Lai , Ethan Schuchman , David Keppel , Denis M. Khartikov , Polychronis Xekalakis , Joshua B. Fryman , Allan D. Knies , Naveen Neelakantam , Gregor Stellpflug , John H. Kelm , Mirem Hyuseinova Seidahmedova , Demos Pavlou , Jaroslaw Topp

IPC: G06F15/76 , G06F9/30 , G06F9/38 , G06F9/46 , G06F9/455

Abstract: Various different embodiments of the invention are described including: (1) a method and apparatus for intelligently allocating threads within a binary translation system; (2) data cache way prediction guided by binary translation code morphing software; (3) fast interpreter hardware support on the data-side; (4) out-of-order retirement; (5) decoupled load retirement in an atomic OOO processor; (6) handling transactional and atomic memory in an out-of-order binary translation based processor; and (7) speculative memory management in a binary translation based out of order processor.

18.

发明授权
Fabric-integrated data pulling engine 有权

公开(公告)号：US10200310B2

公开(公告)日：2019-02-05

申请号：US14757892

申请日：2015-12-24

Applicant: Intel Corporation

Inventor： James Dinan , Mario Flajslik , Keith Underwood , David Keppel , Ulf Rainer Hanebutte

IPC: G06F15/167 , H04L12/931 , G06F15/173

Abstract: In an example, there is disclosed a compute node, comprising: first one or more logic elements comprising a data producer engine to produce a datum; and a host fabric interface to communicatively couple the compute node to a fabric, the host fabric interface comprising second one or more logic elements comprising a data pulling engine, the data pulling engine to: publish the datum as available; receive a pull request for the datum, the pull request comprising a node identifier for a data consumer; and send the datum to the data consumer via the fabric. There is also disclosed a method of providing a data pulling engine.

19.

发明申请
FABRIC-INTEGRATED DATA PULLING ENGINE 审中-公开

公开(公告)号：US20170185561A1

公开(公告)日：2017-06-29

申请号：US14757892

申请日：2015-12-24

Applicant: Intel Corporation

Inventor： James Dinan , Mario Flajslik , Keith Underwood , David Keppel , Ulf Rainer Hanebutte

IPC: G06F15/167 , H04L12/931

CPC classification number: H04L49/35 , G06F15/17331

Abstract: In an example, there is disclosed a compute node, comprising: first one or more logic elements comprising a data producer engine to produce a datum; and a host fabric interface to communicatively couple the compute node to a fabric, the host fabric interface comprising second one or more logic elements comprising a data pulling engine, the data pulling engine to: publish the datum as available; receive a pull request for the datum, the pull request comprising a node identifier for a data consumer; and send the datum to the data consumer via the fabric. There is also disclosed a method of providing a data pulling engine.

20.

发明申请
TECHNOLOGIES FOR PERFORMANCE INSPECTION AT AN ENDPOINT NODE 审中-公开

公开(公告)号：US20170093670A1

公开(公告)日：2017-03-30

申请号：US14866536

申请日：2015-09-25

Applicant: Intel Corporation

Inventor： James Dinan , David Keppel

IPC: H04L12/26 , H04L29/08

CPC classification number: H04L43/0876 , H04L43/0864 , H04L43/10 , H04L43/106 , H04L43/12 , H04L67/10 , H04L67/1004 , H04L67/22 , H04L69/40

Abstract: Technologies for monitoring communication performance of a high performance computing (HPC) network include a performance probing engine of a source endpoint node of the HPC network. The performance probing engine is configured to generate a probe request that includes a timestamp of the probe request and transmit the probe request to a destination endpoint node of the HPC network communicatively coupled to the source endpoint node via the HPC network. The performance probing engine is additionally configured to receive a probe response from the destination endpoint node via the HPC network and to generate another timestamp that corresponds to the probe request having been received. Further, the performance probing engine is configured to determine a round-trip latency as a function of the probe request and probe response timestamps. Other embodiments are described and claimed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification