Patent search ap:("Intel Corporation") AND inv:"Adrian C. Moga" Page 1

1.

发明授权
Integrated three-dimensional (3D) DRAM cache 有权

公开(公告)号：US12271306B2

公开(公告)日：2025-04-08

申请号：US17214835

申请日：2021-03-27

Applicant: Intel Corporation

Inventor： Wilfred Gomes , Adrian C. Moga , Abhishek Sharma

IPC: G06F12/00 , G06F12/0802

Abstract: Three-dimensional (3D) DRAM integrated in the same package as compute logic enable forming high-density caches. In one example, an integrated 3D DRAM includes a large on-de cache (such as a level 4 (L4) cache), a large on-die memory-side cache, or both an L4 cache and a memory-side cache. One or more tag caches cache recently accessed tags from the L4 cache, the memory-side cache, or both. A cache controller in the compute logic is to receive a request from one of the processor cores to access an address and compare tags in the tag cache with the address. In response to a hit in the tag cache, the cache controller accesses data from the cache at a location indicated by an entry in the tag cache, without performing a tag lookup in the cache.

2.

发明授权
Inclusive and non-inclusive tracking of local cache lines to avoid near memory reads on cache line memory writes into a two level system memory 有权
Title translation: 本地缓存行的包含和非包容性跟踪，以避免缓存行内存上的近似存储器读取写入两级系统内存

公开(公告)号：US09418009B2

公开(公告)日：2016-08-16

申请号：US14142045

申请日：2013-12-27

Applicant: Intel Corporation

Inventor： Adrian C. Moga , Vedaraman Geetha , Bahaa Fahim , Robert G. Blankenship , Yen-Cheng Liu , Jeffrey D. Chamberlain , Stephen R. Van Doren

IPC: G06F12/00 , G06F13/00 , G06F13/28 , G06F12/08

CPC classification number: G06F12/0811 , G06F12/0888

Abstract: A processor may include a memory controller to interface with a system memory having a near memory and a far memory. The processor may include logic circuitry to cause memory controller to determine whether a write request is generated remotely or locally, and when the write request is generated remotely to instruct the memory controller to perform a read of near memory before performing a write, when the write request is generated locally and a cache line targeted by the write request is in the inclusive state to instruct the memory controller to perform the write without performing a read of near memory, and when the write request is generated locally and the cache line targeted by the write request is in the non-inclusive state to instruct the memory controller to read near memory before performing the write.

Abstract translation: 处理器可以包括与具有近存储器和远存储器的系统存储器接口的存储器控制器。处理器可以包括逻辑电路，以使存储器控制器确定写入请求是远程生成还是本地生成，并且当写入请求被远程生成以指示存储器控制器在执行写入之前执行近似存储器的读取，当写入请求在本地生成，并且由写入请求所针对的高速缓存行处于包含状态，以指示存储器控制器执行写入而不执行近似存储器的读取，并且当本地生成写入请求时，写请求处于非包容状态，以指示存储器控制器在执行写操作之前读取存储器。

3.

发明申请
Virtual Shared Cache Mechanism in a Processing Device 有权
Title translation: 处理设备中的虚拟共享缓存机制

公开(公告)号：US20160077970A1

公开(公告)日：2016-03-17

申请号：US14484642

申请日：2014-09-12

Applicant: Intel Corporation

Inventor： Yen-Cheng Liu , Aamer Jaleel , Bongjin Jung , Zeshan A. Chishti , Adrian C. Moga , Eric Delano , Ren Wang

IPC: G06F12/08

CPC classification number: G06F12/084 , G06F12/0811 , G06F12/0831 , G06F12/0842 , G06F12/0846 , G06F2212/1024

Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for providing a virtual shared cache mechanism. A processing device includes a plurality of clusters allocated into a virtual private shared cache. Each of the clusters includes a plurality of cores and a plurality of cache slices co-located within the plurality of cores. The processing device also includes a virtual shared cache including the plurality of clusters such that the cache data in the plurality of cache slices is shared among the plurality of clusters.

Abstract translation: 根据本文公开的实施例，提供了用于提供虚拟共享高速缓存机制的系统和方法。处理设备包括分配到虚拟专用共享高速缓存中的多个群集。每个群集包括多个核和多个高速缓存片，共同定位在多个核中。处理装置还包括包含多个群集的虚拟共享高速缓存，使得多个高速缓存片段中的高速缓存数据在多个群集之间共享。

4.

发明授权
Cache coherency apparatus and method minimizing memory writeback operations 有权
Title translation: 缓存一致性设备和最小化内存回写操作的方法

公开(公告)号：US09436605B2

公开(公告)日：2016-09-06

申请号：US14136131

申请日：2013-12-20

Applicant: Intel Corporation

Inventor： Jeffrey D. Chamberlain , Vedaraman Geetha , Robert G. Blankenship , Yen-Cheng Liu , Adrian C. Moga , Herbert H. Hum , Sailesh Kottapalli

IPC: G06F12/08

CPC classification number: G06F12/0817 , G06F12/0815

Abstract: An apparatus and method for reducing or eliminating writeback operations. For example, one embodiment of a method comprises: detecting a first operation associated with a cache line at a first requestor cache; detecting that the cache line exists in a first cache in a modified (M) state; forwarding the cache line from the first cache to the first requestor cache and storing the cache line in the first requestor cache in a second modified (M′) state; detecting a second operation associated with the cache line at a second requestor; responsively forwarding the cache line from the first requestor cache to the second requestor cache and storing the cache line in the second requestor cache in an owned (O) state if the cache line has not been modified in the first requestor cache; and setting the cache line to a shared (S) state in the first requestor cache.

Abstract translation: 一种用于减少或消除写回操作的设备和方法。例如，方法的一个实施例包括：在第一请求者高速缓存处检测与高速缓存行相关联的第一操作; 检测到所述高速缓存行存在于修改（M）状态的第一高速缓存中; 将所述高速缓存行从所述第一高速缓存转发到所述第一请求者高速缓存，并且以第二修改（M'）状态将所述高速缓存行存储在所述第一请求程序高速缓存中; 在第二请求者处检测与所述高速缓存线相关联的第二操作; 响应地将所述高速缓存行从所述第一请求者缓存转发到所述第二请求器高速缓存，并且如果所述高速缓存行尚未在所述第一请求者高速缓存中被修改则将所述高速缓存行存储在所述第二请求程序高速缓存中; 以及将所述高速缓存行设置为所述第一请求者缓存中的共享（S）状态。

5.

发明授权
Allocation and write policy for a glueless area-efficient directory cache for hotly contested cache lines 有权
Title translation: 一个无缝区域高效的目录缓存的分配和写入策略，用于热烈争议的缓存行

公开(公告)号：US08631210B2

公开(公告)日：2014-01-14

申请号：US13786305

申请日：2013-03-05

Applicant: Intel Corporation

Inventor： Adrian C. Moga , Malcolm Mandviwalla , Vedaraman Geetha , Herbert H. Hum

IPC: G06F12/00

CPC classification number: G06F12/0831 , G06F12/082 , G06F2212/2542

Abstract: Methods and apparatus relating to allocation and/or write policy for a glueless area-efficient directory cache for hotly contested cache lines are described. In one embodiment, a directory cache stores data corresponding to a caching status of a cache line. The caching status of the cache line is stored for each of a plurality of caching agents in the system. An write-on-allocate policy is used for the directory cache by using a special state (e.g., snoop-all state) that indicates one or more snoops are to be broadcasted to all agents in the system. Other embodiments are also disclosed.

Abstract translation: 描述与用于热挑战的高速缓存行的无胶带区域高效目录高速缓存的分配和/或写入策略有关的方法和装置。在一个实施例中，目录高速缓存存储与高速缓存行的高速缓存状态对应的数据。为系统中的多个缓存代理中的每一个存储缓存行的高速缓存状态。通过使用指示要广播到系统中的所有代理的一个或多个窥探的特殊状态（例如，窥探全状态），对目录高速缓存使用写入分配策略。还公开了其他实施例。

6.

发明授权
Processor and method implementing a cacheline demote machine instruction 有权

公开(公告)号：US11513957B2

公开(公告)日：2022-11-29

申请号：US17027248

申请日：2020-09-21

Applicant: Intel Corporation

Inventor： Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran

IPC: G06F12/0842 , G06F12/0893 , G06F12/109 , G06F12/0813 , G06F12/0831 , G06F9/455

Abstract: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.

7.

发明授权
Processors having virtually clustered cores and cache slices 有权

公开(公告)号：US10725920B2

公开(公告)日：2020-07-28

申请号：US15947831

申请日：2018-04-08

Applicant: Intel Corporation

Inventor： Herbert H. Hum , Brinda Ganesh , James R. Vash , Ganesh Kumar , Leena K. Puthiyedath , Scott J. Erlanger , Eric J. Dehaemer , Adrian C. Moga , Michelle M. Sebot , Richard L. Carlson , David Bubien , Eric Delano

IPC: G06F12/00 , G06F12/0831 , G06F12/0811 , G06F12/084

Abstract: A processor of an aspect includes a plurality of logical processors each having one or more corresponding lower level caches. A shared higher level cache is shared by the plurality of logical processors. The shared higher level cache includes a distributed cache slice for each of the logical processors. The processor includes logic to direct an access that misses in one or more lower level caches of a corresponding logical processor to a subset of the distributed cache slices in a virtual cluster that corresponds to the logical processor. Other processors, methods, and systems are also disclosed.

8.

发明授权
Virtual shared cache mechanism in a processing device 有权

公开(公告)号：US09792212B2

公开(公告)日：2017-10-17

申请号：US14484642

申请日：2014-09-12

Applicant: Intel Corporation

Inventor： Yen-Cheng Liu , Aamer Jaleel , Bongjin Jung , Zeshan A. Chishti , Adrian C. Moga , Eric Delano , Ren Wang

IPC: G06F12/00 , G06F13/00 , G06F12/084 , G06F12/0811 , G06F12/0842 , G06F12/0846 , G06F12/0831

CPC classification number: G06F12/084 , G06F12/0811 , G06F12/0831 , G06F12/0842 , G06F12/0846 , G06F2212/1024

Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for providing a virtual shared cache mechanism. A processing device includes a plurality of clusters allocated into a virtual private shared cache. Each of the clusters includes a plurality of cores and a plurality of cache slices co-located within the plurality of cores. The processing device also includes a virtual shared cache including the plurality of clusters such that the cache data in the plurality of cache slices is shared among the plurality of clusters.

9.

发明授权
Processors having virtually clustered cores and cache slices 有权

公开(公告)号：US10073779B2

公开(公告)日：2018-09-11

申请号：US13729579

申请日：2012-12-28

Applicant: Intel Corporation

Inventor： Herbert H. Hum , Brinda Ganesh , James R. Vash , Ganesh Kumar , Leena K. Puthiyedath , Scott J. Erlanger , Eric J. Dehaemer , Adrian C. Moga , Michelle M. Sebot , Richard L. Carlson , David Bubien , Eric Delano

IPC: G06F12/00 , G06F12/0831 , G06F12/0811 , G06F12/084

CPC classification number: G06F12/0831 , G06F12/0811 , G06F12/084

Abstract: A processor of an aspect includes a plurality of logical processors each having one or more corresponding lower level caches. A shared higher level cache is shared by the plurality of logical processors. The shared higher level cache includes a distributed cache slice for each of the logical processors. The processor includes logic to direct an access that misses in one or more lower level caches of a corresponding logical processor to a subset of the distributed cache slices in a virtual cluster that corresponds to the logical processor. Other processors, methods, and systems are also disclosed.

10.

发明申请
HARDWARE/SOFTWARE CO-OPTIMIZATION TO IMPROVE PERFORMANCE AND ENERGY FOR INTER-VM COMMUNICATION FOR NFVS AND OTHER PRODUCER-CONSUMER WORKLOADS 审中-公开
Title translation: 硬件/软件优化提高NFVS和其他生产者消费者工作量的互联网通信的性能和能源

公开(公告)号：US20160188474A1

公开(公告)日：2016-06-30

申请号：US14583389

申请日：2014-12-26

Applicant: Intel Corporation

Inventor： Ren Wang , Andrew J. Herdrich , Yen-cheng Liu , Herbert H. Hum , Jong Soo Park , Christopher J. Hughes , Namakkal N. Venkatesan , Adrian C. Moga , Aamer Jaleel , Zeshan A. Chishti , Mesut A. Ergin , Jr-shian Tsai , Alexander W. Min , Tsung-yuan C. Tai , Christian Maciocco , Rajesh Sankaran

IPC: G06F12/08

CPC classification number: G06F12/0842 , G06F9/45558 , G06F12/0813 , G06F12/0833 , G06F12/0893 , G06F12/109 , G06F2009/45595 , G06F2212/1021 , G06F2212/283 , G06F2212/62 , Y02D10/13

Abstract: Methods and apparatus implementing Hardware/Software co-optimization to improve performance and energy for inter-VM communication for NFVs and other producer-consumer workloads. The apparatus include multi-core processors with multi-level cache hierarchies including and L1 and L2 cache for each core and a shared last-level cache (LLC). One or more machine-level instructions are provided for proactively demoting cachelines from lower cache levels to higher cache levels, including demoting cachelines from L1/L2 caches to an LLC. Techniques are also provided for implementing hardware/software co-optimization in multi-socket NUMA architecture system, wherein cachelines may be selectively demoted and pushed to an LLC in a remote socket. In addition, techniques are disclosure for implementing early snooping in multi-socket systems to reduce latency when accessing cachelines on remote sockets.

Abstract translation: 实现硬件/软件协同优化的方法和设备，以提高NFV和其他生产者 - 消费者工作负载之间的VM间通信的性能和能量。该装置包括具有多级缓存层级的多核处理器，包括每个核心的L1和L2高速缓存以及共享的最后一级缓存（LLC）。提供了一个或多个机器级指令，用于主动地将高速缓存行从低级缓存级别降级到更高的高速缓存级别，包括将高速缓存行从L1 / L2高速缓存降级到LLC。还提供了用于在多插槽NUMA架构系统中实现硬件/软件协同优化的技术，其中高速缓存线可以被选择性地降级并被推送到远程插座中的LLC。此外，技术是在多插槽系统中实现早期窥探的技术，以减少在远程插槽上访问高速缓存线时的延迟。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification