-
公开(公告)号:US20200310993A1
公开(公告)日:2020-10-01
申请号:US16370587
申请日:2019-03-29
Applicant: INTEL CORPORATION
Inventor: Sanjay Kumar , David Koufaty , Philip Lantz , Pratik Marolia , Rajesh Sankaran , Koen Koning
IPC: G06F13/16 , G06F12/1027 , G06F3/06
Abstract: The present disclosure is directed to systems and methods sharing memory circuitry between processor memory circuitry and accelerator memory circuitry in each of a plurality of peer-to-peer connected accelerator units. Each of the accelerator units includes physical-to-virtual address translation circuitry and migration circuitry. The physical-to-virtual address translation circuitry in each accelerator unit includes pages for each of at least some of the plurality of accelerator units. The migration circuitry causes the transfer of data between the processor memory circuitry and the accelerator memory circuitry in each of the plurality of accelerator circuits. The migration circuitry migrates and evicts data to/from accelerator memory circuitry based on statistical information associated with accesses to at least one of: processor memory circuitry or accelerator memory circuitry in one or more peer accelerator circuits. Thus, the processor memory circuitry and accelerator memory circuitry may be dynamically allocated to advantageously minimize system latency attributable to data access operations.
-
公开(公告)号:US12229069B2
公开(公告)日:2025-02-18
申请号:US17083200
申请日:2020-10-28
Applicant: Intel Corporation
Inventor: Pratik Marolia , Andrew Herdrich , Rajesh Sankaran , Rahul Pal , David Puffer , Sayantan Sur , Ajaya Durg
Abstract: Methods and apparatus for an accelerator controller hub (ACH). The ACH may be a stand-alone component or integrated on-die or on package in an accelerator such as a GPU. The ACH may include a host device link (HDL) interface, one or more Peripheral Component Interconnect Express (PCIe) interfaces, one or more high performance accelerator link (HPAL) interfaces, and a router, operatively coupled to each of the HDL interface, the one or more PCIe interfaces, and the one or more HPAL interfaces. The HDL interface is configured to be coupled to a host CPU via an HDL link and the one or more HPAL interfaces are configured to be coupled to one or more HPALs that are used to access high performance accelerator fabrics (HPAFs) such as NVlink fabrics and CCIX (Cache Coherent Interconnect for Accelerators) fabrics. Platforms including ACHs or accelerators with integrated ACHs support RDMA transfers using RDMA semantics to enable transfers between accelerator memory on initiators and targets without CPU involvement.
-
公开(公告)号:US20240330053A1
公开(公告)日:2024-10-03
申请号:US18194408
申请日:2023-03-31
Applicant: Intel Corporation
Inventor: Andrew J. Herdrich , Philip Abraham , Priya Autee , Stephen Van Doren , Yen-Cheng Liu , Rajesh Sankaran , Kameswar Subramaniam , Ritesh Parikh
CPC classification number: G06F9/5016 , G06F9/3009 , G06F9/5044
Abstract: Techniques for region-aware memory bandwidth allocation control are described. In an embodiment, an apparatus includes a processing core and control circuitry. The processing core is to execute a plurality of threads. The control circuitry is to control use of memory bandwidth per memory region and per thread.
-
24.
公开(公告)号:US11907744B2
公开(公告)日:2024-02-20
申请号:US16911445
申请日:2020-06-25
Applicant: Intel Corporation
Inventor: Utkarsh Y. Kakaiya , Sanjay K. Kumar , Philip Lantz , Gilbert Neiger , Rajesh Sankaran , Vedvyas Shanbhogue
CPC classification number: G06F9/45558 , G06F9/30098 , G06F9/5005 , G06F9/546 , G06F2009/4557 , G06F2009/45579
Abstract: In one embodiment, a processor comprises: a first configuration register to store quality of service (QoS) information for a process address space identifier (PASID) value associated with a first process; and an execution circuit coupled to the first configuration register, where the execution circuit, in response to a first instruction, is to obtain command data from a first location identified in a source operand of the first instruction, insert the QoS information and the PASID value into the command data, and send a request comprising the command data to a device coupled to the processor, to enable the device to use the QoS information of a plurality of requests to manage sharing between a plurality of processes. Other embodiments are described and claimed.
-
25.
公开(公告)号:US20230185603A1
公开(公告)日:2023-06-15
申请号:US17551166
申请日:2021-12-14
Applicant: Intel Corporation
Inventor: Saurabh Gayen , Philip Lantz , Narayan Ranganathan , Dhananjay Joshi , Rajesh Sankaran , Utkarsh Kakaiya
CPC classification number: G06F9/4881 , G06Q30/0283
Abstract: Methods and apparatus relating to dynamic capability discovery and enforcement for accelerators and devices in multi-tenant systems are described. In an embodiment, a hardware accelerator device advertises one or more available operations and/or capabilities of the hardware accelerator device to one or more tenants. Logic circuitry controls access to the one or more available operations and/or capabilities of the one or more work queues on a per-tenant basis. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US20220147393A1
公开(公告)日:2022-05-12
申请号:US17212977
申请日:2021-03-25
Applicant: Intel Corporation
Inventor: Rajesh Sankaran , Gilbert Neiger , Vedvyas Shanbhogue , David Koufaty
Abstract: An embodiment of an apparatus comprises decode circuitry to decode a single instruction, the single instruction to include a field for an identifier of a first source operand, a field for an identifier of a destination operand, and a field for an opcode, the opcode to indicate execution circuitry is to program a user timer, and execution circuitry to execute the decoded instruction according to the opcode to retrieve timer program information from a location indicated by the first source operand, and program a user timer indicated by the destination operand based on the retrieved timer program information. Other embodiments are disclosed and claimed.
-
公开(公告)号:US11169929B2
公开(公告)日:2021-11-09
申请号:US15958591
申请日:2018-04-20
Applicant: Intel Corporation
Inventor: Rupin Vakharwala , Amin Firoozshahian , Stephen Van Doren , Rajesh Sankaran , Mahesh Madhav , Omid Azizi , Andreas Kleen , Mahesh Maddury , Ashok Raj
IPC: G06F12/1009 , G06F3/06 , G06F12/0862 , G06F9/38 , G06F12/1081 , G06F12/1045 , G06F12/1027
Abstract: A processing device includes a core to execute instructions, and memory management circuitry coupled to, memory, the core and an I/O device that supports page faults. The memory management circuitry includes an express invalidations circuitry, and a page translation permission circuitry. The memory management circuitry is to, while the core is executing the instructions, receive a command to pause communication between the I/O device and the memory. In response to receiving the command to pause the communication, modify permissions of page translations by the page translation permission circuitry and transmit an invalidation request, by the express invalidations circuitry to the I/O device, to cause cached page translations in the I/O device to be invalidated.
-
公开(公告)号:US11055147B2
公开(公告)日:2021-07-06
申请号:US16351396
申请日:2019-03-12
Applicant: Intel Corporation
Inventor: Utkarsh Y. Kakaiya , Rajesh Sankaran , Sanjay Kumar , Kun Tian , Philip Lantz
Abstract: Techniques for scalable virtualization of an Input/Output (I/O) device are described. An electronic device composes a virtual device comprising one or more assignable interface (AI) instances of a plurality of AI instances of a hosting function exposed by the I/O device. The electronic device emulates device resources of the I/O device via the virtual device. The electronic device intercepts a request from the guest pertaining to the virtual device, and determines whether the request from the guest is a fast-path operation to be passed directly to one of the one or more AI instances of the I/O device or a slow-path operation that is to be at least partially serviced via software executed by the electronic device. For a slow-path operation, the electronic device services the request at least partially via the software executed by the electronic device.
-
公开(公告)号:US20200019515A1
公开(公告)日:2020-01-16
申请号:US16582956
申请日:2019-09-25
Applicant: Intel Corporation
Inventor: David Koufaty , Rajesh Sankaran , Anna Trikalinou , Rupin Vakharwala
IPC: G06F12/14 , G06F12/0862 , G06F12/1009 , G06F13/16 , G06F13/42
Abstract: Embodiments are directed to providing a secure address translation service. An embodiment of a system includes DRAM for storage of data, an IOMMU coupled to the DRAM, and a host-to-device link to couple the IOMMU with one or more devices and to operate as a translation agent on behalf of one or more devices in connection with memory operations relating to the DRAM, including receiving a translated request from a discrete device via the host-to-device link specifying a memory operation and a physical address within the DRAM pertaining to the memory operation, determining page access permissions assigned to a context of the discrete device for a physical page of the DRAM within which the physical address resides, allowing the memory operation to proceed when the page access permissions permit the memory operation, and blocking the memory operation when the page access permissions do not permit the memory operation.
-
公开(公告)号:US10228981B2
公开(公告)日:2019-03-12
申请号:US15584979
申请日:2017-05-02
Applicant: Intel Corporation
Inventor: Utkarsh Y. Kakaiya , Rajesh Sankaran , Sanjay Kumar , Kun Tian , Philip Lantz
Abstract: Techniques for scalable virtualization of an Input/Output (I/O) device are described. An electronic device composes a virtual device comprising one or more assignable interface (AI) instances of a plurality of AI instances of a hosting function exposed by the I/O device. The electronic device emulates device resources of the I/O device via the virtual device. The electronic device intercepts a request from the guest pertaining to the virtual device, and determines whether the request from the guest is a fast-path operation to be passed directly to one of the one or more AI instances of the I/O device or a slow-path operation that is to be at least partially serviced via software executed by the electronic device. For a slow-path operation, the electronic device services the request at least partially via the software executed by the electronic device.
-
-
-
-
-
-
-
-
-