-
公开(公告)号:US12014442B2
公开(公告)日:2024-06-18
申请号:US16721450
申请日:2019-12-19
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Rex Eldon McCrary
CPC classification number: G06T1/20 , G06F9/3838 , G06F9/4881
Abstract: A primary processing unit includes queues configured to store commands prior to execution in corresponding pipelines. The primary processing unit also includes a first table configured to store entries indicating dependencies between commands that are to be executed on different ones of a plurality of processing units that include the primary processing unit and one or more secondary processing units. The primary processing unit also includes a scheduler configured to release commands in response to resolution of the dependencies. In some cases, a first one of the secondary processing units schedules the first command for execution in response to resolution of a dependency on a second command executing in a second one of the secondary processing units. The second one of the secondary processing units notifies the primary processing unit in response to completing execution of the second command.
-
公开(公告)号:US20240193847A1
公开(公告)日:2024-06-13
申请号:US18076496
申请日:2022-12-07
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Yusuke Tokuyoshi
Abstract: A processor shares path tracing data across sampling locations to amortize computations across space and time. The processor maps a group of sampling locations of a frame that are adjacent to each other to a reservoir. Each reservoir is associated with a ray that intersects subsets of path space such as a pixel. The processor resamples the reservoirs based on a similarity of probability density functions (PDFs) between pixels to select a set of samples mapped to the reservoir. The processor then performs resampling of the selected set of samples to obtain a representative light sample to determine a value for each pixel and renders the frame based on the values of the pixels.
-
公开(公告)号:US20240193844A1
公开(公告)日:2024-06-13
申请号:US18077424
申请日:2022-12-08
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Mark Fowler , Samuel Naffziger , Michael Mantor , Mark Leather
CPC classification number: G06T15/005 , G06F9/3802
Abstract: A graphics processing unit (GPU) of a processing system is partitioned into multiple dies (referred to as GPU chiplets) that are configurable to collectively function and interface with an application as a single GPU in a first mode and as multiple GPUs in a second mode. By dividing the GPU into multiple GPU chiplets, the processing system flexibly and cost-effectively configures an amount of active GPU physical resources based on an operating mode. In addition, a configurable number of GPU chiplets are assembled into a single GPU, such that multiple different GPUs having different numbers of GPU chiplets can be assembled using a small number of tape-outs and a multiple-die GPU can be constructed out of GPU chiplets that implement varying generations of technology.
-
公开(公告)号:US20240193016A1
公开(公告)日:2024-06-13
申请号:US18064170
申请日:2022-12-09
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: AnZhong Huang , Zhengsan Jian , Yinan Jiang
CPC classification number: G06F9/544 , G06F12/023
Abstract: An apparatus and method for efficiently executing multiple processes by reducing an amount of memory usage of the processes. In various implementations, a computing system includes a first processor and a second processor that support parallel data applications stored on a remote server that provides cloud computing services to multiple users. The first processor creates multiple processes, referred to as “instances” in parallel computing platforms, for a particular application as users request to execute the application. When the first processor detects a function call of the application within a particular instance, the first processor searches for shareable data objects to be used by the second processor when executing the first instance of the function call, and frees data storage allocated to data objects that are already shared by one or more instances. Therefore, an amount of memory allocated for the multiple instances of the application is reduced.
-
135.
公开(公告)号:US12008378B2
公开(公告)日:2024-06-11
申请号:US18132879
申请日:2023-04-10
Applicant: Advanced Micro Devices, Inc.
Inventor: Varun Agrawal , Yasuko Eckert
IPC: G06F9/38 , G06F9/30 , G06F12/0815 , G06F13/16
CPC classification number: G06F9/3895 , G06F9/30036 , G06F9/30105 , G06F12/0815 , G06F13/1668
Abstract: A parallel processing (PP) level coherence directory, also referred to as a Processing In-Memory Probe Filter (PimPF), is added to a coherence directory controller. When the coherence directory controller receives a broadcast PIM command from a host, or a PIM command that is directed to multiple memory banks in parallel, the PimPF accelerates processing of the PIM command by maintaining a directory for cache coherence that is separate from existing system level directories in the coherence directory controller. The PimPF maintains a directory according to address signatures that define the memory addresses affected by a broadcast PIM command. Two implementations are described: a lightweight implementation that accelerates PIM loads into registers, and a heavyweight implementation that accelerates both PIM loads into registers and PIM stores into memory.
-
公开(公告)号:US11996848B1
公开(公告)日:2024-05-28
申请号:US17979622
申请日:2022-11-02
Applicant: Advanced Micro Devices, Inc.
Inventor: Aaron D. Willey , Karthik Gopalakrishnan
Abstract: The disclosed computer-implemented method includes providing, by a reference clock circuit, a clock signal for a clock-triggered element triggered by the clock signal and modulating, by a frequency modulation circuit, a frequency of the clock signal. The method also includes inserting, by a phase compensation circuit, a phase compensation offset to the modulated clock signal in a manner that compensates for a phase error produced by modulating the frequency of the clock signal. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US11996166B2
公开(公告)日:2024-05-28
申请号:US16556139
申请日:2019-08-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Fataneh Ghodrat , Tien E. Wei
IPC: G06F12/00 , G06F9/38 , G06F12/02 , G11C8/12 , G11C11/4074
CPC classification number: G11C8/12 , G06F9/3804 , G06F12/0246 , G11C11/4074 , G06F2212/1028
Abstract: A technique for processing computer instructions is provided. The technique includes obtaining information for an instruction state memory entry for an instruction; identifying, for the instruction state memory entry, a slot in an instruction state memory having selectably powered rows and blocks, based on clustering criteria; and placing the instruction state memory entry into the identified slot.
-
公开(公告)号:US11995351B2
公开(公告)日:2024-05-28
申请号:US17515976
申请日:2021-11-01
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Joseph L Greathouse , Sean Keely , Alan D. Smith , Anthony Asaro , Ling-Ling Wang , Milind N Nemlekar , Hari Thangirala , Felix Kuehling
CPC classification number: G06F3/0659 , G06F3/061 , G06F3/0679 , G06F13/28
Abstract: A method for hardware management of DMA transfer commands includes accessing, by a first DMA engine, a DMA transfer command and determining a first portion of a data transfer requested by the DMA transfer command. Transfer of a first portion of the data transfer by the first DMA engine is initiated based at least in part on the DMA transfer command. Similarly, a second portion of the data transfer by a second DMA engine is initiated based at least in part on the DMA transfer command. After transferring the first portion and the second portion of the data transfer, an indication is generated that signals completion of the data transfer requested by the DMA transfer command.
-
公开(公告)号:US11995008B2
公开(公告)日:2024-05-28
申请号:US17354806
申请日:2021-06-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Guanhao Shen , Ravindra Nath Bhargava , James R. Magro , Kedarnath Balakrishnan
IPC: G06F13/16 , G11C11/4063
CPC classification number: G06F13/1642 , G11C11/4063
Abstract: A memory controller includes a command queue having an input for receiving memory access commands for a memory channel, and a number of entries for holding a predetermined number of memory access commands, and an arbiter that selects memory commands from the command queue for dispatch to one of a persistent memory and a DRAM memory coupled to the memory channel. The arbiter includes a first-tier sub-arbiter circuit coupled to the command queue for selecting candidate commands from among DRAM commands and persistent memory commands, and a second-tier sub-arbiter circuit coupled to the first-tier sub-arbiter circuit for receiving the candidate commands and selecting at least one command from among the candidate commands.
-
公开(公告)号:US20240168513A1
公开(公告)日:2024-05-23
申请号:US17990566
申请日:2022-11-18
Applicant: Advanced Micro Devices, Inc.
Inventor: Nehal Patel
IPC: G06F1/10
CPC classification number: G06F1/10
Abstract: A disclosed technique includes clock gating a plurality of data elements of a first clock domain of a scan dump network; outputting data from a plurality of data elements of a second clock domain of the scan dump network; clock gating the plurality of data elements of the second clock domain; and outputting data from the plurality of data elements of the first clock domain.
-
-
-
-
-
-
-
-
-