Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Alexander Lyashevsky"

1.

发明申请
LOW POWER AND LOW LATENCY GPU COPROCESSOR FOR PERSISTENT COMPUTING 有权

公开(公告)号：US20210201439A1

公开(公告)日：2021-07-01

申请号：US17181300

申请日：2021-02-22

Applicant: Advanced Micro Devices, Inc.

Inventor： Jiasheng Chen , Timour Paltashev , Alexander Lyashevsky , Carl Kittredge Wakeland , Michael J. Mantor

IPC: G06T1/20 , G06F9/54 , G06F9/38 , G06T1/60

Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.

2.

发明申请
Software Only Inter-Compute Unit Redundant Multithreading for GPUs 有权
Title translation: 仅用于软件的计算单元冗余多线程的GPU

公开(公告)号：US20140373028A1

公开(公告)日：2014-12-18

申请号：US13920524

申请日：2013-06-18

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexander Lyashevsky , Sudhanva Gurumurthi , Vilas Sridharan

IPC: G06F9/52

CPC classification number: G06F11/00 , G06F9/3851 , G06F9/52 , G06F11/1608 , G06F11/1633 , G06F11/1637 , G06F11/1654 , G06F11/1687 , G06F11/202 , G06F11/2038 , G06F11/2041 , G06F11/2048 , G06F2201/825 , G06F2201/83

Abstract: A system, method and computer program product to execute a first and a second work-group, and compare the signature variables of the first work-group to the signature variables of the second work-group via a synchronization mechanism. The first and the second work-group are mapped to an identifier via software. This mapping ensures that the first and second work-groups execute exactly the same data for exactly the same code without changes to the underlying hardware. By executing the first and second work-groups independently, the underlying computation of the first and second work-groups can be verified. Moreover, system performance is not substantially affected because the execution results of the first and second work-groups are compared only at specified comparison points.

Abstract translation: 一种用于执行第一和第二工作组的系统，方法和计算机程序产品，并且经由同步机制将第一工作组的签名变量与第二工作组的签名变量进行比较。第一个和第二个工作组通过软件映射到一个标识符。此映射确保第一个和第二个工作组对完全相同的代码执行完全相同的数据，而不会更改底层硬件。通过独立地执行第一和第二工作组，可以验证第一和第二工作组的基础计算。此外，由于第一和第二工作组的执行结果仅在指定的比较点进行比较，系统性能基本上不受影响。

3.

发明授权
Per-block sort for performance enhancement of parallel processors 有权

公开(公告)号：US09740511B2

公开(公告)日：2017-08-22

申请号：US14730499

申请日：2015-06-04

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexander Lyashevsky

IPC: G06F9/50 , G06F9/445 , G06F11/34 , G06F11/30

CPC classification number: G06F9/44505 , G06F9/50 , G06F11/3024 , G06F11/3404 , G06F11/3409 , G06F11/3419 , G06F11/3452 , Y02D10/43

Abstract: A method of enhancing performance of an application executing in a parallel processor and a system for executing the method are disclosed. A block size for input to the application is determined. Input is partitioned into blocks having the block size. Input within each block is sorted. The application is executed with the sorted input.

4.

发明申请
PER-BLOCK SORT FOR PERFORMANCE ENHANCEMENT OF PARALLEL PROCESSORS 有权
Title translation: 对于并行处理器的性能提升，每个块的排序

公开(公告)号：US20160357580A1

公开(公告)日：2016-12-08

申请号：US14730499

申请日：2015-06-04

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexander Lyashevsky

IPC: G06F9/445 , G06F11/30 , G06F11/34

CPC classification number: G06F9/44505 , G06F9/50 , G06F11/3024 , G06F11/3404 , G06F11/3409 , G06F11/3419 , G06F11/3452 , Y02D10/43

Abstract: A method of enhancing performance of an application executing in a parallel processor and a system for executing the method are disclosed. A block size for input to the application is determined. Input is partitioned into blocks having the block size. Input within each block is sorted. The application is executed with the sorted input.

Abstract translation: 公开了一种增强在并行处理器中执行的应用程序的性能的方法和用于执行该方法的系统。确定用于输入到应用程序的块大小。输入被分成具有块大小的块。对每个块内的输入进行排序。应用程序使用排序输入执行。

5.

发明申请
SYSTEM AND METHOD FOR PROVIDING LOW LATENCY TO APPLICATIONS USING HETEROGENEOUS PROCESSORS 有权
Title translation: 使用异构处理器提供低延迟应用的系统和方法

公开(公告)号：US20130328891A1

公开(公告)日：2013-12-12

申请号：US13912438

申请日：2013-06-07

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Alexander Lyashevsky

IPC: G06T1/00

CPC classification number: G06T1/00 , G06F9/5033 , G06F17/30519 , G06T15/005 , G09G5/363

Abstract: Methods, apparatuses, and computer readable media are disclosed for responding to requests. A method of responding to requests may include receiving requests comprising callback functions. The one or more requests may be received in a first memory associated with processors of a first type, which may be CPUs. The requests may be moved to a second memory. The second memory may be associated with processors of a second type, which may be GPUs. GPU threads may process the requests to determine a result for the requests, when a number of the requests is at least a threshold number. The method may include moving the results to the first memory. The method may include the CPUs executing the one or more callback functions with the corresponding result. A GPU persistent thread may check the number of requests to determine when a threshold number of requests is reached.

Abstract translation: 公开了用于响应请求的方法，装置和计算机可读介质。响应请求的方法可以包括接收包括回调函数的请求。可以在与可以是CPU的第一类型的处理器相关联的第一存储器中接收一个或多个请求。请求可以被移动到第二存储器。第二存储器可以与第二类型的处理器相关联，处理器可以是GPU。当许多请求至少为阈值时，GPU线程可以处理请求以确定请求的结果。该方法可以包括将结果移动到第一存储器。该方法可以包括执行具有相应结果的一个或多个回调函数的CPU。 GPU持久线程可以检查确定何时达到阈值数量的请求数。

6.

发明授权
Low power and low latency GPU coprocessor for persistent computing 有权

公开(公告)号：US10929944B2

公开(公告)日：2021-02-23

申请号：US15360057

申请日：2016-11-23

Applicant: Advanced Micro Devices, Inc.

Inventor： Jiasheng Chen , Timour Paltashev , Alexander Lyashevsky , Carl Kittredge Wakeland , Michael J. Mantor

IPC: G06T1/20 , G06F9/54 , G06F9/38 , G06T1/60

Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.

7.

发明申请
LOW POWER AND LOW LATENCY GPU COPROCESSOR FOR PERSISTENT COMPUTING 审中-公开

公开(公告)号：US20180144435A1

公开(公告)日：2018-05-24

申请号：US15360057

申请日：2016-11-23

Applicant: Advanced Micro Devices, Inc.

Inventor： Jiasheng Chen , Timour Paltashev , Alexander Lyashevsky , Carl Kittredge Wakeland , Michael J. Mantor

IPC: G06T1/20 , G06T1/60

CPC classification number: G06T1/20 , G06F9/3887 , G06F9/542 , G06F2009/3883 , G06F2209/548 , G06T1/60

Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.

8.

发明授权
System and method for providing low latency to applications using heterogeneous processors 有权
Title translation: 使用异构处理器为应用程序提供低延迟的系统和方法

公开(公告)号：US09495718B2

公开(公告)日：2016-11-15

申请号：US13912438

申请日：2013-06-07

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Alexander Lyashevsky

IPC: G09G5/36 , G06T1/20 , G06T15/00 , G06T1/00 , G06F9/50 , G06F17/30

CPC classification number: G06T1/00 , G06F9/5033 , G06F17/30519 , G06T15/005 , G09G5/363

Abstract: Methods, apparatuses, and computer readable media are disclosed for responding to requests. A method of responding to requests may include receiving requests comprising callback functions. The one or more requests may be received in a first memory associated with processors of a first type, which may be CPUs. The requests may be moved to a second memory. The second memory may be associated with processors of a second type, which may be GPUs. GPU threads may process the requests to determine a result for the requests, when a number of the requests is at least a threshold number. The method may include moving the results to the first memory. The method may include the CPUs executing the one or more callback functions with the corresponding result. A GPU persistent thread may check the number of requests to determine when a threshold number of requests is reached.

Abstract translation: 公开了用于响应请求的方法，装置和计算机可读介质。响应请求的方法可以包括接收包括回调函数的请求。可以在与可以是CPU的第一类型的处理器相关联的第一存储器中接收一个或多个请求。请求可以被移动到第二存储器。第二存储器可以与第二类型的处理器相关联，处理器可以是GPU。当许多请求至少为阈值时，GPU线程可以处理请求以确定请求的结果。该方法可以包括将结果移动到第一存储器。该方法可以包括执行具有相应结果的一个或多个回调函数的CPU。 GPU持久线程可以检查确定何时达到阈值数量的请求数。

9.

发明申请
Software Only Intra-Compute Unit Redundant Multithreading for GPUs 有权
Title translation: 用于GPU的软件内部计算单元冗余多线程

公开(公告)号：US20140368513A1

公开(公告)日：2014-12-18

申请号：US13920574

申请日：2013-06-18

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexander Lyashevsky , Sudhanva Gurumurthi , Vilas Sridharan

IPC: G06T1/20

CPC classification number: G06F11/00 , G06F9/3851 , G06F9/52 , G06F11/1608 , G06F11/1633 , G06F11/1637 , G06F11/1654 , G06F11/1687 , G06F11/202 , G06F11/2038 , G06F11/2041 , G06F11/2048 , G06F2201/825 , G06F2201/83

Abstract: A system, method and computer program product to execute a first and a second work-item, and compare the signature variable of the first work-item to the signature variable of the second work-item. The first and the second work-items are mapped to an identifier via software. This mapping ensures that the first and second work-items execute exactly the same data for exactly the same code without changes to the underlying hardware. By executing the first and second work-items independently, the underlying computation of the first and second work-item can be verified. Moreover, system performance is not substantially affected because the execution results of the first and second work-items are compared only at specified comparison points.

Abstract translation: 一种用于执行第一和第二工作项目的系统，方法和计算机程序产品，并且将第一工作项目的签名变量与第二工作项目的签名变量进行比较。第一个和第二个工作项通过软件映射到一个标识符。此映射确保第一个和第二个工作项完全相同的数据完全相同的代码，而不会更改底层硬件。通过独立地执行第一和第二工作项目，可以验证第一和第二工件的基础计算。此外，系统性能基本上不受影响，因为第一和第二工作项目的执行结果仅在指定的比较点进行比较。

10.

发明授权
Software only inter-compute unit redundant multithreading for GPUs 有权
Title translation: 用于GPU的仅软件间计算单元冗余多线程

公开(公告)号：US09274904B2

公开(公告)日：2016-03-01

申请号：US13920524

申请日：2013-06-18

Applicant: Advanced Micro Devices, Inc.

Inventor： Alexander Lyashevsky , Sudhanva Gurumurthi , Vilas Sridharan

IPC: G06F9/46 , G06F11/00 , G06F11/20 , G06F9/52 , G06F11/16

CPC classification number: G06F11/00 , G06F9/3851 , G06F9/52 , G06F11/1608 , G06F11/1633 , G06F11/1637 , G06F11/1654 , G06F11/1687 , G06F11/202 , G06F11/2038 , G06F11/2041 , G06F11/2048 , G06F2201/825 , G06F2201/83

Abstract: A system, method and computer program product to execute a first and a second work-group, and compare the signature variables of the first work-group to the signature variables of the second work-group via a synchronization mechanism. The first and the second work-group are mapped to an identifier via software. This mapping ensures that the first and second work-groups execute exactly the same data for exactly the same code without changes to the underlying hardware. By executing the first and second work-groups independently, the underlying computation of the first and second work-groups can be verified. Moreover, system performance is not substantially affected because the execution results of the first and second work-groups are compared only at specified comparison points.

Abstract translation: 一种用于执行第一和第二工作组的系统，方法和计算机程序产品，并且经由同步机制将第一工作组的签名变量与第二工作组的签名变量进行比较。第一个和第二个工作组通过软件映射到一个标识符。此映射确保第一个和第二个工作组对完全相同的代码执行完全相同的数据，而不会更改底层硬件。通过独立地执行第一和第二工作组，可以验证第一和第二工作组的基础计算。此外，由于第一和第二工作组的执行结果仅在指定的比较点进行比较，系统性能基本上不受影响。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification