Patent search ap:("Intel Corporation") AND inv:"James Valerio" Page 4

31.

发明申请
MAINTAINING HIGH TEMPORAL CACHE LOCALITY BETWEEN INDEPENDENT THREADS HAVING THE SAME ACCESS PATTERN 审中-公开

公开(公告)号：US20190324757A1

公开(公告)日：2019-10-24

申请号：US15957695

申请日：2018-04-19

Applicant: Intel Corporation

Inventor： James Valerio , Ben Ashbaugh , Pradeep Ramani , Rebecca David , Sabareesh Ganapathy , Hashem Hashemi

IPC: G06F9/38 , G06F9/30

Abstract: Embodiments described herein provide techniques to maintain high temporal cache locality between independent threads having the same or similar memory access pattern. One embodiment provides a graphics processing unit comprising an instruction execution pipeline including hardware execution logic and a thread dispatcher to process a set of commands for execution and distribute multiple groups of hardware threads to the hardware execution logic to execute the set of commands. The thread dispatcher can be configured to concurrently distribute a first group of the multiple groups of hardware threads to the hardware execution logic and withhold distribution of additional hardware threads for the set of commands until after the first group completes execution.

32.

发明申请
MICROCONTROLLER-BASED FLEXIBLE THREAD SCHEDULING LAUNCHING IN COMPUTING ENVIRONMENTS 审中-公开

公开(公告)号：US20190205163A1

公开(公告)日：2019-07-04

申请号：US15860708

申请日：2018-01-03

Applicant: Intel Corporation

Inventor： Kiran C. Veernapu , Kamlesh Pillai , James Valerio , Joydeep Ray , Abhishek Appu

IPC: G06F9/48 , G06F9/54 , G06F9/22 , G06T1/20

CPC classification number: G06F9/4881 , G06F9/22 , G06F9/54 , G06T1/20 , G06T15/005

Abstract: A mechanism is described to facilitate microcontroller-based flexible thread scheduling launching in computing environments. An apparatus of embodiments, as described herein, includes facilitating a graphics processor hosting a microcontroller having a thread scheduling unit, and detection and observation logic to detect a scheduling algorithm associated with an application at the apparatus. The apparatus may further include reading and dispatching logic to facilitate the microcontroller to prepare a flexible dispatch routine based on the scheduling algorithm. The apparatus may further include scheduling and launching logic to facilitate the thread scheduling unit to dynamically schedule and launch threads based on the flexible dispatch routine, where the threads are hosted by the graphics processor.

33.

发明申请
MULTIPLE REGISTER ALLOCATION SIZES FOR GPU HARDWARE THREADS 有权

公开(公告)号：US20250147762A1

公开(公告)日：2025-05-08

申请号：US18504407

申请日：2023-11-08

Applicant: Intel Corporation

Inventor： Vasanth Ranganathan , Gang Chen , Supratim Pal , Jorge Eduardo Parra Osorio , Arthur Hunter , Boris Kuznetsov , Deepak N K , Siva Kumar Seemakurthi , James Valerio , Shubham Dinesh Chavan , Abhishek Kumar Singh , Samir Pandya , Sandeep Tippannanavar Niranjan , Alan Curtis , Jain Philip , Maltesh Kulkarni , Fangwen Fu , John Wiegert , Brent Schwartz

IPC: G06F9/30 , G06T15/00

Abstract: Described herein is a graphics processor having processing resources with configurable thread and register configurations. Program code can configure a number of registers and accumulators that will be used by hardware threads during execution of the program code by the graphics processor. Processing resources within the graphics processor can be configured to assign different numbers of registers and accumulators to hardware threads based on the configuration requested by program code to be executed by the processing resource.

34.

发明申请
INSTRUCTION ENCODING TO IMPLEMENT INCREASED REGISTER CAPACITY PER THREAD 有权

公开(公告)号：US20250068423A1

公开(公告)日：2025-02-27

申请号：US18453861

申请日：2023-08-22

Applicant: Intel Corporation

Inventor： Jorge Eduardo Parra Osorio , Jiasheng Chen , Supratim Pal , Vasanth Ranganathan , Guei-Yuan Lueh , James Valerio , Pradeep Golconda , Brent Schwartz , Fangwen Fu , Sabareesh Ganapathy , Peter Caday , Wei-Yu Chen , Po-Yu Chen , Timothy Bauer , Maxim Kazakov , Stanley Gambarin , Samir Pandya

IPC: G06F9/30 , G06F9/38

Abstract: Described herein is a graphics processor comprising first circuitry configured to execute a decoded instruction and second circuitry configured to second circuitry configured to decode an instruction into the decoded instruction. The second circuitry is configured to determine a number of registers within a register file that are available to a thread of the processing resource and decode the instruction based on that number of registers.

35.

发明申请
32-BIT CHANNEL-ALIGNED INTEGER MULTIPLICATION VIA MULTIPLE MULTIPLIERS PER-CHANNEL 有权

公开(公告)号：US20250037347A1

公开(公告)日：2025-01-30

申请号：US18358297

申请日：2023-07-25

Applicant: Intel Corporation

Inventor： Jiasheng Chen , Supratim Pal , Kevin Hurd , Jorge E. Parra Osorio , Christopher Spencer , Takashi Nakagawa , Guei-Yuan Lueh , Pradeep K. Golconda , James Valerio , Mukundan Swaminathan , Nicholas Murphy , Clifford Gibson , Li-An Tang , Fangwen Fu , Kaiyu Chen , Buqi Cheng

IPC: G06T15/00 , G06F9/30

Abstract: Described herein is a graphics processor comprising an instruction cache and a plurality of processing elements coupled with the instruction cache. The plurality of processing elements include functional units configured to provide an integer pipeline to execute instructions to perform operations on integer data elements. The integer pipeline including a first multiplier and a second multiplier, the first multiplier and the second multiplier configured to execute operations for a single instruction.

36.

发明授权
Barrier state save and restore for preemption in a graphics environment 有权

公开(公告)号：US12164952B2

公开(公告)日：2024-12-10

申请号：US17358882

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Vasanth Ranganathan , James Valerio , Joydeep Ray , Abhishek R. Appu , Alan Curtis , Prathamesh Raghunath Shinde , Brandon Fliflet , Ben J. Ashbaugh , John Wiegert

IPC: G06F9/48 , G06F9/38 , G06T1/20

Abstract: An apparatus to facilitate barrier state save and restore for preemption in a graphics environment is disclosed. The apparatus includes processing resources to execute a plurality of execution threads that are comprised in a thread group (TG) and mid-thread preemption barrier save and restore hardware circuitry to: initiate an exception handling routine in response to a mid-thread preemption event, the exception handling routine to cause a barrier signaling event to be issued; receive indication of a valid designated thread status for a thread of a thread group (TG) in response to the barrier signaling event; and in response to receiving the indication of the valid designated thread status for the thread of the TG, cause, by the thread of the TG having the valid designated thread status, a barrier save routine and a barrier restore routine to be initiated for named barriers of the TG.

37.

发明授权
Multi-tile memory management 有权

公开(公告)号：US12099461B2

公开(公告)日：2024-09-24

申请号：US17431034

申请日：2020-03-14

Applicant: Intel Corporation

Inventor： Abhishek R. Appu , Altug Koker , Aravindh Anantaraman , Elmoustapha Ould-Ahmed-Vall , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Mike Macpherson , Subramaniam Maiyuran , Joydeep Ray , Lakshminarayanan Striramassarma , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter , Prasoonkumar Surti , David Puffer , James Valerio , Ankur N. Shah

IPC: G06F16/00 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/78 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06

CPC classification number: G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06

Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, an apparatus comprises a cache memory, a high-bandwidth memory, a shader core communicatively coupled to the cache memory and comprising a processing element to decompress a first data element extracted from an in-memory database in the cache memory and having a first bit length to generate a second data element having a second bit length, greater than the first bit length, and an arithmetic logic unit (ALU) to compare the data element to a target value provided in a query of the in-memory database. Other embodiments are also disclosed and claimed.

38.

发明公开
DATA MULTICAST IN COMPUTE CORE CLUSTERS 审中-公开

公开(公告)号：US20240220254A1

公开(公告)日：2024-07-04

申请号：US18148997

申请日：2022-12-30

Applicant: Intel Corporation

Inventor： Chunhui Mei , Yongsheng Liu , John A. Wiegert , Vasanth Ranganathan , Ben J. Ashbaugh , Fangwen Fu , Hong Jiang , Guei-Yuan Lueh , James Valerio , Alan M. Curtis , Maxim Kazakov

IPC: G06F9/30 , G06F9/38 , G06F9/50 , G06F9/54

CPC classification number: G06F9/30087 , G06F9/3877 , G06F9/5072 , G06F9/544

Abstract: Data multicast in compute core clusters is described. An example of an apparatus includes one or more processors including at least a first processor, the first processor including one or more clusters of cores and a memory, wherein each cluster of cores includes multiple cores, each core including one or more processing resources, shared memory, and broadcast circuitry; and wherein a first core in a first cluster of cores is to request a data element, determine whether any additional cores in the first cluster require the data element, and, upon determining that one or more additional cores in the first cluster require the data element, broadcast the data element to the one or more additional cores via interconnects between the broadcast circuitry of the cores of the first core cluster.

39.

发明公开
FORWARD PROGRESS GUARANTEE USING SINGLE-LEVEL SYNCHRONIZATION AT INDIVIDUAL THREAD GRANULARITY 审中-公开

公开(公告)号：US20230153176A1

公开(公告)日：2023-05-18

申请号：US17528386

申请日：2021-11-17

Applicant: Intel Corporation

Inventor： Chunhui Mei , James Valerio , Supratim Pal , Guei-Yuan Lueh , Hong Jiang

IPC: G06F9/52 , G06F9/48

CPC classification number: G06F9/522 , G06F9/48

Abstract: An apparatus to facilitate facilitating forward progress guarantee using single-level synchronization at individual thread granularity is disclosed. The apparatus includes a processor comprising a barrier synchronization hardware circuitry to assign a set of global named barrier identifiers (IDs) to individual execution threads of a plurality of execution threads and synchronize execution of the individual execution threads on a single level via the set of global named barrier IDs; and a plurality of processing resources to execute the plurality of execution threads and comprising divergent barrier scheduling hardware circuitry to facilitate execution flow switching from a first divergent branch executed by a first thread to a second divergent branch executed by a second thread, the execution flow switching performed responsive to the first thread stalling to wait on a named barrier of the set of global named barrier IDs.

40.

发明申请
IMMEDIATE OFFSET OF LOAD STORE AND ATOMIC INSTRUCTIONS 有权

公开(公告)号：US20230090973A1

公开(公告)日：2023-03-23

申请号：US17480528

申请日：2021-09-21

Applicant: Intel Corporation

Inventor： Joydeep Ray , Abhishek R. Appu , Timothy R. Bauer , James Valerio , Weiyu Chen , Subramaniam Maiyuran , Prasoonkumar Surti , Karthik Vaidyanathan , Carsten Benthin , Sven Woop , Jiasheng Chen

IPC: G06F9/30 , G06F12/02 , G06F13/16

Abstract: One embodiment provides a graphics processor including a processing resource including a register file, memory, a cache memory, and load/store/cache circuitry to process load, store, and prefetch messages from the processing resource. The circuitry includes support for an immediate address offset that will be used to adjust the address supplied for a memory access to be requested by the circuitry. Including support for the immediate address offset removes the need to execute additional instructions to adjust the address to be accessed prior to execution of the memory access instruction.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification