-
公开(公告)号:US11768687B2
公开(公告)日:2023-09-26
申请号:US17848559
申请日:2022-06-24
Applicant: Intel Corporation
Inventor: Balaji Vembu , Abhishek R. Appu , Joydeep Ray , Altug Koker
IPC: G06T1/20 , G06F9/50 , G06F9/48 , G06F9/38 , G06F9/46 , G06F9/52 , G06F9/54 , G06F15/16 , G06F15/76 , G06F12/0897 , G06F12/0866 , G06F12/0842 , G06T1/60
CPC classification number: G06F9/3851 , G06F9/46 , G06F9/4843 , G06F9/4881 , G06F9/5027 , G06F9/522 , G06F9/545 , G06F12/0842 , G06F12/0866 , G06F12/0897 , G06F15/16 , G06F15/76 , G06T1/20 , G06T1/60 , G06F2209/5018 , G06T2200/28
Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.
-
公开(公告)号:US11762696B2
公开(公告)日:2023-09-19
申请号:US17520583
申请日:2021-11-05
Applicant: Intel Corporation
Inventor: Abhishek R Appu , Altug Koker , Balaji Vembu , Joydeep Ray , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu , Subramaniam Maiyuran , Sanjeev S. Jahagirdar , Eric J. Asperheim , Guei-Yuan Lueh , David Puffer , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Josh B. Mastronarde , Linda L. Hurd , Travis T. Schluessler , Tomasz Janczak , Abhishek Venkatesh , Kai Xiao , Slawomir Grajewski
CPC classification number: G06F9/5016 , G06F1/329 , G06F9/4893 , G06F9/5044 , G06T1/20 , G06T1/60 , G06T15/005 , G06T2200/28 , Y02D10/00
Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11756247B2
公开(公告)日:2023-09-12
申请号:US17194819
申请日:2021-03-08
Applicant: Intel Corporation
Inventor: Deepak S. Vembar , Atsuo Kuwahara , Chandrasekaran Sakthivel , Radhakrishnan Venkataraman , Brent E. Insko , Anupreet S. Kalra , Hugues Labbe , Abhishek R. Appu , Ankur N. Shah , Joydeep Ray , ElMoustapha Ould-Ahmed-Vall , James M. Holland
IPC: G09G5/00 , G06T11/60 , G06T9/00 , H04N19/124 , H04N19/167 , H04N19/17 , H04N19/436 , H04N19/503
CPC classification number: G06T11/60 , G06T9/00 , H04N19/124 , H04N19/167 , H04N19/17 , H04N19/436 , H04N19/503
Abstract: An embodiment of a graphics apparatus may include a focus identifier to identify a focus area, and a color compressor to selectively compress color data based on the identified focus area. Another embodiment of a graphics apparatus may include a motion detector to detect motion of a real object, a motion predictor to predict a motion of the real object, and an object placer to place a virtual object relative to the real object based on the predicted motion of the real object. Another embodiment of a graphics apparatus may include a frame divider to divide a frame into viewports, a viewport prioritizer to prioritize the viewports, a renderer to render a viewport of the frame in order in accordance with the viewport priorities, and a viewport transmitter to transmit a completed rendered viewport. Other embodiments are disclosed and claimed.
-
34.
公开(公告)号:US20230281898A1
公开(公告)日:2023-09-07
申请号:US18299748
申请日:2023-04-13
Applicant: Intel Corporation
Inventor: Joydeep Ray
CPC classification number: G06T11/60 , G06F3/1438 , G06T7/254 , G06T1/20 , G09G5/39 , G06T15/005 , G06T15/00 , G06T2200/28 , G09G2360/08 , G09G2360/06 , G09G2340/0435 , G09G2352/00 , G09G2360/121 , G09G5/001
Abstract: Video or graphics, received by a render engine within a graphics processing unit, may be segmented into a region of interest such as foreground and a region of less interest such as background. In other embodiments, an object of interest may be segmented from the rest of the depiction in a case of a video game or graphics processing workload. Each of the segmented portions of a frame may themselves make up a separate surface which is sent separately from the render engine to the display engine of a graphics processing unit. In one embodiment, the display engine combines the two surfaces and sends them over a display link to a display panel. The display controller in the display panel displays the combined frame. The combined frame is stored in a buffer and refreshed periodically. In accordance with another embodiment, video or graphics may be segmented by a render engine into regions of interest or objects of interest and objects not of interest and again each of the separate regions or objects may be transferred to the display engine as a separate surface. Then the display engine may transfer the separate surfaces to a display controller of a display panel over a display link. At the display panel, a separate frame buffer may be used for each of the separate surfaces.
-
公开(公告)号:US11748302B2
公开(公告)日:2023-09-05
申请号:US17561427
申请日:2021-12-23
Applicant: Intel Corporation
Inventor: Altug Koker , Prasoonkumar Surti , David Puffer , Subramaniam Maiyuran , Guei-Yuan Lueh , Abhishek R. Appu , Joydeep Ray , Balaji Vembu , Tomer Bar-On , Andrew T. Lauritzen , Hugues Labbe , John G. Gierach , Gabor Liktor
IPC: G06F16/13 , G06F9/38 , G06F9/30 , G06F16/11 , G06F16/172 , G06F9/46 , G06F12/1036 , G06F12/1045 , G06F12/0831
CPC classification number: G06F16/13 , G06F9/30 , G06F9/38 , G06F9/3836 , G06F9/461 , G06F16/113 , G06F16/172 , G06F12/0831 , G06F12/1036 , G06F12/1045 , G06F2201/84
Abstract: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US11727246B2
公开(公告)日:2023-08-15
申请号:US16283021
申请日:2019-02-22
Applicant: Intel Corporation
Inventor: Liwei Ma , Elmoustapha Ould-Ahmed-Vall , Barath Lakshmanan , Ben J. Ashbaugh , Jingyi Jin , Jeremy Bottleson , Mike B. Macpherson , Kevin Nealis , Dhawal Srivastava , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman , Altug Koker , Abhishek R. Appu
Abstract: Embodiments provide systems and methods which facilitate optimization of a convolutional neural network (CNN). One embodiment provides for a non-transitory machine-readable medium storing instructions that cause one or more processors to perform operations comprising processing a trained convolutional neural network (CNN) to generate a processed CNN, the trained CNN having weights in a floating-point format. Processing the trained CNN includes quantizing the weights in the floating-point format to generate weights in an integer format. Quantizing the weights includes generating a quantization table to enable non-uniform quantization of the weights and quantizing the weights from the floating-point format to the integer format using the quantization table. The operations additionally comprise performing an inference operation utilizing the processed CNN with the integer format weights.
-
公开(公告)号:US11726826B2
公开(公告)日:2023-08-15
申请号:US17339184
申请日:2021-06-04
Applicant: Intel Corporation
Inventor: James Valerio , Vasanth Ranganathan , Joydeep Ray , Rahul A. Kulkarni , Abhishek R. Appu , Jeffery S. Boles , Hema C. Nalluri
CPC classification number: G06F9/5038 , G06F9/3822 , G06F9/3867 , G06F9/4881 , G06F9/5066 , G06T1/20
Abstract: Examples are described here that can be used to allocate commands from multiple sources to performance by one or more segments of a processing device. For example, a processing device can be segmented into multiple portions and each portion is allocated to process commands from a particular source. In the event a single source provides commands, the entire processing device (all segments) can be allocated to process commands from the single source. When a second source provides commands, some segments can be allocated to perform commands from the first source and other segments can be allocated to perform commands from the second source. Accordingly, commands from multiple applications can be executed by a processing unit at the same time.
-
公开(公告)号:US11699254B2
公开(公告)日:2023-07-11
申请号:US17524130
申请日:2021-11-11
Applicant: Intel Corporation
Inventor: Joydeep Ray
CPC classification number: G06T11/60 , G06F3/1438 , G06T1/20 , G06T7/254 , G06T15/00 , G06T15/005 , G09G5/39 , G06T2200/28 , G09G5/001 , G09G2340/0435 , G09G2352/00 , G09G2360/06 , G09G2360/08 , G09G2360/121
Abstract: Video or graphics, received by a render engine within a graphics processing unit, may be segmented into a region of interest such as foreground and a region of less interest such as background. In other embodiments, an object of interest may be segmented from the rest of the depiction in a case of a video game or graphics processing workload. Each of the segmented portions of a frame may themselves make up a separate surface which is sent separately from the render engine to the display engine of a graphics processing unit. In one embodiment, the display engine combines the two surfaces and sends them over a display link to a display panel. The display controller in the display panel displays the combined frame. The combined frame is stored in a buffer and refreshed periodically. In accordance with another embodiment, video or graphics may be segmented by a render engine into regions of interest or objects of interest and objects not of interest and again each of the separate regions or objects may be transferred to the display engine as a separate surface. Then the display engine may transfer the separate surfaces to a display controller of a display panel over a display link. At the display panel, a separate frame buffer may be used for each of the separate surfaces.
-
公开(公告)号:US11676239B2
公开(公告)日:2023-06-13
申请号:US17303654
申请日:2021-06-03
Applicant: Intel Corporation
Inventor: Joydeep Ray , Scott Janus , Varghese George , Subramaniam Maiyuran , Altug Koker , Abhishek Appu , Prasoonkumar Surti , Vasanth Ranganathan , Andrei Valentin , Ashutosh Garg , Yoav Harel , Arthur Hunter, Jr. , SungYe Kim , Mike Macpherson , Elmoustapha Ould-Ahmed-Vall , William Sadler , Lakshminarayanan Striramassarma , Vikranth Vemulapalli
IPC: G06T1/20 , G06F9/50 , G06F12/0806 , G06F15/80 , G06F17/16 , G06F7/544 , G06N3/04 , G06N3/08 , G06N3/084 , G06N3/048
CPC classification number: G06T1/20 , G06F7/5443 , G06F9/5027 , G06F12/0806 , G06F15/8046 , G06F17/16 , G06N3/048 , G06N3/08 , G06N3/084
Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to skip computational operations for zero filled matrices and sub-matrices. Embodiments additionally provide techniques to maintain data compression through to a processing unit. Embodiments additionally provide an architecture for a sparse aware logic unit.
-
公开(公告)号:US11675597B2
公开(公告)日:2023-06-13
申请号:US17900230
申请日:2022-08-31
Applicant: Intel Corporation
Inventor: Balaji Vembu , Abhishek R. Appu , Joydeep Ray , Altug Koker
IPC: G06F9/38 , G06F12/0842 , G06F9/52 , G06F9/46 , G06T1/20 , G06F9/48 , G06F9/54 , G06F15/16 , G06F9/50 , G06F15/76 , G06F12/0897 , G06F12/0866 , G06T1/60
CPC classification number: G06F9/3851 , G06F9/46 , G06F9/4843 , G06F9/4881 , G06F9/5027 , G06F9/522 , G06F9/545 , G06F12/0842 , G06F12/0866 , G06F12/0897 , G06F15/16 , G06F15/76 , G06T1/20 , G06T1/60 , G06F2209/5018 , G06T2200/28
Abstract: An apparatus to facilitate thread scheduling is disclosed. In one embodiment the apparatus includes a processor comprising a plurality of multiprocessors comprising single-instruction multiple thread (SIMT) execution circuitry to simultaneously execute multiple threads, a shared local memory to be shared by the multiple threads, and scheduling hardware logic to schedule the multiple threads in a thread group for execution across the plurality of multiprocessors in accordance with barrier data. The instructions of the multiple threads are to produce shared data to be stored in the shared local memory when executed by the plurality of multiprocessors, wherein additional instructions of at least a first thread of the multiple threads are to use the shared data, and wherein, in accordance with the barrier data, the first thread is to wait for other threads of the multiple threads to finish producing the shared data before executing the additional instructions.
-
-
-
-
-
-
-
-
-