-
公开(公告)号:US20200258263A1
公开(公告)日:2020-08-13
申请号:US16750819
申请日:2020-01-23
Applicant: Intel Corporation
Inventor: Joydeep Ray , Ben Ashbaugh , Prasoonkumar Surti , Pradeep Ramani , Rama Harihara , Jerin C. Justin , Jing Huang , Xiaoming Cui , Timothy B. Costa , Ting Gong , Elmoustapha Ould-ahmed-vall , Kumar Balasubramanian , Anil Thomas , Oguz H. Elibol , Jayaram Bobba , Guozhong Zhuang , Bhavani Subramanian , Gokce Keskin , Chandrasekaran Sakthivel , Rajesh Poornachandran
Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.
-
公开(公告)号:US10726517B2
公开(公告)日:2020-07-28
申请号:US16293044
申请日:2019-03-05
Applicant: Intel Corporation
Inventor: Altug Koker , Ingo Wald , David Puffer , Subramaniam M. Maiyuran , Prasoonkumar Surti , Balaji Vembu , Guei-Yuan Lueh , Murali Ramadoss , Abhishek R. Appu , Joydeep Ray
Abstract: One embodiment provides for a parallel processor comprising a processing array within the parallel processor, the processing array including multiple compute blocks, each compute block including multiple processing clusters configured for parallel operation, wherein each of the multiple compute blocks is independently preemptable. In one embodiment a preemption hint can be generated for source code during compilation to enable a compute unit to determine an efficient point for preemption.
-
公开(公告)号:US20200210238A1
公开(公告)日:2020-07-02
申请号:US16726341
申请日:2019-12-24
Applicant: Intel Corporation
Inventor: Abhishek R Appu , Altug Koker , Balaji Vembu , Joydeep Ray , Kamal Sinha , Prasoonkumar Surti , Kiran C. Veernapu , Subramaniam Maiyuran , Sanjeev S. Jahagirdar , Eric J. Asperheim , Guei-Yuan Lueh , David Puffer , Wenyin Fu , Nikos Kaburlasos , Bhushan M. Borole , Josh B. Mastronarde , Linda L. Hurd , Travis T. Schluessler , Tomasz Janczak , Abhishek Venkatesh , Kai Xiao , Slawomir Grajewski
Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.
-
公开(公告)号:US10699465B1
公开(公告)日:2020-06-30
申请号:US16235893
申请日:2018-12-28
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Carsten Benthin , Karthik Vaidyanathan , Philip Laws , Scott Janus , Sven Woop
Abstract: Cluster of acceleration engines to accelerate intersections. For example, one embodiment of an apparatus comprises: a set of graphics cores to execute a first set of instructions of a primary graphics thread; a scalar cluster comprising a plurality of scalar execution engines; and a communication fabric interconnecting the set of graphics cores and the scalar cluster; the set of graphics cores to offload execution of a second set of instructions associated with ray traversal and/or intersection operations to the scalar cluster; the scalar cluster comprising a plurality of local memories, each local memory associated with one of the scalar execution engines, wherein each local memory is to store a portion of a hierarchical acceleration data structure required by an associated scalar execution engine to execute one or more of the second set of instructions; the plurality of scalar execution engines to store results of the execution of the second set of instructions in a memory accessible by the set of graphics cores; wherein the set of graphics cores are to process the results within the primary graphics thread.
-
公开(公告)号:US20200111454A1
公开(公告)日:2020-04-09
申请号:US16599175
申请日:2019-10-11
Applicant: Intel Corporation
Inventor: Joydeep Ray , Altug Koker , Balaji Vembu , Murali Ramadoss , Guei-Yuan Lueh , James A. Valerio , Prasoonkumar Surti , Abhishek R. Appu , Vasanth Ranganathan , Kalyan K. Bhiravabhatla , Arthur D. Hunter, JR. , Wei-Yu Chen , Subramaniam M. Maiyuran
IPC: G09G5/36 , G09G5/00 , G06F9/46 , G06F12/0875
Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.
-
76.
公开(公告)号:US10573066B2
公开(公告)日:2020-02-25
申请号:US16215850
申请日:2018-12-11
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Karthik Vaidyanathan , Murali Ramadoss , Michael Apodaca , Abhishek Venkatesh , Joydeep Ray , Abhishek R. Appu
Abstract: Systems, apparatuses and methods may provide away to render edges of an object defined by multiple tessellation triangles. More particularly, systems, apparatuses and methods may provide a way to perform anti-aliasing at the edges of the object based on a coarse pixel rate, where the coarse pixels may be based on a coarse Z value indicate a resolution or granularity of detail of the coarse pixel. The systems, apparatuses and methods may use a shader dispatch engine to dispatch raster rules to a pixel shader to direct the pixel shader to include, in a tile and/or tessellation triangle, one more finer coarse pixels based on a percent of coverage provided by a finer coarse pixel of a tessellation triangle at or along the edge of the object.
-
公开(公告)号:US10573055B2
公开(公告)日:2020-02-25
申请号:US15693084
申请日:2017-08-31
Applicant: Intel Corporation
Inventor: John G. Gierach , Darrel K. Palke , Travis T. Schluessler , Prasoonkumar Surti
Abstract: An apparatus and method for programmable depth stencil pipeline stage and shading. For example, one embodiment of a graphics processing apparatus comprises: a rasterizer to generate a plurality of pixel blocks, one or more of which overlap one or more primitives; programmable depth stencil circuitry to perform depth stencil tests on the pixels which overlap the one or more primitives to identify pixels which pass the depth stencil tests; and thread dispatch circuitry to dispatch pixel shader threads to perform pixel shading operations on those pixels which pass the depth stencil tests, the thread dispatch circuitry including thread dispatch recombine logic to combine pixels which have passed the depth stencil test from multiple pixel blocks into a set of pixel shader threads to be executed concurrently on single instruction multiple data (SIMD) hardware.
-
公开(公告)号:US10565676B2
公开(公告)日:2020-02-18
申请号:US15488547
申请日:2017-04-17
Applicant: Intel Corporation
Inventor: Adam T. Lake , Guei-Yuan Lueh , Balaji Vembu , Murali Ramadoss , Prasoonkumar Surti , Abhishek R. Appu , Altug Koker , Subramaniam M. Maiyuran , Eric C. Samson , David J. Cowperthwaite , Zhi Wang , Kun Tian , David Puffer , Brian T. Lewis
Abstract: An apparatus to facilitate data prefetching is disclosed. The apparatus includes a memory, one or more execution units (EUs) to execute a plurality of processing threads and prefetch logic to prefetch pages of data from the memory to assist in the execution of the plurality of processing threads.
-
公开(公告)号:US20200045348A1
公开(公告)日:2020-02-06
申请号:US16050475
申请日:2018-07-31
Applicant: Intel Corporation
Inventor: Jill Boyce , Scott Janus , Prasoonkumar Surti , Stanley Baran , Michael Apodaca , Srikanth Potluri , Hugues Labbe , Jong Dae Oh , Gokcen Cilingir , Archie Sharma , Jeffrey Tripp , Jason Ross , Barnan Das
IPC: H04N21/235 , H04N21/435 , H04N21/845 , H04N21/2662 , G06T15/00 , G06T15/80 , G06T15/50
Abstract: An apparatus to facilitate processing video bit stream data is disclosed. The apparatus includes one or more processors to decode point cloud data, reconstruct the decoded point cloud data and fill one or more holes in reconstructed point cloud frame data using patch metadata included in the decoded point cloud data and a memory communicatively coupled to the one or more processors.
-
公开(公告)号:US20200045343A1
公开(公告)日:2020-02-06
申请号:US16050391
申请日:2018-07-31
Applicant: Intel Corporation
Inventor: Jill Boyce , Scott Janus , Itay Kaufman , Archie Sharma , Stanley Baran , Michael Apodaca , Prasoonkumar Surti , Srikanth Potluri , Barnan Das , Hugues Labbe , Jong Dae Oh , Gokcen Cilingir , Maria Bortman , Tzach Ashkenazi , Jonathan Distler , Atul Divekar , Mayuresh M. Varerkar , Narayan Biswal , Nilesh V. Shah , Atsuo Kuwahara , Kai Xiao , Jason Tanner , Jeffrey Tripp
Abstract: An apparatus to facilitate processing video bit stream data is disclosed. The apparatus includes one or more processors to encode surface normals data with point cloud geometry data included in the video bit stream data for reconstruction of objects within the video bit stream data based on the surface normals data and a memory communicatively coupled to the one or more processors.
-
-
-
-
-
-
-
-
-