Patent search ap:("INTEL CORPORATION") AND inv:"Sreenivas Subramoney" Page 7

61.

发明申请
REORDERING OF SPARSE DATA TO INDUCE SPATIAL LOCALITY FOR N-DIMENSIONAL SPARSE CONVOLUTIONAL NEURAL NETWORK PROCESSING 审中-公开

公开(公告)号：US20200327396A1

公开(公告)日：2020-10-15

申请号：US16913370

申请日：2020-06-26

Applicant: Intel Corporation

Inventor： Anirud Thyagharajan , Prashant Laddha , Om Omer , Sreenivas Subramoney

IPC: G06N3/04 , G06F13/40 , G06N3/08

Abstract: Exemplary embodiments maintain spatial locality of the data being processed by a sparse CNN. The spatial locality is maintained by reordering the data to preserve spatial locality. The reordering may be performed on data elements and on data for groups of co-located data elements referred to herein as “chunks”. Thus, the data may be reordered into chunks, where each chunk contains data for spatially co-located data elements, and in addition, chunks may be organized so that spatially located chunks are together. The use of chunks helps to reduce the need to re-fetch data during processing. Chunk sizes may be chosen based on the memory constraints of the processing logic (e.g., cache sizes).

62.

发明授权
Automatic predication of hard-to-predict convergent branches 有权

公开(公告)号：US10754655B2

公开(公告)日：2020-08-25

申请号：US16021838

申请日：2018-06-28

Applicant: Intel Corporation

Inventor： Adarsh Chauhan , Hong Wang , Jayesh Gaur , Zeev Sperber , Sumeet Bandishte , Lihu Rappoport , Stanislav Shwartsman , Kamil Garifullin , Sreenivas Subramoney , Adi Yoaz

IPC: G06F9/32 , G06F9/42 , G06F9/38 , G06F9/30

Abstract: A processing device includes a branch IP table and branch predication circuitry coupled to the branch IP table. The branch predication circuitry to: determine a dynamic convergence point in a conditional branch of set of instructions; store the dynamic convergence point in the branch IP table; fetch a first and second speculative path of the conditional branch; while determining which of the first speculative path and the second speculative path is a taken path of the conditional branch and determining whether a dynamic convergence point is fetched corresponding to the stored dynamic convergence point, stall scheduling of instructions of the first speculative path and the second speculative path; and in response to determining that one of the first speculative path and the second speculative path is the taken path and the fetched dynamic convergence point corresponds to the stored convergence point, resume scheduling of the instructions of the taken path.

63.

发明申请
METHODS, APPARATUS, ARTICLES OF MANUFACTURE TO PERFORM ACCELERATED MATRIX MULTIPLICATION 审中-公开

公开(公告)号：US20200226203A1

公开(公告)日：2020-07-16

申请号：US16833210

申请日：2020-03-27

Applicant: Intel Corporation

Inventor： Biji George , Om Ji Omer , Dipan Kumar Mandal , Cormac Brick , Lance Hacking , Sreenivas Subramoney , Belliappa Kuttanna

IPC: G06F17/16

Abstract: A disclosed apparatus to multiply matrices includes a compute engine. The compute engine includes multipliers in a two dimensional array that has a plurality of array locations defined by columns and rows. The apparatus also includes a plurality of adders in columns. A broadcast interconnect between a cache and the multipliers broadcasts a first set of operand data elements to multipliers in the rows of the array. A unicast interconnect unicasts a second set of operands between a data buffer and the multipliers. The multipliers multiply the operands to generate a plurality of outputs, and the adders add the outputs generated by the multipliers.

64.

发明授权
Adaptive spatial access prefetcher apparatus and method 有权

公开(公告)号：US10713053B2

公开(公告)日：2020-07-14

申请号：US16024808

申请日：2018-06-30

Applicant: Intel Corporation

Inventor： Rahul Bera , Anant Vithal Nori , Sreenivas Subramoney , Hong Wang

IPC: G06F9/38 , G06F12/0875 , G06F12/0862 , G06F12/084

Abstract: An apparatus and method for adaptive spatial accelerated prefetching. For example, one embodiment of an apparatus comprises: execution circuitry to execute instructions and process data; a Level 2 (L2) cache to store at least a portion of the data; and a prefetcher to prefetch data from a memory subsystem to the L2 cache in anticipation of the data being needed by the execution unit to execute one or more of the instructions, the prefetcher comprising a buffer to store one or more prefetched memory pages or portions thereof, and signature data indicating detected patterns of access to the one or more prefetched memory pages; wherein the prefetcher is to prefetch one or more cache lines based on the signature data.

65.

发明授权
Dynamic detection and prediction for store-dependent branches 有权

公开(公告)号：US10430198B2

公开(公告)日：2019-10-01

申请号：US15870595

申请日：2018-01-12

Applicant: Intel Corporation

Inventor： Saurabh Gupta , Rahul Pal , Niranjan Soundararajan , Ragavendra Natarajan , Sreenivas Subramoney

IPC: G06F9/38

Abstract: One embodiment provides an apparatus. The apparatus includes a store direct dependent (SDD) branch prediction circuitry and an SDD management circuitry. The store direct dependent (SDD) branch prediction circuitry is to store an SDD branch table. The SDD branch table is to store at least one record. Each record includes a branch instruction pointer (IP) field, a load IP field, a store IP field, a comparison info field and at least one of a store value field and/or a predicted outcome field. The SDD management circuitry is to populate the SDD branch table at runtime and to override a baseline branch prediction associated with an incoming branch IP with an SDD branch prediction, if the SDD branch table contains a first record populated with the incoming branch IP and at least one of a store value and/or an SDD predicted outcome.

66.

发明授权
Branch predictor with empirical branch bias override 有权

公开(公告)号：US10423422B2

公开(公告)日：2019-09-24

申请号：US15383832

申请日：2016-12-19

Applicant: Intel Corporation

Inventor： Niranjan K. Soundararajan , Sreenivas Subramoney , Rahul Pal , Ragavendra Natarajan

IPC: G06F9/38

Abstract: A processor may include a baseline branch predictor and an empirical branch bias override circuit. The baseline branch predictor may receive a branch instruction associated with a given address identifier, and generate, based on a global branch history, an initial prediction of a branch direction for the instruction. The empirical branch bias override circuit may determine, dependent on a direction of an observed branch direction bias in executed branch instruction instances associated with the address identifier, whether the initial prediction should be overridden, may determine, in response to determining that the initial prediction should be overridden, a final prediction that matches the observed branch direction bias, or may determine, in response determining that the initial prediction should not be overridden, a final prediction that matches the initial prediction. The predictor may update an entry in the global branch history reflecting the resolved branch direction for the instruction following its execution.

67.

发明申请
SUPPORTING TIMELY AND CONTEXT TRIGGERED PREFETCHING IN MICROPROCESSORS 审中-公开

公开(公告)号：US20190205135A1

公开(公告)日：2019-07-04

申请号：US15861370

申请日：2018-01-03

Applicant: Intel Corporation

Inventor： Anant Vithal Nori , Sreenivas Subramoney , Shankar Balachandran , Hong Wang

IPC: G06F9/30 , G06F9/38 , G06F13/16

CPC classification number: G06F9/30047 , G06F9/3814 , G06F13/1673 , G06F2213/0064

Abstract: Implementations of the disclosure implement timely and context triggered (TACT) prefetching that targets particular load IPs in a program contributing to a threshold amount of the long latency accesses. A processing device comprising an execution unit; and a prefetcher circuit communicably coupled to the execution unit is provided. The prefetcher circuit is to detect a memory request for a target instruction pointer (IP) in a program to be executed by the execution unit. A trigger IP is identified to initiate a prefetch operation of memory data for the target IP. Thereupon, an association is determined between memory addresses of the trigger IP and the target IP. The association comprising a series of offsets representing a path between the trigger IP and an instance of the target IP in memory. Based on the association, an offset from the memory address of the trigger IP to prefetch the memory data is produced.

68.

发明授权
Optimized image feature extraction 有权

公开(公告)号：US10318834B2

公开(公告)日：2019-06-11

申请号：US15582945

申请日：2017-05-01

Applicant: INTEL CORPORATION

Inventor： Gurpreet S. Kalsi , Om J. Omer , Biji George , Gopi Neela , Dipan Kumar Mandal , Sreenivas Subramoney

IPC: G06K9/00 , G06K9/46 , G06F17/10

Abstract: One embodiment provides an image processing circuitry. The image processing circuitry includes a feature extraction circuitry and an optimization circuitry. The feature extraction circuitry is to determine a feature descriptor based, at least in part, on a feature point location and a corresponding scale. The optimization circuitry is to optimize an operation of the feature extraction circuitry. Each optimization is to at least one of accelerate the operation of the feature extraction circuitry, reduce a power consumption of the feature extraction circuitry and/or reduce a system memory bandwidth used by the feature extraction circuitry.

69.

发明授权
Systems and methods for page management using local page information 有权

公开(公告)号：US10191689B2

公开(公告)日：2019-01-29

申请号：US15393998

申请日：2016-12-29

Applicant: Intel Corporation

Inventor： Sriseshan Srikanth , Lavanya Subramanian , Sreenivas Subramoney

IPC: G06F3/06

Abstract: Systems for page management using local page information are disclosed. The system may include a processor, including a memory controller, and a memory, including a row buffer. The memory controller may include circuitry to determine that a page stored in the row buffer has been idle for a time exceeding a predetermined threshold determine whether the page is exempt from idle page closures, and, based on a determination that the page is exempt, refrain from closing the page. Associated methods are also disclosed.

70.

发明申请
METHOD AND APPARATUS FOR REDUCING WRITE CONGESTION IN NON-VOLATILE MEMORY BASED LAST LEVEL CACHES 审中-公开

公开(公告)号：US20180285268A1

公开(公告)日：2018-10-04

申请号：US15475197

申请日：2017-03-31

Applicant: Intel Corporation

Inventor： Kunal Kishore Korgaonkar , Ishwar S. Bhati , Huichu Liu , Jayesh Gaur , Sasikanth Manipatruni , Sreenivas Subramoney , Tanay Karnik , Hong Wang , Ian A. Young

IPC: G06F12/0811 , G06F12/0808 , G06F12/1045 , G06F13/40

Abstract: In one embodiment, a processor comprises a processing core, a last level cache (LLC), and a mid-level cache. The mid-level cache is to determine that an idle indicator has been set, wherein the idle indicator is set based on an amount of activity at the LLC, and based on the determination that the idle indicator has been set, identify a first cache line to be evicted from a first set of cache lines of the mid-level cache and send a request to write the first cache line to the LLC.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification