Systems and methods for accelerating memory transfers and computation efficiency using a computation-informed partitioning of an on-chip data buffer and implementing computation-aware data transfer operations to the on-chip data buffer
Abstract:
Systems and methods for implementing accelerated memory transfers in an integrated circuit includes configuring a region of memory of an on-chip data buffer based on a neural network computation graph, wherein configuring the region of memory includes: partitioning the region of memory of the on-chip data buffer to include a first distinct sub-region of memory and a second distinct sub-region of memory; initializing a plurality of distinct memory transfer operations from the off-chip main memory to the on-chip data buffer; executing a first set of memory transfer operations that includes writing a first set of computational components to the first distinct sub-region of memory, and while executing, using the integrated circuit, a leading computation based on the first set of computational components, executing a second set of memory transfer operations to the second distinct sub-region of memory for an impending computation.
Information query
Patent Agency Ranking
0/0