Invention Grant
US08554820B2 Optimized corner turns for local storage and bandwidth reduction 有权
优化的转弯为本地存储和带宽降低

Optimized corner turns for local storage and bandwidth reduction
Abstract:
A block matrix multiplication mechanism is provided for reversing the visitation order of blocks at corner turns when performing a block matrix multiplication operation in a data processing system. By reversing the visitation order, the mechanism eliminates a block load at the corner turns. In accordance with the illustrative embodiment, a corner return is referred to as a “bounce” corner turn and results in a serpentine patterned processing order of the matrix blocks. The mechanism allows the data processing system to perform a block matrix multiplication operation with a maximum of three block transfers per time step. Therefore, the mechanism reduces maximum throughput and increases performance. In addition, the mechanism also reduces the number of multi-buffered local store buffers.
Public/Granted literature
Information query
Patent Agency Ranking
0/0