Hardware unit for performing matrix multiplication with clock gating
Abstract:
Hardware units and methods for performing matrix multiplication via a multi-stage pipeline wherein the storage elements associated with one or more stages of the pipeline are clock gated based on the data elements and/or portions thereof that known to have a zero value (or can be treated as having a zero value). In some cases, the storage elements may be clock gated on a per data element basis based on whether the data element has a zero value (or can be treated as having a zero value). In other cases, the storage elements may be clock gated on a partial element basis based on the bit width of the data elements. For example, if bit width of the data elements is less than a maximum bit width for the data elements then a portion of the bits related to that data element can be treated as having a zero value and a portion of the storage elements associated with that data element may not be clocked. In yet other cases the storage elements may be clock gated on both a per element and a partial element basis.
Information query
Patent Agency Ranking
0/0