Abstract:
A data cache memory coupled to a processor architecture including a plurality of processor clusters (Cluster0,..., Cluster3) is adapted to operate simultaneously on scalar and vectorial data by providing in the data cache memory data locations for storing therein data for processing by the architecture, and accessing the data locations in the data cache memory either in a scalar mode or in a vectorial mode. This is done preferably by explicitly mapping those locations of the cache memory considered as scalar and those locations of the cache memory considered as vectorial.
Abstract:
A processor architecture (10) e.g. for multimedia applications, includes a plurality of processor clusters (18a, 18b) that provide a vectorial data processing capability. The processing elements in the processor clusters (18a, 18b) are configured to process both data with a given bit length N and data with bit lengths N/2, N/4, and so on according to a Single Instruction Multiple Data (SIMD) paradigm. A load unit (26) is provided for loading into the processor clusters (18a, 18b) data to be processed in the form of sets of more significant bits and less significant bits of operands to be processed according to a same instruction. An intercluster datapath (28) exchanges and/or merges data between the processor clusters (18a, 18b). The intercluster datapath (28) is scalable to activate selected ones of the processor clusters (18a, 18b), whereby the architecture (10) is adapted to operate simultaneously on SIMD, scalar and vectorial data. Preferably, the instruction subsystem (12) has instruction parallelism capability and the intercluster datapath (28) is configured for performing operations on e.g. 2*N data. Preferably, a data cache memory (34) is provided which is accessible either in a scalar mode or in a vectorial mode.