Abstract:
Tile based interleaving and de-interleaving of row-column interleaved data is described. In one example, the de-interleaving is divided into two memory transfer stages, the first from an on-chip memory to a DRAM and the second from the DRAM to an on-chip memory. Each stage operates on part of a row-column interleaved block of data and re-orders the data items, such that the output of the second stage comprises de-interleaved data. In the first stage, data items are read from the on-chip memory according to a non-linear sequence of memory read addresses and written to the DRAM. In the second stage, data items are read from the DRAM according to bursts of linear address sequences which make efficient use of the DRAM interface and written back to on-chip memory according to a non-linear sequence of memory write addresses.
Abstract:
Tile based interleaving and de-interleaving of row-column interleaved data is described. In one example, the de-interleaving is divided into two memory transfer stages, the first from an on-chip memory to a DRAM and the second from the DRAM to an on-chip memory. Each stage operates on part of a row-column interleaved block of data and re-orders the data items, such that the output of the second stage comprises de-interleaved data. In the first stage, data items are read from the on-chip memory according to a non-linear sequence of memory read addresses and written to the DRAM. In the second stage, data items are read from the DRAM according to bursts of linear address sequences which make efficient use of the DRAM interface and written back to on-chip memory according to a non-linear sequence of memory write addresses.
Abstract:
Tile based interleaving and de-interleaving of row-column interleaved data is described. In one example, the de-interleaving is divided into two memory transfer stages, the first from an on-chip memory to a DRAM and the second from the DRAM to an on-chip memory. Each stage operates on part of a row-column interleaved block of data and re-orders the data items, such that the output of the second stage comprises de-interleaved data. In the first stage, data items are read from the on-chip memory according to a non-linear sequence of memory read addresses and written to the DRAM. In the second stage, data items are read from the DRAM according to bursts of linear address sequences which make efficient use of the DRAM interface and written back to on-chip memory according to a non-linear sequence of memory write addresses.
Abstract:
A technique for transferring data in a digital signal processing system is described. In one example, the digital signal processing system comprises a number of fixed function accelerators, each connected to a memory access controller and each configured to read data from a memory device, perform one or more operations on the data, and write data to the memory device. To avoid hardwiring the fixed function accelerators together, and to provide a configurable digital signal processing system, a multi-threaded processor controls the transfer of data between the fixed function accelerators and the memory. Each processor thread is allocated to a memory access channel, and the threads are configured to detect an occurrence of an event and, responsive to this, control the memory access controller to enable a selected fixed function accelerator to read data from or write data to the memory device via its memory access channel.
Abstract:
A SIMD processing module is provided, comprising multiple vector processing units (“VUs”), which can be used to execute an instruction on respective parts (or “subvectors”) within a vector. A control unit determines a vector position indication for each of the VUs to indicate which part of the vector that VU is to execute the instruction on. Therefore, the vector is conceptually divided into subvectors with the respective VUs executing the instruction on the respective subvectors in parallel. Each VU can then execute the instruction as intended, but only on a subsection of the whole vector. This allows an instruction that is written for execution on an n-way VU to be executed by multiple n-way VUs, each starting at different points of the vector, such that the instruction can be executed on more than n of the data items of the vector in parallel.
Abstract:
Methods and apparatus for efficient demapping of constellations are described. In an embodiment, these methods may be implemented within a digital communications receiver, such as a Digital Terrestrial Television receiver. The method reduces the number of distance metric calculations which are required to calculate soft information in the demapper by locating the closest constellation point to the received symbol. This closest constellation point is identified based on a comparison of distance metrics which are calculated parallel to either the I- or Q-axis. The number of distance metric calculations may be reduced still further by identifying a local minimum constellation point for each bit in the received symbol and these constellation points are identified using a similar method to the closest constellation point. Where the system uses rotated constellations, the received symbol may be unrotated before any constellation points are identified.
Abstract:
Tile based interleaving and de-interleaving of row-column interleaved data is described. In one example, the de-interleaving is divided into two memory transfer stages, the first from an on-chip memory to a DRAM and the second from the DRAM to an on-chip memory. Each stage operates on part of a row-column interleaved block of data and re-orders the data items, such that the output of the second stage comprises de-interleaved data. In the first stage, data items are read from the on-chip memory according to a non-linear sequence of memory read addresses and written to the DRAM. In the second stage, data items are read from the DRAM according to bursts of linear address sequences which make efficient use of the DRAM interface and written back to on-chip memory according to a non-linear sequence of memory write addresses.
Abstract:
Tile based interleaving and de-interleaving of row-column interleaved data is described. In one example, the de-interleaving is divided into two memory transfer stages, the first from an on-chip memory to a DRAM and the second from the DRAM to an on-chip memory. Each stage operates on part of a row-column interleaved block of data and re-orders the data items, such that the output of the second stage comprises de-interleaved data. In the first stage, data items are read from the on-chip memory according to a non-linear sequence of memory read addresses and written to the DRAM. In the second stage, data items are read from the DRAM according to bursts of linear address sequences which make efficient use of the DRAM interface and written back to on-chip memory according to a non-linear sequence of memory write addresses.
Abstract:
Tile based interleaving and de-interleaving of row-column interleaved data is described. In one example, the de-interleaving is divided into two memory transfer stages, the first from an on-chip memory to a DRAM and the second from the DRAM to an on-chip memory. Each stage operates on part of a row-column interleaved block of data and re-orders the data items, such that the output of the second stage comprises de-interleaved data. In the first stage, data items are read from the on-chip memory according to a non-linear sequence of memory read addresses and written to the DRAM. In the second stage, data items are read from the DRAM according to bursts of linear address sequences which make efficient use of the DRAM interface and written back to on-chip memory according to a non-linear sequence of memory write addresses.
Abstract:
A technique for transferring data in a digital signal processing system is described. In one example, the digital signal processing system comprises a number of fixed function accelerators, each connected to a memory access controller and each configured to read data from a memory device, perform one or more operations on the data, and write data to the memory device. To avoid hardwiring the fixed function accelerators together, and to provide a configurable digital signal processing system, a multi-threaded processor controls the transfer of data between the fixed function accelerators and the memory. Each processor thread is allocated to a memory access channel, and the threads are configured to detect an occurrence of an event and, responsive to this, control the memory access controller to enable a selected fixed function accelerator to read data from or write data to the memory device via its memory access channel.