Abstract:
The present invention provides a scalable method for implementing FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements, comprising computing each butterfly of the first "log 2 P" stages on either a single processor or each of the "P" processors simultaneously and distributing the computation of the butterflies in all the subsequent stages among the "P" processors such that each chain of cascaded butterflies consisting of those butterflies that have inputs and outputs connected together, are processed by the same processor. The invention also provides a system for obtaining scalable implementation of FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements.
Abstract:
An improved FFT / IFFT processor comprising computation means capable of processing butterfly operations, and storage means for storing the operands of butterfly operations, and a mechanism for storing the operands of multiple consecutive butterfly operations in contiguous storage locations, and wherein the computation means is capable of simultaneously accessing and processing said multiple butterfly operations.
Abstract:
A macro-block level parallel implementation of a video decoder in parallel processing environment comprising a Variable Length Decoding (VLD) block to decode the encoded Discrete Cosine Transform (DCT) coefficient; a master node which receives said decoded Discrete Cosine Transform (DCT) coefficients; and, plurality of slave nodes/processors for parallel implementation of Inverse Discrete Cosine Transform (IDCT) and motion compensation at macro-block level.
Abstract:
A macro-block level parallel implementation of a video decoder in parallel processing environment comprising a Variable Length Decoding (VLD) block to decode the encoded Discrete Cosine Transform (DCT) coefficient; a master node which receives said decoded Discrete Cosine Transform (DCT) coefficients; and, plurality of slave nodes/processors for parallel implementation of Inverse Discrete Cosine Transform (IDCT) and motion compensation at macro-block level.
Abstract:
An improved FFT / IFFT processor comprising computation means capable of processing butterfly operations, and storage means for storing the operands of butterfly operations, and a mechanism for storing the operands of multiple consecutive butterfly operations in contiguous storage locations, and wherein the computation means is capable of simultaneously accessing and processing said multiple butterfly operations.
Abstract:
The present invention provides a scalable method for implementing FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements, comprising computing each butterfly of the first "log 2 P" stages on either a single processor or each of the "P" processors simultaneously and distributing the computation of the butterflies in all the subsequent stages among the "P" processors such that each chain of cascaded butterflies consisting of those butterflies that have inputs and outputs connected together, are processed by the same processor. The invention also provides a system for obtaining scalable implementation of FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements.