Abstract:
The present invention provides a scalable method for implementing FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements, comprising computing each butterfly of the first "log 2 P" stages on either a single processor or each of the "P" processors simultaneously and distributing the computation of the butterflies in all the subsequent stages among the "P" processors such that each chain of cascaded butterflies consisting of those butterflies that have inputs and outputs connected together, are processed by the same processor. The invention also provides a system for obtaining scalable implementation of FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements.
Abstract:
The present invention provides a scalable method for implementing FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements, comprising computing each butterfly of the first "log 2 P" stages on either a single processor or each of the "P" processors simultaneously and distributing the computation of the butterflies in all the subsequent stages among the "P" processors such that each chain of cascaded butterflies consisting of those butterflies that have inputs and outputs connected together, are processed by the same processor. The invention also provides a system for obtaining scalable implementation of FFT/IFFT computations in multiprocessor architectures that provides improved throughput by eliminating the need for inter-processor communication after the computation of the first "log 2 P" stages for an implementation using "P" processing elements.