Abstract:
This invention relates to a linear scalable method for computing a Fast Fourier Transform (FFT) or Inverse Fast Fourier transform (IFFT) in a multiprocessing system using a decimation in time approach. Linear scalability means, as the number of processors increases by a factor P (for example), the computational cycle reduces by exactly the same factor P. The invention comprises computing the first two stages of an N-point FFT/IFFT as a single radix-4 butterfly computation operation while implementing the remaining (log 2 N-2) stages as radix-2 operations, fusing the 3 main nested loops of each radix-2 butterfly stage into a single radix-2 butterfly computation loop, and distributing the computation of the butterflies in each stage such that each processor computes an equal number of complete butterfly calculations thereby eliminating data interdependency in the stage.
Abstract:
A processing system for accessing first and second data types. The first data type is data supplied from a peripheral and the second data type is randomly accessible data held in a data memory. The processing system comprises a processor for executing instructions; a stream register unit connected to supply data from the peripheral to the processor; a FIFO connected to receive data from the peripheral and connected to the stream register unit by a communication path, along which the said data can be supplied from the FIFO to the stream register unit; and a memory bus connected between the data memory and the processor, across which the processor can access the randomly accessible data.
Abstract:
A semiconductor integrated circuit (210) for use in direct memory access (DMA) has 3 sources (214,215,216) which communicate with a bus (230) through a bus interface (220). A DMA access signal generator (290) is coupled to the bus interface (220) and asserts a DMA access output signal at DMA access signal pins (296,396) whenever either of the sources requires a DMA access. The need for separate DMA access signal pins for each of the 3 sources is thereby avoided. With targets on two separate integrated circuits (212,312), a single DMA access pin (396) can be used for the two targets (248,349), chip select signal at chip select pins (506,516) on the source integrated cicuit (210) indicate which of the two targets the DMA access is intended for.
Abstract:
The invention provides circuitry for carrying out at least one of a square root operation and a division operation. The circuitry comprises a carry save adder, and a carry propagate adder part. The carry save adder and the carry propagate adder part are arranged in parallel.
Abstract:
The invention provides circuitry for carrying out an arithmetic operation requiring a plurality of iterations. The circuitry comprises N sets of iteration circuitry arranged one after the other so that at least one of the sets of iteration circuitry receives an output from a preceding one of the sets of iteration circuitry. Each of the sets of iteration circuitry comprises at least one adder part, wherein a full adder is provided by at least one part in one of the sets of iteration circuitry and a second part in a succeeding one of the sets of iteration circuitry.
Abstract:
A structure for simplifying the programmable memory to logic interface in FPGAs is proposed. The interface is such that it isolates the general purpose routing architecture for intra-PLB (Programmable Logic Blocks) routing from the RAM address, data and control lines. The programmable logic blocks and the input-output resources of the FPGA access the embedded memory or RAM using dedicated direct interconnects. A major part of these direct interconnects surface from programmable logic blocks in vicinity of the RAM. The rest run between the input-output (IO) pads/routing and the RAM blocks. A dedicated bus-routing architecture is provided to club the memories to emulate larger RAM blocks. This bus routing is devoted to interconnection among RAM blocks and is isolated from the PLB routing resources.
Abstract:
A digital frequency divider has a single circulating shifter register loaded with a bit sequence of variable length and having two outputs (A,B) adjacent such that one output is equal to the other delayed by one clock period. The outputs (A,B) are passed to a multiplexer (6) via further logic, the multiplexer selecting one of two inputs (X,Y) depending on whether a clock is high or low. Program logic (40) is provided so that the circuit is configurable for odd, even or half integer division by detecting changes in the bit sequence between 0 and 1 and selectively "deleting" the first half clock cycle when a change is detected. This allows even, odd or half integer clock division with an "even" mark space ratio.
Abstract:
This invention relates to a synthesizable, synchronous static RAM comprising custom built memcells and a semi-custom IO / precharge section in form of bit slice, a semi-custom built decoder connected to said bit slice and a semi-custom built control clock generation section, which is connected to said semi-custom built decoder and IO section. The arrangement being such as to provides high speed access, easy testability and asynchronous initialization capabilities while reducing design time in a size that is significantly smaller than existing semi custom or standard cell base memory design.
Abstract:
This invention relates to a synthesizable, synchronous static RAM comprising custom built memcells and a semi-custom IO / precharge section in form of bit slice, a semi-custom built decoder connected to said bit slice and a semi-custom built control clock generation section, which is connected to said semi-custom built decoder and IO section. The arrangement being such as to provides high speed access, easy testability and asynchronous initialization capabilities while reducing design time in a size that is significantly smaller than existing semi custom or standard cell base memory design.