Abstract:
A vector-matrix multiplier unit fully utilizes a 128x128b data path for operand sizes from 8 to 128b and operand types including signed, unsigned or complex, and fixed-, floating-point, polynomial, or Galois-field while maintaining full internal precision. The present disclosure provides a system and method for improving the performance of general-purpose processor, by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128b by 128b multipliers regardless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.
Abstract:
A method and data processing system for transferring data between the system and a memory system using more than one byte ordering convention by incorporating byte order information into instruction codes. The byte order information is coupled to a control unit along with other information characterizing the data transfer operation. In response to the byte order information and the data transfer operation information, the control unit generates a control signal that is coupled to a BPU. The control signal causes the BPU to rearrange the order of bytes in the data being transferred when the byte order information indicates a first byte ordering format. When the byte order information indicates a second byte ordering format, the BPU does not change the order of the bytes in the data being transferred.
Abstract:
A general purpose, programmable media processor (12) for processing and transmitting a media data streams. The media processor (12) incorporates an execution unit (100) that maintains substantially peak data throughout of media data streams. The execution unit (100) includes a dynamically partionable multi-precision arithmetic unit (102), programmable switch (104) and programmable extended mathematical element (106). A high bandwidth external interface (124) supplies media data streams at substantially peak rates to a general purpose register file (110) and the execution unit. A memory management unit, and instruction and data cache/buffers (118, 120). The general purpose, programmable media processor (12) is disposed in a network fabric consisting of fiber optic cable, coaxial cable and twisted pair wires to transmit, process and receive single or unified media data streams.
Abstract:
The present invention encompasses techniques for reducing digital noise in integrated circuits and circuit assemblies, particularly dense mixed-signal integrated circuits, based upon shaping the noise from the digital circuit and concentrating it in a single, or a small number, of parts of the frequency spectrum. Generally, the presence of noise in the analog circuit is less important at certain frequencies, and therefore the spectral peak or peaks from the digital circuit can be carefully placed to result in little or no interference. As an example, a radio receiver might be designed such that the peaks of the digital noise lie between received channels, outside the band edges of each.
Abstract:
In a lithographical tool utilizing off-axis illumination, masks to provide increased depth of focus and minimize CD differences between certain features are disclosed. A first mask for reducing proximity effects between isolated and densely packed features and increasing depth of focus (DOF) of isolated features is disclosed. The first mask comprises additional lines (214) referred to as scattering bars, disposed next to isolated edges. The bars are spaced a distance from isolated edges such that isolated and densely packed edge gradients substantially match so that proximity effects become negligible. The width of the bars is set so that a maximum DOF range for the isolated feature is achieved. A second mask, that is effective with quadrupole illumination only, is also disclosed. This mask "boosts" intensity levels and consequently DOF ranges for smaller square contacts so that they approximate intensity levels and DOF ranges of larger elongated contacts. Increasing the intensity levels in smaller contacts reduces critical dimension differences between variably sized contact patterns when transferred to a resist layer. The second mask comprises additional openings, referred to as anti-scattering bars, disposed about the square contact openings. The amount of separation between the edge of the smaller contact and the anti-scattering bars determines the amount of increased intensity. The width of the anti-scattering bars determines the amount of increase in DOF range. Both scattering bar and anti-scattering bars are designed to have widths significantly less than the resolution of the exposure tool so that they do not produce a pattern during exposure of photoresist.
Abstract:
The present invention describes a bias potential distribution system which provides bias potentials to MOS devices while ensuring the devices' operating conditions remain constant over temperature, process, and power supply fluctuations. Further, bias potentials are generated at one main location within the logic circuit and then distributed throughout the logic circuit to all of the MOS devices or to bias voltage conversion circuits.
Abstract:
The present invention provides a system and method for improving the performance of general-purpose processors by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128b by 128b multipliers regardsless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.
Abstract:
A floating-point instruction having incorporated floating point information and a method and system for implementing the floating-point instruction in a computer system is described. The floating point information indicates whether an exception trap should occur and the type of rounding to be performed upon "inexact" arithmetic results. The floating point information further indicates whether other floating-point exception traps should occur. This information allows dynamic (e.g. instruction-by-instruction) modification of various operating parameters of the CPU without modifying information in status registers using special instructions or modes, thereby increasing overall CPU performance. The technique is also supported by several mechanisms for providing precise floating-point exceptions.
Abstract:
A method and system for performing arbitrary permutations of sequences of elements. In the general case, the method of the present invention processes the elements to be permuted as a multi-dimensional array, where each element in the array corresponds to one of the elments to be permuted. The permutation is achieved by performing a sequence of sets of permutations, where each set of permutations in the sequence independently permutes the elements within each one-dimensional slice through the array, along some particular dimension of the array. The total number of sets of permutations, or stages, is one less than twice the number of dimensions in the array. An extension to the general method allows some extensions of permutations which involve the copying of individual elements. A system based on the extended general method implements a large class of operations which involve copying and/or permuting elements, where the sequence of elements is a word of data and the elements are bits of data. An efficient control structure for the system permits control signals to be shared across slices of the array. A version of the system based on a two-dimensional array includes three multiplex stages, where the first stage multiplexes along the rows, the second stage multiplexes along the columns, and the third stage multiplexes across the rows once again. Several classes of computer instructions which generally involve the copying and/or permuting of data are also described.
Abstract:
A method for synthesizing correction features for an entire mask pattern that initially divides mask pattern data into tiles of data - each tile representing an overlapping section of the original mask pattern. Each of the tiles of data is sequentially processed through correction feature synthesis phases - each phase synthesizing a different type of correction feature. All of the correction features are synthesized for a given tile before synthesizing the correction features for the next tile. Each correction feature synthesis phase formats the data stored in the tile into a representation that provides information needed to synthesize the correction feature for the given phase. Methods for implementing edge bar and serif correction features synthesis phases are also described. The method for synthesizing external type edge bars is performed by oversizing feature data in the tile by an amount equal to the desired spacing of the external edge bar, formatting the oversized data into an edge representation and expanding each of the edges in the edge representation of the oversized data into edge bars having a predetermined width. Internal type of edge bars for the tile are synthesized by initially inverting feature data and then performing the same steps as for generating the external edge bars. The method for serif synthesis is performed by initially formatting tile data into a vertex representation, eliminating certain of the vertices not requiring serifs, synthesizing a positive serif for each convex corner and a negative vertex for each concave corner, and eliminating any disallowed serifs. Internal bars and negative serifs are "cut-out" of original tile data by performing geometric Boolean operations and external bars and positive serifs are concatenated with the "cut-out" tile data, equivalent to performing a geometric OR operation.