Abstract:
PROBLEM TO BE SOLVED: To provide vector instructions to enable efficient synchronization and parallel reduction operations.SOLUTION: In one embodiment, a processor may include: a vector unit to perform operations on multiple data elements in response to a single instruction; and a control unit coupled to the vector unit to provide the data elements to the vector unit. The control unit enables an atomic SIMD operation to be performed on at least some of the data elements in response to a first vector instruction to be executed under a first mask and a second vector instruction to be executed under a second mask. Other embodiments are described and claimed.
Abstract:
PROBLEM TO BE SOLVED: To provide a texture unit that optimizes texture filtering.SOLUTION: The efficiency of communication between a processor core and a texture unit may be improved by reducing the computational overhead caused by the core in encoding groups of pixels to be textured. A region or group of pixels may be textured as a unit, using a range specifier and one or more anchor pixels to define the group. In some embodiments, processing efficiency of grouped pixels is improved.
Abstract:
PROBLEM TO BE SOLVED: To provide a texture unit that optimizes texture filtering and performs texture filtering faster than a general purpose processor. SOLUTION: A region or group of pixels may be textured as a unit, using a range specifier and one or more anchor pixels to define the group. Processing grouped pixels improves efficiency. COPYRIGHT: (C)2011,JPO&INPIT
Abstract:
Embodiments of systems, apparatuses, and methods for performing an align instruction in a computer processor are described. In some embodiments, the execution of an align instruction causes the selective storage of data elements of two concatenated sources to be stored in a destination.
Abstract:
A system and method are configured to detect conflicts when converting scalar processes to parallel processes ("SIMDifying"). Conflicts may be detected for an unordered single index, an ordered single index and/or ordered pairs of indices. Conflicts may be further detected for read-after-write dependencies. Conflict detection is configured to identify operations (i.e., iterations) in a sequence of iterations that may not be done in parallel.
Abstract:
Embodiments of systems, apparatuses, and methods for performing an expand and/or compress instruction in a computer processor are described. In some embodiments, the execution of an expand instruction causes the selection of elements from a source that are to be sparsely stored in a destination based on values of the writemask and store each selected data element of the source as a sparse data element into a destination location, wherein the destination locations correspond to each writemask bit position that indicates that the corresponding data element of the source is to be stored.
Abstract:
Ein Bereich oder eine Gruppe von Pixeln kann unter Verwendung eines Bereichsspezifikators und einem oder mehr Ankerpixeln als eine Einheit texturiert werden, um die Gruppe zu definieren. Bei einigen Ausführungsformen verbessert das Verarbeiten von gruppierten Pixeln die Effizienz.
Abstract:
Methods and apparatuses for error correction. A N-bit block data to be stored in a memory device is received. The memory device does not perform any error correction code (ECC) algorithm nor provide designated error correction code storage for the N-bit block of data. Data compression is applied to the N-bit data to compress the block of data to generate a M-bit compressed block of data. A K-bit ECC is computed for the M-bit compressed data, wherein M+K is less than or equal to N. The M-bit compressed data and the K-bit ECC are stored together in the memory device.
Abstract:
In one embodiment, a processor may include a vector unit to perform operations on multiple data elements responsive to a single instruction, and a control unit coupled to the vector unit to provide the data elements to the vector unit, where the control unit is to enable an atomic vector operation to be performed on at least some of the data elements responsive to a first vector instruction to be executed under a first mask and a second vector instruction to be executed under a second mask. Other embodiments are described and claimed.
Abstract:
Ausführungsbeispiele von Systemen, Vorrichtungen und Verfahren zum Ausführen eines Ausrichtungsbefehls in einem Computerprozessor werden beschreiben. In einigen Ausführungsbeispielen bewirkt das Ausführen des Ausrichtungsbefehls ein selektives Speichern von Datenelementen von zwei verknüpften Quellen, die in ein Ziel zu speichern sind.