Abstract:
Programs often require constants that cannot be encoded in a native instruction format, such as 32-bits. To provide an extended constant, an instruction packet is formed with constant extender information and a target instruction. The constant extender information encoded as a constant extender instruction provides a first set of constant bits, such as 26-bits for example, and the target instruction provides a second set of constant bits, such as 6-bits. The first set of constant bits are combined with the second set of constant bits to generate an extended constant for execution of the target instruction. The extended constant may be used as an extended source operand, an extended address for memory access instructions, an extended address for branch type of instructions, and the like. Multiple constant extender instructions may be used together to provide larger constants than can be provided by a single extension instruction.
Abstract:
A system and method to manage a translation lookaside buffer (TLB) is disclosed. In a particular embodiment, a method of managing a first TLB includes in response to starting execution of a memory instruction, setting a first field associated with an entry of the first TLB to indicate use of the entry. The method also includes setting a second field to indicate that the entry in the first TLB matches a corresponding entry in a second TLB.
Abstract:
In a particular embodiment, a circuit device includes a translation look-aside buffer (TLB) configured to receive a virtual address and to translate the virtual address to a physical address of a cache having at least two partitions. The circuit device also includes a control logic circuit adapted to identify a partition replacement policy associated with the identified one of the at least two partitions based on a partition indicator. The control logic circuit controls replacement of data within the cache according to the identified partition replacement policy in response to a cache miss event.
Abstract:
In a particular embodiment, a method is disclosed that includes receiving a first sense output and a second sense output of a sense amplifier at a first tri-state device coupled to a first bus, receiving the first sense output and the second sense output of the sense amplifier at a second tri-state device coupled to a second bus, and selectively activating one of the first tri-state device and the second tri-state device to drive the first bus or the second bus in response to a bus selection input.
Abstract:
Systems and methods of data extraction in a vector processor are disclosed. In a particular embodiment a method of data extraction in a vector processor includes copying at least one data element to a source register of a permutation network. The method includes reordering multiple data elements of the source register, populating a destination register of the permutation network with the reordered data elements, and copying the reordered data elements from the destination register to a memory.
Abstract:
In a particular embodiment, a method is disclosed that includes receiving a first sense output and a second sense output of a sense amplifier at a first tri-state device coupled to a first bus, receiving the first sense output and the second sense output of the sense amplifier at a second tri-state device coupled to a second bus, and selectively activating one of the first tri-state device and the second tri-state device to drive the first bus or the second bus in response to a bus selection input.
Abstract:
An apparatus includes one or more registers configured to store a vector of input values. The apparatus also includes a coefficient determination unit configured to, responsive to execution by a processor of a single instruction, select a plurality of piecewise analysis coefficients. The plurality of piecewise analysis coefficients includes one or more sets of piecewise analysis coefficients, and each set of piecewise analysis coefficients corresponds to an input value of the vector of input values. The apparatus further includes arithmetic logic circuitry configured to, responsive to the execution of at least the single instruction, determine estimated output values of a function based on the plurality of piecewise analysis coefficients and the vector of input values.
Abstract:
Systems and methods relate to a mixed-width single instruction multiple data (SIMD) instruction which has at least a source vector operand comprising data elements of a first bit-width and a destination vector operand comprising data elements of a second bit-width, wherein the second bit-width is either half of or twice the first bit-width. Correspondingly, one of the source or destination vector operands is expressed as a pair of registers, a first register and a second register. The other vector operand is expressed as a single register. Data elements of the first register correspond to even-numbered data elements of the other vector operand expressed as a single register, and data elements of the second register correspond to data elements of the other vector operand expressed as a single register.
Abstract:
In a particular embodiment, a method includes executing a vector instruction at a processor. The vector instruction includes a vector input that includes a plurality of elements. Executing the vector instruction includes providing a first element of the plurality of elements as a first output. Executing the vector instruction further includes performing an arithmetic operation on the first element and a second element of the plurality of elements to provide a second output. Executing the vector instruction further includes storing the first output and the second output in an output vector.