Abstract:
A method includes reading, by a processor, one or more configuration values from a storage device or a memory management unit. The method also includes loading the one or more configuration values into one or more registers of the processor. The one or more registers are useable by the processor to perform address translation.
Abstract:
Systems and methods for branch prediction, including predicting evaluation of a producer instruction (102) such as a compare instruction, by encoding a prediction field (102p) in the producer instruction, and predicting evaluation (107, using 104, 106) of the producer instruction by using the encoded prediction field. A consumer instruction such as a conditional branch instruction predicated on the producer instruction can be speculatively executed based on the predicted evaluation of the producer instruction. The producer instruction is executed in an execution pipeline (112) to determine an actual evaluation (113) of the producer instruction, and if necessary, the prediction field is updated by update logic based on the actual evaluation and the predicted evaluation. The producer instruction can be updated in memory (108) with the updated prediction field.
Abstract:
A method includes executing a branch instruction and determining if a branch is taken. The method further includes evaluating a number of instructions associated with the branch instruction. Upon determining that the branch is taken, the method includes selectively writing an entry into a branch target buffer that corresponds to the taken branch responsive to determining that the number of instructions is less than a threshold.
Abstract:
Systems and method for configuring a page-based memory device without pre-existing dedicated metadata. The method includes reading metadata from a metadata portion of a page of the memory device, and determining a characteristic of the page based on the metadata. The memory device may be configured as a cache. The metadata may include address tags, such that determining the characteristic may include determining if desired information is present in the page, and reading the desired information if it is determined to be present in the page. The metadata may also include error-correcting code (ECC), such that determining the characteristic may include detecting errors present in data stored in the page. The metadata may further include directory information, memory coherency information, or dirty/valid/lock information.
Abstract:
Methods, apparatuses, and computer-readable storage media are disclosed for reducing power by reducing hardware-thread toggling in a multi-threaded processor. In a particular embodiment, a method allocates software threads to hardware threads. A number of software threads to be allocated is identified. It is determined when the number of software threads is less than a number of hardware threads. When the number of software threads is less than the number of hardware threads, at least two of the software threads are allocated to non-sequential hardware threads. A clock signal to be applied to the hardware threads is adjusted responsive to the non-sequential hardware threads allocated.
Abstract:
An apparatus and method to translate virtual addresses to physical addresses in a base plus offset addressing mode are disclosed. In an embodiment, a method includes performing a first translation lookaside buffer (TLB) lookup based on a base address value to retrieve a speculative physical address. While performing the TLB lookup based on the base address value, the base address value is added to an offset value to generate an effective address value. The method also includes performing a comparison of the base address value and the effective address value based on a variable page size to determine whether the speculative physical address corresponds to the effective address.
Abstract:
Systems and methods to perform fast rotation operations are disclosed. In a particular embodiment, a method includes executing a single instruction. The method includes receiving first data indicating a first coordinate and a second coordinate, receiving a first control value that indicates a first rotation value selected from a set of ninety degree multiples, and writing output data corresponding to the first data rotated by the first rotation value.
Abstract:
In a particular embodiment, a method is disclosed that includes receiving an instruction packet including a first instruction and a second instruction that is dependent on the first instruction at a processor having a plurality of parallel execution pipelines, including a first execution pipeline and a second execution pipeline. The method further includes executing in parallel at least a portion of the first instruction and at least a portion of the second instruction. The method also includes selectively committing a second result of executing the at least a portion of the second instruction with the second execution pipeline based on a first result related to execution of the first instruction with the first execution pipeline.
Abstract:
Techniques for the design and use of a digital signal processor, including (but not limited to) for processing transmissions in a communications (e.g., CDMA) system. The method and system improve software instruction debugging operations by capturing real-time information relating to software execution flow and include and instructions and circuitry for operating a core processor process within a core processor. A non-intrusive debugging process operates within a debugging mechanism of a digital signal processor. Non-intrusively monitoring in real time predetermined aspects of software execution occurs with the core processing process and occurs in real-time on the processor. An embedded trace macrocell records selectable aspects of the non-intrusively monitored software execution and generates at least one breakpoint in response to events arising within the selectable aspects of the non-intrusively monitored software execution. The present disclosure controls aspects of the non-intrusive debugging process in response to at least one breakpoint.
Abstract:
An apparatus includes a primary hypervisor that is executable on a first set of processors and a secondary hypervisor that is executable on a second set of processors. The primary hypervisor may define settings of a resource and the secondary hypervisor may use the resource based on the settings defined by the primary hypervisor. For example, the primary hypervisor may program memory address translation mappings for the secondary hypervisor. The primary hypervisor and the secondary hypervisor may include their own schedulers.