Abstract:
A scheduling unit is described for scheduling an execution order of a first instruction of a first type (503) and a second instruction of a second type (506) in an instruction stream where the second instruction precedes the first instruction. The scheduling unit comprises a table that records address component identifiers corresponding to the second instructions (503). An address comparator is coupled to the table. The address comparator compares address component identifiers that corresponds to the first instruction with address component identifiers on the table (504). The scheduling unit schedules the first instruction to be executed ahead of the second instruction when the address component identifiers differ from the address component identifiers on the table (506).
Abstract:
A thin-film waveguide lens and a wavelength division multiplexer/demultiplexer (1) embodying such lens. The thin film waveguide lens comprises a thin-film waveguide with end planes essentially normal to the lens axis. A plano-convex overlay layer integral with the waveguide extends the length of the waveguide along the axis, its profile being selected so as to produce a graded effective refractive index in the lens in order to collimate focussed rays and focus collimated rays entering at one end plane for substantially collimated or focussed arrival at the other end plane. The multiplexer/demultiplexer (1) comprises the above thin-film waveguide lens, with one of the end planes constituting an entrance/exit plane (A A') for receiving optical fibres (F1, F2, F3) in an abutting connection, and the other end plane (B B') bearing a diffraction/reflection grating (5). In accordance with the preferred embodiments, the shape of the plano-convex overlay layer is such as to provide an effective index of refraction profile in the waveguide according to the formula n = n0sech(gr), where n is the effective refractive index at the distance r from the axis of the waveguide, n0 is the effective refractive index at the waveguide axis, and g is a constant equal to pi /2f, where is the focal length of the lens, being essentially the distance between end planes.
Abstract:
A method and an apparatus for translating a virtual address into a physical address in a multiple region virtual memory environment. In one embodiment, a translation lookside buffer (TLB) is configured to provide page table entries to build a physical address. The TLB is supplemented with a virtual hash page table (VHPT) to provide TLB entries in the occurrences of TLB misses. An alternate software replacement scheme may be utilized on a per region basis instead of the default page table walk of the VHPT with a dedicated bit associated with each particular region of the disclosed virtual address space. A VHPT walk is performed only if the particular bit for the particular region and a master enable bit are both enabled. Otherwise, the alternate software replacement routine is performed to provide TLB replacements in the occurrences of TLB misses.
Abstract:
The present invention provides a method and apparatus for restoring a predicate register set. One embodiment of the invention includes decoding a first instruction ("Instruction set") which specifies a restoring operation to be performed on a predicate register set. In response to the the first instruction, a mask is used to select a plurality of predicate registers that are to be restored ("Mask"). The mask of the present invention consists of a first set of bits, with each bit of the first set of bits corresponding to a register in the predicate register set. When a bit of the first set of bits is set to one, the predicate register corresponding to that bit is restored. In one embodiment, the mask further includes one bit corresponding to a plurality of registers in the predicate register set, wherein when that bit is set to one, the plurality of registers corresponding to that bit are restored ("Restore Unit").
Abstract:
A method and apparatus for handling branch instructions contained within a source program includes applying a set of heuristics to classify each of the branch instructions in the source program as either a hard-to-predict type or a simple type of branch. A system implements a multi-heuristic branch predictor (21) comprising a large, relatively simple branch predictor (23) having many entries, to accommodate the majority of branch instructions encountered in a program, and a second, relatively small, sophisticated branch predictor (24) having a few entries. The sophisticated branch predictor (24) predicts the target addresses of the hard-to-predict branches. By mapping hard-to-predict branches to the sophisticated branch predictor (24), and easy-to-predict branches to the relatively simple branch predictor (23), overall performance is enhanced.
Abstract:
A processor and method that reduces instruction fetch penalty in the execution of a program sequence of instructions comprises a branch predict instruction that is inserted into the program at a location which precedes the branch. The branch predict instruction has an opcode that specifies a branch as likely to be taken or not taken, which also specifies a target address of the branch. A block of target instructions (18), starting at the target address, is prefetched into the instruction cache of the processor so that the instructions are available for execution prior to the point in the program where the branch is encountered. Also specified by the opcode is an indication of the size of the block of target instructions, and a trace vector of a path in the program sequence that leads to the target from the branch predict instruction for better utilization of the limited memory bandwith.
Abstract:
According to one aspect of the invention, a machine-readable medium having stored thereon data representing sequences of instructions is described. When executed by a computer system, the sequences of instructions cause the computer system to perform a series of steps. One of these steps involves preloading one of a set of registers data retrieved from a memory starting at a first address. Another of these steps involves storing memory conflict information representing the first address. This memory conflict information is later used for determining if a memory conflict has occurred. Another of these steps involves storing data at a second address in the memory. Yet another of these steps involves determining if a memory conflict has occurred between the first address and the second address using the previously stored memory conflict information. If a memory conflict occurred between the first and second addresses, then one of the registers is reloaded with the data located at the first address. However, if a memory conflict did not occur between the first and second addresses, then the memory conflict information is left for use during subsequent memory conflict checks. According to one embodiment of the invention, the data is reloaded into a register by causing the computer system to branch to recovery code. According to another embodiment of the invention, the data is reloaded into a register without performing any branch instructions.
Abstract:
A piece of three-dimensional maze, wherein at least two of six faces of a cube have opening faces such that a path connecting the opening faces of two faces is defined by flat plate of the cube.
Abstract:
A processor having a large register file (10) utilizes a template field for encoding a set of most useful instruction sequences in a long instruction word format. The instruction set of the processor includes instructions which are one of the plurality of different instruction types. The execution units of the processor are similarly categorized into different types, wherein each instruction type may be executed on one or more of the execution unit types. The instructions are grouped together into 128-bit sized and aligned containers called bundles, with each bundle includes a plurality of instruction slots and a template field that specifies the mapping of the instruction slots to the execution unit types.