SIMPLE HIGH-PERFORMANCE MEMORY MANAGEMENT UNIT

    公开(公告)号:JP2000276397A

    公开(公告)日:2000-10-06

    申请号:JP2000061273

    申请日:2000-03-06

    Abstract: PROBLEM TO BE SOLVED: To eliminate the complexity of a virtual memory paging system by comparing a virtual address with a lower limit and indicating that access is illegal when the virtual address has a value less than the lower limit. SOLUTION: It is determined whether the virtual address 104 is above or below a virtual address space. Namely, the most significant digit bit of the virtual address 104 is checked for the determination. The virtual address 104 is compared with the upper limit 110. When the virtual address 104 is larger than the upper limit 110, the system asserts an illegal access signal. When the virtual address 104 is in the upper area of the virtual address space, the system adds the virtual address 104 to a base address 112 to generate a physical address 115. Further, the system compares the virtual address 104 with the lower limit 114. When the virtual address 104 is smaller than the lower limit 114, the system asserts an illegal access signal.

    CENTRAL PROCESSING UNIT OF SUPERSCALAR PROCESSOR

    公开(公告)号:JPH1097424A

    公开(公告)日:1998-04-14

    申请号:JP15173897

    申请日:1997-06-10

    Inventor: TREMBLAY MARC

    Abstract: PROBLEM TO BE SOLVED: To analyze a resource, etc., within a processor cycle by a central processing unit containing a grouped logical circuit, a function unit which can carry out an instruction dispatched by it and pipeline stages and executing a check of resource arrangement, etc., through a processor cycle. SOLUTION: A grouped logical circuit 109 dispatches up to four instructions in every processor cycle. A completion unit 110 retires an instruction when the instruction is completed. When data of a dispatched load instruction are returned from main memory, a CPU 100 arranges these data in a pipeline format to be stored in cache memory in the 2nd level. A floating-point adder 106 and a floating-point multiplier 107 separately have a four stage pipeline. Similarly, a load/storage unit 103 has a two stage pipeline.

    SUPPORT FOR A PLURALITY OF UNSOLVED REQUEST TO A PLURALITY OF TARGETS OF PIPELINE MEMORY SYSTEM

    公开(公告)号:JP2000293436A

    公开(公告)日:2000-10-20

    申请号:JP2000081045

    申请日:2000-03-22

    Abstract: PROBLEM TO BE SOLVED: To overcome the restrictions of the performance of an existent memory system by adjusting a load address buffer, a register file, and a data flow between a 1st and a 2nd data source so that a plurality of load requests can be put in an unprocessed state at the same time. SOLUTION: The array 216 of a load buffer 210 has entries (e.g. load address entries 211 to 215) as to five load addresses. The addresses of up to five unprocessed load requests can be stored in those five load addresses. A circuit in the load buffer 210 operates under the control of an LSU controller 250 having different state machines as to the respective entries in the array 216. In this case, the load address buffer, register files 110 and 112 and the data flow between the 1st and 2nd data sources are so controlled as to both the 1st and 2nd data sources so that the plurality of load requests can be put in an unprocessed state at the same time.

    EFFICIENT SUB-INSTRUCTION EMULATION IN VLIW PROCESSOR

    公开(公告)号:JP2000284964A

    公开(公告)日:2000-10-13

    申请号:JP2000079047

    申请日:2000-03-21

    Abstract: PROBLEM TO BE SOLVED: To efficiently emulate a sub-instruction by combining stored result from at least one sub-instruction emulated in a software with a result from the residual sub-instructions executed in a hardware. SOLUTION: A VLIW processor 101 detects an exceptional state to report it to a pipeline control unit 108. Then, the contents of exception registers 126, 128, 130 and 132 are sent via OR gates 135, 136, 138, 140 and 144 to form an exception signal 146, which is sent to a pipeline control unit 108. Next, the result of the exceptional state is stored in an external memory. In this case, this system next stores the pattern of an enable signal concerning all the residual sub-instructions, which are not emulated, in an instruction break point mask register in the unit 108.

    Software branch prediction filtering for a microprocessor

    公开(公告)号:US6374351B2

    公开(公告)日:2002-04-16

    申请号:US82952501

    申请日:2001-04-10

    Inventor: TREMBLAY MARC

    CPC classification number: G06F9/3844 G06F9/3846

    Abstract: The present invention provides software branch prediction filtering for a microprocessor. In one embodiment, a method for a software branch prediction filtering for a microprocessor includes determining whether a branch is "easy" to predict, and predicting the branch using software branch prediction if the branch is easy to predict. Otherwise (i.e., the branch is "hard" to predict), the branch is predicted using hardware branch prediction. Accordingly, more accurate but space-limited hardware branch prediction resources are conserved for hard-to-predict branches.

    7.
    发明专利
    未知

    公开(公告)号:AT450001T

    公开(公告)日:2009-12-15

    申请号:AT05747426

    申请日:2005-05-11

    Abstract: One embodiment of the present invention provides a processor which selectively fetches cache lines for store instructions during speculative-execution. During normal execution, the processor issues instructions for execution in program order. Upon encountering an instruction which generates a launch condition, the processor performs a checkpoint and begins the execution of instructions in a speculative-execution mode. Upon encountering a store instruction during the speculative-execution mode, the processor checks an L1 data cache for a matching cache line and checks a store buffer for a store to a matching cache line. If a matching cache line is already present in the L1 data cache or if the store to a matching cache line is already present in the store buffer, the processor suppresses generation of the fetch for the cache line. Otherwise, the processor generates a fetch for the cache line.

    8.
    发明专利
    未知

    公开(公告)号:DE60039808D1

    公开(公告)日:2008-09-25

    申请号:DE60039808

    申请日:2000-03-13

    Abstract: One embodiment of the present invention provides a system that efficiently emulates sub-instructions in a very long instruction word (VLIW) processor (101). The system operates by receiving (504) an exception condition during execution of a VLIW instruction within a VLIW program. This exception condition indicates that at least one sub-instruction within the VLIW instruction requires emulation in software or software assistance. In processing this exception condition, the system emulates the sub-instructions that require emulation in software and stores the results. The system also selectively executes (520) in hardware any remaining sub-instructions in the VLIW instruction that do not require emulation in software. The system finally combines (524) the results from the sub-instructions emulated in software with the results from the remaining sub-instructions executed in hardware, and resumes execution of the VLIW program.

    9.
    发明专利
    未知

    公开(公告)号:DE69734303D1

    公开(公告)日:2005-11-10

    申请号:DE69734303

    申请日:1997-06-04

    Inventor: TREMBLAY MARC

    Abstract: A pipelined instruction dispatch or grouping circuit allows instruction dispatch decisions to be made over multiple processor cycles. In one embodiment, the grouping circuit performs resource allocation and data dependency checks on an instruction group, based on a state vector which includes representation of source and destination registers of instructions within said instruction group and corresponding state vectors for instruction groups of a number of preceding processor cycles.

    10.
    发明专利
    未知

    公开(公告)号:DE60009732T2

    公开(公告)日:2005-04-21

    申请号:DE60009732

    申请日:2000-09-29

    Abstract: A method and computer system for resolving simultaneous requests from multiple processing units to load from or store to the same shared resource. When the colliding requests come from two different processing units, the first processing unit is allowed access to the structure in a predetermined number of sequential collisions and the second device is allowed access to the structure in a following number of sequential collisions. The shared resource can be a fill buffer, where a collision involves attempts to simultaneously store in the fill buffer. The shared resource can be a shared write back buffer, where a collision involves attempts to simultaneously store in the shared write back buffer. The shared resource can be a data cache unit, where a collision involves attempts to simultaneously load from a same data space in the data cache unit. A collision can also involve an attempt to load and store from a same resource and in such case the device that attempts to load is favored over the device that attempts to store.

Patent Agency Ranking