Method and apparatus for fuzzy stride prefetch

    公开(公告)号:AU2011296479B2

    公开(公告)日:2014-12-04

    申请号:AU2011296479

    申请日:2011-08-03

    Applicant: INTEL CORP

    Abstract: In one embodiment, the present invention includes a prefetching engine to detect when data access strides in a memory fall into a range, to compute a predicted next stride, to selectively prefetch a cache line using the predicted next stride, and to dynamically control prefetching. Other embodiments are also described and claimed.

    43.
    发明专利
    未知

    公开(公告)号:DE60207222D1

    公开(公告)日:2005-12-15

    申请号:DE60207222

    申请日:2002-02-28

    Applicant: INTEL CORP

    Inventor: WU YOUFENG

    Methods and apparatus to manage cache bypassing

    公开(公告)号:GB2410582A

    公开(公告)日:2005-08-03

    申请号:GB0508442

    申请日:2003-09-12

    Applicant: INTEL CORP

    Abstract: Methods and apparatus to manage bypassing of a first cache are disclosed. In one such method, a load instruction having an expected latency greater than or equal to a predetermined threshold is identified. A request is then made to schedule the identified load instruction to have a perdetermined latency. The software program is then scheduled. An actual latency associated with the load instruction in the scheduled software program is then compared to the predetermined latency. If the actual latency is greater than or equal to the predetermined latency, the load instruction is marked to bypass the first cache.

    METHODS AND APPARATUS TO MANAGE CACHE BYPASSING

    公开(公告)号:AU2003288904A1

    公开(公告)日:2004-05-13

    申请号:AU2003288904

    申请日:2003-09-12

    Applicant: INTEL CORP

    Abstract: Methods and apparatus to manage bypassing of a first cache are disclosed. In one such method, a load instruction having an expected latency greater than or equal to a predetermined threshold is identified. A request is then made to schedule the identified load instruction to have a predetermined latency. The software program is then scheduled. An actual latency associated with the load instruction in the scheduled software program is then compared to the predetermined latency. If the actual latency is greater than or equal to the predetermined latency, the load instruction is marked to bypass the first cache.

    Comprehensive redundant load elimination for architectures supporting control and data speculation

    公开(公告)号:AU2884599A

    公开(公告)日:1999-09-27

    申请号:AU2884599

    申请日:1999-02-26

    Applicant: INTEL CORP

    Abstract: In one implementation of the invention, a computer implemented method used in compiling a program includes identifying a covering load, which may be one of a set of covering loads, and a redundant load. The covering load and the redundant load have a first and second load type, respectively. The first and the second load type each may be one of a group of load types including a regular load and at least one speculative-type load. In one implementation, the group of load types includes at least one check-type load. One implementation of the invention is in a machine readable medium.

    ALLOCATION OF ALIAS REGISTERS IN A PIPELINED SCHEDULE
    49.
    发明公开
    ALLOCATION OF ALIAS REGISTERS IN A PIPELINED SCHEDULE 审中-公开
    伊朗伊朗伊斯兰共和国的黎巴嫩民族解放阵线

    公开(公告)号:EP2875427A4

    公开(公告)日:2016-07-13

    申请号:EP13885974

    申请日:2013-05-30

    Applicant: INTEL CORP

    Abstract: In an embodiment, a system includes a processor including one or more cores and a plurality of alias registers to store memory range information associated with a plurality of operations of a loop. The memory range information references one or more memory locations within a memory. The system also includes register assignment means for assigning each of the alias registers to a corresponding operation of the loop, where the assignments are made according to a rotation schedule, and one of the alias registers is assigned to a first operation in a first iteration of the loop and to a second operation in a subsequent iteration of the loop. The system also includes the memory coupled to the processor. Other embodiments are described and claimed.

    Abstract translation: 在一个实施例中,系统包括处理器,其包括一个或多个核和多个别名寄存器,用于存储与循环的多个操作相关联的存储器范围信息。 存储器范围信息引用存储器内的一个或多个存储器位置。 该系统还包括寄存器分配装置,用于将每个别名寄存器分配给循环的相应操作,其中根据旋转调度进行分配,并且在第一次迭代中将一个别名寄存器分配给第一操作 循环和循环的后续迭代中的第二操作。 该系统还包括耦合到处理器的存储器。 描述和要求保护其他实施例。

    FLEXIBLE ACCELERATION OF CODE EXECUTION
    50.
    发明公开
    FLEXIBLE ACCELERATION OF CODE EXECUTION 有权
    灵活的BESCHLEUNIGUNG EINERCODEAUSFÜHRUNG

    公开(公告)号:EP2901266A4

    公开(公告)日:2016-05-25

    申请号:EP13841895

    申请日:2013-09-26

    Applicant: INTEL CORP

    Abstract: Technologies for performing flexible code acceleration on a computing device includes initializing an accelerator virtual device on the computing device. The computing device allocates memory-mapped input and output (I/O) for the accelerator virtual device and also allocates an accelerator virtual device context for a code to be accelerated. The computing device accesses a bytecode of the code to be accelerated and determines whether the bytecode is an operating system-dependent bytecode. If not, the computing device performs hardware acceleration of the bytecode via the memory-mapped I/O using an internal binary translation module. However, if the bytecode is operating system-dependent, the computing device performs software acceleration of the bytecode.

    Abstract translation: 在计算设备上执行灵活代码加速的技术包括在计算设备上初始化加速器虚拟设备。 计算设备为加速器虚拟设备分配内存映射输入和输出(I / O),并为加速的代码分配加速器虚拟设备上下文。 计算设备访问要加速的代码的字节码,并确定字节码是否是依赖于操作系统的字节码。 如果不是,则计算设备通过使用内部二进制翻译模块的内存映射I / O执行字节码的硬件加速。 但是,如果字节码与操作系统有关,则计算设备执行字节码的软件加速。

Patent Agency Ranking