Method and apparatus for efficient load processing using buffer
    1.
    发明专利
    Method and apparatus for efficient load processing using buffer 有权
    使用缓冲器的有效负载处理的方法和装置

    公开(公告)号:JP2011129103A

    公开(公告)日:2011-06-30

    申请号:JP2010248263

    申请日:2010-11-05

    Abstract: PROBLEM TO BE SOLVED: To provide a method and apparatus for power and time efficient load handling. SOLUTION: A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads. COPYRIGHT: (C)2011,JPO&INPIT

    Abstract translation: 要解决的问题:提供用于功率和时间有效的负载处理的方法和装置。 解决方案:编译器可以识别生产者负载,消费者重用负载,消费者转发负载以及生产者/消费者混合负载。 基于该识别,可以将负载的性能有效地指向负载值缓冲器,存储缓冲器,数据高速缓存或其他位置。 因此,通过从负载值缓冲区和存储缓冲区的直接加载,减少对高速缓存的访问,从而有效地处理负载。 版权所有(C)2011,JPO&INPIT

    DYNAMIC DATA SYNCHRONIZATION IN THREAD-LEVEL SPECULATION
    2.
    发明申请
    DYNAMIC DATA SYNCHRONIZATION IN THREAD-LEVEL SPECULATION 审中-公开
    动态数据同步在线程分析

    公开(公告)号:WO2012006030A2

    公开(公告)日:2012-01-12

    申请号:PCT/US2011042040

    申请日:2011-06-27

    Inventor: LIU WEI WU YOUFENG

    Abstract: In one embodiment, the present invention introduces a speculation engine to parallelize serial instructions by creating separate threads from the serial instructions and inserting processor instructions to set a synchronization bit before a dependence source and to clear the synchronization bit after a dependence source, where the synchronization bit is designed to stall a dependence sink from a thread running on a separate core. Other embodiments are described and claimed.

    Abstract translation: 在一个实施例中,本发明引入了一种推测引擎,以通过从串行指令中创建单独的线程并插入处理器指令来在依赖源之前设置同步位并在依赖源之后清除同步位,从而并行化串行指令,其中同步 位被设计为从在单独核心上运行的线程停止依赖宿。 描述和要求保护其他实施例。

    Dynamic data synchronization in thread-level speculation

    公开(公告)号:AU2011276588A1

    公开(公告)日:2013-01-10

    申请号:AU2011276588

    申请日:2011-06-27

    Applicant: INTEL CORP

    Inventor: LIU WEI WU YOUFENG

    Abstract: In one embodiment, the present invention introduces a speculation engine to parallelize serial instructions by creating separate threads from the serial instructions and inserting processor instructions to set a synchronization bit before a dependence source and to clear the synchronization bit after a dependence source, where the synchronization bit is designed to stall a dependence sink from a thread running on a separate core. Other embodiments are described and claimed.

    METHODS AND APPARATUS TO MANAGE CACHE BYPASSING
    4.
    发明申请
    METHODS AND APPARATUS TO MANAGE CACHE BYPASSING 审中-公开
    管理高速缓存的方法和设备

    公开(公告)号:WO2004038583A9

    公开(公告)日:2005-06-09

    申请号:PCT/US0328783

    申请日:2003-09-12

    Applicant: INTEL CORP

    Abstract: Methods and apparatus to manage bypassing of a first cache are disclosed. In one such method, a load instruction having an expected latency greater than or equal to a predetermined threshold is identified. A request is then made to schedule the identified load instruction to have a perdetermined latency. The software program is then scheduled. An actual latency associated with the load instruction in the scheduled software program is then compared to the predetermined latency. If the actual latency is greater than or equal to the predetermined latency, the load instruction is marked to bypass the first cache.

    Abstract translation: 公开了管理第一高速缓存的旁路的方法和装置。 在一种这样的方法中,识别具有大于或等于预定阈值的预期等待时间的加载指令。 然后进行请求以将所识别的加载指令调度为具有不确定的等待时间。 然后安排软件程序。 然后将与预定软件程序中的加载指令相关联的实际延迟与预定延迟进行比较。 如果实际延迟大于或等于预定延迟,则加载指令被标记为绕过第一高速缓存。

    APPARATUS, METHOD, AND SYSTEM FOR DYNAMICALLY OPTIMIZING CODE UTILIZING ADJUSTABLE TRANSACTION SIZES BASED ON HARDWARE LIMITATIONS
    5.
    发明申请
    APPARATUS, METHOD, AND SYSTEM FOR DYNAMICALLY OPTIMIZING CODE UTILIZING ADJUSTABLE TRANSACTION SIZES BASED ON HARDWARE LIMITATIONS 审中-公开
    基于硬件限制的可调整交易尺寸动态优化代码的装置,方法和系统

    公开(公告)号:WO2012040742A3

    公开(公告)日:2012-06-14

    申请号:PCT/US2011053337

    申请日:2011-09-26

    Abstract: An apparatus and method is described herein for conditionally committing /andor speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

    Abstract translation: 本文描述了用于有条件地提交/推测的检查点事务的装置和方法,这可能导致事务的动态调整大小。 在二进制代码的动态优化期间,插入事务以提供存储器排序保护措施,这使得动态优化器能够更积极地优化代码。 并且条件提交可以有效地执行动态优化代码,同时试图阻止事务用尽硬件资源。 虽然投机检查点能够在中止交易后快速有效地恢复。 处理器硬件适于支持事务的动态调整大小,诸如包括识别条件提交指令的解码器,推测性检查点指令或两者。 并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE
    6.
    发明申请
    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE 审中-公开
    通过与第二核心类型耦合的第一核心类型来提高功率效能的装置,方法和系统

    公开(公告)号:WO2012005949A3

    公开(公告)日:2012-05-18

    申请号:PCT/US2011041429

    申请日:2011-06-22

    Abstract: An apparatus and method is described herein for coupling a processor core of a first type with a co-designed core of a second type. Execution of program code on the first core is monitored and hot sections of the program code are identified. Those hot sections are optimize for execution on the co-designed core, such that upon subsequently encountering those hot sections, the optimized hot sections are executed on the co- designed core. When the co-designed core is executing optimized hot code, the first processor core may be in a low-power state to save power or executing other code in parallel. Furthermore, multiple threads of cold code may be pipelined on the first core, while multiple threads of hot code are pipeline on the co-designed core to achieve maximum performance.

    Abstract translation: 本文描述了一种用于将第一类型的处理器核与第二类型的共同设计的核耦合的装置和方法。 对第一个核心上的程序代码执行进行监控,并且识别程序代码的热部分。 这些热段在协同设计的核心上进行优化,以便随后遇到这些热部分,优化的热段在协同设计的核心上执行。 当共同设计的核心正在执行优化的热代码时,第一处理器核心可以处于低功率状态以节省功率或并行执行其他代码。 此外,多个冷码线程可以在第一核心上流水线化,而多个热代码线程在共同设计的核心上进行流水线以实现最大性能。

    METHOD AND SYSTEM FOR COLLABORATIVE PROFILING FOR CONTINUOUS DETECTION OF PROFILE PHASE
    7.
    发明申请
    METHOD AND SYSTEM FOR COLLABORATIVE PROFILING FOR CONTINUOUS DETECTION OF PROFILE PHASE 审中-公开
    用于协调分析的方法和系统,用于连续检测轮廓相

    公开(公告)号:WO02077821A3

    公开(公告)日:2003-03-06

    申请号:PCT/US0206292

    申请日:2002-02-28

    Inventor: WU YOUFENG

    CPC classification number: G06F11/3612

    Abstract: A method and system for collaborative profiling for continuous detection of profile phase transitions is disclosed. In one embodiment, the method, comprises using hardware and software to perform continuous edge profiling on a program detecting profile phase transitions continuously; and optimizing the program based upon the profile phase transitions and edge profile.

    Abstract translation: 公开了一种用于协调分析以用于轮廓相变的连续检测的方法和系统。 在一个实施例中,该方法包括使用硬件和软件在连续地检测轮廓相变的程序上执行连续的边缘轮廓; 并根据轮廓相变和边缘轮廓优化程序。

    APPARATUS, METHOD, AND SYSTEM FOR PROVIDING A DECISION MECHANISM FOR CONDITIONAL COMMITS IN AN ATOMIC REGION
    9.
    发明申请
    APPARATUS, METHOD, AND SYSTEM FOR PROVIDING A DECISION MECHANISM FOR CONDITIONAL COMMITS IN AN ATOMIC REGION 审中-公开
    设备,方法和系统,用于提供原子地区条件性的决策机制

    公开(公告)号:WO2012040715A3

    公开(公告)日:2012-06-21

    申请号:PCT/US2011053285

    申请日:2011-09-26

    Abstract: An apparatus and method is described herein for conditionally committing /andor speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

    Abstract translation: 本文描述了用于有条件地提交/推测的检查点事务的装置和方法,这可能导致事务的动态调整大小。 在二进制代码的动态优化期间,插入事务以提供存储器排序保护措施,这使得动态优化器能够更积极地优化代码。 并且条件提交可以有效地执行动态优化代码,同时尝试防止事务用尽硬件资源。 虽然投机检查点可以在交易中止时快速有效地恢复。 处理器硬件适于支持事务的动态调整大小,诸如包括识别条件提交指令的解码器,推测性检查点指令或两者。 并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

    CONTEXT-SENSITIVE SLICING FOR DYNAMICALLY PARALLELIZING BINARY PROGRAMS
    10.
    发明申请
    CONTEXT-SENSITIVE SLICING FOR DYNAMICALLY PARALLELIZING BINARY PROGRAMS 审中-公开
    用于动态平行二进制程序的上下文敏感切片

    公开(公告)号:WO2011056278A3

    公开(公告)日:2011-06-30

    申请号:PCT/US2010046685

    申请日:2010-08-25

    CPC classification number: G06F11/3604 G06F8/433 G06F8/456

    Abstract: In one embodiment of the invention a method comprising (1) receiving an unstructured binary code region that is single-threaded; (2) determining a slice criterion for the region; (3) determining a call edge, a return edge, and a fallthrough pseudo-edge for the region based on analysis of the region at a binary level; and (4) determining a context-sensitive slice based on the call edge, the return edge, the fallthrough pseudo-edge, and the slice criterion. Embodiments of the invention may include a program analysis technique that can be used to provide context-sensitive slicing of binary programs for slicing hot regions identified at runtime, with few underlying assumptions about the program from which the binary is derived. Also, in an embodiment a slicing method may include determining a context-insensitive slice, when a time limit is met, by determining the context-insensitive slice while treating call edges as a normal control flow edges.

    Abstract translation: 在本发明的一个实施例中,一种方法包括:(1)接收单线程的非结构化二进制码区; (2)确定该区域的切片标准; (3)基于二进制级别的区域分析来确定区域的呼叫边缘,返回边缘和穿透式伪边缘; (4)基于呼叫边缘,返回边缘,下穿伪边缘和切片标准来确定上下文敏感切片。 本发明的实施例可以包括程序分析技术,该程序分析技术可以用于提供用于对在运行时识别的热区域进行切片的二进制程序的上下文敏感的分片,而关于从其导出二进制的程序的基本假设很少。 而且,在一个实施例中,切片方法可以包括当满足时间限制时确定上下文不敏感切片,通过确定上下文不敏感切片而将呼叫边缘视为正常控制流边缘。

Patent Agency Ranking