Patent search ap:("INTEL CORP") AND inv:"WU YOUFENG" Page 1

1.

发明专利
Method and apparatus for efficient load processing using buffer 有权
Title translation: 使用缓冲器的有效负载处理的方法和装置

公开(公告)号：JP2011129103A

公开(公告)日：2011-06-30

申请号：JP2010248263

申请日：2010-11-05

Applicant: Intel Corp , インテル・コーポレーション

Inventor： LIU WEI , WU YOUFENG , WILKERSON CHRISTOPHER B , HUM HERBERT H

IPC: G06F12/08 , G06F9/45

CPC classification number: G06F12/0888 , G06F8/4442 , G06F9/30043 , G06F9/3826 , G06F9/383 , Y02B60/1225 , Y02D10/13

Abstract: PROBLEM TO BE SOLVED: To provide a method and apparatus for power and time efficient load handling. SOLUTION: A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads. COPYRIGHT: (C)2011,JPO&INPIT

Abstract translation: 要解决的问题：提供用于功率和时间有效的负载处理的方法和装置。解决方案：编译器可以识别生产者负载，消费者重用负载，消费者转发负载以及生产者/消费者混合负载。基于该识别，可以将负载的性能有效地指向负载值缓冲器，存储缓冲器，数据高速缓存或其他位置。因此，通过从负载值缓冲区和存储缓冲区的直接加载，减少对高速缓存的访问，从而有效地处理负载。版权所有（C）2011，JPO＆INPIT

2.

发明申请
DYNAMIC DATA SYNCHRONIZATION IN THREAD-LEVEL SPECULATION 审中-公开
Title translation: 动态数据同步在线程分析

公开(公告)号：WO2012006030A2

公开(公告)日：2012-01-12

申请号：PCT/US2011042040

申请日：2011-06-27

Applicant: INTEL CORP , LIU WEI , WU YOUFENG

Inventor： LIU WEI , WU YOUFENG

IPC: G06F9/46 , G06F9/06 , G06F9/30

CPC classification number: G06F9/3834 , G06F9/3004 , G06F9/30087 , G06F9/3851 , G06F9/52

Abstract: In one embodiment, the present invention introduces a speculation engine to parallelize serial instructions by creating separate threads from the serial instructions and inserting processor instructions to set a synchronization bit before a dependence source and to clear the synchronization bit after a dependence source, where the synchronization bit is designed to stall a dependence sink from a thread running on a separate core. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，本发明引入了一种推测引擎，以通过从串行指令中创建单独的线程并插入处理器指令来在依赖源之前设置同步位并在依赖源之后清除同步位，从而并行化串行指令，其中同步位被设计为从在单独核心上运行的线程停止依赖宿。描述和要求保护其他实施例。

3.

发明专利
Dynamic data synchronization in thread-level speculation 未知

公开(公告)号：AU2011276588A1

公开(公告)日：2013-01-10

申请号：AU2011276588

申请日：2011-06-27

Applicant: INTEL CORP

Inventor： LIU WEI , WU YOUFENG

IPC: G06F9/06 , G06F9/30 , G06F9/46

Abstract: In one embodiment, the present invention introduces a speculation engine to parallelize serial instructions by creating separate threads from the serial instructions and inserting processor instructions to set a synchronization bit before a dependence source and to clear the synchronization bit after a dependence source, where the synchronization bit is designed to stall a dependence sink from a thread running on a separate core. Other embodiments are described and claimed.

4.

发明申请
METHODS AND APPARATUS TO MANAGE CACHE BYPASSING 审中-公开
Title translation: 管理高速缓存的方法和设备

公开(公告)号：WO2004038583A9

公开(公告)日：2005-06-09

申请号：PCT/US0328783

申请日：2003-09-12

Applicant: INTEL CORP

Inventor： WU YOUFENG , CHEN LI-LING

IPC: G06F9/00 , G06F9/45 , G06F12/08

CPC classification number: G06F12/0888 , G06F8/4442 , G06F9/3824 , G06F9/3836 , G06F9/3838 , G06F12/0897

Abstract: Methods and apparatus to manage bypassing of a first cache are disclosed. In one such method, a load instruction having an expected latency greater than or equal to a predetermined threshold is identified. A request is then made to schedule the identified load instruction to have a perdetermined latency. The software program is then scheduled. An actual latency associated with the load instruction in the scheduled software program is then compared to the predetermined latency. If the actual latency is greater than or equal to the predetermined latency, the load instruction is marked to bypass the first cache.

Abstract translation: 公开了管理第一高速缓存的旁路的方法和装置。在一种这样的方法中，识别具有大于或等于预定阈值的预期等待时间的加载指令。然后进行请求以将所识别的加载指令调度为具有不确定的等待时间。然后安排软件程序。然后将与预定软件程序中的加载指令相关联的实际延迟与预定延迟进行比较。如果实际延迟大于或等于预定延迟，则加载指令被标记为绕过第一高速缓存。

5.

发明申请
APPARATUS, METHOD, AND SYSTEM FOR DYNAMICALLY OPTIMIZING CODE UTILIZING ADJUSTABLE TRANSACTION SIZES BASED ON HARDWARE LIMITATIONS 审中-公开
Title translation: 基于硬件限制的可调整交易尺寸动态优化代码的装置，方法和系统

公开(公告)号：WO2012040742A3

公开(公告)日：2012-06-14

申请号：PCT/US2011053337

申请日：2011-09-26

Applicant: INTEL CORP , WANG CHENG , LIU WEI , BORIN EDSON , BRETERNITZ JR MAURICIO , WU YOUFENG , HU SHILIANG

Inventor： WANG CHENG , LIU WEI , BORIN EDSON , BRETERNITZ JR MAURICIO , WU YOUFENG , HU SHILIANG

IPC: G06F9/30 , G06F9/06

CPC classification number: G06F9/3842 , G06F8/52 , G06F9/3004 , G06F9/30072 , G06F9/30087 , G06F9/30116 , G06F9/3857

Abstract: An apparatus and method is described herein for conditionally committing /andor speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

Abstract translation: 本文描述了用于有条件地提交/推测的检查点事务的装置和方法，这可能导致事务的动态调整大小。在二进制代码的动态优化期间，插入事务以提供存储器排序保护措施，这使得动态优化器能够更积极地优化代码。并且条件提交可以有效地执行动态优化代码，同时试图阻止事务用尽硬件资源。虽然投机检查点能够在中止交易后快速有效地恢复。处理器硬件适于支持事务的动态调整大小，诸如包括识别条件提交指令的解码器，推测性检查点指令或两者。并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

6.

发明申请
APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE 审中-公开
Title translation: 通过与第二核心类型耦合的第一核心类型来提高功率效能的装置，方法和系统

公开(公告)号：WO2012005949A3

公开(公告)日：2012-05-18

申请号：PCT/US2011041429

申请日：2011-06-22

Applicant: INTEL CORP , WU YOUFENG , HU SHILIANG , BORIN EDSON , WANG CHENG C , BRETERNITZ MAURICIO JR , LIU WEI

Inventor： WU YOUFENG , HU SHILIANG , BORIN EDSON , WANG CHENG C , BRETERNITZ MAURICIO JR , LIU WEI

IPC: G06F9/46 , G06F1/32 , G06F9/38

CPC classification number: G06F9/30076 , G06F9/30174 , G06F9/3879 , G06F9/4893 , Y02D10/24

Abstract: An apparatus and method is described herein for coupling a processor core of a first type with a co-designed core of a second type. Execution of program code on the first core is monitored and hot sections of the program code are identified. Those hot sections are optimize for execution on the co-designed core, such that upon subsequently encountering those hot sections, the optimized hot sections are executed on the co- designed core. When the co-designed core is executing optimized hot code, the first processor core may be in a low-power state to save power or executing other code in parallel. Furthermore, multiple threads of cold code may be pipelined on the first core, while multiple threads of hot code are pipeline on the co-designed core to achieve maximum performance.

Abstract translation: 本文描述了一种用于将第一类型的处理器核与第二类型的共同设计的核耦合的装置和方法。对第一个核心上的程序代码执行进行监控，并且识别程序代码的热部分。这些热段在协同设计的核心上进行优化，以便随后遇到这些热部分，优化的热段在协同设计的核心上执行。当共同设计的核心正在执行优化的热代码时，第一处理器核心可以处于低功率状态以节省功率或并行执行其他代码。此外，多个冷码线程可以在第一核心上流水线化，而多个热代码线程在共同设计的核心上进行流水线以实现最大性能。

7.

发明申请
METHOD AND SYSTEM FOR COLLABORATIVE PROFILING FOR CONTINUOUS DETECTION OF PROFILE PHASE 审中-公开
Title translation: 用于协调分析的方法和系统，用于连续检测轮廓相

公开(公告)号：WO02077821A3

公开(公告)日：2003-03-06

申请号：PCT/US0206292

申请日：2002-02-28

Applicant: INTEL CORP , WU YOUFENG

Inventor： WU YOUFENG

IPC: G06F11/36

CPC classification number: G06F11/3612

Abstract: A method and system for collaborative profiling for continuous detection of profile phase transitions is disclosed. In one embodiment, the method, comprises using hardware and software to perform continuous edge profiling on a program detecting profile phase transitions continuously; and optimizing the program based upon the profile phase transitions and edge profile.

Abstract translation: 公开了一种用于协调分析以用于轮廓相变的连续检测的方法和系统。在一个实施例中，该方法包括使用硬件和软件在连续地检测轮廓相变的程序上执行连续的边缘轮廓; 并根据轮廓相变和边缘轮廓优化程序。

8.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR A HARDWARE AND SOFTWARE SYSTEM TO AUTOMATICALLY DECOMPOSE A PROGRAM TO MULTIPLE PARALLEL THREADS 审中-公开
Title translation: 硬件和软件系统自动将程序分解为多个并行线程的系统，装置和方法

公开(公告)号：WO2012087561A3

公开(公告)日：2012-08-16

申请号：PCT/US2011063466

申请日：2011-12-06

Applicant: INTEL CORP , SAGER DAVID J , SASANKA RUCHIRA , GABOR RON , RAIKIN SHLOMO , NUZMAN JOSEPH , PELED LEEOR , DOMER JASON A , KIM HO-SEOP , WU YOUFENG , YAMADA KOICHI , NGAI TIN-FOOK , CHEN HOWARD H , BOBBA JAYARAM , COOK JEFFREY J , SHAIKH OSMAR M , SRINIVAS SURESH

Inventor： SAGER DAVID J , SASANKA RUCHIRA , GABOR RON , RAIKIN SHLOMO , NUZMAN JOSEPH , PELED LEEOR , DOMER JASON A , KIM HO-SEOP , WU YOUFENG , YAMADA KOICHI , NGAI TIN-FOOK , CHEN HOWARD H , BOBBA JAYARAM , COOK JEFFREY J , SHAIKH OSMAR M , SRINIVAS SURESH

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F8/4442 , G06F9/3842 , G06F9/3851 , G06F9/3861 , G06F9/54 , G06F11/3612 , G06F11/3636 , G06F11/3648 , G06F2213/0038

Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.

Abstract translation: 描述了用于将程序自动分解成多个并行线程的硬件和软件系统的系统，设备和方法。在一些实施例中，系统和设备执行原始代码分解和/或生成的线程执行的方法。

9.

发明申请
APPARATUS, METHOD, AND SYSTEM FOR PROVIDING A DECISION MECHANISM FOR CONDITIONAL COMMITS IN AN ATOMIC REGION 审中-公开
Title translation: 设备，方法和系统，用于提供原子地区条件性的决策机制

公开(公告)号：WO2012040715A3

公开(公告)日：2012-06-21

申请号：PCT/US2011053285

申请日：2011-09-26

Applicant: INTEL CORP , BRETERNITZ JR MAURICIO , WU YOUFENG , WANG CHENG , BORIN EDSON , HU SHILIANG , ZILLES CRAIG B

Inventor： BRETERNITZ JR MAURICIO , WU YOUFENG , WANG CHENG , BORIN EDSON , HU SHILIANG , ZILLES CRAIG B

IPC: G06F9/30 , G06F9/06 , G06F9/305 , G06F15/76

CPC classification number: G06F8/443 , G06F8/52 , G06F9/3004 , G06F9/30072 , G06F9/30087 , G06F9/30116 , G06F9/3842 , G06F9/3857 , G06F9/466 , G06F11/3672 , G06F11/3688

Abstract: An apparatus and method is described herein for conditionally committing /andor speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

Abstract translation: 本文描述了用于有条件地提交/推测的检查点事务的装置和方法，这可能导致事务的动态调整大小。在二进制代码的动态优化期间，插入事务以提供存储器排序保护措施，这使得动态优化器能够更积极地优化代码。并且条件提交可以有效地执行动态优化代码，同时尝试防止事务用尽硬件资源。虽然投机检查点可以在交易中止时快速有效地恢复。处理器硬件适于支持事务的动态调整大小，诸如包括识别条件提交指令的解码器，推测性检查点指令或两者。并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

10.

发明申请
CONTEXT-SENSITIVE SLICING FOR DYNAMICALLY PARALLELIZING BINARY PROGRAMS 审中-公开
Title translation: 用于动态平行二进制程序的上下文敏感切片

公开(公告)号：WO2011056278A3

公开(公告)日：2011-06-30

申请号：PCT/US2010046685

申请日：2010-08-25

Applicant: INTEL CORP , BLOMSTEDT JOSEPH , WANG CHENG , WU YOUFENG

Inventor： BLOMSTEDT JOSEPH , WANG CHENG , WU YOUFENG

IPC: G06F17/21 , G06F9/30 , G06F9/38 , G06F13/14

CPC classification number: G06F11/3604 , G06F8/433 , G06F8/456

Abstract: In one embodiment of the invention a method comprising (1) receiving an unstructured binary code region that is single-threaded; (2) determining a slice criterion for the region; (3) determining a call edge, a return edge, and a fallthrough pseudo-edge for the region based on analysis of the region at a binary level; and (4) determining a context-sensitive slice based on the call edge, the return edge, the fallthrough pseudo-edge, and the slice criterion. Embodiments of the invention may include a program analysis technique that can be used to provide context-sensitive slicing of binary programs for slicing hot regions identified at runtime, with few underlying assumptions about the program from which the binary is derived. Also, in an embodiment a slicing method may include determining a context-insensitive slice, when a time limit is met, by determining the context-insensitive slice while treating call edges as a normal control flow edges.

Abstract translation: 在本发明的一个实施例中，一种方法包括：（1）接收单线程的非结构化二进制码区; （2）确定该区域的切片标准; （3）基于二进制级别的区域分析来确定区域的呼叫边缘，返回边缘和穿透式伪边缘; （4）基于呼叫边缘，返回边缘，下穿伪边缘和切片标准来确定上下文敏感切片。本发明的实施例可以包括程序分析技术，该程序分析技术可以用于提供用于对在运行时识别的热区域进行切片的二进制程序的上下文敏感的分片，而关于从其导出二进制的程序的基本假设很少。而且，在一个实施例中，切片方法可以包括当满足时间限制时确定上下文不敏感切片，通过确定上下文不敏感切片而将呼叫边缘视为正常控制流边缘。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification