Assisting parallelization of a computer program
    1.
    发明授权
    Assisting parallelization of a computer program 有权
    协助计算机程序的并行化

    公开(公告)号:US09250877B2

    公开(公告)日:2016-02-02

    申请号:US14033306

    申请日:2013-09-20

    Applicant: Cray Inc.

    Abstract: A parallelization assistant tool system to assist in parallelization of a computer program is disclosed. The system directs the execution of instrumented code of the computer program to collect performance statistics information relating to execution of loops within the computer program. The system provides a user interface for presenting to a programmer the performance statistics information collected for a loop within the computer program so that the programmer can prioritize efforts to parallelize the computer program. The system generates inlined source code of a loop by aggressively inlining functions substantially without regard to compilation performance, execution performance, or both. The system analyzes the inlined source code to determine the data-sharing attributes of the variables of the loop. The system may generate compiler directives to specify the data-sharing attributes of the variables.

    Abstract translation: 公开了一种用于协助计算机程序并行化的并行化辅助工具系统。 系统指导计算机程序的检测代码的执行,以收集与计算机程序中的循环执行相关的性能统计信息。 该系统提供一个用户界面,用于向程序员呈现为计算机程序内的一个循环收集的性能统计信息,以便程序员可以将努力的优先次序并行化计算机程序。 系统通过积极地内联函数,基本上不考虑编译性能,执行性能或两者,生成循环的内联源代码。 系统分析内联源代码以确定循环变量的数据共享属性。 系统可以生成编译器指令来指定变量的数据共享属性。

    High-bandwidth prefetcher for high-bandwidth memory

    公开(公告)号:US09946654B2

    公开(公告)日:2018-04-17

    申请号:US15335041

    申请日:2016-10-26

    Applicant: Cray Inc.

    CPC classification number: G06F12/0862 G06F12/1054 G06F2212/602 G06F2212/68

    Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.

    HIGH-BANDWIDTH PREFETCHER FOR HIGH-BANDWIDTH MEMORY

    公开(公告)号:US20180074963A1

    公开(公告)日:2018-03-15

    申请号:US15335041

    申请日:2016-10-26

    Applicant: Cray Inc.

    CPC classification number: G06F12/0862 G06F12/1054 G06F2212/602 G06F2212/68

    Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.

    HIGH-BANDWIDTH PREFETCHER FOR HIGH-BANDWIDTH MEMORY

    公开(公告)号:US20190042435A1

    公开(公告)日:2019-02-07

    申请号:US15913749

    申请日:2018-03-06

    Applicant: Cray Inc.

    Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.

    Memory allocation system for multi-tier memory

    公开(公告)号:US10185659B2

    公开(公告)日:2019-01-22

    申请号:US15374114

    申请日:2016-12-09

    Applicant: Cray, Inc.

    Abstract: A system is provided for allocating memory for data of a program for execution by a computer system with a multi-tier memory that includes LBM and HBM. The system accesses a data structure map that maps data structures of the program to the memory addresses within an address space of the program to which the data structures are initially allocated. The system executes the program to collect statistics relating to memory requests and memory bandwidth utilization of the program. The system determines an extent to which each data structure is used by a high memory utilization portion of the program based on the data structure map and the collected statistics. The system generates a memory allocation plan that favors allocating data structures in HBM based on the extent to which the data structures are used by a high memory utilization portion of the program.

    MEMORY ALLOCATION SYSTEM FOR MULTI-TIER MEMORY

    公开(公告)号:US20180322064A1

    公开(公告)日:2018-11-08

    申请号:US16034216

    申请日:2018-07-12

    Applicant: Cray, Inc.

    Abstract: A system is provided for allocating memory for data of a program for execution by a computer system with a multi-tier memory that includes LBM and HBM. The system accesses a data structure map that maps data structures of the program to the memory addresses within an address space of the program to which the data structures are initially allocated. The system executes the program to collect statistics relating to memory requests and memory bandwidth utilization of the program. The system determines an extent to which each data structure is used by a high memory utilization portion of the program based on the data structure map and the collected statistics. The system generates a memory allocation plan that favors allocating data structures in HBM based on the extent to which the data structures are used by a high memory utilization portion of the program.

    ASSISTING PARALLELIZATION OF A COMPUTER PROGRAM

    公开(公告)号:US20160110174A1

    公开(公告)日:2016-04-21

    申请号:US14978211

    申请日:2015-12-22

    Applicant: Cray Inc.

    Abstract: A parallelization assistant tool system to assist in parallelization of a computer program is disclosed. The system directs the execution of instrumented code of the computer program to collect performance statistics information relating to execution of loops within the computer program. The system provides a user interface for presenting to a programmer the performance statistics information collected for a loop within the computer program so that the programmer can prioritize efforts to parallelize the computer program. The system generates inlined source code of a loop by aggressively inlining functions substantially without regard to compilation performance, execution performance, or both. The system analyzes the inlined source code to determine the data-sharing attributes of the variables of the loop. The system may generate compiler directives to specify the data-sharing attributes of the variables.

    Assisting parallelization of a computer program

    公开(公告)号:US10761820B2

    公开(公告)日:2020-09-01

    申请号:US14978211

    申请日:2015-12-22

    Applicant: Cray Inc.

    Abstract: A parallelization assistant tool system to assist in parallelization of a computer program is disclosed. The system directs the execution of instrumented code of the computer program to collect performance statistics information relating to execution of loops within the computer program. The system provides a user interface for presenting to a programmer the performance statistics information collected for a loop within the computer program so that the programmer can prioritize efforts to parallelize the computer program. The system generates inlined source code of a loop by aggressively inlining functions substantially without regard to compilation performance, execution performance, or both. The system analyzes the inlined source code to determine the data-sharing attributes of the variables of the loop. The system may generate compiler directives to specify the data-sharing attributes of the variables.

    HIGH-BANDWIDTH PREFETCHER FOR HIGH-BANDWIDTH MEMORY

    公开(公告)号:US20190163637A9

    公开(公告)日:2019-05-30

    申请号:US15913749

    申请日:2018-03-06

    Applicant: Cray Inc.

    Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.

    High-bandwidth prefetcher for high-bandwidth memory

    公开(公告)号:US10303610B2

    公开(公告)日:2019-05-28

    申请号:US15913749

    申请日:2018-03-06

    Applicant: Cray Inc.

    Abstract: A method for prefetching data into a cache is provided. The method allocates an outstanding request buffer (“ORB”). The method stores in an address field of the ORB an address and a number of blocks. The method issues prefetch requests for a degree number of blocks starting at the address. When a prefetch response is received for all the prefetch requests, the method adjusts the address of the next block to prefetch and adjusts the number of blocks remaining to be retrieved and then issues prefetch requests for a degree number of blocks starting at the adjusted address. The prefetching pauses when a maximum distance between the reads of the prefetched blocks and the last prefetched block is reached. When a read request for a prefetched block is received, the method resumes prefetching when a resume criterion is satisfied.

Patent Agency Ranking