-
公开(公告)号:US09400702B2
公开(公告)日:2016-07-26
申请号:US14320985
申请日:2014-07-01
Applicant: Intel Corporation
Inventor: Hu Chen , Ying Gao , Xiaocheng Zhou , Shoumeng Yan , Peinan Zhang , Mohan Rajagopalan , Jesse Fang , Avi Mendelson , Bratin Saha
CPC classification number: G06F9/544 , G06F12/0815 , G06F12/084 , G06F12/1009 , G06F12/1063 , G06F12/1072 , G06F12/1081 , G06F12/109 , G06F2212/161 , G06F2212/622 , G06F2212/656 , G06F2212/657 , G06F2212/682 , G06T1/20 , G06T1/60
Abstract: Embodiments of the invention provide a programming model for CPU-GPU platforms. In particular, embodiments of the invention provide a uniform programming model for both integrated and discrete devices. The model also works uniformly for multiple GPU cards and hybrid GPU systems (discrete and integrated). This allows software vendors to write a single application stack and target it to all the different platforms. Additionally, embodiments of the invention provide a shared memory model between the CPU and GPU. Instead of sharing the entire virtual address space, only a part of the virtual address space needs to be shared. This allows efficient implementation in both discrete and integrated settings.
-
公开(公告)号:US08683487B2
公开(公告)日:2014-03-25
申请号:US13792427
申请日:2013-03-11
Applicant: Intel Corporation , Kerry Tweet
Inventor: Zhou Xiaocheng , Shoumeng Yan , Gao Ying , Hu Chen , Peinan Zhang , Mohan Rajagopalan , Avi Mendelson , Bratin Saha
CPC classification number: G06F9/544 , G06F12/0815 , G06F12/084 , G06F12/1009 , G06F12/1063 , G06F12/1072 , G06F12/1081 , G06F12/109 , G06F2212/161 , G06F2212/622 , G06F2212/656 , G06F2212/657 , G06F2212/682 , G06T1/20 , G06T1/60
Abstract: Embodiments of the invention provide language support for CPU-GPU platforms. In one embodiment, code can be flexibly executed on both the CPU and GPU. CPU code can offload a kernel to the GPU. That kernel may in turn call preexisting libraries on the CPU, or make other calls into CPU functions. This allows an application to be built without requiring the entire call chain to be recompiled. Additionally, in one embodiment data may be shared seamlessly between CPU and GPU. This includes sharing objects that may have virtual functions. Embodiments thus ensure the right virtual function gets invoked on the CPU or the GPU if a virtual function is called by either the CPU or GPU.
Abstract translation: 本发明的实施例为CPU-GPU平台提供语言支持。 在一个实施例中,可以在CPU和GPU两者上灵活地执行代码。 CPU代码可以将内核卸载到GPU。 该内核可能会调用CPU上的预先存在的库,或者调用其他CPU函数。 这允许构建应用程序,而不需要重新编译整个调用链。 此外,在一个实施例中,数据可以在CPU和GPU之间无缝共享。 这包括共享可能具有虚拟功能的对象。 因此,如果CPU或GPU调用虚拟功能,这些实施例可以确保在CPU或GPU上调用正确的虚拟功能。
-
公开(公告)号:US09535838B2
公开(公告)日:2017-01-03
申请号:US13691106
申请日:2012-11-30
Applicant: Intel Corporation
Inventor: Jasmin Ajanovic , Mahesh Wagh , Prashant Sethi , Debendra Das Sharma , David J. Harriman , Mark B. Rosenbluth , Ajay V. Bhatt , Peter Barry , Scott Dion Rodgers , Anil Vasudevan , Sridhar Muthrasanallur , James Akiyama , Robert G. Blankenship , Ohad Falik , Avi Mendelson , Ilan Pardo , Eran Tamari , Eliezer Weissmann , Doron Shamia
CPC classification number: G06F12/0831 , G06F1/3203 , G06F1/324 , G06F1/3253 , G06F12/0815 , G06F13/385 , G06F13/4045 , G06F13/4068 , G06F13/4265 , G06F2212/621 , H04L12/66 , Y02D10/126 , Y02D10/151
Abstract: A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
Abstract translation: 这里描述了用于增强/扩展串行点对点互连架构的方法和装置,例如外围组件互连Express(PCIe)。 提供了时间和地点缓存提示和预取提示,以改进系统范围的缓存和预取。 包括用于仲裁系统设备/资源之间的所有权的原子操作的消息代码,以便有效地访问/拥有共享数据。 提供的松散的事务排序,同时将对应的事务优先级保持到内存位置,以确保数据完整性和高效的内存访问。 包括有功功率子状态及其设置以允许更有效的电源管理。 并且,提供设备本地存储器在主机地址空间中的缓存以及设备本地存储器地址空间中的系统存储器的缓存,以提高存储器访问的带宽和延迟。
-
公开(公告)号:US20150161050A1
公开(公告)日:2015-06-11
申请号:US14583601
申请日:2014-12-27
Applicant: Intel Corporation
Inventor: Jasmin Ajanovic , Mahesh Wagh , Prashant Sethi , Debendra Das Sharma , David J. Harriman , Mark B. Rosenbluth , Ajay V. Bhatt , Peter Barry , Scott Dion Rodgers , Anil Vasudevan , Sridhar Muthrasanallur , James Akiyama , Robert G. Blankenship , Ohad Falik , Avi Mendelson , Ilan Pardo , Eran Tamari , Eliezer Weissmann , Doron Shamia
IPC: G06F12/08
CPC classification number: G06F12/0831 , G06F1/3203 , G06F1/324 , G06F1/3253 , G06F12/0815 , G06F13/385 , G06F13/4045 , G06F13/4068 , G06F13/4265 , G06F2212/621 , H04L12/66 , Y02D10/126 , Y02D10/151
Abstract: A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
Abstract translation: 这里描述了用于增强/扩展串行点对点互连架构的方法和装置,例如外围组件互连Express(PCIe)。 提供了时间和地点缓存提示和预取提示,以改进系统范围的缓存和预取。 包括用于仲裁系统设备/资源之间的所有权的原子操作的消息代码,以便有效地访问/拥有共享数据。 提供的松散的事务排序,同时将对应的事务优先级保持到内存位置,以确保数据完整性和高效的内存访问。 包括有功功率子状态及其设置以允许更有效的电源管理。 并且,提供设备本地存储器在主机地址空间中的缓存以及设备本地存储器地址空间中的系统存储器的缓存,以提高存储器访问的带宽和延迟。
-
公开(公告)号:US09032103B2
公开(公告)日:2015-05-12
申请号:US13706575
申请日:2012-12-06
Applicant: Intel Corporation
Inventor: Jasmin Ajanovic , Mahesh Wagh , Prashant Sethi , Debendra Das Sharma , David J. Harriman , Mark B. Rosenbluth , Ajay V. Bhatt , Peter Barry , Scott Dion Rodgers , Anil Vasudevan , Sridhar Muthrasanallur , James Akiyama , Robert G. Blankenship , Ohad Falik , Avi Mendelson , Ilan Pardo , Eran Tamari , Eliezer Weissmann , Doron Shamia
CPC classification number: G06F12/0831 , G06F1/3203 , G06F1/324 , G06F1/3253 , G06F12/0815 , G06F13/385 , G06F13/4045 , G06F13/4068 , G06F13/4265 , G06F2212/621 , H04L12/66 , Y02D10/126 , Y02D10/151
Abstract: A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
Abstract translation: 这里描述了用于增强/扩展串行点对点互连架构的方法和装置,例如外围组件互连Express(PCIe)。 提供了时间和地点缓存提示和预取提示,以改进系统范围的缓存和预取。 包括用于仲裁系统设备/资源之间的所有权的原子操作的消息代码,以便有效地访问/拥有共享数据。 提供的松散的事务排序,同时将对应的事务优先级保持到内存位置,以确保数据完整性和高效的内存访问。 包括有功功率子状态及其设置以允许更有效的电源管理。 并且,提供设备本地存储器在主机地址空间中的缓存以及设备本地存储器地址空间中的系统存储器的缓存,以提高存储器访问的带宽和延迟。
-
公开(公告)号:US09442855B2
公开(公告)日:2016-09-13
申请号:US14583601
申请日:2014-12-27
Applicant: Intel Corporation
Inventor: Jasmin Ajanovic , Mahesh Wagh , Prashant Sethi , Debendra Das Sharma , David J. Harriman , Mark B. Rosenbluth , Ajay V. Bhatt , Peter Barry , Scott Dion Rodgers , Anil Vasudevan , Sridhar Muthrasanallur , James Akiyama , Robert G. Blankenship , Ohad Falik , Avi Mendelson , Ilan Pardo , Eran Tamari , Eliezer Weissmann , Doron Shamia
CPC classification number: G06F12/0831 , G06F1/3203 , G06F1/324 , G06F1/3253 , G06F12/0815 , G06F13/385 , G06F13/4045 , G06F13/4068 , G06F13/4265 , G06F2212/621 , H04L12/66 , Y02D10/126 , Y02D10/151
Abstract: A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
Abstract translation: 这里描述了用于增强/扩展串行点对点互连架构的方法和装置,例如外围组件互连Express(PCIe)。 提供了时间和地点缓存提示和预取提示,以改进系统范围的缓存和预取。 包括用于仲裁系统设备/资源之间的所有权的原子操作的消息代码,以便有效地访问/拥有共享数据。 提供的松散的事务排序,同时将对应的事务优先级保持到内存位置,以确保数据完整性和高效的内存访问。 包括有功功率子状态及其设置以允许更有效的电源管理。 并且,提供设备本地存储器在主机地址空间中的缓存以及设备本地存储器地址空间中的系统存储器的缓存,以提高存储器访问的带宽和延迟。
-
公开(公告)号:US08806068B2
公开(公告)日:2014-08-12
申请号:US13706575
申请日:2012-12-06
Applicant: Intel Corporation
Inventor: Jasmin Ajanovic , Mahesh Wagh , Prashant Sethi , Debendra Das Sharma , David J. Harriman , Mark B. Rosenbluth , Ajay V. Bhatt , Peter Barry , Scott Dion Rodgers , Anil Vasudevan , Sridhar Muthrasanallur , James Akiyama , Robert G. Blankenship , Ohad Falik , Avi Mendelson , Ilan Pardo , Eran Tamari , Eliezer Weissmann , Doron Shamia
Abstract: A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
-
公开(公告)号:US09588826B2
公开(公告)日:2017-03-07
申请号:US14566654
申请日:2014-12-10
Applicant: Intel Corporation
Inventor: Hu Chen , Ying Gao , Xiaocheng Zhou , Shoumeng Yan , Peinan Zhang , Mohan Rajagopalan , Jesse Fang , Avi Mendelson , Bratin Saha
CPC classification number: G06F9/544 , G06F12/0815 , G06F12/084 , G06F12/1009 , G06F12/1063 , G06F12/1072 , G06F12/1081 , G06F12/109 , G06F2212/161 , G06F2212/622 , G06F2212/656 , G06F2212/657 , G06F2212/682 , G06T1/20 , G06T1/60
Abstract: Embodiments of the invention provide a programming model for CPU-GPU platforms. In particular, embodiments of the invention provide a uniform programming model for both integrated and discrete devices. The model also works uniformly for multiple GPU cards and hybrid GPU systems (discrete and integrated). This allows software vendors to write a single application stack and target it to all the different platforms. Additionally, embodiments of the invention provide a shared memory model between the CPU and GPU. Instead of sharing the entire virtual address space, only a part of the virtual address space needs to be shared. This allows efficient implementation in both discrete and integrated settings.
Abstract translation: 本发明的实施例提供了一种用于CPU-GPU平台的编程模型。 特别地,本发明的实施例为集成和分立设备提供统一的编程模型。 该模型还适用于多个GPU卡和混合GPU系统(分立和集成)。 这允许软件供应商编写单个应用程序堆栈并将其定位到所有不同的平台。 另外,本发明的实施例提供了CPU和GPU之间的共享存储器模型。 而不是共享整个虚拟地址空间,只需要共享虚拟地址空间的一部分。 这允许在离散和集成设置中有效实现。
-
公开(公告)号:US20150123978A1
公开(公告)日:2015-05-07
申请号:US14566654
申请日:2014-12-10
Applicant: Intel Corporation
Inventor: Hu Chen , Ying Gao , Xiaocheng Zhou , Shoumeng Yan , Peinan Zhang , Mohan Rajagopalan , Jesse Fang , Avi Mendelson , Bratin Saha
CPC classification number: G06F9/544 , G06F12/0815 , G06F12/084 , G06F12/1009 , G06F12/1063 , G06F12/1072 , G06F12/1081 , G06F12/109 , G06F2212/161 , G06F2212/622 , G06F2212/656 , G06F2212/657 , G06F2212/682 , G06T1/20 , G06T1/60
Abstract: Embodiments of the invention provide a programming model for CPU-GPU platforms. In particular, embodiments of the invention provide a uniform programming model for both integrated and discrete devices. The model also works uniformly for multiple GPU cards and hybrid GPU systems (discrete and integrated). This allows software vendors to write a single application stack and target it to all the different platforms. Additionally, embodiments of the invention provide a shared memory model between the CPU and GPU. Instead of sharing the entire virtual address space, only a part of the virtual address space needs to be shared. This allows efficient implementation in both discrete and integrated settings.
Abstract translation: 本发明的实施例提供了一种用于CPU-GPU平台的编程模型。 特别地,本发明的实施例为集成和分立设备提供统一的编程模型。 该模型还适用于多个GPU卡和混合GPU系统(分立和集成)。 这允许软件供应商编写单个应用程序堆栈并将其定位到所有不同的平台。 另外,本发明的实施例提供了CPU和GPU之间的共享存储器模型。 而不是共享整个虚拟地址空间,只需要共享虚拟地址空间的一部分。 这允许在离散和集成设置中有效实现。
-
公开(公告)号:US09026682B2
公开(公告)日:2015-05-05
申请号:US13713635
申请日:2012-12-13
Applicant: Intel Corporation
Inventor: Jasmin Ajanovic , Mahesh Wagh , Prashant Sethi , Debendra Das Sharma , David J. Harriman , Mark B. Rosenbluth , Ajay V. Bhatt , Peter Barry , Scott Dion Rodgers , Anil Vasudevan , Sridhar Muthrasanallur , James Akiyama , Robert G. Blankenship , Ohad Falik , Avi Mendelson , Ilan Pardo , Eran Tamari , Eliezer Weissmann , Doron Shamia
CPC classification number: G06F12/0831 , G06F1/3203 , G06F1/324 , G06F1/3253 , G06F12/0815 , G06F13/385 , G06F13/4045 , G06F13/4068 , G06F13/4265 , G06F2212/621 , H04L12/66 , Y02D10/126 , Y02D10/151
Abstract: A method and apparatus for enhancing/extending a serial point-to-point interconnect architecture, such as Peripheral Component Interconnect Express (PCIe) is herein described. Temporal and locality caching hints and prefetching hints are provided to improve system wide caching and prefetching. Message codes for atomic operations to arbitrate ownership between system devices/resources are included to allow efficient access/ownership of shared data. Loose transaction ordering provided for while maintaining corresponding transaction priority to memory locations to ensure data integrity and efficient memory access. Active power sub-states and setting thereof is included to allow for more efficient power management. And, caching of device local memory in a host address space, as well as caching of system memory in a device local memory address space is provided for to improve bandwidth and latency for memory accesses.
Abstract translation: 这里描述了用于增强/扩展串行点对点互连架构的方法和装置,例如外围组件互连Express(PCIe)。 提供了时间和地点缓存提示和预取提示,以改进系统范围的缓存和预取。 包括用于仲裁系统设备/资源之间的所有权的原子操作的消息代码,以便有效地访问/拥有共享数据。 提供的松散的事务排序,同时将对应的事务优先级保持到内存位置,以确保数据完整性和高效的内存访问。 包括有功功率子状态及其设置以允许更有效的电源管理。 并且,提供设备本地存储器在主机地址空间中的缓存以及设备本地存储器地址空间中的系统存储器的缓存,以提高存储器访问的带宽和延迟。
-
-
-
-
-
-
-
-
-