-
公开(公告)号:US10289469B2
公开(公告)日:2019-05-14
申请号:US15338247
申请日:2016-10-28
Applicant: NVIDIA CORPORATION
Inventor: Nick Fortino , Fred Gruner , Ben Hertzberg
Abstract: Systems and methods for enhancing reliability are presented. In one embodiment, a system comprises a processor configured to execute program instructions and contemporaneously perform reliability enhancement operations (e.g., fault checking, error mitigation, etc.) incident to executing the program instructions. The fault checking can include: identifying functionality of a particular portion of the program instructions; speculatively executing multiple sets of operations contemporaneously; and comparing execution results from the multiple sets of operations. The multiple sets of operations are functional duplicates of the particular portion of the program instructions. If the execution results have a matching value, then the value can be made architecturally visible. If the execution results do not have a matching value, the system can be put in a safe mode. An error mitigation operation can be performed can include a corrective procedure. The corrective procedure can include rollback to a known valid state.
-
公开(公告)号:US20180121273A1
公开(公告)日:2018-05-03
申请号:US15338247
申请日:2016-10-28
Applicant: NVIDIA CORPORATION
Inventor: Nick Fortino , Fred Gruner , Ben Hertzberg
CPC classification number: G06F11/079 , G06F9/30174 , G06F9/3842 , G06F9/3859 , G06F9/3863 , G06F11/0721 , G06F11/0751 , G06F11/0793 , G06F11/36
Abstract: Systems and methods for enhancing reliability are presented. In one embodiment, a system comprises a processor configured to execute program instructions and contemporaneously perform reliability enhancement operations (e.g., fault checking, error mitigation, etc.) incident to executing the program instructions. The fault checking can include: identifying functionality of a particular portion of the program instructions; speculatively executing multiple sets of operations contemporaneously; and comparing execution results from the multiple sets of operations. The multiple sets of operations are functional duplicates of the particular portion of the program instructions. If the execution results have a matching value, then the value can be made architecturally visible. If the execution results do not have a matching value, the system can be put in a safe mode. An error mitigation operation can be performed can include a corrective procedure. The corrective procedure can include rollback to a known valid state.
-