Abstract:
Embodiments of the present disclosure describe methods, apparatus, and system configurations for providing rank-specific cyclic redundancy checks in memory systems.
Abstract:
Provided are an apparatus and method to store data from a cache line at locations having errors in a sparing directory. In response to a write operation having write data for locations in one of the cache lines, the write data for a location in the cache line having an error is written to an entry in a sparing directory including an address of the cache line.
Abstract:
Error correction in a memory subsystem includes determining whether an error is a transient error or a persistent error, and adjusting an approach to ECC (error checking and correction) based on error type. The type of error can be determined by a built in self-test. If the error is a persistent error, the memory controller can perform in erasure mode, including correcting an erasure for an identified error location prior to applying an ECC correction algorithm. Otherwise, if the error is transient, the memory controller can perform standard full ECC correction by applying the ECC correction algorithm.
Abstract:
Methods, techniques, systems and apparatuses for utilizing reserved space for error correcting functionality. A cache line ("reserved line") in a plurality of cache lines to store error correcting code (ECC) data is utilized for storing ECC data corresponding to other cache lines within the plurality of cache lines when a memory device has failed.
Abstract:
Error correction in a memory subsystem includes a memory device generating internal check bits after performing internal error detection and correction, and providing the internal check bits to the memory controller. The memory device performs internal error detection to detect errors in read data in response to a read request from the memory controller. The memory device selectively performs internal error correction if an error is detected in the read data. The memory device generates check bits indicating an error vector for the read data after performing internal error detection and correction, and provides the check bits with the read data to the memory controller in response to the read request. The memory controller can apply the check bits for error correction external to the memory device.
Abstract:
A novel ECC scheme is disclosed that offers an error protection level that is at least the same as (if not better than) that of the conventional ECC scheme without negatively impacting latency and design complexity. Embodiments of the present disclosure utilize an ECC scheme which leaves up to extra 2B for metadata storage by changing the error detection and correction process flow. The scheme adopts an early error detection mechanism, and tailors the need for subsequent error correction based on the results of the early detection.
Abstract:
Memory subsystem error management enables dynamically changing lockstep partnerships. A memory subsystem has a lockstep partnership relationship between a first memory portion and a second memory portion to spread error correction over the pair of memory resources. The lockstep partnership can be preconfigured. In response to detecting a hard error in the lockstep partnership, the memory subsystem can cancel or reverse the lockstep partnership between the first memory portion and the second memory portion and create or set a new lockstep partnership. The detected error can be a second hard error in the lockstep partnership. The memory subsystem can create new lockstep partnerships between the first memory portion and a third memory portion as lockstep partners and between the second memory portion and a fourth memory portion as lockstep partners. The memory subsystem can also be configured to change the granularity of the lockstep partnership when changing partnerships.
Abstract:
An apparatus and method are described for performing forward and reverse memory sparing operations. For example, one embodiment of a processor comprises memory sparing logic to perform a first forward memory sparing operation at a first level of granularity in response to detecting a memory failure; the memory sparing logic to perform a reverse memory sparing operation in response to a determination of an improved sparing state having a second level of granularity; and the memory sparing logic to responsively perform a second forward memory sparing operation at the second level of granularity.
Abstract:
In accordance with some embodiments, memory modules containing phase change memory elements may be organized so that each memory integrated circuit includes both data and error correcting code. As a result of including the error correcting code in each integrated circuit, extra accesses of the memory module to extract the error correcting code can be avoided, improving the performance of the overall memory module in some embodiments.
Abstract:
Memory subsystem error management enables dynamically changing lockstep partnerships. A memory subsystem has a lockstep partnership relationship between a first memory portion and a second memory portion to spread error correction over the pair of memory resources. The lockstep partnership can be preconfigured. In response to detecting a hard error in the lockstep partnership, the memory subsystem can cancel or reverse the lockstep partnership between the first memory portion and the second memory portion and create or set a new lockstep partnership. The detected error can be a second hard error in the lockstep partnership. The memory subsystem can create new lockstep partnerships between the first memory portion and a third memory portion as lockstep partners and between the second memory portion and a fourth memory portion as lockstep partners. The memory subsystem can also be configured to change the granularity of the lockstep partnership when changing partnerships.