Electrical and Computer Engineering
Journal or Book Title
As DRAM technology continues to evolve towards smaller feature sizes and increased densities, faults in DRAM subsystem are becoming more severe. Current servers mostly use CHIPKILL based schemes to tolerate up-to one/two symbol errors per DRAM beat. Multi-symbol errors arising due to faults in multiple data buses and chips may not be detected by these schemes. In this paper, we introduce Single Symbol Correction Multiple Symbol Detection (SSCMSD) - a novel error handling scheme to correct single-symbol errors and detect multi-symbol errors. Our scheme makes use of a hash in combination with Error Correcting Code (ECC) to avoid silent data corruptions (SDCs). SSCMSD can also enhance the capability of detecting errors in address bits. We employ 32-bit CRC along with Reed-Solomon code to implement SSCMSD for a x4 based DDRx system. Our simulations show that the proposed scheme effectively prevents SDCs in the presence of multiple symbol errors. Our novel design enabled us to achieve this without introducing additional READ latency. Also, we need 19 chips per rank (storage overhead of 18.75 percent), 76 data bus-lines and additional hash-logic at the memory controller.
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Yeleswarapu, Ravikiran and Somani, Arun K., "Addressing multiple bit/symbol errors in DRAM subsystem" (2019). Electrical and Computer Engineering Publications. 224.