SDC-causing Error Detection Based on Lightweight Vulnerability Prediction

[edit]

Cheng Liu, Jingjing Gu, Zujia Yan, Fuzhen Zhuang, Yunyun Wang ;
Proceedings of The Eleventh Asian Conference on Machine Learning, PMLR 101:1049-1064, 2019.

Abstract

Nowadays the system vulnerability caused by soft errors grows exponentially, of which Silent Data Corruption(SDC) is one of the most harmful issues due to introducing unnoticed changes to the original data and error outputs. Thus, the detection of SDC-causing errors is extremely significant to the system reliability. However, most of the current detecting techniques require sufficient data of fault injections for training, which are difficult to achieve in practice because of high resources consumption, such as expensive execution time and code size costs. To this end, we propose a lightweight model named Deep Forest Regression based Multi-granularity Redundancy(DFRMR) to improve the error detection rate and meanwhile decrease the resources consumption. Specifically, first, we employ the program analysis to extract instruction features which are highly related to SDCs. Second, we design the deep forest regression model to predict the SDC vulnerability of instructions. Third, we optimize the error detection procedure by duplicating the critical instructions with different granularity. Finally, we evaluate our DFRMR model on Mibench benchmarks with multiple testing programs. The results show that our method attains better detection accuracy compared to other state-of-the-art methods and keeps the low multi-granularity redundancy.

Related Material