Single Event Upsets Fault Tolerance of Convolutional Neural Networks Based on Adaptive Boosting
-
摘要: 空间辐射环境下的单粒子翻转效应严重威胁着星载智能系统的可靠性, 传统的三模冗余和周期性擦写等容错方法存在资源开销大、功耗高等问题. 提出一种基于自适应增强算法的轻量化容错方法(AB-FTM), 通过该方法构建ResNet20/32/44异构弱模型集成架构, 结合动态权重调整机制, 不仅显著减少参数规模(相比于原始ResNet110缩减18.2%), 而且提升了分类精度与鲁棒性, 增强了容错能力. 在CIFAR-10, MNIST, EuroSAT和Galaxy10 DECals数据集上的实验验证表明, 当0.032‰ 比例的参数发生单粒子翻转时, 该方法较ResNet110三模冗余的准确率分别提升53.25%, 63.49%, 57.67%和47.43%, 显著优于传统三模冗余方案. 该方法为未来空间科学卫星使用星载智能系统提供了兼顾可靠性、轻量化与计算效能的新型解决方案.Abstract: Single-Event Upsets (SEUs) in the space radiation environment pose a serious threat to the reliability of satellite-borne intelligent systems. Traditional fault-tolerance methods such as Triple Modular Redundancy (TMR) and periodic scrubbing face challenges including excessive resource overhead and high power consumption. This paper presents a lightweight fault-tolerance method based on Adaptive Boosting-based Fault-Tolerance Method (AB-FTM) to address SEU vulnerabilities in convolutional neural networks. The proposed approach constructs a heterogeneous ensemble architecture comprising three weak models (ResNet20, ResNet32, ResNet44) and integrated with a dynamic weight adjustment mechanism. By integrating a dynamic weight adjustment mechanism, the method not only significantly reduces the parameter scale (achieving an 18.2% reduction compared to ResNet110) but also enhances classification accuracy, robustness, and fault tolerance. Experimental validation on datasets including CIFAR-10, MNIST, EuroSAT, and Galaxy10 DECals demonstrates that when 0.032‰ of parameters are affected by single-event upsets, the proposed method improves classification accuracy by 53.25%, 63.49%, 57.67%, and 47.43% respectively compared to the TMR-based ResNet110, significantly outperforming traditional triple modular redundancy solutions. This approach provides a novel solution for future space science satellites employing satellite-borne intelligent systems, balancing reliability, lightweight design, and computational efficiency.
-
Key words:
- Single event upset /
- Adaptive boosting /
- Convolutional neural network /
- Fault tolerance /
- Spacecraft
-
表 1 基于不同数据集任务的ResNet架构对比
Table 1. Comparison of ResNet architectures based on different dataset tasks
架构 数据集 数据集
尺寸/pixel参数量/
(×106)浮点运算
次数ResNet18 ImageNet $ 256\times 256 $ $ 11.69 $ $ 1.8\times {10}^{9} $ ResNet34 ImageNet $ 256\times 256 $ $ 21.80 $ $ 3.6\times {10}^{9} $ ResNet50 ImageNet $ 256\times 256 $ $ 25.56 $ $ 3.8\times {10}^{9} $ ResNet101 ImageNet $ 256\times 256 $ $ 44.55 $ $ 7.6\times {10}^{9} $ ResNet20 CIFAR-10 $ 32\times 32 $ $ 0.27 $ $ 4.0\times {10}^{7} $ ResNet32 CIFAR-10 $ 32\times 32 $ $ 0.46 $ $ 6.9\times {10}^{7} $ ResNet44 CIFAR-10 $ 32\times 32 $ $ 0.66 $ $ 9.7\times {10}^{7} $ ResNet110 CIFAR-10 $ 32\times 32 $ $ 1.72 $ $ 2.5\times {10}^{8} $ 表 2 不同错误注入比例下发生单粒子翻转的参数数量
Table 2. Number of parameters with SEU under different error injection rates
方法 0.001‰ 0.002‰ 0.004‰ 0.008‰ 0.016‰ 0.032‰ 0.064‰ 0.128‰ 单一模型ResNet110 2 3 7 14 27 54 109 218 三模冗余ResNet110(TMR) 5 10 20 41 82 163 326 653 静态集成ResNet20/32/44(SE) 1 3 6 11 22 45 90 179 AB-FTM 1 3 6 11 22 45 90 179 表 3 各方法在无错误注入情况下的分类准确率
Table 3. Classification accuracy of each method without error injection
方法 MNIST/(%) CIFAR-10/(%) EuroSAT/(%) Galaxy10 DECals/(%) 单一模型ResNet110 99.65 93.68 97.37 84.33 三模冗余ResNet110(TMR) 99.65 93.68 97.37 84.33 静态集成ResNet20/32/44(SE) 99.68 93.83 97.56 84.49 AB-FTM 99.68 94.27 97.81 84.50 表 4 四种方法的资源开销对比
Table 4. Comparison of resource cost among four methods
方法 模型 参数量/(×106) 存储空间/MByte 浮点运算次数 单一模型ResNet110 1×ResNet110 $ 1.70 $ $ 6.78 $ $ 2.5\times {10}^{8} $ 三模冗余ResNet110(TMR) 3×ResNet110 $ 5.10 $ $ 20.34 $ $ 7.5\times {10}^{8} $ 静态集成ResNet20/32/44(SE) ResNet20+32+44 $ 1.39 $ $ 5.56 $ $ 2.1\times {10}^{8} $ AB-FTM ResNet20+32+44 $ 1.39 $ $ 5.56 $ $ 2.1\times {10}^{8} $ 表 5 动态权重调整机制贡献分析
Table 5. Contribution analysis of dynamic weight adjustment mechanism
数据集 静态集成 AB-FTM 动态权重调整机制提高的
准确率/(%)准确率/(%) 准确率下降率/(%) 准确率/(%) 准确率下降率/(%) MNIST 76.47 23.21 83.17 16.51 6.70 CIFAR-10 56.43 37.40 63.82 30.45 7.39 EuroSAT 64.80 32.76 71.30 26.51 6.50 Galaxy10 DECals 49.17 35.32 55.90 28.60 6.73 -
[1] 余静, 彭晓东, 谢文明, 等. 基于数据生成和深度神经网络的空间非合作目标行为意图识别[J]. 空间科学学报, 2024, 44(6): 1134-1146 doi: 10.11728/cjss2024.06.2023-0151YU Jing, PENG Xiaodong, XIE Wenming, et al. Spatial non-cooperative target behavior intent recognition based on data generation and deep neural networks[J]. Chinese Journal of Space Science, 2024, 44(6): 1134-1146 doi: 10.11728/cjss2024.06.2023-0151 [2] 张辉, 卢皓, 于天一, 等. “祝融号”火星车遥操作技术[J]. 深空探测学报(中英文), 2021, 8(6): 582-591 doi: 10.15982/j.issn.2096-9287.2021.20210108ZHANG Hui, LU Hao, YU Tianyi, et al. Teleoperation technology of Zhurong Mars Rover[J]. Journal of Deep Space Exploration, 2021, 8(6): 582-591 doi: 10.15982/j.issn.2096-9287.2021.20210108 [3] GAN W Q, ZHU C, DENG Y Y, et al. The advanced space-based solar observatory (ASO-S)[J]. Solar Physics, 2023, 298(5): 68 doi: 10.1007/s11207-023-02166-x [4] 陈学雷, 阎敬业, 徐怡冬, 等. 宇宙黑暗时代探路者−鸿蒙计划[J]. 空间科学学报, 2023, 43(1): 43-59CHEN Xuelei, YAN Jingye, XU Yidong, et al. Discovering the sky at the longest wavelength mission−A pathfinder for exploring the cosmic dark ages[J]. Chinese Journal of Space Science, 2023, 43(1): 43-59 [5] 张晓芳, 刘松涛, 吴耀平. 影响卫星故障的空间天气分析[J]. 空间科学学报, 2015, 35(4): 461-472 doi: 10.11728/cjss2015.04.461ZHANG Xiaofang, LIU Songtao, WU Yaoping. Statistical analysis of space weather effects on satellite anomalies[J]. Chinese Journal of Space Science, 2015, 35(4): 461-472 doi: 10.11728/cjss2015.04.461 [6] 李炎, 胡岳鸣, 曾晓洋. 面向商业航天卫星成本效益的三模冗余软错误防护技术: 近似计算的实践[J]. 电子与信息学报, 2024, 46(5): 1604-1612LI Yan, HU Yueming, ZENG Xiaoyang. Cost-effective TMR soft error tolerance technique for commercial aerospace: utilization of approximate computing[J]. Journal of Electronics & Information Technology, 2024, 46(5): 1604-1612 [7] 夏俊, 张嘉伟, 孙晨, 等. 基于XC7V690T的在轨抗单粒子翻转系统设计[J]. 计算机测量与控制, 2024, 32(3): 267-272,279 doi: 10.16526/j.cnki.11-4762/tp.2024.03.039XIA Jun, ZHANG Jiawei, SUN Chen, et al. System design of on-orbit anti-SEU based on XC7V690T[J]. Computer Measurement & Control, 2024, 32(3): 267-272,279 doi: 10.16526/j.cnki.11-4762/tp.2024.03.039 [8] RUOSPO A, SANCHEZ E, TRAIOLA M, et al. Investigating data representation for efficient and reliable convolutional neural networks[J]. Microprocessors and Microsystems, 2021, 86: 104318 doi: 10.1016/j.micpro.2021.104318 [9] IBRAHIM Y, WANG H B, LIU J Y, et al. Soft errors in DNN accelerators: a comprehensive review[J]. Microelectronics Reliability, 2020, 115: 113969 doi: 10.1016/j.microrel.2020.113969 [10] 钱欢, 谢卓辰, 梁旭文. 基于dropout算法的卷积神经网络单粒子翻转容错方法[J]. 中国科学院大学学报, 2021, 38(5): 712-719 doi: 10.7523/j.issn.2095-6134.2021.05.016QIAN Huan, XIE Zhuochen, LIANG Xuwen. Single-event upsets fault tolerance of convolutional neural networks based on dropout algorithm[J]. Journal of University of Chinese Academy of Sciences, 2021, 38(5): 712-719 doi: 10.7523/j.issn.2095-6134.2021.05.016 [11] HELBER P, BISCHKE B, DENGEL A, et al. EuroSAT: a novel dataset and deep learning benchmark for land use and land cover classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019, 12(7): 2217-2226 doi: 10.1109/JSTARS.2019.2918242 [12] LEUNG H W, BOVY J. Deep learning of multi-element abundances from high-resolution spectroscopic data[J]. Monthly Notices of the Royal Astronomical Society, 2019, 483(3): 3255-3277 doi: 10.1093/mnras/sty3217 [13] 尚琳, 刘晓娜, 曹彩霞, 等. 低轨互联网卫星在轨单粒子翻转分析及防护措施[J]. 航天器环境工程, 2021, 38(5): 503-507 doi: 10.12126/see.2021.05.002SHANG Lin, LIU Xiaona, CAO Caixia, et al. Analysis of in-orbit single event upset of low-Earth-orbit internet satellite and protection measures[J]. Spacecraft Environment Engineering, 2021, 38(5): 503-507 doi: 10.12126/see.2021.05.002 [14] REVIRIEGO P, MAESTRO J A, CERVANTES C. Reliability analysis of memories suffering multiple bit upsets[J]. IEEE Transactions on Device and Materials Reliability, 2007, 7(4): 592-601 doi: 10.1109/TDMR.2007.910443 [15] BLACK J D, DODD P E, WARREN K M. Physics of multiple-node charge collection and impacts on single-event characterization and soft error rate prediction[J]. IEEE Transactions on Nuclear Science, 2013, 60(3): 1836-1851 doi: 10.1109/TNS.2013.2260357 [16] BAEG S, WEN S J, WONG R. SRAM interleaving distance selection with a soft error failure model[J]. IEEE Transactions on Nuclear Science, 2009, 56(4): 2111-2118 doi: 10.1109/TNS.2009.2015312 [17] RUOSPO A, GAVARINI G, BRAGAGLIA I, et al. Selective hardening of critical neurons in deep neural networks[C]//Proceedings of 2022 25th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS). Prague: IEEE, 2022: 136-141 [18] 陈子洋, 张萌, 张吉良. 一种星载在轨神经网络的容错设计方法[J]. 电子与信息学报, 2023, 45(9): 3234-3243 doi: 10.11999/JEIT230378CHEN Ziyang, ZHANG Meng, ZHANG Jiliang. A fault-tolerant design of spaceborne onboard neural network[J]. Journal of Electronics & Information Technology, 2023, 45(9): 3234-3243 doi: 10.11999/JEIT230378 [19] TAHERKHANI A, COSMA G, MCGINNITY T M. AdaBoost-CNN: an adaptive boosting algorithm for convolutional neural networks to classify multi-class imbalanced datasets using transfer learning[J]. Neurocomputing, 2020, 404: 351-366 doi: 10.1016/j.neucom.2020.03.064 [20] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778 [21] KRIZHEVSKY A. Learning Multiple Layers of Features from Tiny Images[R]. Toronto: University of Toronto, 2009 [22] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324 doi: 10.1109/5.726791 [23] DEY A, SCHLEGEL D J, LANG D, et al. Overview of the DESI legacy imaging surveys[J]. The Astronomical Journal, 2019, 157(5): 168 doi: 10.3847/1538-3881/ab089d [24] WALMSLEY M, LINTOTT C, GÉRON T, et al. Galaxy Zoo DECaLS: detailed visual morphology measurements from volunteers and deep learning for 314 000 galaxies[J]. Monthly Notices of the Royal Astronomical Society, 2022, 509(3): 3966-3988 doi: 10.1093/mnras/stab2093 -
-
罗熙 男, 2000年5月出生于湖南省永州市. 现为中国科学院国家空间科学中心硕士研究生, 专业为计算机技术, 主要研究方向为星载人工智能. E-mail:
下载: