Improving Fault Tolerance to Radiation Effects in Integrated Systems

  • Gustavo Neuberger UFRGS
  • Fernanda Kastensmidt UERGS
  • Ricardo Reis UFRGS

Resumo


This paper describes the radiation effects in integrated systems and discusses some techniques to mitigate these effects. The main circuits analyzed are SRAM memories and SRAM-based FPGAs. New techniques used to protect these circuits against these effects are also proposed in this work. One of the presented techniques to protect FPGAs is based on a combination of Double Modular Redundancy (DMR) with Concurrent Error Detection (CED) that can reduce overheads comparing to Triple Modular Redundancy (TMR). In the case of SRAM memories, a technique based on the Reed-Solomon Code and Hamming Code was developed, as well as a tool to generate a core for fault-tolerance that minimizes the area cost of the code. Some results are presented.

Referências

Azumi, S. and Kasami, T. (1975) “Of Optimal Modified Hamming Codes”, In: Trans. Inst. Electr. Commun. Eng. Jap., vol. A58, no. 6, pp. 325-330.

Bessot, D. (1993) “Conception de Deux Points Memoire Statiques CMOS Durcis Contre L’effet des Aleas Logiques Provoques par L’environment Radiatif Spatial”, These. INPG. November, 1993.

Buchner, S., Campbell, A., Meehan, T., Clark, K., McMorrow, D., Dyer, C., Sanderson, C., Comber, C. and Kuboyama, S. (2000) “Investigation of Single-Ion Multiple-Bit Upsets in Memories on Board a Space Experiment”, In: Proceedings of IEEE Transactions on Nuclear Science, June 2000.

Carmichael, C., Fuller, E., Fabula, J. and Lima, F. (2001) “Proton Testing of SEU Mitigation Methods for the Virtex® FPGA”, In: International Conference on Military and Aerospace Applications of Programmable Logic Devices, MAPLD, 2001.

Fuller, E. (2002) “Radiation test results of the Virtex FPGA and ZBT SRAM for Space Based Reconfigurable Computing”, In: International Conference on Military and Aerospace Applications of Programmable Logic Devices, MAPLD, 2002.

Hentschke, R., Marques, F., Lima, F., Carro, L., Susin, A. and Reis, R. (2002) “Analysing Area and Performance Penalty of Protecting Different Digital Modules with Hamming Code and Triple Modular Redundancy”, In: Symposium on Integrated Circuits and Systems Design (SBCCI), September.

Houghton, A. D. (1997) “The Engineer’s Error Coding Handbook”, London, Chapman & Hall.

IBM.(2000) “SOI Technology: IBM’s Next Advance in Chip Design”, In: http://www.ibm.com, January.

Johansson, K., Ohlsson, M., Olsson, N., Blomgren, J., and Renberg, P. (1999) “Neutron Induced Single-Word Multiple-bit Upset in SRAM”, In: IEEE Transactions on Nuclear Science, December 1999.

Label, K. (1999) “Commercial Microelectronics Technologies for Applications in the Satellite Radiation Environment”, In: http://flick.gsfc.nasa.gov/radhome.htm.

Lima, F., Carro, L. and Reis, R. (2003) “Techniques for reconfigurable logic applications: Designing fault tolerant systems into SRAM-based FPGAs”. In: Proceedings of the 40th conference on Design automation, June.

Lubaszewski, M. and Courtois, B. (1998) “A reliable fail-safe system”, In: IEEE Transactions on Computers, New York, v.47, n.2, p. 236-241, February.

NASA. (2002) “Radiation Effects on Digital Systems”, In: http://radhome.gsfc.nasa.gov/top.htm, January.

Neuberger, G., Lima, F. and Reis, R. (2002) “Designing a Reed-Solomon Core Optimized for Area and Performance”, In: Proceedings of XVII South Symposium on Microelectronics, June.

Neuberger, G., Lima, F., Carro, L. and Reis, R. (2003) “A Multiple Bit Upset Tolerant SRAM Memory”, In: Latin-American Test Workshop, February.

Neuberger, G., Kastensmidt, F. and Reis, R. (2004) “Improving the Use of Reed-Solomon Codes to Increase Fault-tolerance in Very Deep Sub-Micron Integrated Circuits”, In: Latin-American Test Workshop, March.

Patel, J. H., Fung, L. Y. (1982) “Concurrent Error Detection in ALUs by Recomputing with Shifted Operands”, IEEE Transactions on Computers, Vol. C-31, July.

Redinbo, G., Napolitano, L. and Andaleon, D. (1993) “Multibit Correcting Data Interface for Fault-Tolerant Systems”, In: IEEE Transactions on Computers, April.

Reed, R. (1997) “Heavy Ion and Proton Induced Single Event Multiple Upsets”, In: Proceedings of IEEE Nuclear and Space Radiation Effects Conference (NSREC), July 1997.

Rockett, L. (1992) “SEU Hardened Scaled CMOS SRAM Cell Design Using Gate Resistors”, In: IEEE Transactions on Nuclear Science, October.

Shirvani, P., Saxena, N. and McCluskey, E. (2000) “Software Implemented EDAC Protection Against SEUs”, In: IEEE Transactions on Reliability, September.

Whitaker, S., Canaris, J. and Liu, K. (1991) “SEU Hardened Memory Cells for CCSDS REED-Solomon Encoder”, In: IEEE Transactions on Nuclear Science, December.

Wrobel, F., Palau, J., Calvet, M., Bersillon, O., Duarte, H. (2001) “Simulation of Nucleon-Induced Nuclear Reactions in a Simplified SRAM Structure: Scaling Effects on SEU and MBU Cross Sections”, In: Proceedings of IEEE Transactions on Nuclear Science, December 2001.
Publicado
10/05/2004
NEUBERGER, Gustavo; KASTENSMIDT, Fernanda; REIS, Ricardo. Improving Fault Tolerance to Radiation Effects in Integrated Systems. In: WORKSHOP DE TESTES E TOLERÂNCIA A FALHAS (WTF), 5. , 2004, Gramado/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2004 . p. 109-120. ISSN 2595-2684. DOI: https://doi.org/10.5753/wtf.2004.23384.