0
Research Papers

Intermittent Failures in Hardware and Software

[+] Author and Article Information
Roozbeh Bakhshi

CALCE Electronic Products and Systems Center,
Building 89, Room 1103,
University of Maryland,
College Park, MD 20742
e-mail: Roozbeh@calce.umd.edu

Surya Kunche

CALCE Electronic Products and Systems Center,
Building 89, Room 1103,
University of Maryland,
College Park, MD 20742
e-mail: ksurya@umd.edu

Michael Pecht

CALCE Electronic Products and Systems Center,
Building 89, Room 1103,
University of Maryland,
College Park, MD 20742
e-mail: pecht@calce.umd.edu

1Corresponding author.

Contributed by the Electronic and Photonic Packaging Division of ASME for publication in the JOURNAL OF ELECTRONIC PACKAGING. Manuscript received September 23, 2013; final manuscript received January 18, 2014; published online February 18, 2014. Assoc. Editor: Yi-Shao Lai.

J. Electron. Packag 136(1), 011014 (Feb 18, 2014) (5 pages) Paper No: EP-13-1111; doi: 10.1115/1.4026639 History: Received September 23, 2013; Revised January 18, 2014

Intermittent failures and no fault found (NFF) phenomena are a concern in electronic systems because of their unpredictable nature and irregular occurrence. They can impose significant costs for companies, damage the reputation of a company, or be catastrophic in systems such as nuclear plants or avionics. Intermittent failures in systems can be attributed to hardware failures or software failures. In order to diagnose and mitigate the intermittent failures in systems, the nature and the root cause of these failures have to be understood. In this paper we have reviewed the current literature concerning intermittent failures and have a comprehensive study on how these failures happen, how to detect them and how to mitigate them.

FIGURES IN THIS ARTICLE
<>
Copyright © 2014 by ASME
Your Session has timed out. Please sign back in to continue.

References

Thomas, D. A., Ayers, K., and Pecht, M., 2002, “The ‘Trouble Not Identified’ Phenomenon in Automotive Electronics,” Microelectron. Reliab., 42(4–5), pp. 641–651. [CrossRef]
James, I., Lumbard, D., Willis, I., and Goble, J., 2003, “Investigating No Fault Found in the Aerospace Industry,” Annual Reliability and Maintainability Symposium, Tampa, FL, January 27–30, pp. 441–446. [CrossRef]
Söderholm, P., 2007, “A System View of the No Fault Found (NFF) Phenomenon,” Reliab. Eng. Syst. Saf., 92(1), pp. 1–14. [CrossRef]
Steadman, B., Pombo, T., Madison, I., Shively, J., and Kirkland, L., 2002, “Reducing No Fault Found Using Statistical Processing and an Expert System,” IEEE AUTOTESTCON 2002, Huntsville, AL, October 15–17, pp. 872–878. [CrossRef]
WDS White Paper, 2006, “No Fault Found Returns Cost the Mobile Industry $4.5 Billion per Year,” http://www.wds.co/news/whitepapers/20060717/MediaBulletinNFF.pdf
Maul, C., McBride, J. W., and Swingler, J., 2001, “Intermittency Phenomena in Electrical Connectors,” IEEE Trans. Compon. Packag. Technol., 24(3), pp. 370–377. [CrossRef]
Schafft, H. A., 1973, “Failure Analysis of Wire Bonds,” 11th Annual Reliability Physics Symposium, Las Vegas, NV, April 3–5, pp. 98–104. [CrossRef]
McCullough, R. E., 1972, “Screening Techniques for Intermittent Shorts,” 10th Annual Reliability Physics Symposium, Las Vegas, NV, April 5–7, pp. 19–22. [CrossRef]
Koch, T., Richliug, W., Whitlock, J., and Hall, D., 1986, “A Bond Failure Mechanism,” 24th Annual Reliability Physics Symposium, Anaheim, CA, April 1–3, pp. 55–60. [CrossRef]
Minzari, D., Jellesen, M. S., Møller, P., and Ambat, R., 2011, “On the Electrochemical Migration Mechanism of Tin in Electronics,” Corros. Sci., 53(10), pp. 3366–3379. [CrossRef]
Minzari, D., Grumsen, F. B., Jellesen, M. S., Møller, P., and Ambat, R., 2011, “Electrochemical Migration of Tin in Electronics and Microstructure of the Dendrites,” Corros. Sci., 53(5), pp. 1659–1669. [CrossRef]
Reid, M., Punch, J., Grace, G., Garfias, L. F., and Belochapkine, S., 2006, “Corrosion Resistance of Copper-Coated Contacts,” J. Electrochem. Soc., 153(12), pp. B513–B517. [CrossRef]
Pan, S., Hu, Y., and Li, X., 2010, “IVF: Characterizing the Vulnerability of Microprocessor Structures to Intermittent Faults,” Design, Automation & Test in Europe Conference Exhibition (DATE), Dresden, Germany, March 8–12, pp. 238–243. [CrossRef]
Constantinescu, C., 2008, “Intermittent Faults and Effects on Reliability of Integrated Circuits,” Annual Reliability and Maintainability Symposium (RAMS 2008), Las Vegas, NV, January 28–31, pp. 370–374. [CrossRef]
Blaauw, D. T., Oh, C., Zolotov, V., and Dasgupta, A., 2003, “Static Electromigration Analysis for On-Chip Signal Interconnects,” IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., 22(1), pp. 39–48. [CrossRef]
Kothawade, S., Chakraborty, K., Roy, S., and Han, Y., 2012, “Analysis of Intermittent Timing Fault Vulnerability,” Microelectron. Reliab., 52(7), pp. 1515–1522. [CrossRef]
Mathew, S., Das, D., Rossenberger, R., and Pecht, M., 2008, “Failure Mechanisms Based Prognostics,” International Conference on Prognostics and Health Management (PHM 2008), Denver, CO, October 6–9, pp. 1–6. [CrossRef]
Kirkland, L. V., 2011, “When Should Intermittent Failure Detection Routines be Part of the Legacy Re-Host TPS?,” IEEE AUTOTESTCON 2011, Baltimore, MD, September 12–15, pp. 54–59. [CrossRef]
Qi, H., Ganesan, S., and Pecht, M., 2008, “No-Fault-Found and Intermittent Failures in Electronic Products,” Microelectron. Reliab., 48(5), pp. 663–674. [CrossRef]
Steadman, B., Berghout, F., Olsen, N., and Sorensen, B., 2008, “Intermittent Fault Detection and Isolation System,” IEEE AUTOTESTCON 2008, Salt Lake City, UT, September 8–11, pp. 37–40. [CrossRef]
Gracia, J., Saiz, L., Baraza, J. C., Gil, D., and Gil, P., 2008, “Analysis of the Influence of Intermittent Faults in a Microcontroller,” 11th IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems (DDECS 2008), Bratislava, Slovakia, April 16–18, pp. 80–85. [CrossRef]
Syed, R. A., Robinson, B., and Williams, L., 2010, “Does Hardware Configuration and Processor Load Impact Software Fault Observability?,” Third International Conference on Software Testing, Verification and Validation (ICST), Paris, April 6–10, pp. 285–294. [CrossRef]
Wei, J., Rashid, L., Pattabiraman, K., and Gopalakrishnan, S., 2011, “Comparing the Effects of Intermittent and Transient Hardware Faults on Programs,” 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks Workshops (DSN-W), Hong Kong, June 27–30, pp. 53–58. [CrossRef]
Anderson, T., and Knight, J. C., 1983, “A Framework for Software Fault Tolerance in Real-Time Systems,” IEEE Trans. Software Eng., SE-9(3), pp. 355–364. [CrossRef]
Lyu, M. R., 1995, Software Fault Tolerance, John Wiley & Sons, Inc., New York.
Randell, B., 1975, “System Structure for Software Fault Tolerance,” International Conference on Reliable Software, Los Angeles, CA, April 21–23, pp. 437–449. [CrossRef]
Avizienis, A., 1985, “The N-Version Approach to Fault-Tolerant Software,” IEEE Trans. Software Eng., SE-11(12), pp. 1491–1501. [CrossRef]
Blum, M., Luby, M., and Rubinfeld, R., 1993, “Self-Testing/Correcting With Applications to Numerical Problems,” J. Comput. Syst. Sci., 47(3), pp. 549–595. [CrossRef]

Figures

Grahic Jump Location
Fig. 1

FMMEA methodology [17]

Grahic Jump Location
Fig. 2

Fishbone diagram for intermittent failures in hardware and software

Tables

Errata

Discussions

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In