MICROELECTROtlICS RELIABILITY PERGAMON
Microelectronics Reliability 38 (1998) 1171-1175
Comparison between field reliability and new prediction methodology on avionics embedded electronics P. Charpenel a, p. Cavemes b, V. Casanovas a, j. Borowski b, JM. Chopin a a AEROSPATIALE A~ronautique, 316 Route de Bayonne, M8621, 31060 Toulouse Cedex, France b GIATIndustries, 155 Av. de Grande-Bretagne, 31052 Toulouse Cedex, France
Abstract :
This paper presents a critical analysis of a deterministic reliability prediction approach previously described. Moreover, the proposed methodology is applied to an avionics embedded electronics equipment and the results are compared to field return data. A very good accordance between these predictive reliability data and field data is shown. The limitations of this methodology are analyzed. © 1998 Elsevier ScienceLtd. All rights reserved.
1.
IntroductiOn
A previous paper [1] showed an other way to assess electronics reliability based on reliability tests performed by the parts manufacturers. Some critical points have been pointed out and have to be treated. This paper presents a critical analysis of this methodology and its application on an avionics embedded electronics equipment. Then, the results are compared to field return data and classical used methods. 2.
Context
The identification and follow-up of our avionics is a prime necessity for our operational dependability policy. A systematic follow-up has been put in place, based on the research of the anomaly root-cause accounting for each equipment removal. Then, a set of indicators is created in order to control : • reliability growth (according to DUANE model), • relative repartition of failures observed during aircraft integration and during normal
utilization by the airlines so that early failures are discriminated from wear-out one's, • comparison for each component family between the operational and estimated failure rates. Various prediction methods exist (MIL-HDBK217 [2], CNET RDF [3] .... ) but they are based on empirical failure rate models, developped from curve fits of field failure data. These data are limited in terms of number of failures in a given environment and determination of the actual cause of failure [4]. Moreover, future developments of these methods are doubtful and involve the development of new methodologies. Figure 1 shows field return results on an avionics computer, compared to the predictive failure rate. 3.
Methodology description
Our electronics component policy is based on the use of components off the shelf (COTS) with certain levels of quality and reliability. We request all the necessary information from the manufacturer to assess his component technology, given the fact that he is alone to know his products, and then its
0026-2714/98/$ - see front matter. C) 1998 Elsevier Science Ltd. All rights reserved. PII: S 0 0 2 6 - 2 7 ! 4 ( 9 8 ) 0 0 0 7 7 - 8
P. Charpenel et al./Microelectronics Reliability 38 (1998) 1171-1175
1172
8'
7. q 6.
nl
5
4
3
m U.
1 0 -
"
Field return -O-- MII-Hdbk-217-E method
Fig. 1. Example of comparison between field and predictive failure rates of an embedded avionics system performances. A previous paper [1] described the proposed methodology based on the reliability tests performed by the parts manufacturers. The data obtained from these standard tests may be useful to estimate the reliability of the COTS used in our applications. Two assumptions are done : all of the intrinsic failure mechanisms are revealed by these tests, and the observed failure mechanisms are the same in the field. The chosen tests are High Temperature Operating Life (HTOL), Temperature Humidity Bias (THB), and Temperature Cycle (TC) or tests similar to these one's. To analyze the results of these reliability stress tests, probabilistic and physical laws are used, like Arrhenius model for temperature effects, Coffm-Manson model for the thermal fatigue effects, Gunn or Peek models for the influence of humidity rate and Poisson or ~2 distribution laws. We have successfully studied its feasibility on two electronic components. We have confirmed too the inadequacy between field return data and classical reliability prediction methods and the accordance between our deterministic approach and the field return data. Furthermore, some critical points have been pointed out, as the inaccuracy of
the use environment knowledge which may lead to a great variation in the predicted reliability data.
4.
Critical analysis
In the feasibility study, we chose two plastic encapsulated parts, respectively in small outline (SO8) and lead chip carrier (PLCC84) packages, in order to take into account and assess the humidity rate influence on reliability. When we analyze the obtained results, we may see that the humidity induced failure rate accounts for only an infinitesimal percentage of the whole failure rate. This percentage is independent of the chosen model [5]. According to our field return experience, we may neglect the humidity influence in the calculations. We use the modified Coffm-Manson relation [6] to assess the thermal fatigue effects. The thermal cycling, and then the thermal fatigue, seems to be the most influential phenomenon on the parts reliability. We can write the failure rate as : 2 = 2r + 2cr + 2 ~
(1)
P. Charpenel et al./ Microelectronics Reliability 38 (1998) 1171-1175
where ~,r, ~CF and ~,~ are the failure rates induced by temperature, thermal cycling and humidity rate respectively. According to the above comments, Eq. 1 becomes : ~ 2r + 2CF
(2)
If we consider a daily AT of 30°C, we obtain with an activation energy Ea of 0.7 eV : (3)
2 r ~ 0.6 A and 2cr ~ 0.4 )1,
On the contrary, with a daily AT of 80°C, the ratio becomes : (4)
2T~ 0.25 A and Ace ~ 0.75 2
Accurate knowledge of operational daily AT may greatly influence the FIT (failure in time) predictive values. This is the major key point of the proposed methodology. To confirm this assumption, table 1 shows the influence of the operational temperature gradient on the FIT of the components chosen in the preliminary study (component A in SO8 package, components B1 and B2 in PLCC84 package and different mission profiles [ 1]). Table 1 : Coffin-Manson calculated FIT vs operational daily AT values Gradient AT
30
40
60
80
Component A
0.9
1.6
3.6
6.4
Component B1
12.6
Component B2
5
2 2 . 5 50.5 9
20
90 36
In order to solve this point, the implementation of Time Stress Measurement Devices (TSMD) into the equipment racks may be useful. Calculating ~,T must be performed as accurately as possible too. According to the Arrhenius law, we have to know accurately the mean junction temperature of each part on an electronic board (a 10°C increase induces a twice higher ~,r). However, some difficulties appear as : - the major unknown of the A_rrhenius law is the
1173
activation energy Ea, typical of the intrinsic failure mechanism and part technology. Due to the lack of knowledge of accurate Ea for every technology and failure mechanism, we have to use only average activation energies. - a part can be heated by a neighboring one's and therefore its ambient temperature won't be the ambient temperature inside the box, - in some specific cases, the equipment packaging will change significantly the component standard thermal paths. Then, it will be impossible to use some parameters given by the part manufacturer (e.g. junction to ambient thermal resistance ®JA). This problem is encountered inside equipment where the natural convection is negligible, as in land military applications. - the mean junction temperature of a part depends on the mean dissipated power by the part during its life. This mean power depends on the part mission profile which often differs from the equipment one's. In order to solve this problem, we have implemented an accurate thermal assessment methodology for the electronic equipment design. This methodology is centered around a thermal simulation tool which allows us to calculate the junction temperature of each part before the realization of the first prototype. This tool takes into account all the parameters necessary to calculate accurately the junction temperatures and their behavior (accurate 3D representation of the equipment, thermal convection, thermal conduction, thermal radiation, thermal inertia of the materials, external environmental conditions, mission profile). The results of this simulation may be confirmed when necessary on the first prototype measuring either the case temperature or the junction temperature of a particular part. These results (mean junction temperature and junction temperature gradients) are used directly as input parameters of the used physical laws. 5.
A p p l i c a t i o n on an avionics C P U unit
We choose to apply our method to an avionics CPU unit in order to compare the obtained results with other classical methods (MIL-HDBK-217, CNET RDF93) used to assess its reliability and field reliability results. For the chosen CPU unit (made with two cards called here C1 and C2), the ratio between Mil-Hdbk-217 failure rate and field failure rate is over 5. We also use a modified CNET
P. Charpenel et al./Microelectronics Reliability 38 (1998) 1171-1175
1174
RDF93, called CRDF, using corrective factors, for which calculated failure rates are today closed to operational failure rates. In 1996, our field return database registered 16 component failures for the CPU unit during 1.4 106 fly hours. Our reliability assessment method having been applied on active components only, we registered 8 component failures in this family, 4 for each card. Table 2 shows the obtained operational failure rates, with 60% confidence level. We define ~,G as the global failure rate and ~,A as the active components failure rate. Table 3 presents the obtained results with the classical methods for the CPU unit.
60, m 48,9
20.
10-
~A
Card CI
4.79 < ~ < 9.28
1.7 < ~< 4.97
Card C2
3.51 < ~ < 7.61
1.7 < ~ <4.97
CPU unit
11.55 < ~ < 18.76
4.16 < X < 8.46
In order to apply our method, we defined an average operational environment, described in table 4. Moreover, to take into account operational hours and not only fly hours, we have to apply a 1.5 factor to the predictive failure rates. In these conditions, we obtained a failure rate LA of 3.5.10 -6 h -1. Using a daily AT of 80°C, a failure rate LA of 6.9.10 "6 h'lis calculated, twice the previous one's, but in the same order of magnitude. Table 3 : Predictive failure rates with classical methods (10 .6 h "1) ~G
~'A
CRDF
19.65
4.34
Mil-Hbdk-217
49.1 a
48.9 b
a Mil-Hdbk-217 E notice 1 with modified nO b Mil-Hdbk-217 F notice 1
i MIL ~ 217 F notice1
114,1 I
CRDF
il3~
q
I
Field return
Our Method
Fig. 2. Failure rates ~,A obtained with various methods (10 -6 h "l)
Table 2 • Field failure rates for the considered CPU unit (10 .6 h ~) ~G
i~3
oi
The results are summarized in figure 2.
The predictive calculated FIT with the presented methodology is similar to the field one's. The proposed methodology is then valid to assess electronic parts reliability. We use it to discriminate two potential electrically equivalent parts, choosing the part manufacturer having either the best reliability approach or the best reliability results. 6.
Conclusions
The estimated failure rate of the avionics CPU unit studied is similar to the field failure rate. The input data of the proposed method being parts manufacturer data, discrimination between two electrically equivalent sources may be easily achieved. However, we have highlighted some critical points such as the use of average activation energies and the accurate knowledge of the profile mission. In order to solve this particular point, the implementation of Time Stress Measurement Devices (TSMD) into the equipment racks may be useful. The reliability assessment methodology will be implemented on future electronic equipment design.
Table 4 : Average mission profile
Acknowledgments Ambient temperature
45°C
Humidity rate
50%
Daily AT
1 cycle of 30°C
We want to acknowledge Mr Arnaud Meneut for field data analysis and calculations.
P. Charpenel et al./Microelectronics Reliability 38 (1998) 1171-1175
References [1] Charpenel P. and al., "An other way to assess electronics part reliability", Microclectron. Reliab., Vol.37, n°10/ll, Proc. ESREF 1997 [2] Mil-Hdbk-217 F [3] CNET RDF 1993 [4] Cushing M.J. and al., "Comparison of electronics-reliability assessment approaches", IEEE Trans. Reliability, Vol.42, n°4, 1993 [5] Internal report [6] Lau J.H., "Thermal fatigue life prediction of flip chip solder joints by fracture mechanics method", Eng. Fracture Mechanics, Vol.45, n°5, pp. 643-654, 1993.
1175