An improved algorithm for parameter estimation suitable for mixed Weibull distributions

An improved algorithm for parameter estimation suitable for mixed Weibull distributions

International Journal of Fatigue 22 (2000) 75–80 www.elsevier.com/locate/ijfatigue Technical Note An improved algorithm for parameter estimation sui...

540KB Sizes 0 Downloads 42 Views

International Journal of Fatigue 22 (2000) 75–80 www.elsevier.com/locate/ijfatigue

Technical Note

An improved algorithm for parameter estimation suitable for mixed Weibull distributions Marko Nagode *, Matija Fajdiga Faculty of Mechanical Engineering, University of Ljubljana, Asˇkercˇeva 6, SI-1000 Ljubljana, Slovenia Received 21 May 1999; received in revised form 10 September 1999; accepted 10 September 1999

Abstract The shape of measured or design load spectra can vary considerably and therefore frequently cannot be approximated by simple distribution functions. The introduction of mixed Weibull distribution has thus been suggested. Compared to the extreme value distribution, the proposed distribution makes the extrapolation of load spectra and its scatter prediction much easier, especially in the case of variable operating conditions. The major drawback of mixed distributions in general is a complicated, and often impossible, calculation of unknown parameters. This article presents an improved algorithm for unknown parameter estimation. The algorithm is distribution-independent and consists of rough parameter estimation and optimization. Each component density is calculated separately. The solving of complicated systems of equations is not necessary. The proposed algorithm has been tested by analyzing load spectra resulting from in-the-field measurements.  1999 Elsevier Science Ltd. All rights reserved. Keywords: Numerical modelling; Fatigue load spectra; Operational loading; Probabilistic analysis

1. Introduction The service life of a structural component depends primarily on the loading conditions in service, which can be varied by operating conditions and by the usage of a component [1]. The most questionable is certainly the determination of usage. Tests on proving grounds are commonly used to obtain actual load time histories [2]. The usage profile can also be obtained by long-term real-time measurements [3,4]. In this case, a certain number of, e.g., vehicles equipped with measuring devices, is exposed to normal use. To decrease the necessary amount of computer memory, the measuring data are frequently transformed into load spectra already within a measuring device [5]. As a consequence of data compression, the information about operating conditions is lost. Each driver can thus be characterized by a representative load spectrum. If the operating conditions vary during service life, the shape of representative load spectra is expected to be multi-modal and can thus not be

* Corresponding author. Tel.: +386-61-1771-505; fax: +386-61218-567. E-mail address: [email protected] (M. Nagode)

characterized by unimodal distribution functions. Quite a few research works such as [6–8] have confirmed this. The most direct way to make the extrapolation of load spectra possible is if the measured spectra are modelled by mixed distribution functions. A significant difficulty common to all mixed distributions is the estimation of unknown parameters. This is resolved predominately by solving different systems of equations [9]. It has been shown that all possible basic shapes of load spectra can be approximated well by the two-parameter Weibull distribution [10]. Later Zhao and Baker [7] suggested that the load range probability density function (p.d.f.) for a Gaussian process should be defined as a linear combination of two Weibull distributions. As a reasonable step forward, the mixed Weibull distribution has been suggested for spectra composed of more than one basic shape [11]. In this case, solving the system of equations is very difficult and only possible if the spectrum is composed only of two Weibull distributions or other simplifications are made [9]. To overcome this problem, an alternative procedure of unknown parameter estimation has been proposed [11]. However, subsequent studies have shown that it can still be significantly improved. For this reason some modifications of the original procedure are suggested. They are

0142-1123/00/$ - see front matter  1999 Elsevier Science Ltd. All rights reserved. PII: S 0 1 4 2 - 1 1 2 3 ( 9 9 ) 0 0 1 1 2 - 7

76

M. Nagode, M. Fajdiga / International Journal of Fatigue 22 (2000) 75–80

mainly related to rough parameter estimation and the treatment of errors.

2. Problem definition

3. Critique of the algorithm

A mixed p.d.f. is defined as



The existing algorithm of working out unknown constants consists of the following steps:

m

f(s)⫽

wlfl(s)

(1)

l⫽1

where wl⬎0 (l=1, %, m) and ⌺ wl=1. Constant wl stands for a weighting factor, whereas fl(s) represents an arbitrary p.d.f. Generally speaking, f(s) can be composed of m component distributions fl(s), each of a different type. Detecting and unwrapping a model in its different underlying components is a difficult undertaking. Significant simplification can be achieved if all fl(s) are of the same type. It turns out that the selection of the proper component distribution is of paramount importance, especially in the extrapolated region of the load spectra. Furthermore, the total number of component distributions m is also influenced by the choice of fl(s). Considering all the points mentioned above the selection of Weibull distribution as fl(s) seems to be the most reasonable one m l=1

冉冊

bl s fl(s)⫽ ql ql

bl−1

冉 冉 冊冊

exp ⫺

s ql

bl

(2)

It can approximate normal or lognormal and is exactly equal to the exponential and Rayleigh distribution for bl=1 and 2, respectively. Constants bl and ql represent Weibull shape parameters. They are limited according to the characteristics of the Weibull distribution bl⬎0⵩ql⬎0; (l⫽1, %, m)

(3)

Eq. (1) can thus be rewritten as

冘 冉冊 m

f(s)⫽

wl

l⫽1

bl s ql ql

bl−1

冉 冉 冊冊

exp ⫺

s ql



F(s)⫽1⫺

l⫽1

冉 冉冊冊

wl exp ⫺

s ql

1. In the histogram of relative frequencies the position of the global maximum (mode) and its left and right minima are detected. 2. Extracting the lth mode. As a result two histograms are obtained. The first belongs to the lth mode, whereas the second forms the residuum. 3. Rough estimation of unknown parameters wl, bl and ql. 4. Estimating optimal values of unknowns, belonging to the lth mode fl(s). As the individual components are mixed together in a histogram of relative frequencies, there is no exact line of separation between them. The only parameter that can be detected is the mode position. The determination of local minima should be avoided to enable the usability of the algorithm in an n-dimensional space. In other words, with the increase in the number of independent variables the determination of local minima becomes more and more complicated. The algorithm has thus been so modified that the local minima detection is no longer required. The treatment of errors has a significant influence on histogram separation as well as on the number of component densities. Instead of searching for the maximum positive deviation, the mean value of absolute relative deviations Da [12] has been introduced. The algorithm can also be accelerated.

bl

(4)

Its cumulative distribution function (c.d.f.) is therefore m

consequence, solving the unknowns for a mixture becomes almost as simple and reliable as for the individual components.

bl

4. Treatment of errors Let relative frequency fi⬘ in class i of width ⌬s be expressed as

(5)

There exist a number of different methods that enable us to calculate the parameters of a mixture [9]. Unfortunately the efficiency of known methods depends on the number of component distributions and/or on the choice of fl(s). Therefore, problem simplifications are frequently required in order to make the estimation of unknown parameters possible. Because of the stated limitations, an alternative algorithm has been proposed. It does not depend on the number of components fl(s) at all. As a



i⌬s

fi⬘⫽

f(s) ds⫹⌬fi⬘; (i⫽1, …, k)

(6)

(i⫺1)⌬s

where f(s) stands for a hypothetical p.d.f., whereas ⌬fi⬘ represents an error. Furthermore, continuous distributions in Eq. (1) can be replaced with discrete ones. This yields

冘 m

fi⬘⫽

l⫽1

冘 冕

i⌬s

m

wlfli⬘⫽

l⫽1

wl

(i⫺1)⌬s

冘 m

fl(s) ds⫹

l⫽1

wl⌬fli⬘

(7)

M. Nagode, M. Fajdiga / International Journal of Fatigue 22 (2000) 75–80

77

5. Rough parameter estimation Since



冘 冕

i⌬s

i⌬s

m

f(s) ds⫽

wl

l⫽1

(i⫺1)⌬s

fl(s) ds

(8)



(i⫺1) s

Eq. (7) can be rewritten as

冘 m

⌬fi⬘⫽

wl⌬fli⬘; (i⫽1, …, k)

(9)

l⫽1

When comparing the difference between two probability densities, a certain measure of distance between them is usually introduced. For the algorithm in question the total of absolute relative deviations has turned out to be a better selection. The mean value of absolute relative deviations Da can thus be expressed as

冘| | 冘 冘| | 冘 k

Da⫽

m

⌬fi⬘ ⫽

i⫽1

k

wl

l⫽1

i⫽1

During the rough parameter estimation, the part of the histogram that does not correspond to the observed component density fl(s) is transferred to the residuum. Besides, unknown parameters wl, bl and ql are roughly estimated. The only information that can be reliably obtained from the histogram is global mode position smax. The extreme value of the closest component density function should therefore coincide or at least be very close to it. Let us suppose that fl(s) attains its maximum at s=smax. The first derivative of Eq. (2) yields

冉 冊

bl−1 smax⫽ql bl

1 bl

for bl⬎1

(12)

If Eq. (12) is inserted into Eq. (2), we obtain fl(bl)⫽fl(smax)smax⫺(bl⫺1) e−(bl−1)/bl⫽0

(13)

m

⌬f⬘li ⫽

wlDal

(10)

l⫽1

Generally speaking, quantities Dal and Da are not equal and cannot even be stated because parameters wl, bl and ql are not known. In any case the parameters reach their optimal values when Eq. (10) is minimal. To avoid solving the system of nonlinear equations, an iterative procedure consisting of a rough and optimal parameter estimation phase has been developed, as shown above. Fig. 1 depicts the interdependence between Da and iteration number I. If I increases, Da tends to its minimal value, which is either greater or equal to zero. The iteration process ends when the discrepancy between two successive iterations is less or equal to error rate er Da(I⫺1)⫺Da(I)ⱕer

(11)

Since fl(smax) and smax are known values, shape parameters bl and ql can be calculated by, e.g., using both the Newton–Raphson method and Eq. (12). The calculated constants are treated as an outcome of the first iteration. The initial weighting factor stems from



l⫺1

wl⫽1⫺

wi; (l⫽1, …, m)

(14)

i⫽1

In the steps to follow all relative differences are computed

冕 冕

i⌬s

f⬘li− eli⫽

fl(s) ds

(i⫺1)⌬s i⌬s

; (i⫽1, …, k)

(15)

fl(s) ds

(i⫺1)⌬s

and the maximal relative difference is extracted el max⫽max{eli; i⫽1, …, k}

(16)

As a measure of distance, relative differences have been employed instead of ⌬fi⬘ [12], as they do better in tails, which is particularly important if spectra are to be extrapolated. The cycles that do not belong to the lth component density are transferred to the residuum. It has been ascertained [11] that the procedure converges rapidly if the following transformation is carried out f⬘li⫽f⬘li⫺⌬f⬘li, ri⬘⫽ri⬘⫹⌬fi⬘ and wl⫽wl⫺⌬f⬘li

Fig. 1. Da as a function of iteration number.

(17)

for those indices i which fulfil the conditions eli⬍0 or eliⱖel max(1⫺ar). The negative value of ⌬fi⬘ must never be greater than the residuum value. However, if this is not true, the error has to be corrected

78

M. Nagode, M. Fajdiga / International Journal of Fatigue 22 (2000) 75–80

⌬f⬘li⫽⫺ri

(18)

The iteration process can be accelerated by choosing acceleration rate ar⬎0. However, if it is too high, it can influence wl, bl and ql. The procedure has to be repeated until Eq. (11) is accomplished. In Fig. 2, the transition from the initial to the final shape before optimization for two fl(s) is presented. The mode location is preserved and the residuum is formed. As distinguished from [11], the searching for local minima is not necessary any more.

6. Optimal parameter estimation As the global mode position is affected by the neighbouring component densities, the exact mode location of the observed distribution does not coincide with smax. Therefore, parameter optimization is required. In this case, instead of Eqs. (12) and (13), MLE has been used to recalculate the shape parameters. Otherwise the procedure is the same as before. When condition (11) is fulfilled, optimal values wl, bl and ql are reached. As can be observed from Fig. 3, mode position does not necessarily coincide with its initial position after optimization. Both component densities f1(s) and f2(s) also fit well in the histogram. The corresponding representative H(s) and partial load spectra Hl(s) are depicted in Fig. 4. What is left in residuum ri⬘ is used as an input histogram for the next, in this particular case the third component density. The whole procedure should thus be repeated (m⫺1) times, once for each fl(s). The remaining cycles join the last, mth component density by using MLE.

Fig. 3.

Estimating optimal values of unknowns wl, bl and ql.

Fig. 4.

Empirical and computed distribution functions.

It is well known [13] that MLE is a convenient method for parameter estimation in the case of two- or even three-parameter Weibull distribution. However for Weibull mixtures, MLE alone is not appropriate [14]. This is why the presented algorithm has been developed.

7. Example

Fig. 2. Rough estimation of weight and shape parameters.

The characteristics of the algorithm will be demonstrated by an example. In Fig. 5 load time histories, rep-

M. Nagode, M. Fajdiga / International Journal of Fatigue 22 (2000) 75–80

Fig. 5.

79

Wheel forces Fx, Fy and Fz and braking moment My.

resenting all three wheel forces and braking moment My obtained by measurements on proving ground, are shown. In the diagram it is evident that the operating conditions are not stable during the test. Since the load time histories are not stationary, it can be expected that the corresponding load spectra are composed of more than one basic shape. This can be seen in Fig. 6. Load ranges have been derived from level crossing counting. The agreement between empirical and hypothetical representative spectra is also good in the region of high amplitudes, which is especially important for spectrum extrapolation. What is left is the explanation of the quantities used. The total number of partial spectra m can be treated as the number of characteristic operating conditions that the vehicle was exposed to during the test. The weighting factors represent the usage of a vehicle. If they are changed and the shape parameters bl and ql are preserved, different users can be simulated. Fig. 6.

8. Conclusions This article presents an algorithm suitable for unknown parameter estimation, irrespective of the type of component p.d.f. Any other component p.d.f. could be used instead of the Weibull distribution. The selection of the counting method is optional. Instead of calculating all shape and weight parameters simultaneously, unknowns wl, bl and ql are calculated for every single component p.d.f. The local minima are no more necessary, so the algorithm could be expanded at least to two

Empirical and extrapolated representative and partial spectra.

independent variables. This way the extrapolation of the rainflow matrix using an arbitrary component distribution would be possible. By taking another criterion into consideration to stop the iteration process, the necessary number of iterations and the number of modes m have been reduced. It is also not necessary to bin data in a histogram. This has been done only to reduce the time necessary for calculations. An Excel file with a DLL that makes the algorithm user-friendly is available from the authors. Finally, the number of cycles to failure associated

80

M. Nagode, M. Fajdiga / International Journal of Fatigue 22 (2000) 75–80

calculated. Secondly, total damage is reached by the weighted sum of partial damage fractions. Acknowledgement The first author would like to thank the Fraunhofer Institut fu¨r Betriebsfestigkeit — LBF in Darmstadt, because this institute allowed him to use the data from their measurements. References

Fig. 6. (continued)

with a Weibull mixture can be determined by using known cumulative damage rules. Firstly, the damage accumulation for each single component distribution is

[1] Grubisic V. Determination of load spectra for design and testing. Int J Vehicle Design 1994;15(1/2):8–26. [2] Bra¨ker KF, Hummel R, Schu¨tz D, Kla¨tschke H. VersuchszeitReduzierung bei der mehraxialen Betriebslastensimulation durch das LBF Simultanverfahren. DVM-Bericht 1997;123:191–205. [3] Oelmann B. Online Datenerfassung Verarbeitung und Analyse mit KISS. Praxisbericht aus der Pkw-Entwicklung. DVM-Bericht 1998;125:31–58. [4] Nagel KD. Langzeitfeldmessungen. DVM-Bericht 1998;125: 107–9. [5] Murakami Y, Mineki K, Wakamatsu T, Morita T. Data acquisition by a small portable strain histogram recorder (Mini-Rainflow-Coder) and application to fatigue design of car wheels. In: Marquis G, Solin J, editors. Fatigue design and reliability. ESIS publication, 23. Amsterdam: Elsevier, 1999:135–45. [6] Grubisic V, Fischer G. Methodology for effective design evaluation and durability approval of car suspension components. SAE technical paper series 970094, 1997. [7] Zhao W, Baker MJ. On the probability density function of rainflow stress range for stationary Gaussian processes. Int J of Fatigue 1992;14(2):121–35. [8] Nagode M, Fajdiga M. On the new method of the loading spectra extrapolation and its scatter prediction. In: Marquis G, Solin J, editors. Fatigue design and reliability. ESIS publication, 23. Amsterdam: Elsevier, 1999:147–53. [9] Titterington DM, Smith AFM, Makov UE. Statistical analysis of finite mixture distributions. Chichester: John Wiley and Sons, 1985. [10] Buxbaum O, Zaschel JM. Beschreibung stochastischer Beanspruchungs-Zeit-Funktionen. Verhalten von Stahl bei schwingender Beanspruchung. Kontaktstudium Werkstoffkunde Eisen und Stahl III. Du¨sseldorf: Verein Deutscher Eisenhu¨ttenleute, 1979: 208– 22. [11] Nagode M, Fajdiga M. A general multi-modal probability density function suitable for the rainflow ranges of stationary random processes. Int J of Fatigue 1998;20(3):211–23. [12] Grabec I, Sachse W. Synergetics of measurement prediction and control. Berlin: Springer Verlag, 1997. [13] Bain JL, Engelhardt M. Statistical analysis of reliability and lifetesting models. New York: Marcel Dekker, 1991. [14] Bishop CM. Neural networks for pattern recognition. Oxford: Clarendon Press, 1998.