Accepted Manuscript Engineering Failure Data Analysis: Revisiting the Standard Linear Approach Zhigang Wei, Fulun Yang, Henry Cheng, Shervin Maleki, Kamran Nikbin PII: DOI: Reference:
S1350-6307(12)00273-7 http://dx.doi.org/10.1016/j.engfailanal.2012.12.003 EFA 1891
To appear in:
Engineering Failure Analysis
Received Date: Revised Date: Accepted Date:
3 September 2012 4 December 2012 12 December 2012
Please cite this article as: Wei, Z., Yang, F., Cheng, H., Maleki, S., Nikbin, K., Engineering Failure Data Analysis: Revisiting the Standard Linear Approach, Engineering Failure Analysis (2013), doi: http://dx.doi.org/10.1016/ j.engfailanal.2012.12.003
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Engineering Failure Data Analysis: Revisiting the Standard Linear Approach Zhigang Wei1*
Fulun Yang1
Henry Cheng2
Shervin Maleki3
1
Emission Control Team, Tenneco, Inc., USA
2
Tenneco China Tech Center, Shanghai, China
3
The Welding Institute, Ltd., Granta Park, Cambridge, UK
4
Department of Mechanical Engineering, Imperial College, UK
Kamran Nikbin4
(Corresponding author)
Abstract In this paper, the current standard method and other commonly used engineering practices for linear or linearized engineering failure data analysis are critically reviewed first, and the existing issues are also indicated. To overcome these issues, a linear data analysis method based on a new equilibrium mechanism is subsequently presented. Based on the equilibrium mechanism, three possible ideal data patterns are identified and the corresponding best curve fitting approaches are proposed. Compared to the existing methods, the equilibrium method not only provides quantitative solutions of fit parameters, but also gives an obvious physical meaning and, therefore, is more intuitive in quickly and correctly identifying data patterns, subsequent data preprocessing, and evaluating goodness-of-fit. The equilibrium method is further applied to fatigue and creep data analyses to demonstrate its applicability.
Keywords: Failure data analysis, Least squares method, Spring analogy, Fatigue, Creep.
Page 1 of 34
1. Introduction Curve and surface fitting is one of the most important data processing procedures in engineering failure analysis. Linear curve/surface fitting is widely applied in many engineering fields due to its simplicity and practical feasibility [1, 2] as compared to other nonlinear data analysis methods [3-5]. In fact, many engineering data analysis problems can be simplified to a linear problem. For example, stage-II fatigue crack growth data as described by the Paris law can be reasonably approximated by a straight line in log-log plot. Though the typical fatigue crack growth rate curve may exhibit a more complex shape than a straight line, nevertheless, it can be approximated in piecewise fashion with power-law segments. For each segment, the linear data analysis can be utilized. Least squares method (LS) and maximum likelihood estimation (MLE) are two of the most widely used methods for curve/surface fit and both of these methods lead to identical closed-form analytical solutions for normally distributed data [6, 7]. However, improper use of these methods leads to wrong and misleading conclusions. To avoid these issues, many engineering standards, such as the American Society for Testing and Materials [8] and the Det Norske Veritas [9] standards, have been published, for the analysis of fatigue S-N data. For example, in the ASTM standard for metallic fatigue [8], the stress range is defined as the independent variable because stress range is the “controlled” variable in experiments, and the cycle to failure N is considered as the dependent variable. It is also recommended in the ASTM standard that the cycles to failure N be plotted on abscissa while the stress range S is plotted on ordinate [8]. Therefore, the horizontal offset method is used to evaluate the variation of cycles to failure N . For materials used specifically for wind turbine design, the DNV standard of fatigue design gives a similar recommendation [9]. However, in practice, engineers sometimes use different procedures even though the standard methods are widely used in industry. The following are a few examples: (1) Many engineering software provides a very convenient tool to perform simple curve fitting. It can be used to add a Page 2 of 34
trendline, and therefore, is very popular in the engineering community [10]. However, the trendline function in many engineering software is based on the assumption that the independent variable is on the abscissa and the dependent variable on the ordinate. In other words, for a given set of fatigue S-N data with the engineering software, the fatigue S-N curve obtained is essentially based on vertical offsets method, i.e. residuals of stress range S are evaluated. If data is not properly organized, the trendline obtained from the engineering software is just opposite to the ASTM convention for fatigue S-N curve analysis and one should not expect any similarity between the fit curves and that of the ASTM standard method; (2) Some engineers treat stress range S as the dependent variable in their practices, e.g. in the VAMAS (The Versailles Project and Advanced Materials and Standards) round-robin test [11, 12] for high cycle fatigue data analysis, six out of seven of the analysis (submitted by participants from Japan and Germany) used stress range as the dependent variable; (3) A complete fatigue S-N curve usually can be divided into two parts: sloped (inclined) part at relatively low cycle regime and horizontal part [13], at very high cycle regime. For this kind of curves, some researchers use a mixed approach to curve fit fatigue S-N data, e.g. the sloped part is obtained by using horizontal offsets method, whereas for high cycle regime near the endurance limit, the vertical offsets method is used [14]. In addition, there is no universally accepted guidance on the two-slope, by which the conventional horizontal ‘fatigue limit’ is replaced with decreasing curve, fatigue S-N curve fitting [15]. The inadequateness and inconsistency of the conventional LS and MLE methods as adopted by the standard methods have been realized for a long time [13, 16], even though these methods themselves have been well understood. For example, for a series of test data, the existing standards lead to S-N curves with differing slopes ( − 3.00 to − 3.75 ) and different cutoff points ( 2×10 to 6
10×10 6 or
∞ ), where the S-N curve changes slope to the horizontal [17]. These unsatisfactory data
fits clearly indicate that some fundamental part between the LS/MLE theories and their targeted applications in data analysis is still missing and the missing part should be clearly understood Page 3 of 34
before an uncontroversial method can be fully established. This paper is intended to fill this gap with a new insight. In this paper, an equilibrium based linear curve/surface fitting concept is proposed for the first time. The new method reveals the nature of the linear curve/surface fitting process, i.e. equilibrium mechanism, which is analogous to force and angular moment equilibrium in the classic mechanics.
2. Equilibrium mechanism based linear curve/surface fitting 2.1 Spring analogy The principles behind the LS and the MLE [6, 7] are simple and elegant but abstractive and it is difficult to make a judgment on if the obtained fit parameters and curves are accurate or not. Therefore, a model with a physical analogy should be very helpful in understanding the data fitting nature and subsequent engineering design. The following spring model for linear functions
y = a + bx , Figure 1(a), and z = a + bx + cy , Figure 1(b), is developed in this paper for this purpose. In this spring model, each test data point is treated as a mass connected with a massless, frictionless, unbreakable, zero initial length and infinitely stretchable spring. The linear elastic relationship of the springs can then be described by the Hooke’s law. Therefore, the force is proportional to the displacement of a spring from its equilibrium position and the mass itself does not make any contribution because static equilibrium is assumed. The best fit curve can then be solved by introducing equilibrium (both linear and angular) concept, which is analogous to the equilibrium of force and angular moments in the classic mechanics.
Page 4 of 34
z y
y
x
x (a)
(b)
Figure 1 Data points and their elastic spring analogy for (a) curve fitting and (b) surface fitting It should be noted that there are several possible data evaluation methods in terms of evaluating directions along which a curve/surface fitting process is conducted. For example, vertical offsets method, Figure 2(a), horizontal offsets method, Figure 2(b), are used for the cases when only one variable is subjected to variation, and total least squares method, such as perpendicular offsets method, Figure 2(c), is used in cases when two or more variables are subjected to variation [17]. Therefore, the ‘force’ and ‘moment’ are vectors and the equilibrium must be established in a specific direction which could be vertical, horizontal, or perpendicular directions. The same direction variations can also be applied to surface fitting.
∑F ∑M
x
∑F ∑M i
y
i
=0
y y i
F3
y y3 y2 yc
=0
c F1
F2
y1 xc x2 x3
x1 (a)
F1
x i
=0 =0
∑F ∑M
⊥
i
F3
y
⊥ i
l1
x
F3
=0
c
F1
c F2
=0
l2
F2 l3
x (b)
x (c)
Figure 2 Equilibrium mechanism based linear curve fitting method. (a) vertical equilibrium, (b) horizontal equilibrium, (c) perpendicular equilibrium. Page 5 of 34
2.2 Linear curve fitting (1) Equilibrium along y (vertical) direction y As shown in Figure 2(a), Fi (i = 1,...n) , has the same meaning as di = Δyi in the least squares
(xi , yi )
method [6, 7], but is written in terms of 'force' from a point
y = a + bx
to the expected curve
y along the vertical direction. For a point i, Fi is derived as
Fi y = yi − (a + bxi )
(1)
The total force equilibrium is then n
n
i =1
i =1
∑ Fi y = ∑ [yi − (a + bxi )] = 0
(2)
The total moment balance, with arms along x direction, of all these data around a point
(xc , yc ), which could be the centroid of the points system, is n
n
n
i =1
i =1
i =1
∑ M iy = ∑ {[yi − (a + bxi )](xi − xc )}= ∑ {[yi − (a + bxi )]xi }= 0
(3)
In deriving Equation (3) the terms with xc is cancelled out because of Equation (2). The final solution is n
n
i =1
i =1
i
i =1
i
i =1 2
⎛ ⎞ n ∑ x − ⎜ ∑ xi ⎟ i =1 ⎝ i =1 ⎠ n
n
n
2 i
i
a=
n
∑ y ∑x −∑x ∑x y n
2 i
i
b=
n
n
n∑ xi yi − ∑ xi ∑ yi i =1
i =1
i =1
⎛ ⎞ n ∑ x − ⎜ ∑ xi ⎟ i =1 ⎝ i =1 ⎠ n
n
2
(4)
2 i
It should be noted that the concept of "centroid" has been also introduced [13] from the observation of a variety of fitting procedures. (2) Equilibrium along x (horizontal) direction
Page 6 of 34
In the same way, the equations of fitted curve can be obtained by applying the equilibrium concept along x direction, Figure 2(b). x The distance Fi from a point (xi , yi ) to the expected curve
y = a + bx along horizontal
direction is
Fi x = xi −
1 ( yi − a ) b
(5)
Then, the total force balance is n
∑F
x
i
i =1
n 1 1 n ⎤ ⎡ = ∑ ⎢ xi − ( yi − a )⎥ = − ∑ Fi y = 0 b b i =1 ⎦ i =1 ⎣
(6)
It is also interesting to note that the force equilibrium in x and y directions results in identical equations, Equations (2) and (6). This observation indicates that force equilibrium in one direction can guarantee the force equilibrium in another direction. The total angular moment balance, with arms along y direction, of all these data around a point
(xc , yc ) is n ⎧⎡ ⎫ 1 1 n ⎤ = ∑ ⎨⎢ xi − ( yi − a )⎥ ( yi − yc )⎬ = − ∑ {[yi − (a + bxi )]yi } = 0 b b i ⎦ i =1 ⎩ ⎣ ⎭
n
∑M
x i
i =1
(7)
In deriving Equation (7), the term with y c is cancelled out because of Equation (6). The final solution is then n
n
i =1
i
i
i
i =1
i =1
i =1
n
n
n
i =1
i =1
i =1
n ∑ x i y i − ∑ xi ∑ y i
2
⎛ n ⎞ n ∑ y − ⎜ ∑ yi ⎟ ⎝ i =1 ⎠ b = n i =1 n n n∑ xi yi − ∑ xi ∑ yi n
n
∑y ∑x y −∑x ∑y i
a=
n
2 i
2 i
i =1
i =1
(8)
i =1
The equilibrium equations in the perpendicular direction can also be derived, see Appendix-A, with the perpendicular offsets method, which is an alternative method to the conventional vertical and horizontal offsets methods. Page 7 of 34
2.3 Linear surface fitting
Similar to the linear curve fitting procedure, a surface fitting procedure can be developed as follows. A linear function with two independent variables x and y is expressed in equation (9)
z = f (x, y ) = a + bx + cy
(9)
The signed distance or ‘force’ Fi z , from a point (xi , yi ) to the expected surface along vertical direction, i.e. z direction, is n
n
i =1
i =1
∑ Fi z = ∑ [zi − (a + bxi + cyi )] = 0
(10)
The corresponding total ‘moment' equilibrium is then n
∑M i =1
n
x i
[
]
n
= ∑ Fi z (xi − xc ) = ∑ {[z i − (a + bxi + cy i )]xi }= 0 i =1
(11)
i =1
for moment arms along x direction, and n
n
i =1
i =1
[
]
n
∑ M iy = ∑ Fi z ( yi − y c ) = ∑ {[z i − (a + bxi + cyi )]yi } = 0
(12)
i =1
for moment arms along y direction. (x c , y c , z c ) could be the centroid of the data points system. It should be noted that in deriving Equation (11) and Equation (12) the moment arms in
x and y
directions are written as ( xi − xc ) and ( y i − y c ) , respectively. Eventually, the terms with xc and y c are cancelled out because of 'force' equilibrium Equation (10). Therefore, similar to curve fitting case, the exact location of the reference points (xc , y c , z c ) does not make any difference. For the linear function, Equation (10), the parameters a , b , and c can be uniquely solved with the Equations (10), (11), and (12), or Equation (13) in matrix form. MX = K or X = M −1K
Where
Page 8 of 34
(13)
n
M=
n
n
∑ xi
∑ yi
∑x ∑x
∑x y
∑ yi
∑ yi2
n
i =1 n
n
2 i
i
i =1 n
i =1
i =1 n
∑ xi yi i =1
i =1 n
i
i
i =1 n
a X=b K= c
∑z
i
i =1 n
∑x z
(14)
i i
i =1 n
∑y z
i i
i =1
i =1
The equilibrium equation can be applied to other directions such as perpendicular (normal) direction. The formula of fit parameters for linear surface fitting with the perpendicular offsets method is derived in the Appendix-B. The solutions of the equilibrium based surface fitting method can be reduced to that of the curve fitting method for the case with single independent variable. It should be emphasized here that all of these formulae derived using the equilibrium concept are exactly the same as that obtained by using conventional LS method and MLE method for normal distribution [6, 7]. The basic idea of the least squares method is to find the parameters, e.g.
a , b , and c for the expected best fit surface, by minimizing the sum of the squares of residuals, whereas, that of the maximum likelihood estimation approach is obtained by maximizing the likelihood function. For example, for vertical offset based method with the LS, the squares of residuals d i = Δzi for a given set of data points
Rz2 =
n
∑
[zi − (a + bxi + cyi )]2 . Differentiation of
i =1
( )
(xi , yi , zi ) i = 1,...n ,
can be expressed as
( )
Rz2 with respect to a , i.e. ∂ Rz2 ∂a = 0 , b ,
( )
2 2 i.e. ∂ Rz ∂b = 0 , and c , i.e. ∂ R z ∂c = 0 , leads to the normal Equations (10), (11), and (12),
respectively. Therefore, the equilibrium method and LS method are equivalent in surface fitting of linear function, Equation (10). Gauss-Markov theorem [6, 7], provides the mathematical proof on LS method, and therefore implicitly confirm the correctness of the spring analogy of the equilibrium method. The Gauss-Markov theorem states that in a linear model in which the errors have expectation zero and are uncorrelated and have equal variances, the best linear unbiased
Page 9 of 34
estimators of the coefficients are the least-squares estimators. Therefore, the equilibrium method has both the mathematical basis and the intuitive physical picture.
3. Data pattern identification with the equilibrium method The identical solutions derived from the LS, MLE for normal distribution, and the equilibrium methods clearly demonstrate the equilibrium nature of the LS and MLE methods for linear data. In other words, the equilibrium method is shown to be consistent with the widely used LS and MLE methods. Therefore, the equilibrium method can be confidently applied to statistical analysis of linear or linearized data. 3.1 Standard data patterns
As described previously, there are many possible equilibrium directions, such as vertical, horizontal, and perpendicular directions, and different directions result in different fitted curves. Therefore, in order to find a unique solution in a data fitting process, the data pattern has to be determined first. Otherwise, the fitted curve or surface obtained by blindly applying a fitting method to an arbitrary fitting direction could result in inaccurate prediction. It should be noted that for most engineering applications, the boundaries (the regime near the low and the high ends of the independent variable) of data control the fit of a curve due to very often insufficient data collection at the two extreme ends. Without losing generality, we first assume that the test data follow normal or lognormal distribution and the standard deviations of data are uniformly distributed for the range studied, i.e. the scatter band width is the same everywhere. With the equilibrium mechanism, it is reasonable to define the following three ideal data patterns as shown in Figure 3 (a), (b), and (c) for curve fitting and two ideal data patterns as shown in Figure 4(a) and (b) for surface fitting. For data sets with these patterns, based on the equilibrium mechanism, the best fit curve/surface must be the middle curve (line)/middle surface (plane), so that the data can be symmetrically distributed
Page 10 of 34
around the expected curve/surface and balanced to make sure that the net ‘force’ and net ‘moment’ are zero as required from the equilibrium principle.
∑F ∑M
y
i
y i
∑F ∑M
=0
x
i
=0
x i
∑F ∑M
⊥
=0
i
=0
⊥ i
=0 =0
(b) (c) (a) Figure 3 Ideal curve fitting methods for given data patterns (a) vertical offsets, (b) horizontal offset and, (c) perpendicular offsets
(a)
(b)
Figure 4 Vertical (a) and perpendicular (b) offsets directions Since the best fit can be guaranteed for these ideal data patterns with their middle lines or planes, we call the these data patterns as ‘standard patterns’ now on, i.e. vertical pattern (Figure 3(a), Figure 4(a)), horizontal pattern, Figure 3 (b), and perpendicular pattern, (Figure 3(c), Figure 4(b)), respectively. It can be demonstrated that, for a given standard data pattern and its corresponding fit curve or surface, any deviation from a standard pattern will result in a deviated fit curve/surface, which is not accurate and is therefore undesired. Take Figure 3(a), the 'standard vertical pattern', also shown in Figure 5(a), as an example. If two triangle data blocks are symmetrically added to the lower and Page 11 of 34
upper bounds of the existing data set with the standard vertical pattern (Figure 5(b)), the fit curve will go down to meet the new 'force' equilibrium because of the added two blocks of data. The net 'angular moment' can be cancelled out in the case shown in Figure 5(b). If the two blocks added to the existing standard pattern are anti-symmetrical, Figure 5(c), then the fit curve will rotate around the centroid to a certain degree to establish a new equilibrium because of the added net 'angular moment'. In the case shown in Figure 5(c), the net 'force' contributed from the two blocks are cancelled out. For cases of more general added data blocks, such as the case with only one triangle block, both 'force' and 'moment' will cause the fit curve to make both translation and rotation movements, which will result in inaccurate fit curves. Therefore, with a certain equilibrium direction, any deviation in data pattern from a standard pattern will lead to inaccurate fit curve.
y
∑F ∑M
y
i
y i
y
=0 =0
end
∑F ∑M
y
i
y i
y
=0 =0
end
x (a)
F x
(b)
∑F ∑M
y
i
y i
=0 =0
M
end
x (c)
Figure 5 Equilibrium establishment of (a) vertical pattern; (b) vertical patterned data with added symmetrical data blocks; (c) vertical patterned data with added anti-symmetrical data blocks
3.2 Consistency and significance of identifying data patterns
The importance of data pattern identification can be further demonstrated by examining the relationship between the fit parameters obtained with different standard data patterns from a data set of the same source. Figure 6 (a) is a set of data with standard vertical pattern trimmed from a fatigue crack growth data set of aircraft industry [18]. The data shown in Figure 6 (b) with the standard horizontal pattern and the data shown in Figure 6 (c) with the standard horizontal pattern Page 12 of 34
are obtained by further trimming the data shown in Figure 6(a). These data with the standard patterns are then fitted with the corresponding curve fitting methods shown in Figures 3 (a), (b), and (c). The fit curves are plotted in Figure 6 and the fit parameters are listed in Table 1 with the linearized equation y = a + bx used for curve fitting. From the highlighted data value in Table 1 it is clear that the fit parameters obtained by fitting vertical pattern with vertical offsets method, horizontal pattern with horizontal method, and perpendicular pattern with perpendicular method are very close to each other, while the values obtained by using other methods vary widely. This clearly demonstrates the importance of identifying a data pattern before fitting data to a curve, and the predicted best fit curve should be consistent and accurate as long as the equilibrium direction and the corresponding standard data pattern is chosen consistently and correctly for a given set of data. It should be noted that the perpendicularity of a data pattern is strongly coordinate and scale dependent, thus essentially subjective. By contrast, the definition of ‘vertical’ and ‘horizontal’ are objective no matter what kind of coordinate and scale are used. Therefore, the vertical and horizontal offsets methods are recommended in data fitting whenever possible. Special cautions should be taken if data with the vertical and horizontal patterns are not available and the use of the perpendicular offsets method is necessary.
1.E-03
1.E-04
Crack growth rate, mm/SFH
Crack growth rate, mm/SFH
1.E-03
Data Vertical Horizontal Perpendicular
Data
1.E-04
Vertical Horizontal Perpendicular
1.E-05
1.E-05 1
10 Stress intensity factor range, MPa.m^0.5
(a) Data with vertical pattern Page 13 of 34
100
1
10
100
Stress intensity factor range, MPa.m^0.5
(b) Data with horizontal pattern
Crack growth rate, mm/SFH
1.E-03
Data
1.E-04
Vertical Horizontal Perpendicular 1.E-05 1
10
100
Stress intensity factor range, MPa.m^0.5
(c) Data with perpendicular pattern Figure 6 Fatigue crack growth data with standard data patterns. (a) vertical pattern, (b) horizontal pattern , (c) perpendicular pattern
Table 1 Calculated fitting parameters with a = log(C ) and
b = m for the power law
m da / dN = C (ΔK ) , where N is number of cycles to failure and is ΔK stress intensity
factor range
Vertical pattern Horizontal pattern Perpendicular pattern
a b a b a b
Vertical offsets Horizontal offsets Perpendicular offsets -6.414 -6.383 -6.168 2.817 2.779 2.510 -5.754 -6.113 -6.187 2.443 1.997 2.535 -5.916 -6.234 -6.178 2.196 2.596 2.525
3.3 Guideline on data trimming
Clearly, blindly applying a curve fitting method, such as the standard horizontal offset method [8], to a given data set, no matter what kind of the pattern the data belongs to, is not a good practice. For getting an ideal data pattern and eventually physically accurate fit parameters, efforts should be started in the very early stage of testing planning stage if possible, i.e the same amount of data points should be allotted at all stress levels [2], Page 14 of 34
therefore, the ideal standard data patterns can be guaranteed as long as the data set is in a linear manner. However, in practice, test data may not be obtained by careful planning or data come from different resources. In these cases, the standard data patterns can not be guaranteed and a data trimming process, as demonstrated in Section 3.2, may be a necessary step to getting an accurate curve fitting. However, data trimming is a delicate process, and to facilitate the trimming process, the following procedure is provided for engineers to guide them for consistent operation. The procedure:
1.Find the boundary of the data set as schematically shown in Figure 7 (a). The convex hull algorithm provides an ideal tool for quantitatively determine the boundary [19]. The data pattern can be easily identified by observing the determined boundary, such as the red solid line shown in Figure 7(a); 2.Find the extreme points on the top, bottom, left and right, Figure 7 (b). If a pair of opposite extreme points are vertical lines or horizontal lines, such as the cases shown in Figures 6(a) and (b), respectively, the ideal patterns can be immediately identified. 3.If a pattern have not been identified in step 2, translate the calibre (the blue vertical or horizontal lines initially going through the extreme points) lines inward until the two boundary lines of the data enclosed by the calibre lines remained to be almost straight without significant curvature, Figure 7(c) and (d). Then, the data swept by the caliber lines can be got rid of and only the data enclosed by the caliber lines and boundary lines will be used for data fit. Definitely, vertical method for vertical data pattern, Figure 7(c), and horizontal method for horizontal data pattern, Figure 7(d). Theoretically, fit parameters obtained from Figure 7(c) and (d) should be similar, as demonstrated in section 3.2. However, in practice, the results are sensitive to the data
Page 15 of 34
trimmed off, the rule of thumb is to keep data as more as possible as long as the standard pattern is kept. Therefore, the trimming shown in Figure 7(c) is preferred over Figure 7(d).
(a)
(b)
(c)
(d)
Figure 7 Recommended data trimming procedure It should be noted the final location of the caliber lines may not be accurately determined depending on specific data structures. However, the accuracy of fit can be substantially improved as long as the main data pattern is captured and a trimming process is taken. It should be noted that in some cases, there is not pattern can be identified and the fit parameters obtained based on either methods will not available. To improve the accuracy of test data, planned new tests are recommended.
Page 16 of 34
4. Application of the equilibrium method Several examples are given below to demonstrate the advantage and the capability of the equilibrium method. 4.1 Example-1
Figure 8 shows a set of fatigue data of welded automotive exhaust components made of steel materials. Tests were conducted by controlling the applied force and only two force levels were tested with totally six data points at each force level. Wide scatter bands can be observed for both force levels because many factors were involved in the failure of the exhaust components. Since the data pattern in Figure 8 is similar to that shown in Figure 3(b), the horizontal offsets method, which is the ASTM standard recommended method [8], should provide a reasonable fit curve. The fit curves with the three fitting methods are plotted in Figure 8 and the fit parameters are listed in Table 2. It is clear that the results of horizontal offsets method are very different from other two methods while the vertical and the perpendicular are almost identical. 2.9 Data Vertical Horizontal 2.8 Log(S)
Perpendicular
2.7
2.6 4.6
4.8
5
5.2 Log(N)
Page 17 of 34
5.4
5.6
Figure 8 Vertical, horizontal, and perpendicular offsets methods for fatigue data of an automotive exhaust component Table 2 Calculated fit parameters with a = log(C ) and b = − 1 / m for the power law
S = CN −1 / m Vertical offsets Horizontal offsets Perpendicular offsets
a
3.345
4.037
3.355
b
-0.117
-0.254
-0.119
4.2 Example-2
The current structural-stress based fatigue master S-N curve shown in Figure 9 was built upon more than 900 individual test points representing many different test configurations and notch classes of steels [20]. The mean curve was obtained with the standard horizontal offsets method [8] and the minus and plus shown in Figure 9 represent the left shift and right shift from the mean curve, respectively, and STD stands for standard deviation.
10000
Test data
Eq. SS range, MPa
Mean -2*STD 1000
+2*STD +3*STD 100
10 1.E+03
.
-3*STD
1.E+04
1.E+05
1.E+06
1.E+07
1.E+08
Cycles to failure
Figure 9 Master S-N curve and the database used to build the S-N curve.
Page 18 of 34
It is clear, the data pattern shown in Figure 9 does not belong the category of Fig.3(b). Data trimming operation could be done before any curve fitting with a standard pattern, which will lead to more accurate prediction. Here, instead of trimming data, both vertical offset methods and the perpendicular methods are used to provide alternative methods. Figure 10 compares the fit curves (termed as "new master S-N curve") using the perpendicular offsets method, the dashed lines, and the horizontal offsets method (the ASTM standard method [8]), the solid lines. The fit parameters with all of the three methods are listed in Table 3. It is clear that the difference is noticeable, even though the data collected have a relatively small scatter band. The horizontal offsets method can lead to conservative prediction at high cycle fatigue regime but non-conservative prediction at low cycle fatigue regime. The pivotal point is somewhere in the middle of the two fatigue cycles. 10000
Current master S-N curve Mean -2*STD
Eq. SS range, MPa
-3*STD
1000
+2*STD +3*STD New master S-N curve Mean -2*STD
100
-3*STD +2*STD +3*STD
10 1.E+03
1.E+04
1.E+05
1.E+06
1.E+07
1.E+08
Cycles to failure
Figure 10 The comparison between the new master S-N curve and the current master S-N curve. Table 3 Master S-N curve with three curve fitting methods Master S-N data Vertical Horizontal Perpendicular a 4.0764 4.3371 4.0955 b
4.3 Example-3 Page 19 of 34
-0.2794
-0.3254
-0.2827
The importance of the equilibrium curve fitting method can be clearly demonstrated by the following 360° welds fatigue S-N data with a wide scatter band. It is clear from Figure 11 that the data pattern of the fatigue S-N data does not belong to the standard horizontal pattern, Figure 3 (b). Instead, the data pattern fit with the standard vertical, Figure 3 (a), or standard perpendicular pattern, Figure 3 (c), well. The mean curves obtained from both vertical (solid line) and horizontal (dash-dot line) methods are plotted in Figure 11 and the fit parameters are listed in Table 4, and it is clear that the difference between these two methods is significant for the data. The vertical and the perpendicular methods provide similar results. It should be noted that the horizontal method is the ASTM recommended curve fitting method [8]. The rotated fit curve with the horizontal method can be easily described by the equilibrium argument shown in Figure 5. It is interesting to note that the mean curve obtained by using the vertical offsets method for the data gives a similar trend as the mean curve shown in Figure 9 and Table 3 for welded structures made of similar materials but with a much smaller scatter band [20]. For example, the values of the slope b obtained for the wide-band data shown in Figure 11 by using the vertical method is -0.218, which is close to 0.279 and -0.325 as obtained, respectively, by using the vertical and horizontal methods for the narrow-band data shown in Figure 9. However, the value obtained by using the vertical method for the wide band data is -0.807, which is significantly different from that of the narrow-band data. Similarly, the value of the intercept a has the similar difference. Therefore, the true material properties can be revealed only when a proper data pattern and the corresponding evaluation direction are correctly identified. Obviously, the standard ASTM recommended method [8], leads to misleading conclusions for this kind of data.
Page 20 of 34
Eq. SS range, MPa
100000
Bushing
10000
Flange Pipe
1000
Horizontal method (Standard) Vertical method
100 10 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 Cycles to failure
Figure 11 360° closed welds data and the fit curves with standard horizontal offsets method (dash-dot line) and vertical method (solid line).
Table 4 Calculated fit parameters with a = log(C ) and b = − 1 / m for the power law
S = CN −1 / m Vertical offsets Horizontal offsets Perpendicular offsets
a
3.772
6.809
3.929
b
-0.218
-0.807
-0.249
4.4 Example-4
Figure 12 plots a collection of average creep rate data r of a steel at various temperature
T and stress σ levels and it is found that linear function can be used to describe creep rate. The unit of temperature in Celsius is used based on the linear data observation. Overall, the data pattern of each data set belongs to the pattern identified in Figure 4(a) for overall data, or Figure 3(a) for individual data set at each temperature level except for some data, e.g. lower Page 21 of 34
stress level data at temperature of 650 0 C and 700 0 C . Therefore, vertical offsets method can be used in these data analysis to estimate the value of fit parameters. Simple linear curve fitting can be done for each individual data sets, however, it is more convenient to use a unified formula to fit all of the data with all of temperature levels tested. Both the vertical offsets methods, i.e. Equations (13) and (14), and the perpendicular offsets method, Equation (A2-12) in Appendix B, are used here to demonstrate the surface fitting capability. The linear function z = a + bx + cy with z = log(r ) , x = T , y = log(σ ) is used here. The corresponding power law form of the function is r = 10 a10 bT σ c . The predicted curves for each temperature levels are also plotted in Figure 12, and the fit parameters are listed in Table 5. It is found that the fit curves obtained with the perpendicular method rotate counter-clockwise with respect to the respective curves obtained with vertical offsets methods, with degrees depending on the orientations of the given data. Furthermore, the individual curves obtained for each methods are parallel to each other because of the use of the unified linear function
Page 22 of 34
z = a + bx + cy , which is linear in log-log plot.
1.E+01
550 C 600 C 650 C
1.E+00
700 C
Average creep rate, %/h
750 C 800 C
1.E-01
Predicted, 550 C Vertical Predicted, 600 C Vertical Predicted, 650 C Vertical
1.E-02
Predicted, 700 C Vertical Predicted, 750 C Vertical Predicted, 800 C Vertical
1.E-03
Predicted, 550 C Perpendicular Predicted, 600 C Perpendicular Predicted, 650 C Perpendicular
1.E-04
Predicted, 700 C Perpendicular Predicted, 750 C Perpendicualr Predicted, 800 C Perpendicular
1.E-05 10
100
1000
Stress,MPa
Figure 12 Results of vertical and perpendicular offsets method with the unified linear surface z = a + bx + cy where z = log(r ) , x = T , y = log(σ ) Table 5 Fit parameters for creep test data for all testing temperatures (0C) with
z = a + bx + cy , where z = log(r ) , Vertical offsets
x = T , y = log(σ )
Perpendicular offsets
a
b
c
a
b
c
-31.66
0.025
6.27
-38.43
0.031
7.74
It should be noted that the curve fit for the data at temperature of 650 0 C and 700 0 C can be improved by trimming the data into a standard vertical pattern. In addition, the curve fit for the data at temperature of 800 0 C can be enhanced by putting more test data.
5. Discussion Linear curve fitting has been studied for hundreds years and it is of paramount importance in engineering applications. However, in their practice with even the widely used standard Page 23 of 34
methods and procedures, engineers and scientists are often perplexed by poor data correlation, especially for wide-band data. Without detailed insight of the nature of a curve fitting, the problem could not be solved. In this paper, the equilibrium mechanism is introduced to help understand the equilibrium nature of the curve fitting problem with the help of spring analogy. With the equilibrium mechanism, the importance of data pattern in the accuracy of a fitted data has been identified for the first time. Many issues faced by engineers can now be solved and it is potential impact on engineering applications is expected to be far-reaching. In addition to data pattern identification as a pre-processing procedure, the equilibrium method also provides an intuitive tool to quickly determine the fitting quality of an existing fit curve/surface by visually examining the ‘force’ and ‘moment’ balance of the mean curve/surface respect to the given data. Figure 5 gives a simple example of this procedure and it can be done before further applying the rigorous and sophisticated mathematical criteria of fitting quality examination [7]. In this paper, the data variance is assumed to be constant over the whole range of the independent variable. For data with variable variance, it may not be reasonable to assume that every observation should be treated equally, actually, fatigue S-N data usually displays much large standard deviation at low stress levels. To solve this problem, the weighted equilibrium method can be used to maximize the accuracy of parameter estimation. In fact, this has been successfully achieved by applying the equilibrium method developed in this paper and giving each data point its proper amount of influence (weight) over the parameter estimates [21]. Finally, it should be emphasized that the fit curves/surfaces obtained above are just mean curves/surfaces, which are the basis of further probabilistic data analysis. With the accurate mean curve/surface, the lower and upper bounds, such as mean-minus-two-standarddeviation, can be subsequently developed for engineering design purpose. Page 24 of 34
6. Conclusions 1. Equilibrium based curve/surface fitting method is presented and the equilibrium nature of a linear data fitting process is discovered in this paper. 2. Several ideal data patterns are defined based on the equilibrium nature of curve/surface fitting and three corresponding fitting methods are determined for best curve/surface fit. 3. The gap between the classic LS and MLE theories and its application in data analysis is filled with the equilibrium method which provides a guideline for data preprocessing as well as goodness-of-fit examination. 4. The perpendicular offsets method provides an alternative way for curve/surface fitting in the cases where the vertical and horizontal offsets method can not be properly used. The formulae of the perpendicular offsets method for surface fitting is derived and provided.
Acknowledgments The authors are grateful to Dr. Loris Molent at Defence Science and Technology Organisation, Australia for providing fatigue test data.
References 1.
ASTM Manual on Fitting Straight Lines, STP 313, ASTM International,1962.
2.
Little, R. E., and Jebe, E. H., Statistical Design of Fatigue Experiments, Applied Science Publishers, London, 1975.
3.
Nakazawa, H., Kodama, S., Statistical S-N testing method with 14 specimens: JSME standard method for determination of S-N curves, Statistical Research on Fatigue and Fracture, Elsevier Applied Science, New York, 1987, 59-69.
Page 25 of 34
4.
Ling, J., Pan, J., A maximum likelihood method for estimating P-S-N curves, International Journal of Fatigue, 1997, 19, 415-419.
5.
Pascual, F.G., Meeker, W.Q., Estimating fatigue curves with the random fatigue-limit model, Technometrics, 1999, 41, 277-302.
6.
Guest, P.G., 1961, Numerical methods of curve fitting, Cambridge University Press, Cambridge.
7.
Neter, J., Wasserman, W., Kutner, M.H., Applied Linear Statistical Models, Richards D. Irwin, Inc., Homewood, IL. 1990.
8.
Standard practice for statistical analysis of linear or linearized stress-life ( S − N ) and strain-life ( ε − N ) fatigue data, ASTM Designation: E739-10.
9.
Design and manufacture of wind turbine blades, offshore and onshore wind turbines, DNV-OS-J102, October 2006, Det Norske Veritas.
10. Dong, P., Hong, J.K., A robust structural stress parameter for evaluation of multiaxial
fatigue of weldments, Journal of ASTM International, 2006, 3, 1-17. 11. Nishijima, S., Monma, Y., and Kanazawa, K., Computational Models for Creep and
Fatigue Data Analysis, VAMAS Technical Report No.7. NRIM, 1990. 12. Nishijima, S., Monma, Y., and Kanazawa, K., Significance of Data Evaluation Models in
Materials Databases, VAMAS Technical Report No.6. NRIM, 1990. 13. Spindel, J.E., Haibach, E., The method of maximum likelihood applied to the statistical
analysis of fatigue data, International Journal of Fatigue, 1979, 1, 81-88. 14. Shimizu, S., Tosha, K., Tsuchiya, K., New data analysis of probabilistic stress-life (P-S-
N) curve and its application for structural materials, International Journal of Fatigue, 2009, 32, 565-575.
Page 26 of 34
15. Sonsino, C.M., Course of S-N-curves especially in the high-cycle fatigue regime with
regard to component design and safety, International Journal of Fatigue, 2007, 29, 22462258. 16. Bishop, T.A., Collier, R.P., Kurth, R.E., Statistical analysis of ECC bypass data using a
nonlinear constrained maximum likelihood estimation technique, Nuclear Engineering and Design, 1981, 64, 87-91. 17. Huffel, S., Vandewalle, J., The total least squares problem: computational aspects and
analysis, SIAM, Frontiers in Applied Mathematics, 1991. 18. Molent, L., Jones, R., Barter, S., Pitt, S., Recent developments in fatigue crack growth
assessment, International Journal of Fatigue, 2006, 28, 1759-1768. 19. Wei, Z., Dong, P., A rapid path-length searching procedure for multi-axial fatigue cycle
counting, Fatigue & Fracture of Engineering Materials & Structures, 2012, 35, 556-571. 20. Dong, P., Hong, J.K., Osage, D., Prager, M., Master S-N curve method for fatigue
evaluation of welded components, WRC Bulletin, 2002, No. 474. 21. Wei, Z., Yang, F., Maleki, S., Nikbin, K., Equilibrium based curve fitting method for test
data with nonuniform variance, PVP2012-78234, Proceedings of the ASME 2012 Pressure Vessels & Piping Division Conference, July 15-19, 2012, Toronto, Canada. 22. Lin, C.C., Segel, L.A, Mathematics Applied to Deterministic Problems in the Natural
Sciences, SIAM, 1988.
Appendix-A Equilibrium based curve fitting solution with perpendicular offsets method The linear curve needs to fit is in the form of Equation (A-1)
y = a + bx Page 27 of 34
(A-1)
To facilitate analysis, the following signed distance and signed area are introduced first:
di =
yi − (a + bxi )
(A-2)
1 + b2
xc 1 Ai = x0i 2 xi
yc
1
y0i 1 yi
(A-3)
1
Figure A-1 shows the expected curve y = a + bx, data point Pi (xi, yi), reference point C(xc, yc), intersection point, Oi(x0i,y0i), which is obtained from the expected curve and the perpendicular straight line from data point Pi (xi, yi). The signed distance, di, is the distance between Pi and Oi, and signed area, Ai, is the area of triangle COiPi. The signed distance and area can distinguish the linear and angular directions with positive and negative signs, which is very helpful for vector analysis. The sign definition can be arbitrary chosen as long as the definition is consistent. The convention of sign definition in this paper is: the distance from a point on left side of vector C-O1 is defined as positive; the area following a counterclockwise path such as C-O1-P1 is defined as positive. Therefore, d1 and A1 are positive, whereas d 2 and A2 are negative in Figure A-1. The coordinates of Oi(x0i,y0i) can be written as follows.
x0i =
Page 28 of 34
xi + b( yi − a ) 1 + b2
y0 i =
a + bxi + b 2 yi 1 + b2
(A-4)
P1 A1 (+)
y
d1 (+)
O1 O2 d (-) C 2 P2 A2 (-) x Figure A-1 Definition of signed distance and area for perpendicular equilibrium.
We can derive the following equations based on the equilibrium concept. From Equation (A-2), we get the following force equilibrium equation N
∑ Fi = 0 or i =1
N
∑ [y − (a + bx )] = 0 i
(A-5)
i
i =1
Equation (A-5) is exactly the same as Equations (2) and (6) in the main context. From Equation (A-3), we get the following angular moment equilibrium equation
M i = 2 Ai = xc ( y0i − yi ) − yc (x0i − xi ) + (x0i yi − xi y0i )
(A-6)
and then n
∑M i =1
i
n
n
n
n
i =1
i =1
i =1
i =1
= xc ∑ ( y0i − yi ) − yc ∑ (x0i − xi ) + ∑ ( x0i yi − xi y0i ) = ∑ ( x0i yi − xi y0i ) = 0 (A-7)
The first two terms with yc and xc of Equation (A-7) are zero because of Equation (A-5). Finally, we have
∑ [− b x y n
2
i
i =1
i
(
)
Substituting Equation (A-5) into Equation (A-8), we have Page 29 of 34
]
+ b yi2 − xi2 − a (byi + xi ) + xi yi = 0
(A-8)
b2 + Kb −1 = 0
(A-9)
Then the solution of fit parameters can be obtained as
a=
n 1⎡ n ⎤ y b xi ⎥ − ∑ ∑ i ⎢ n ⎣ i =1 i =1 ⎦
(A-10)
− K ± K2 +4 b= 2
(A-11)
where 2
K=
n n n Q−R 1 n 1⎛ n ⎞ , S = ∑ xi yi − ∑ xi ∑ yi , Q = ∑ xi2 − ⎜ ∑ xi ⎟ , S n i =1 i =1 n ⎝ i =1 ⎠ i =1 i =1
n 1⎛ n ⎞ R = ∑ yi2 − ⎜ ∑ yi ⎟ n ⎝ i =1 ⎠ i =1
2
(A-12)
Appendix-B Surface fitting solution for a linear function with perpendicular offsets method For
a
general
quadratic
equation
z = a + bx + cy + dxy + ex 2 + fy 2 , the normal
(perpendicular) direction to the expected surface for a given point away from the surface may not be unique. One extreme example is a circle in a 2-D plane, in which the number of normal going through the center point of the circle is infinite. Therefore, it is hard to find a solution with the perpendicular offsets method for a general quadratic function. For a simpler bi-linear function
z = f (x, y ) = a + bx + cy + dxy
(B-1)
the normal equation may not be unique as well for a given point if the curvature caused by d in Eq.(B-1) is too big. The unique normal equation may exist if the value d is small enough, and the perpendicular offsets method can then be used to get a best fit surface. It should be noted that there is no exact solution for the bi-linear equation in general and the approximate Page 30 of 34
solutions to Eq.(B-1) can be obtained only by approximate methods, such as regular perturbation method [22]. The equation of a surface normal going through the point (xi , yi ) out of surface z = f (x, y ) and a point (x, y ) on surface is xi − x yi − y zi − z = = −1 zx zy
(B-2)
where z x and z y are the derivatives ∂z ∂x and ∂z ∂y , respectively. Combing Equation (B-1) and Equation (B-2), we have
(xi − x ) = −(b + dy )(zi − z ) ( yi − y ) = −(c + dx )(zi − z )
(B-3)
z = a + bx + cy + dxy
For a given small value of d . We assume the solution of equation can be written in the following form [20].
( ) x = y + dy + O(d ) z = z + dz + O(d ) x = x0 + dx1 + O d 2
2
0
1
(B-4)
2
0
1
then we have the following equations for the leading term and the first perturbation term
(xi − x0 ) = −b(zi − z0 ) ( yi − y0 ) = −c(zi − z0 )
(B-5)
z0 = a + bx0 + cy0
and
(x1 − bz1 ) = y0 (zi − z0 ) ( y1 − cz1 ) = x0 (zi − z0 )
(B-6)
z1 − bx1 − cy1 = x0 y0
Solving Equation (B-5) for (x0 , y 0 , z 0 ) and substituting the solution into the following distance equation di = Page 31 of 34
(xi − x )2 + ( yi − y )2 + (zi − z )2
(B-7)
We have di =
zi − (a + bxi + cyi )
(B-8)
1+ b2 + c2
Solving Equation (B-6) for (x1 , y1 , z1 ) , we have x1 =
(zi − z0 )[bcx0 + (1 − c 2 y0 )]
y1 =
(zi − z0 )[bcy0 + (1 − b2 x0 )]
(
)
(
)
1 − b2 + c2
(B-9)
1 − b2 + c 2 (z − z )[by + cx ] z1 = i 0 2 0 2 0 1− b + c
(
)
Substituting (x1 , y1 , z1 ) and (x0 , y 0 , z 0 ) into Equation (B-7) we can get the general equation of the distance, which is complex and will not be addressed here. The leading equation (B-5) can be solved as n n ⎞ 1⎛ n a = ⎜ ∑ zi − b∑ xi − c∑ yi ⎟ n ⎝ i=1 i =1 i =1 ⎠
(1 + b
2
(1 + b
2
+ c2
)∑ {[z − (a + bx + cy )]x }+b∑ {[z − (a + bx + cy )] }= 0 n
n
i
i
i
i
i =1
+ c2
2
i
i
(B-10)
i
i =1
)∑ {[z − (a + bx + cy )]y }+c∑ {[z − (a + bx + cy )] }= 0 n
n
i
i
i
i =1
i
2
i
i
i
i =1
Equations (B-10)2 and (B-10)3 result in Equation (B-11) c ∑ {[zi − (a + bxi + cyi )]xi } =b ∑ {[zi − (a + bxi + cyi )]yi } n
n
i =1
i =1
Then we have the following canonical form solution
Page 32 of 34
(B-11)
n n ⎞ 1⎛ n a = ⎜ ∑ zi − b∑ xi − c∑ yi ⎟ n ⎝ i=1 i =1 i =1 ⎠ n n n n n ⎡ n ⎤ 2 2 2 ⎢− ∑ xi zi + a∑ xi +c∑ xi yi + b ∑ xi zi − ab ∑ xi − b c∑ xi yi ⎥ i =1 i =1 i =1 i =1 i =1 ⎢ i=1 ⎥ n n n n n ⎢ 2 n ⎥ 1 2⎛ 2 2⎞ 3 b= n ⎢− c ∑ xi zi − bc ⎜ ∑ yi − ∑ xi ⎟ + c ∑ xi yi + 2ab∑ zi + 2bc∑ yi zi ⎥ n ⎛ ⎞ i =1 i =1 i =1 i =1 i =1 ⎝ i=1 ⎠ ⎥ ⎜ ∑ zi2 − ∑ xi2 ⎟ ⎢ n n ⎥ i =1 ⎝ i=1 ⎠⎢ 2 2 ⎢+ ac ∑ xi − a bn − 2abc∑ yi ⎥ i =1 i =1 ⎣ ⎦
c=
(A2-12)
n n n ⎡ n ⎤ ⎛ n 2 n 2⎞ 2 n 2 + − + + − − ac x bc x y c x y b y z ab y b xi yi ⎥ ⎜ ⎟ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ⎢ i i i i i i i i n i =1 i =1 i =1 i =1 i =1 ⎝ i=1 ⎠ ⎦ ∑ xi zi ⎣ i=1
1
i =1
There are two solutions to the perpendicular offsets based curve fitting method, Appendix-1, and only one solution is physically reasonable and another can be easily excluded. The existence of closed-form solution to Equations (B-6) or (B-12) is unknown. It would be very difficult to find the solution, if not impossible. Approximate solution is the only approach available so far. There are a few methods available to solve a system of nonlinear algebraic equations like Equation (B-12). However, several methods were tested and the convergence is not guaranteed. Therefore, a Brute-Force method is used in this paper to search a reasonable solution. It is found that there is only one reasonable approximate solution available to Equation (B-12) for the data investigated in this paper. The two advantages of solving Equation (B-12) over solving Equation (B-10) are:(1) the numerical errors can be easily controlled; (2) b and c in Equation (B-12) follow the same trend in increasing or decreasing in values, therefore, the calculation results can be easily monitored.
Page 33 of 34
Highlights Engineering Failure Data Analysis: Revisiting the Standard Linear Approach
The issues of the current standard methods is indicated The equilibrium mechanism of the linear data fitting processes is discovered; Several ideal standard data patterns are identified; The advantage of the new equilibrium method is demonstrated The impact of the new method on current engineering applications is indicated
Page 34 of 34