Engineering Structures 128 (2016) 283–295
Contents lists available at ScienceDirect
Engineering Structures journal homepage: www.elsevier.com/locate/engstruct
A reduction technique for long-term fatigue analysis of offshore structures using Monte Carlo simulation Ying Min Low Centre for Offshore Research and Engineering, Department of Civil and Environmental Engineering, National University of Singapore, 1 Engineering Drive 2, Singapore 117576, Singapore
a r t i c l e
i n f o
Article history: Received 31 December 2015 Revised 21 September 2016 Accepted 22 September 2016
Keywords: Fatigue analysis Scatter diagram Control variates Monte Carlo simulation
a b s t r a c t Long-term fatigue assessment of an offshore structure is a challenging practical problem. An offshore structure is exposed to fluctuating sea conditions, thus the fatigue analysis needs to account for many different wave heights and periods. Because each sea state entails a full dynamic analysis, the total computational effort can be formidable, particularly for time domain analysis. Recently, researchers have proposed various strategies for efficient estimation of the long-term mean damage, but these methods all have drawbacks, such as the predicted damage is approximate without means to quantify the error, and the simulation procedure is customized for a particular stress location. This paper outlines a new approach based on using control variates to enhance the efficiency of Monte Carlo simulation. The proposed approach has the advantage of being a post-processing scheme that does not alter the dynamic simulation procedure, enabling the damage of multiple locations to be evaluated from the same simulation results. The predicted mean damage is unbiased and the sampling error can be quantified. In addition, the approach supports both time domain and frequency domain analysis. Case studies of a floating production system demonstrate that the proposed approach provides a substantial speedup in computational time. The actual performance varies with factors such as sample size and model complexity. Ó 2016 Elsevier Ltd. All rights reserved.
1. Introduction Accurate fatigue assessment of a floating production system is a challenging practical problem. An offshore structure experiences environmental conditions that fluctuate extensively over its service life. In the short term (e.g. 3 h), the sea state is assumed to be stationary, and the wave statistics can be characterized by the significant wave height HS and the peak spectral period Tp. The long-term variability of HS and Tp is described by a site-specific wave scatter diagram. Whereas the design storm of specified return period [1,2] is a viable approach for predicting the extreme load effects, computing the long-term fatigue damage necessarily involves all the sea states in the scatter diagram. As the evaluation of each sea state entails a full dynamic analysis, the total computational effort can be formidable. This problem is particularly acute for the moorings and risers of floating systems, because they are highly compliant to the environmental loads, thus time domain analysis is often essential to account for the nonlinearities. As reflected by design guidelines [3], the prevailing industry practice is to condense the scatter diagram into manageable numE-mail address:
[email protected] http://dx.doi.org/10.1016/j.engstruct.2016.09.047 0141-0296/Ó 2016 Elsevier Ltd. All rights reserved.
ber of blocks, each associated with a representative sea state. However, this crude approach cannot be expected to be accurate, and what is more, the associated error is difficult to quantify. A precise prediction of the long-term damage can be accomplished by direct numerical integration and Monte Carlo simulation, but these brute force methods are too computationally demanding for design purposes. Hence, there is a pressing practical need for fast and accurate methods for long-term fatigue analysis, and unsurprisingly, this subject has received heightened attention from researchers, especially in recent years. One strategy is to replace stochastic time domain analysis that is computationally intensive by deterministic regular wave analyses covering a range of wave heights and periods. Sheehan et al. [4] applied the Longuet-Higgins distribution to extract the statistics of the wave heights and periods from wave spectra. This procedure is rather approximate, and neglects many features of irregular waves; for one thing the second-order slow drift response cannot be modeled. The long-term fatigue damage can be formulated as a probability integral. Low and Cheung [5] investigated the use of the highly efficient perturbation method [6] to estimate the integral, but concluded that the method can be highly erroneous in some situations, and therefore cannot be recommended. Low and Cheung [5] also proposed to evaluate the integral by asymptotic
284
Y.M. Low / Engineering Structures 128 (2016) 283–295
approximation [6], which is found to be fairly accurate, as subsequently verified by other researchers [7]. Even so, the method suffers from several drawbacks, as will be elaborated later. In particular, the method is not suitable for time domain analysis, and the analysis results are applicable only to a particular hotspot stress location. The slow convergence rate of Monte Carlo simulation can be alleviated by importance sampling [5]. Unfortunately, constructing a good importance sampling density requires one to first obtain an approximate solution, which is not an easy task. Researchers have also explored the response surface method [8], univariate dimension reduction method [9], and neural network [10–13] for the fatigue problem. However, these methodologies are not without shortcomings, an in-depth discussion of which is beyond the purview of this paper. One drawback common with many approximate approaches is that the error is systematic (i.e. biased), yet difficult to ascertain a priori. Even if the approximate approach performs well for selected examples, it can be hard to detect if it becomes unstable in a particular application. It appears that existing methods have limitations. The ideal method should have the following attributes. First, it needs to be efficient, accurate and robust. Second, the damage estimate should preferably be unbiased, and the error must be quantifiable. Third, the simulation results should be applicable to any location of interest, since in reality there are multiple critical hotspot locations that need to be checked. Finally, it is advantageous if the method can support both time domain and frequency domain analysis. The aim is to develop a long-term fatigue analysis approach that can fulfil these requirements, at least to a certain extent. This paper is organized as follows. Sections 2 and 3 introduce the background theory and relevant literature for short-term and long-term analysis respectively. Subsequently, the proposed method is developed in Section 4, and illustrated in Section 5. Finally, Section 6 recapitulates the key findings. 2. Short-term fatigue analysis
In offshore engineering, the sea state is assumed to be statistically stationary in the short term, typically taken as 3 h. The random wave elevation g(t) is modeled as a Gaussian process; thus the short-term statistics can be fully characterized by a wave spectrum Sgg(x). There are several formulations for wave spectra. The JONSWAP spectrum adopted in this work is parameterized in terms of the significant wave height HS, peak spectral period Tp, and a shape parameter c as follows
h i 4 # ^ 2 x2p exp ðx xp Þ2 =2r 5 x exp c 4 xp "
^g x Sgg ðxÞ ¼ a
5
ð1Þ where
a^ ¼ 5:061
H2S T 4p
! ð1 0:287 ln cÞ; xp ¼
2p ^¼ ;r Tp
0:07; if x 6 xp 0:09; if x > xp
:
ð2Þ For computational expediency and to avoid high frequency contamination, frequencies above 1.4 rad/s are truncated from the wave spectrum. It is assumed that c = 3.3, while the long-term variability of HS and Tp will be discussed in Section 3.1. For time domain analysis, the wave spectrum is partitioned into R regular wave components of equal frequency interval Dx. The time history of the wave elevation can then be simulated as
gðtÞ ¼
R X Ar cosðxr t þ hr Þ r¼1
2.2. Dynamic analysis The global dynamic analysis of a floating system can be performed in the time domain or frequency domain; both options are considered herein. Time domain analysis captures all nonlinear effects rigorously, but at the expense of high computational cost. Frequency domain analysis is approximate but efficient. The simulations are carried out via an in-house numerical code developed in a prior work [14]. The floating system comprises the floater, moorings and risers, which are analyzed simultaneously in a single model. This approach is known as a coupled analysis, where the dynamic interactions between various components are fully incorporated. The floater is modeled as a rigid body with six degrees-offreedom, while the moorings and risers (collectively referred to as lines) are discretized as line elements using the lumped mass approach. The floater is subjected to first-order wave-frequency and second-order low frequency hydrodynamic loads. The fluid forces on the lines are described by Morison’s equation. Time domain analysis allows for geometric nonlinearity to be considered, as large displacements of the lines are permitted. In the frequency domain, small vibrations are assumed, and the drag force in Morison’s equation is linearized. Because of space constraints, further details are not reproduced here. Following the dynamic analysis, the stress X(t) at a particular location of interest is extracted for subsequent fatigue damage calculation. For time domain analysis, the full time history of X(t) is available, whereas in the frequency domain, the stress response is produced in the form of a spectral density SXX(x). The variances _ of X(t) and its time derivative XðtÞ, needed for the spectral fatigue analysis, are calculated from the spectral moments according to random vibration theory, i.e.
2.1. Wave environment
2
where hr are random phase angles uniformly distributed from 0 to 2p, while Ar are wave amplitudes represented either deterministipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi cally as Ar ¼ 2Sgg ðxr ÞDx, or as Rayleigh-distributed random variables. For simplicity but without loss of generality, the deterministic amplitude scheme is adopted herein.
ð3Þ
r2X ¼
Z
0
1
SXX dx;
r2X_ ¼
Z
1
x2 SXX dx:
ð4Þ
0
2.3. Fatigue damage calculation Fatigue assessment of offshore structures is dominated by use of S–N curves that specify the number of cycles (Nf) for a constant stress range (S) that will lead to fatigue failure. The S–N curve is fitted to experimental data, and in the segment of interest, it is approximately linear when depicted on a log-log graph. Hence it is common to assume Nf(S) = KS–m, where K and m are empirical parameters. If time domain simulations have been performed, the rainflow counting algorithm is used to convert the stress time history into cycles with well-defined ranges. The fatigue damage for each cycle is defined as 1/Nf(S), and by invoking Miner’s rule, the total damage from all the cycles are aggregated [3]. Due to the stochastic nature of irregular waves, each realization produces a different time history, and consequently there is variability in the short-term fatigue damage. Practical fatigue design utilizes only the mean damage, and this variability is typically neglected. Since the sea state is short-term stationary, the damage accumulation rate is constant in the short-term. The damage rate for a particular sea state, S ; T p Þ, is estimated from M number of simulations as denoted as dðH
S ; T p Þ dðH ^ S; TpÞ ¼ dðH
M 1 X Dj ðHS ; T p Þ MT sim j¼1
ð5Þ
Y.M. Low / Engineering Structures 128 (2016) 283–295
285
where Tsim is the duration for each simulation, and Dj(HS, Tp) is the damage from the jth simulation. There is no restriction on M and Tsim, although a larger MTsim corresponds to lower sampling variability. For frequency domain analysis where the stress spectrum is available, it is common to perform spectral fatigue analysis to estimate the damage rate directly from the spectrum, without recourse to simulation. The simplest method for spectral fatigue analysis is established on the premise that X(t) is a narrowband Gaussian process, in which case the damage rate reads [3]
¼ d
pffiffiffi m rX_ m : ð2 2rX Þ C 1 þ 2 2prX K
ð6Þ
3. Long-term fatigue analysis 3.1. Long-term environmental condition Over the service life of an offshore structure, there are many environmental actions that are fluctuating, including wind (mean and dynamic components), current (velocity and direction), and waves (wave height, period, and direction). The dominant source of stress reversals comes from wave loading, although vortexinduced vibrations – outside the scope of this work – is also paramount for fatigue assessment. In the present study, only the long-term variability of HS and Tp is considered. The wave scatter diagram specifies the joint likelihood of HS and Tp in a discrete fashion. The discrete data can be fitted to a continuous probabilistic model with better resolution and more realistic description of the tail statistics. Here, the model developed by Haver and Nyhus [15] is adopted. The joint probability density function (jpdf) of HS and Tp is specified as
f ðHS ; T p Þ ¼ f ðHS Þf ðT p jHS Þ 8 h i 2 1 > S lh Þ ^ > exp ðln H2n ; if HS 6 g 2 < pffiffiffiffi 2pnHS ^ f ðHS Þ ¼ c^1 c > c^ H > ^ ; if HS > g exp HqS : q qS " # 2 1 ðln T p lt Þ exp f ðT p jHS Þ ¼ pffiffiffiffiffiffiffi 2r 2 2prT p
lt ðHS Þ ¼ a1 þ a2 HaS3 ; r2 ðHS Þ ¼ b1 þ b2 expðb3 HS Þ:
ð7Þ
ð8Þ
ð9Þ ð10Þ
3.2. Fatigue damage calculation Let DLT represent the expected long-term damage accumulated over the intended service life T (typically 20 years). The fatigue failure criterion is kSFDLT < 1, where kSF is a safety factor depending on the failure consequence. According to DNV [3], kSF can be as high as 10, reflecting the high level of uncertainty in fatigue assessment LT , where d LT is the longmethodology. One may write DLT = T d term mean damage rate characterized by the integral
Z
1 0
Z
1
S ; T p Þf ðHS ; T p ÞdHS dT p : dðH
S ; T p Þ is estimated by dðH ^ S ; T p Þ (cf. For time domain analysis, dðH Eq. (5)). To accommodate the short-term variability due to random waves, introduce a column vector h representing the entire set of wave component random phases, inclusive of multiple simulations if necessary. For convenience, let X = [HS Tp hT]T represent the vector of random variables. The mean damage rate is the integral over the relevant domain X, i.e.
LT ¼ d
Z
X
dðXÞf ðXÞdX:
ð12Þ
Eq. (12) is more challenging to evaluate accurately, being a highdimensional integral. 3.3. Review of existing methods
For the North Sea, Haver [16] gives the following parameters: ^ = 1.503, a1 = 1.134, ^ = 2.90, q = 2.691, c n = 0.6565, lh = 0.77, g a2 = 0.892, a3 = 0.225, b1 = 0.005, b2 = 0.120, b3 = 0.455. The corresponding contours of f ðHS ; T p Þ are depicted in Fig. 1.
LT ¼ d
Fig. 1. Contours of the joint probability density f(HS, Tp).
ð11Þ
0
If frequency domain analysis is performed followed by spectral fati S ; T p Þ is computed by Eq. (6), and there is no shortgue analysis, dðH S ; T p Þ. term variability in dðH
There is a wide assortment of techniques for evaluating probability integrals such as Eqs. (11) and (12). This section reviews several existing methods, however the present review is not intended to be exhaustive. There are essentially two classes of brute force approach, namely numerical integration that is deterministic, and non-deterministic Monte Carlo simulation. For numerical integration, the domain of integration is meshed using a regular grid, where each cell is DHS by DTp. Eq. (11) is evaluated as
LT ¼ DHS DT p d
XX ðqÞ ðrÞ ðqÞ ðrÞ H ;T f H ;T d p p S S q
ðqÞ
ð13Þ
r
where HS and T pðrÞ are respectively the qth grid point for HS and rth point for Tp. Provided that grid resolution and domain of integration are sufficient, numerical integration is precise and therefore ideal for verification in a study. It is however too computationally demanding for practical purposes. Besides, the computational costs grows exponentially with the problem dimension, thus numerical integration is not amenable for Eq. (12), or if more environmental parameters are introduced. Monte Carlo simulation (MCS) is a robust technique for evaluating integrals by repeated random sampling [17]. The appeal of MCS is that its efficiency is not governed by the problem dimension. Hence, without loss of generality, consider the more challenging Eq. (12). There are two options in which MCS can be implemented for the present problem. For Option 1, it is recognized that Eq. (12) is a probability integral (the integrand contains a jpdf). Thus, it is viable to draw samples from the underlying distribution, i.e. f ðXÞ ¼ f ðhÞf ðHS ; T p Þ. Let Y1, Y2, . . ., YN be independent and identically
286
Y.M. Low / Engineering Structures 128 (2016) 283–295
distributed (i.i.d) outputs from N samples. Then, an unbiased point estimate is the average N X Y^ ¼ 1 Yi: d LT N i¼1
ð14Þ
~i Þ, where x ~ i are random variates. Here, Y i ¼ dðx Option 2, which is often used for generic integrals, involves sampling from a multivariate uniform distribution within a specified domain of integration. In the present context, h is sampled as usual, whereas HS and Tp are drawn from the density h(HS, Tp), given as
hðHS ; T p Þ ¼ 1=j;
if Hmin 6 HS 6 Hmax and T min 6 T p 6 T max S S p p
j ¼ Hmax Hmin T min T max S S p p
ð15Þ
max where Hmax , Hmin and T min are the limits of the integration. The S S , Tp p MCS estimate is again given by Eq. (14), but Yi becomes
~i Þf ðx ~i Þ Y i ¼ jdðx
ð16Þ
For Option 1, there is no need to truncate the semi-infinite domain of integration. On the other hand, Option 2 ensures that the samples are more evenly dispersed, instead of being concentrated around f(HS, Tp). Both options will be compared in the case studies. Irrespective of the option employed, the sampling variability of ^ can be quantified by its standard error (SE), the estimator Y
r^ Y ^ p ffiffiffiffi SEðYÞ N
ð17Þ PN
^ is the sample variance of Y. The ðY i YÞ ^ is independent of the probprincipal advantage of MCS is that SEðYÞ ^ 2Y ¼ ðN 1Þ1 where r
2
i
lem dimension, but it has the drawback that the convergence rate, pffiffiffiffi being inversely proportional to N , is very slow. The computational cost of MCS may be measured by the total simulation duration, which is NMTsim, where N is the number of different sea states simulated, while M is the number of simulations per sea state. It is sensible to take M = 1, select a relatively short Tsim, and set N to be as high as resources permit. The reasoning is that increasing MTsim reduces only the short-term variability caused by h, whereas increasing N diminishes both the short-term and long-term variability (from HS and Tp). Asymptotic approximation [6] is a method for estimating probability integrals. Consider Eq. (11), and define S = [HS Tp]T. In addi tion, let LðSÞ ¼ ln½dðSÞf ðSÞ represent the natural logarithm of the integrand. Suppose the integrand has one local maximum point S⁄. A second-order Taylor expansion of L(S) about S⁄ yields
1 T LðSÞ LðS Þ þ ðS S Þ HðS ÞðS S Þ 2
ð18Þ
where H(S⁄) is the Hessian matrix of L(S) at S = S⁄. The asymptotic LT reads approximation for d
2pLðS Þ LT pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : d det½HðS Þ
ð19Þ
Asymptotic approximation is efficient, as the second-order derivatives in the Hessian matrix require only nine function evaluations. For the case of multiple peaks, Eq. (19) can be applied to each peak, and the contribution from all peaks are summed. Various researchers [5,7] have demonstrated that asymptotic approximation is reasonably accurate for the long-term fatigue problem, but it suffers from several drawbacks. First, one must initially solve a time consuming optimization problem to find the maximum point S⁄. Moreover, multiple peaks can compound the problem complexity [5]. Second, as pointed out by Giraldo et al.
[7], the integrand peak differs from location to location, hence the analysis must be repeated for multiple stress locations, impairing the efficacy of the method. Third, the derivatives of the response are evaluated numerically, rendering the method unsuitable for time domain analysis due to sensitivity of the numerical derivatives. Perturbation method is another derivative-based approach, ¼ ½H S ; T p . based on a Taylor expansion of dðSÞ about the mean, S Applying this method to the present problem yields 2 LT dð H S ; T p Þ þ 1 @ dðHS ; T p Þ varðHS Þ d 2 @H2S 2 H S ; T p Þ 1 @ dðHS ; T p Þ @ 2 dð varðT p Þ þ covðHS ; T p Þ þ 2 2 @HS @T p @T p
ð20Þ
Because f(HS, Tp) is known, the mean, variance and covariances can be readily determined. The computational effort amounts to evaluating the second-order derivatives. There is no need to find S⁄, and the results are applicable to any location of interest. Unfortunately, the accuracy of the method is rather poor [5], mainly because the which can be quite far from the imporexpansion is carried out at S, tant region. Importance sampling (IS) is a well-known variance reduction technique aimed at improving the poor convergence rate of MCS [17]. The principle of IS is to draw samples of the random variables from another density (the IS density), in order that more samples fall in important regions that contribute significantly to the total integral. However, constructing an effective IS density requires prior knowledge of the behavior of the integrand. Low and Cheung [5] showed that this information can be acquired by applying the asymptotic approximation method. However, the entire procedure entails solving the problem twice, and besides the IS density is effective only for a particular stress location. Gao and Low [18] applied IS not to achieve variance reduction, but to allow the results for many sea states to be inferred from the analysis of one sea state; this approach is unbiased and efficient, but the analysis procedure is excessively complicated. In sum, all existing methods have drawbacks, motivating the need for new improved approaches. 4. Proposed approach 4.1. Fundamentals of control variates The proposed approach should fulfil the challenging requirements set out in the Introduction. It is reasoned that a Monte Carlo based approach is still the most promising, as it satisfies most of the requisites, barring efficiency. Hence, it is natural to explore variance reduction methods to enhance the efficiency of MCS. Importance sampling is but one of several variance reduction techniques; other notable ones include antithetic variates, stratified sampling, and control variates [17]. In particular, the proposed approach is underpinned by control variates, whose basic principles are briefly reviewed. LT . The In conventional MCS, Y(X) is an unbiased estimator of d variable C(X) is said to be a control variate if it is correlated with Y(X) and its expectation C ¼ E½CðXÞ is known. Consider a new estimator [17]
Z k ðXÞ ¼ YðXÞ k½CðXÞ C;
ð21Þ
which is unbiased for any value of the control variate weight k. The variance of Zk is
varðZ k Þ ¼ varðYÞ 2kcovðY; CÞ þ k2 varðCÞ It follows that the optimal k that minimizes var(Zk) is
ð22Þ
Y.M. Low / Engineering Structures 128 (2016) 283–295
k ¼
covðY; CÞ ; varðCÞ
ð23Þ
and the corresponding minimal variance is
varðZ k Þ ¼ ð1 q2YC ÞvarðYÞ
ð24Þ
where qYC is the correlation coefficient of Y and C. In practice, cov(Y, C) and var(C) are seldom known; however, they can be estimated from the available Monte Carlo data. The extent of variance reduction increases with |qYC|. At worst, the variance remains unchanged when Y and C are uncorrelated; in this regard, control variates is ‘‘safe” as it will never backfire, unlike say importance sampling. There will be variance reduction for both positive and negative correlations. Intuitively, control variates work as follows. Take the case of positive correlation. For a partic and this infor~i Þ > C, it is likely also that Yi > Y, ular replicate, if Cðx ~ i Þ C. The mation is exploited by adjusting Yi downwards by k½Cðx reverse also holds. On average, adjustments are made towards the mean value, thereby reducing the variance. 4.2. Constructing suitable control variate functions Control variates are most often used when a simple analytical version of the problem is at hand. This is seldom the case for offshore engineering, which may explain why control variates are so rarely employed in this field (certainly, another reason is that control variates is not well known in offshore engineering). Still, there are in fact overlooked strategies that can overcome the absence of available simple models. For the long-term fatigue problem, one feasible strategy is to build a low-fidelity finite element model (e.g. coarser mesh, fewer details), perform dynamic and fatigue analyses for pre-selected sea states, and subsequently fit the results to a surrogate model. This procedure will not be pursued for the following reasons. For one, it requires skill to assess which details are crucial, and additional time is incurred to build the simplified model and run the analyses. It is possible to include details essential to one stress location, and simplify the rest of the model; however, the resulting model will not be universally useful to all pertinent stress locations. However, the above scheme is certainly viable and can be explored in a future work. The general strategy adopted herein is introduced as follows. Due to the diversity of structural systems, there is no one-sizefits-all model. Therefore, functional forms for C() are devised, but the actual parameters of the function are estimated from MCS data. Moreover, it is feasible to experiment between different functional forms, and select the best performing one without rerunning the simulations. In this way, variance reduction is accomplished without invoking additional simulations or specific prior information (unlike importance sampling). This ‘‘trick” is somewhat reminiscent of pulling a rabbit out of a hat, in the sense that a computational gain is seemingly created out of nowhere. In fact, it is nothing more than capitalizing on information that has always been available, but merely neglected. In conventional MCS, the replicates Yi are taken at face value without considering X. However, by drawing the connection between Y and X, one may infer whether a particular replicate Yi is likely to be greater or smaller than the mean, and adjust accordingly. Having laid out the general concept, the next logical step is to identify promising functional forms for the control variate. Earlier, two options for MCS have been described. The proposed approach can be as easily applied to either option. However, the ensuing exposition of the proposed approach specializes to Option 2, which has lower variability as will be shown in Section 5. In Option 2, Y ¼ jdðXÞf ðXÞ. Since Y is the product of a known function f(X) and unknown function d(X), one has the choice of fitting either d(X) or d(X)f(X) to a model. After some trial and error, fitting to
287
d(X) is found to be more profitable, as the damage is a physical quantity that follows certain trends, unlike d(X)f(X). In the sequel, it is understood that X = [HS Tp hT]T for time domain analysis, and X = [HS Tp] when considering frequency domain analysis. The goal is to construct a functional form for g(X) that approximates d(X). It is clear that modeling the influence of h in g(X) is futile, since h is a high-dimensional vector, and the relation between h and d(X) is too intricate and unpredictable. Hence, it is assumed that g(X) = g (HS, Tp). A conceivable strategy is to choose a generic form for g(X) such as multivariate polynomials, but such a choice would be ungainly. Instead, problem-specific functions are likely to be more effective. To devise such functions, it is imperative to understand the problem at hand. Following Low and Cheung [5], the clearest way to elucidate the problem characteristics is by way of example. Consider a floating system that has been studied in Ref. [5], and will be described in Section 5. Reference is made to the probability integral given by Eq. (11). The contours of the jpdf of HS and Tp was shown earlier in Fig. 1. Fig. 2 depicts the damage function S ; T p Þ as a surface plot, while Fig. 3 shows the contours of the dðH S ; T p Þf ðHS ; T p Þ, from which it can be observed that integrand dðH the maximum point of the integrand (HS = 6.00 m, Tp = 12.01 s) is shifted ‘‘northwards” compared to the peak of f(HS, Tp). This is S ; T p Þ. In addition, there is the consequence of multiplying by dðH a secondary peak at HS = 2.08 m, Tp = 6.81 s, rendered by the spike S ; T p Þ (Ref. [5] gives a more thorough explanation). The exisin dðH tence of secondary peak(s) varies from system to system. The complicated characteristics of the integrand makes it unsuitable for data-fitting, hence the choice to fit to the damage function instead. S ; T p Þ is obviously period sensitive; this is According to Fig. 2, dðH the consequence of various factors such as system resonances, frequency dependent wave loads, and Tp affecting the number of stress cycles in a specified time period. Another notable feature is that d(HS, Tp) approximately increases exponentially with HS, for Tp fixed. With these considerations, it is assumed that bðT p Þ
gðHS ; T p Þ ¼ aðT p ÞHS
ð25Þ
where a and b are parameters that vary with Tp. Here, lna(Tp) and b(Tp) are fitted to polynomials of qth and rth order respectively, i.e.
ln aðT p Þ ¼ a0 þ a1 T p þ þ aq T qp ; bðT p Þ ¼ b0 þ b1 T p þ þ br T rp :
S ; T p Þ for m = 3. Fig. 2. Surface plot of the damage function dðH
ð26Þ
288
Y.M. Low / Engineering Structures 128 (2016) 283–295
the sum of squared residuals (SSR) (i.e. least squares) is a time honored method for parameter estimation. For the case of linear regression with intercept, minimizing SSR is mathematically equivalent to maximizing q2YC [19]. Although the two optimization objectives are not exactly equivalent for nonlinear regression, one can still expect that SSR would correlate strongly with 1 q2YC . The nonlinear least squares problem at hand seeks
minp
N X 2 ½Yðxi Þ f ðxi Þgðxi ; pÞ
ð30Þ
i
where p = [a1, a2,. . ., b1, b2,. . .] is the vector of parameters for g(). Nonlinear regression necessitates an iterative procedure to compute the parameters. The analyst must specify an initial guess, and a poor one can result in the iterative analysis being unable to converge, or to converge to a local, rather than the global minimum. Here, an initial guess is attained by solving a transformed version of the problem. Taking the logarithm of Eq. (27) gives S ; T p Þf ðHS ; T p Þ for m = 3. Fig. 3. Contours of the integrand dðH
ln gðHS ; T p Þ ¼ a0 þ a1 T p þ aq T qp þ þ b0 þ b1 T p þ br T rp ln HS
Inserting Eq. (26) into Eq. (25) yields
b þb T þþb T r p r p gðHS ; T p Þ ¼ exp a0 þ a1 T p þ þ aq T qp HS 0 1 :
ð27Þ
It is found that odd values of q perform poorly, the reason being that odd-ordered polynomials are less stable, as lna(Tp) may ‘‘blow up” when Tp is too small or large. The following models are considered:
Model 1 gðHS ; T p Þ ¼ expða0 ÞHbS 0 ;
np ¼ 2
ð28aÞ
Model 2 gðHS ; T p Þ ¼ exp a0 þ a1 T p þ a2 T 2p HbS 0 ;
np ¼ 4
b þb T Model 3 gðHS ; T p Þ ¼ exp a0 þ a1 T p þ a2 T 2p HS 0 1 p
ð28bÞ
np ¼ 5; ð28cÞ b þb1 T p
Model 4 gðHS ; T p Þ ¼ expða0 þ a1 T p þ a2 T 2p þ a3 T 3p þ a4 T 4p ÞHS 0 np ¼ 7
;
ð28dÞ
where np is the number of parameters. Models 1–3 allow for only one maxima (e.g. one resonant period) in the variation of Tp, whereas Model 4 caters to two maxima. The difference between Models 2 and 3 is b1 that captures the interaction between b and Tp. Parameter estimation will be discussed next. 4.3. Parameter estimation by regression analysis Considering that YðXÞ ¼ jdðXÞf ðXÞ, and g(X) is an approximation for d(X), it follows that the control variate is CðXÞ ¼ jgðXÞf ðXÞ. For C(X) to be a valid control variate, a prerequisite is that its mean,
Z
C¼ 0
1
Z
1
gðHS ; T p Þf ðHS ; T p ÞdHS dT p
ð29Þ
0
is known. Eq. (29) cannot be evaluated in closed form, thus the solution is obtained numerically. Because g() and f() are analytical functions, the double integral can be calculated fairly quickly on modern computing systems. Typically, this computational overhead would be negligible compared to the dynamic analyses. Revisiting Eq. (24), recall that minimizing var(Zk) entails maximizing q2YC . The parameters can be estimated by attempting to solve a nonlinear optimization problem to maximize q2YC . However, optimizing the correlation coefficient can often run into numerical difficulties, due to many (or infinite) possible solutions, a case in point is a model that contains a scaling parameter. Minimizing
ð31Þ
The log-transformed Eq. (31) is in the form of a multivariate polynomial of Tp and lnHS, albeit with the omission of second- and higherorder terms of lnHS. Thus, the parameters can be efficiently determined by multivariate polynomial regression (MPR) [20], which is in fact a linear estimation problem. The MPR solution provides the initial guess for the nonlinear regression analysis. Certainly, one may take the MPR result directly, thereby avoiding nonlinear optimization altogether. However, as will be shown later, the MPR solution is suboptimal, due to incorrect assumptions on the distribution of errors. There is a variety of iterative methods for solving the least squares problem. This study uses the Nelder-Mead simplex method [21], which has the advantage of not requiring analytical or numerical derivatives. This algorithm is found to be very robust for the present problem, provided that the MPR result is taken as an initial guess. It is timely to re-examine Eq. (21) to better appreciate how the proposed control variates scheme achieves variance reduction. For the sole purpose of illustrating certain concepts, assume that g() is a good model for d(), so that k⁄ is close to unity. Taking k = 1, Eq. (21) specializes to
Z k ðXÞ ¼ C þ eðXÞ;
eðXÞ ¼ YðXÞ CðXÞ
ð32Þ
where e(X) is the residual. Considering that C() is constructed by regression, it is akin to a surrogate (or response surface) model. The main distinction is that in response surface modeling, the design points are usually selected deterministically, whereas in the proposed approach, the design points are randomly generated. Because virtually all models are imperfect, the estimate C is inherently biased. For linear least squares regression, the sample mean of residuals, ^e is zero. For nonlinear regression, however, ^e – 0, and ^e serves as a correction for bias. The procedure outlined so far seems promising conceptually, but a critical flaw remains. Consider the implication when more parameters are added, up to the point when np = N. Under this circumstance, the data can be fitted to a ‘‘perfect model”, and ^e ¼ 0. Obviously, this is too good to be true. What is happening is that is entirely governed by the response surface the prediction for D method, and it is therefore biased. Even with fewer parameters, the prediction is biased because the function C() is constructed from all the data points, hence the function and any arbitrary data ~ i are dependent. point x Another concern relates to estimating the standard error, for which there are two interrelated issues, namely dependence of
Y.M. Low / Engineering Structures 128 (2016) 283–295
the residuals and underestimation of var(Z). In MCS, the outputs are i.i.d, allowing a straightforward estimation of the standard error. For the approach outlined above, owing to the constraint of the least squares fitting, the residuals are dependent, and so are the outputs Zi. If Zi are assumed to be independent, the standard error tends to be optimistic. The underestimation of var(Z) is caused by the shrinkage of the mean-square residuals, which will always be lower when tested on the training data, compared to new data. The reason is that the model inadvertently fits to the random noise of the data, in addition to the true behavior. As the model complexity increases, say by including more parameters, the fitting will always improve, as reflected by a drop in training error, but the performance of the model on new data may not necessarily be enhanced. Overfitting is said to occur when an excessively complex model is explaining more of the noise than the actual trends, resulting in poor predictive capability. The interrelated issues of bias, variance estimation and overfitting are addressed in the following section. 4.4. Incorporating the cross-validation technique Model validation is paramount for ascertaining how well a model can generalize to a fresh dataset. The simplest form of validation is the holdout set method [22]. The data is partitioned into two sets, one for training and the other for validation (70–30 split for training/validation is common). When data is limited, it is extravagant to dedicate such a substantial portion of the data for testing purposes. A more sophisticated technique to reduce data wastage is cross-validation, which has several variants. For k-fold cross-validation [22], the data is partitioned into k subsets. For illustration, suppose N = 100 samples and k = 5. Four sets (with 80 samples) are used to construct the model, and the error is estimated from the remaining set with 20 samples. The process is repeated five times for the various combinations. In this way, every data point rotates to train and also validate a model, but both are never done concurrently. A special case is when k = N, giving rise to the so-called leaveout-one-cross-validation (LOOCV) [22]. LOOCV has two advantages; specifically, it allows for the maximum number of data points to be utilized for training, and it also avoids the problem of N not being divisible by k. The procedure for LOOCV comprises ~ i Þ represent the ith observation used the following steps. Let Yðx for testing, and the remaining N 1 data points are used to build a surrogate model C i ðÞ. Then the ith predicted residual (as opposed to the ordinary residual) is given by
ei ¼ Yðx~i Þ C i ðx~i Þ; i ¼ 1; 2; . . . ; N
ð33Þ P N 1 Ni 2i ,
The predicted mean-square error, PMSE = e can be used to measure the predictive capability of the model. Strictly speaking, PMSE slightly overestimates the true mean-square error, because the actual model is established from N data points, whereas the cross-validation models are assessed from N 1 points. The LOOCV concept is incorporated into the proposed control variates scheme, such that the ith controlled output reads
~ i Þ ki ½C i ðx ~ i Þ C i Z i ¼ Yðx
ð34Þ
where the subscript i signifies that the ith sample is excluded. ~i qualifies as an independent Since x~i is unrelated to ki and C i ðÞ, x input variable. As a corollary, Zi and Z 2i are unbiased, and thus unbiased estimates of the mean and mean-square can be calculated as P c2 ^ ¼ N 1 PN Z i and Z ¼ N 1 Ni Z 2i . Z i Nelson [23] described a similar control variates scheme that incorporates LOOCV, referring to such a scheme as splitting. In Nelson’s study, the control function is established a priori and does not depend on the samples i.e.
~i Þ ¼ Yðx ~i Þ ki ½Cðx ~i Þ C Z i ¼ Z k ðx
289
ð35Þ
~ i and ki indepenThe motivation for a splitting scheme is to make x dent to eliminate bias in Zi. Nevertheless, splitting is seldom employed in most control variates applications, as the bias is trivial, unless the sample size is small. In fact, the bias caused by estimating k⁄ from the MCS data is often glossed over in many texts and articles. Although splitting in control variates is not a new concept, to the author’s best knowledge, the notion of deriving the nonlinear function C() from MCS data has not hitherto appeared in the literature. While the estimated mean is unbiased, the point estimator variance is still biased, because Zi are not completely independent. Nelson’s study suggests that the bias is negligible for N P 30. ^ The discussion now focusses on the standard error. Recall that Z b 2 are unbiased; this is true irrespective of whether Zi are and Z c2 b 2 ^ 2Z = ð Z dependent. Hence, the sample variance r Z ÞN=ðN 1Þ is virtually unbiased for large N; the small bias is due to Bessel’s correction N/(N 1) being inexact for correlated samples. This nuance is hardly worth fussing over, because the effect is mild ^ Z would be biased anyway even if r ^ 2Z were unbiased. and besides, r pffiffiffiffi ^ The crux is whether rZ = N suffices as a good approximation for the standard error, if Zi are correlated. For the present fatigue problem, the typical sample size would not be extremely small. Hence, for simplicity of argument, assume that the dependency caused by ki is negligible, and ki is treated as a known constant k. Referring to Eq. (34), consider the covariance of Zi and Zj (i – j), i.e.
~i Þ; Yðx ~j Þ 2kcov½Yðx ~i Þ; covðZ i ; Z j Þ ¼ cov½Yðx ~j Þ C j þ k2 cov½C i ðx ~i Þ C i ; C j ðx ~j Þ C j C j ðx
ð36Þ
where the factor of two in the second term arises from the inter~ j Þ ¼ 0 since Yðx ~i Þ ~ i Þ; Yðx changeability of i and j. Clearly, cov½Yðx ~j Þ are the raw MCS outputs. In general, and Yðx ~i Þ; C j ðx ~j Þ C j – 0, because Yi is involved in the definition cov½Yðx ~j , of C j ðÞ. Nevertheless, considering that Yi is independent from x ~ j Þ would be mostly dominated by its argument x ~ j , one and C j ðx ~ j Þ C j to be would expect the correlation between Yi and C j ðx ~ i Þ C i ; C j ðx ~ j Þ C j is also non-zero, as weak. Finally, cov½C i ðx ~ i Þ and C j ðx ~ j Þ depend on all the data points. When splitboth C i ðx ting is incorporated, it is conjectured that the correlation between these two terms would be diminished, the rationale being that the functions C i ðÞ and C j ðÞ are defined independently of their ~ j are independent. In sum, it is proposed ~i and x arguments, while x pffiffiffiffi ^ Z = N , in view of the that the standard error be approximated by r absence of alternatives. The quality of this approximation will be ascertained in the case studies. The overall procedure of the proposed approach is summarized as follows: 1. Select a model from Eq. (28). 2. Apply linear regression on all (N) samples using the logtransformed model to produce a first estimate of the parameters p. 3. Using an initial guess from (2), perform nonlinear optimization on all N samples to acquire another estimate of p. 4. Apply splitting technique. Using an initial guess from (3), perform nonlinear optimization on N 1 samples to construct C i ðÞ. Compute C i . Repeat this step N times for each i. The above iterative scheme exploits the fact that the various C i ðÞ share similarities across i as they differ by only one data point. Using an initial guess from Step 3 accelerates the convergence.
290
Y.M. Low / Engineering Structures 128 (2016) 283–295
4.5. Advantages of proposed approach/general discussion Having fully described the proposed approach, it is timely to consider its advantages. At its core, the proposed approach is effectively a synthesis of the response surface method and MCS, inheriting all the strengths of its predecessors. It uses surrogate modeling to provide a reasonable estimate of the damage (in the form of C) without the need for excessive samples, unlike MCS. The prediction by traditional response surface method is intrinsically biased. The proposed approach relies on the MCS control variates formulation to make the necessary correction, resulting in an unbiased estimate. The classical favorable attributes of MCS also manifest in the proposed approach. The standard error can be estimated, allowing confidence intervals to be constructed. In addition, since the design points are not predetermined unlike response surface, additional simulations can be performed as needed. The proposed variance reduction strategy has many practical advantages over other simulation approaches. It is a nonintrusive method in the sense that the analyst does not alter the code or dynamic simulation procedure, and all the extra computations for variance reduction are conducted during the postprocessing stage. For this reason, the method is applicable to multiple stress locations, by devising control functions individually for each location. In contrast, techniques such as importance sampling modify the simulation itself, thus an optimized performance for one stress location may be at the expense of another. Importance sampling requires great expertise to implement, and is liable to backfire if the sampling density is not appropriately chosen. By comparison, control variates are virtually immune to backfire even in the worst scenario. In addition, importance sampling necessitates pilot simulations to be performed to obtain prior knowledge of the system, whereas the proposed approach has no such requirement. Being a post-processing method comes with yet another benefit. One impediment to applying advanced probabilistic methods in practical design of offshore structures is the shortage of suitable expertise in stochastic analysis. The proposed approach allows the ‘‘ordinary” engineer to perform the dynamic and fatigue analyses in the usual fashion, and the stochastic analysis specialist in the design team – whose availability is probably limited – need not intervene until the simulations are completed. The proposed concept of constructing control variates from the data appears to be powerful and versatile. There are many more possibilities that can be investigated in a future work. For example, more environmental variables can be included, such as wave direction, wind and current. The nonlinear control function can be generalized to accommodate additional variables, but it is also feasible to employ multiple controls, in which C(X) is replaced by a multivariate function C(X).
selected as 2000 m, unless otherwise stated. For an ultradeepwater floating system, the floater motions are small compared to the water depth, thus the LF tension, which is primarily quasistatic, becomes insignificant [14]. For simplicity, the dynamic analyses are carried out only for the WF response. The top tension of one of the lines is considered in the fatigue assessment in the examples, but as mentioned on multiple occasions, the proposed approach can be used to evaluate the damage at any arbitrary stress location without the need to rerun the simulations. Two values of the S–N exponent are studied, namely m = 3 and 5. 5.2. Frequency domain analysis without short-term variability The proposed approach is amenable to both frequency domain and time domain analysis. As frequency domain analysis is much faster and less intricate (without random phases), it is expedient to first consider frequency domain analysis for investigating a multitude of cases, to understand the applicability of the various methods. Direct numerical integration is computationally intensive, but it is useful as a benchmark and moreover, by analyzing the domain in fine resolution, insightful contours can be generated. Here, the grid intervals are chosen as DHS = 0.1 m and DTp = 1 s, and the integration limits are T min = 5 s, T max ¼ 18 s, Hmin = 0 m, Hmax = 14 m. The p p S S fatigue damage results reported hereafter are normalized by dividing by the direct integration solution. The contours of the inte S ; T p Þf ðHS ; T p Þ for m = 3 have been previously presented grand dðH in Fig. 3. Fig. 4 shows the contours for m = 5. Compared to m = 3, the location of the peak for m = 5 has shifted further ‘‘upwards” owing to the more pronounced nonlinearity of the damage [5]. In addition, the secondary peak has almost vanished. Two options for MCS were introduced in Section 3.3. For Options 1 and 2, samples of HS and Tp are drawn from f(HS, Tp) and the multivariate uniform distribution respectively. The bounds of the uniform distribution (T min p , etc) are identical to the direct integration limits. For both options, the sample size is 50,000, selected such that the results are reliable; specifically, the standard error is within 0.01 for Option 2 (the preferred method). Table 1 lists the normalized long-term mean damage, and the associated sample standard deviation. The normalized damage is close to unity, indicating good agreement with numerical integration; the slight difference can be ascribed to sampling variability. Option 1 exhibits higher sampling variability, which is accentuated when m = 5. The reason is clear by examining Fig. 5, which shows
5. Case study 5.1. Description of offshore structure The floating system studied in the previous work [5] is considered in the present case studies. The floater is an FPSO (floating production storage offloading) vessel. The moorings and risers are simplified as four representative lines in a catenary configuration, and each line is orientated at 45° to the floater in plan. The random waves unidirectional, with a heading of 30°. The shortterm wave characteristics are defined by the JONSWP spectrum (cf. Section 2.1), while the long-term joint distribution of HS and Tp are described in Section 3.1. The future prospects of offshore oil and gas lie in ultradeepwater (1000–3000 m) fields, accordingly the water depth is
S ; T p Þf ðHS ; T p Þ for m = 5. Fig. 4. Contours of the integrand dðH
291
Y.M. Low / Engineering Structures 128 (2016) 283–295 Table 1 Mean damage and sampling variability for MCS with 50,000 samples. MCS, Option 1
Normalized mean damage Standard deviation, r pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Standard error, r= 50; 000
MCS, Option 2
m=3
m=5
m=3
m=5
1.01 2.65 0.012
0.974 9.84 0.044
0.997 1.77 0.0079
0.992 2.06 0.0092
denote the ith controlled output for set j. The following quantities are computed: N X ^ ¼1 d Z i;j ; j N i
s2est;j ¼
^ ¼1 d CV J
J X dj ;
^ 2d ¼ s2CV r j
j
N X 1 ^ Þ2 ðZ i;j d j NðN 1Þ i
s2est;all ¼
1 X ^ ^ Þ2 ðdj d CV J1 j
N 1X s2 J i est;j
J
ð37Þ
^ is the sample mean damage for the jth set, d ^CV is the mean where d j across all sets (CV indicates control variates), sCV is the ‘‘true” standard error, sest,j is the estimated standard error for the jth set assuming independent outputs, and sest,all is the average estimated standard error. The variance reduction is quantified by the speedup factor SF = s2MCS =s2CV . The optimistic factor, OF = sCV =sest;mean measures the optimism in the estimation of the standard error by assuming the controlled outputs are independent; a higher OF indicates a more optimistic (i.e. unconservative) estimate. Table 2 presents the ^CV for m = 3, comparing the four models results of SF, OF and d described by Eq. (28). Three versions of the proposed approach are considered, i.e. (1) splitting with nonlinear optimization, (2) splitting with linear optimization, and (3) ‘‘in-sample” method (no splitting). ^CV (normalized by the direct integration result) is In all cases, d
Fig. 5. Scatter plot of 400 samples: (a) Option 1; (b) Option 2.
the scatter plot of the (HS, Tp) samples for the two options. For Option 1, the samples are expectedly concentrated in the vicinity of the statistical mode of f(HS, Tp), and there are inadequate samples near the primary peak of the integrand. This weakness is compounded for m = 5 as the peak is further away. For Option 2, the samples are more evenly dispersed, and they are able to cover the important region(s). Correspondingly, the sampling variability is smaller and also less disparate between Options 1 and 2. The proposed approach is based on Option 2 on account of its better efficiency. Nevertheless, the efficiency of Option 2 is still objectively poor, motivating the need for variance reduction. The proposed approach is implemented as follows. Supposing N = 100, the existing 50,000 MCS replicates are divided into J = 500 unique sets of N samples. The proposed approach is applied recursively to quantify the variability over the different sets. Let Zi,j
close to unity for all methods, indicating the predicted damage is unbiased. This is consistent for the cross-validation methods, which are theoretically unbiased. Although the in-sample method is biased, the effect happens to be insignificant, as np is much smaller than N. The results for the in-sample method exemplify the perils of eschewing splitting. The optimism increases significantly with model complexity from Models 1 to 4. The wrongly estimated speedup factors are 12.83, 13.11, 18.26, 18.27 (calculated from SF OF2), giving the illusion of an improving performance, when in reality the actual SF drops from Model 3 to 4. The speedup for nonlinear optimization is quite encouraging; SF initially increases with np, reaching a peak for Model 3, but then falls slightly for Model 4. Most likely, Model 4 (np = 7) is too complex for the amount of data (N = 100), resulting in the model fitting to the noise as well as the signal. Model 3 has a higher SF than Model 2, suggesting that the b1 term that captures the interaction between HS and Tp is important (cf. Eq. (28c)). The optimism amplifies with np; although OF is non-negligible, it is much lower than the in-sample method, demonstrating the benefits of splitting. Linear optimization performs worse in terms of SF, as the parameters are not fully optimized with respect to the nonlinear system. On the positive side, not pushing the boundaries make the control function C() less dependent on the data, and this seems to almost entirely eradicate the correlation between Zi, resulting in OF 1. Consider Model 3, for which the nonlinear optimization has the best performance. Fig. 6 compares the replicates for the MCS out~ i Þ, and controlled output Zi for a particput Yi, control variate C i ðx ular set. Evidently, Yi is characterized by wide fluctuations, punctuated by exceptionally large values. Nonlinear optimization ~ i Þ is able to reasonably based on Model 3 is favorable, as C i ðx
292
Y.M. Low / Engineering Structures 128 (2016) 283–295
Table 2 Summary of results for N = 100, m = 3. Model
1 2 3 4
Nonlinear optimization
Linear optimization
SF
OF
^ d CV
9.72 14.2 17.3 16.5
0.982 1.04 1.08 1.20
0.991 0.997 1.001 1.005
SF 7.73 8.12 9.85 14.0
In-sample method
OF
^ d CV
SF
OF
^ d CV
1.01 1.01 1.02 1.01
0.999 0.998 0.997 0.994
11.5 9.84 13.7 8.99
1.05 1.15 1.15 1.43
1.003 0.998 1.000 1.013
Fig. 8. Sample mean damage across various sets. Fig. 6. Replicates of MCS outputs, control variates and controlled outputs.
reproduce the essential characteristics. As a result, the variability for Zi is substantially diminished, but the model cannot be perfect, and there are still several instances where Zi has some small peaks (e.g. the 36th and 43rd replicates). It is interesting that the 53rd Zi replicate transpires to be negative, however this does not violate any principles, as although Z(X) is unbiased, it is not guaranteed to be positive unlike Yi. The occurrences of negative Zi are relatively uncommon, yet crucial to offset the positive peaks. Fig. 7 shows the pdf of Y(X) and Z(X) generated from all 50,000 replicates. The pdf for Y(X) is heavily skewed with a long right tail, explaining for the high variance. In contrast, the pdf for Z(X) is more symmetrical with less spread, and there is a small but finite probability of neg^ across various ative values. Fig. 8 plots the sample mean damage d j
sets; once again it is apparent that MCS has wider scatter than the proposed approach. In contrast with Fig. 6, there are no negative
Fig. 7. Comparison of pdfs of MCS output Y(X) and controlled output Z(X).
^ would be infinitesvalues in this plot. The likelihood of negative d j ^ imal, as each d is averaged from 100 samples. j
Table 3 shows the results for m = 5. As before, the in-sample method is overly optimistic. Here, the speedup by the nonlinear optimization is more impressive, particularly for Model 3 with SF = 90. It is reasoned that the integrand for m = 3 is more complicated due to the presence of a secondary peak, which is almost non-existent for m = 5. Consequently, Model 3, which allows for only a single hump in the variation of Tp, has a ‘‘blind spot” for m = 3, culminating in a lower SF. For m = 5, there is not much benefit in adopting Model 4 that caters for double hump, and accordingly the OF suffers a huge decline from overfitting. Linear optimization is less favorable for m = 5, a logical outcome since the higher nonlinearity accentuates the discrepancy between the linear and nonlinear models. To explore sample size effect, the proposed approach is applied to N = 200 and N = 50, and the results are tabulated in Table 4. For nonlinear optimization, larger sample size (N = 200) furnishes a better fit and leads to an obvious improvement in the speedup; moreover, the better N/np ratio reduces the optimism. On the contrary, for N = 50, the performance overall deteriorates, and Model 2 with fewer parameters is most effective. Interestingly, the sample size has a diminished influence on SF for linear optimization. More specifically, the gain is insignificant when more samples are available (N = 200), but on the positive side, there is lower penalty for fewer samples (N = 50). To verify the hypothesis that the less impressive performance for m = 3 originates from the secondary peak in the integrand, another structural configuration is investigated. The water depth is reduced to 400 m (only in this paragraph). The low-frequency responses are included as they are important, and the stress response becomes bimodal. Further details are omitted due to space constraints, but they are available in Ref. [5]. Table 5 shows the results. Because the integrand (not shown) has only one peak, Models 1 to 3 no longer have a blind spot and their performances improve dramatically.
293
Y.M. Low / Engineering Structures 128 (2016) 283–295 Table 3 Summary of results for N = 100, m = 5. Model
1 2 3 4
Nonlinear optimization
Linear optimization
SF
OF
^ d CV
13.3 58.7 90.3 6.32
1.02 1.05 1.06 1.04
1.000 1.002 1.002 1.003
SF 10.5 4.83 7.43 12.6
In-sample method
OF
^ d CV
SF
OF
^ d CV
0.985 1.01 0.996 1.01
0.983 1.000 1.001 1.001
7.71 24.5 32.2 14.3
1.04 1.12 1.12 1.34
0.999 0.998 1.002 1.001
Table 4 Effect of sample size N, with m = 3. Model
Nonlinear optimization
Linear optimization
SF
OF
^ d CV
SF
OF
^ d CV
N = 200 1 2 3 4
9.41 16.6 24.5 38.3
0.978 0.998 0.998 1.050
0.997 0.996 0.995 0.994
8.95 9.55 11.7 15.2
0.973 0.972 0.973 0.985
0.995 0.994 0.994 0.993
N = 50 1 2 3 4
6.85 10.2 7.41 2.10
1.012 1.026 1.048 1.378
1.003 0.997 0.991 1.033
6.95 7.58 9.46 7.41
1.01 1.00 0.998 0.99
1.003 1.001 0.999 0.997
Table 5 Summary of results for alternative system (water depth = 400 m), N = 100, m = 3. Model
Nonlinear optimization
1 2 3 4
Linear optimization
SF
OF
^ d CV
SF
OF
^ d CV
15.4 76.5 75.8 47.2
0.971 1.022 1.044 1.168
1.004 1.001 1.001 1.002
10.5 57.2 50.2 9.32
0.985 0.972 0.977 0.988
1.005 1.003 1.003 1.009
5.3. Frequency domain analysis with short-term variability One distinguishing feature of time domain analysis is that the short-term variability of the sea state contributes to the uncertainty of the damage prediction. However, there is a constraint to the simulations that can be performed for time domain analysis, even in a study. In this section, frequency domain analysis is considered, and the short-term variability effect is mimicked by introducing a random variable w. This allows a large dataset to be generated, and also enables direct comparison with the results in Section 5.2. The ith raw MCS output Yi is further randomized as Y 0i ¼ wi Y i , where w is a lognormal variable with mean of one, and a coefficient of variation cw. Because w and Y are independent, it follows that E½Y 0 ¼ E½Y: To obtain a reasonable value for cw, it is observed that for a narrowband Gaussian process, if the correlation between stress cycles is neglected, then c2w is approximately [24]
c2w
1 ¼ 2Nc
Cð1 þ mÞ C2 ð1 þ m=2Þ
! 1
ð38Þ
where Nc is the number of stress cycles. In general, higher m is associated with larger cw. Assuming Tsim = 21 min and the response period is 15 s, then Nc = 84, and thus cw = 0.119 and 0.242 for m = 3 and 5 respectively. Suppose again that N = 100, corresponding to 35 h of simulation time. Since it is the norm to perform 3 h time domain simulation for a given spectrum, this is equivalent to analyzing 12 distinct sea states in the scatter diagram, considered to be manage-
able in practice. To explore sample size effect, N = 200 with m = 3 is also examined. The results are presented in Table 6. A comparison with previous results (Tables 2–4) reveals that performances of both the nonlinear and linear optimization methods are less favorable when short-term variability is included. The short-term variability is akin to random noise, thereby contributing to the variation unexplained by the model C(X). Notwithstanding, the short-term variability is much lower than the long-term variability, and the proposed approach still produces a respectable speedup. Overall, m = 5 suffers a greater deterioration in performance, because w has larger uncertainty. 5.4. Time domain analysis Nonlinear time domain simulations are performed with MCS (Option 2) on the floating system earlier described. The simulations consumed about two months of computational time, producing 7200 realizations, with Tsim = 21 min. From these realizations, the coefficient of variation (i.e. standard deviation divided by mean) of the damage is computed as 1.75 and 2.49 for m = 3 and 5 respectively. To get a sense of the short-term variability, a supplementary set of 100 simulations are performed, keeping the sea state fixed at HS = 6.00 m, Tp = 12.01 s (peak of the integrand for frequency domain analysis, m = 3). The coefficient of variation for the 100 samples are found to be 0.096 and 0.203 for m = 3 and 5 respectively; these values are similar to those in Section 5.3. Assuming N = 100, the 7200 realizations are partitioned into J = 72 sets, and the proposed approach is implemented in the same
294
Y.M. Low / Engineering Structures 128 (2016) 283–295
Table 6 Summary of results for frequency domain analysis with short-term variability. Model
Nonlinear optimization
Linear optimization
SF
OF
^ d CV
m = 3, N = 100 1 2 3 4
6.73 11.3 14.8 9.84
1.020 1.011 1.005 1.172
0.998 0.994 0.993 0.996
7.22 7.75 9.26 10.81
1.00 1.00 0.990 1.02
0.993 0.988 0.991 0.993
m = 5, N = 100 1 2 3 4
6.71 9.90 9.50 5.74
1.00 1.00 1.02 1.07
1.003 1.002 1.004 1.010
2.59 3.23 4.44 6.02
1.02 1.02 1.01 1.02
1.000 0.999 1.003 1.001
m = 3, N = 200 1 2 3 4
8.65 13.1 20.1 19.5
0.975 0.998 1.05 1.06
0.998 0.996 0.995 0.996
8.09 8.86 10.53 12.79
0.977 0.969 0.967 0.983
0.994 0.994 0.994 0.993
SF
OF
^ d CV
Table 7 Summary of results for time domain analysis (N = 100). Nonlinear optimization
Linear optimization
SF
OF
SF
OF
m=3 1 2 3 4
6.62 13.0 16.5 21.2
0.997 0.939 0.952 0.972
6.28 6.44 7.39 10.1
0.960 0.975 0.975 0.966
m=5 1 2 3 4
6.07 6.46 5.89 5.25
0.984 0.976 1.00 1.02
2.12 2.42 3.15 3.91
0.978 0.975 0.978 0.997
fashion as before. Here, the damage is not normalized because the ^CV will not solution by numerical integration is unavailable, thus d ^CV is to verify that the proposed approach be reported (presenting d is unbiased, and this has already been verified). Table 7 summarizes the results, which are understood to be merely indicative, since there would be much more variability with 72 sets compared to the 500 sets considered previously. Although the floating system is identical, some difference with frequency domain results can still be expected owing to the manner in which nonlinearities are treated. In addition, for frequency domain analysis, the short-term uncertainty is artificially represented by w, assumed to be statistically identical for all sea states. In reality, the uncertainty may vary with HS and Tp. Nevertheless, the trends of Table 7 are broadly similar to those in Table 6. The important inference is that the proposed approach is verified to work well in conjunction with actual time domain analysis. 5.5. General discussion In light of the foregoing results, some suggestions are made pertaining to the practical aspects of applying the proposed approach. The in-sample method, although the simplest, cannot be recommended in view of its relatively high OF. Nonlinear optimization outperforms linear optimization in terms of SF; the disparity is particularly distinct for high nonlinearity (m = 5). For m = 3, linear optimization is viable, as the slightly inferior SF is a reasonable tradeoff for the reduced complexity and optimism. Linear optimization is particularly attractive for small sample sizes due to its stable performance, but it is unable to reap the full benefits of large sample sizes.
The primary weakness of the proposed approach (with nonlinear optimization) is that assuming independent Zi overestimates the standard error. Nevertheless, the optimism is not excessive and is tolerable, provided the model chosen is not overly complex. The optimism depends on several factors, including N, model complexity (np), and SF. Hence, one avenue for further work is to investigate more systems, with a view to developing an empirical equation for OF in terms of these variables. From a practical perspective, Model 2 is recommended as the speedup is relatively good, while the OF is small for the vast majority of cases, although Models 3 or 4 may perform better in specific situations. There are plenty of uncertainties in the fatigue assessment of offshore structures, such as environmental loading (wind, wave and current), structural parameters, boundary conditions, and fatigue S-N parameters. Practical fatigue design of offshore structures is based on the mean fatigue damage. In this connection, the proposed method is geared towards evaluating the mean damage under stochastic wave environment. The control variates framework can be possibly extended to include more uncertainties, however the aim is still to evaluate the mean. Uncertainty in the fatigue damage is not explicitly analyzed by the proposed method. As in practice, all uncertainties can be accounted for by a global safety factor on the fatigue damage (see Section 3.2), or by partial safety factors. Alternatively, the fatigue uncertainty can be analyzed as a reliability problem, which is beyond the scope of this study. 6. Conclusions Long-term fatigue assessment of an offshore structure is an important, yet computationally challenging problem. Various effi-
Y.M. Low / Engineering Structures 128 (2016) 283–295
cient strategies have been proposed in the literature, however they generally suffer from one or more of the following shortcomings: (1) computed fatigue damage is approximate, (2) error is difficult to quantify, and (3) analysis procedure is customized for a particular stress location. This paper outlines a new analysis approach, based on using the classical control variates technique to improve the efficiency of Monte Carlo simulation (MCS). Control variates are rarely employed in offshore engineering, as the technique is effective when a simple version of the problem can be solved easily, whereas offshore structural systems are typically too complex. The proposed approach overcomes this restriction via the novel idea of exploiting the MCS data to estimate the parameters of the control function, whose form is deduced from physical considerations. Parameter estimation involves solving a nonlinear least squares problem. A splitting scheme, akin to cross-validation, is incorporated to minimize the detriments of overfitting. The proposed approach has the following key advantages. First, control variates is a post-processing technique that does not alter the simulation procedure, enabling the damage of multiple locations to be evaluated from the same set of simulation results. Second, the predicted mean damage is unbiased and the sampling variability can be quantified. Third, the approach does not require additional dynamic simulations to attain prior knowledge, while the computational overhead for post-processing is small compared to the dynamic analyses. Fourth, the approach is equally applicable to time domain or frequency domain analysis. Case studies of a floating productions system show that the proposed approach provides substantial reduction in the sampling variability of MCS. The improvement in computational efficiency depends on various factors, such as sample size, model complexity, and problem complexity. The variance reduction is less impressive for time domain analysis owing to the presence of short-term variability of the random waves, contributing to the unexplained variance of the fitted model. In this study, only two random variables HS and Tp are considered for the long-term condition. Nevertheless, the framework is versatile, hence there are many possibilities for future work, for instance, including additional environmental variables, such as wave direction, wind and current.
Acknowledgements Financial support from the NUS Start-up Grant, No. R-302-000099-133 is gratefully acknowledged.
295
References [1] Det Norske Veritas. DNV-RP-C205 environmental conditions and environmental loads; 2010. [2] Naess A, Moan T. Stochastic dynamics of marine structures. Cambridge University Press; 2012. [3] Det Norske Veritas. DNV-RP-F204 riser fatigue; 2010. [4] Sheehan JM, Grealish FW, Harte AM, Smith RJ. Characterizing the wave environment in the fatigue analysis of flexible risers. J Offshore Mech Arctic Eng 2006;128(2):108–18. [5] Low YM, Cheung SH. On the long-term fatigue assessment of mooring and riser systems. Ocean Eng 2012;53:60–71. [6] Papadimitriou C, Beck JL, Katafygiotis LS. Asymptotic expansions for reliability and moments of uncertain systems. J Eng Mech 1997;123(12):1219–29. [7] Giraldo JSM, Sagrilo LVS, Dantas CMS. Efficient probabilistic fatigue analysis of a riser suspended and moored by chains. In: Proceedings of the 34th international conference on ocean, offshore and arctic engineering. [8] Gao Y, Cheung SH. Long-term fatigue analysis of risers with multiple environmental random variables in time domain. In: Proceedings of the 25th international ocean and polar engineering conference. p. 228–33. [9] Giraldo JSM, Dantas CMS, Sagrilo JVS. Probabilistic fatigue analysis of marine structures using the univariate dimension-reduction method. Mar Struct 2016;50:189–204. [10] Christiansen NH, Voie PET, Hogsberg J, Sodahl N. Efficient mooring line fatigue analysis using a hybrid method time domain simulation scheme. In: Proceedings of the 32nd international conference on ocean, offshore and arctic engineering, vol. 1; 2013. [11] de Pina AC, Albrecht CH, de Lima BSLP, Jacob BP. Wavelet network metamodels for the analysis of slender offshore structures. Eng Struct 2014;68:71–84. [12] Quéau LM, Kimiaei M, Randolph MF. Approximation of the maximum dynamic stress range in steel catenary risers using artificial neural networks. Eng Struct 2015;92:172–85. [13] Chaves V, Sagrilo LVS, Silva VRM, Vignoles MA. Artificial neural networks applied to flexible pipes fatigue calculations. In: Proceedings of the 34th international conference on ocean, offshore and arctic engineering. [14] Low YM, Langley RS. Time and frequency domain coupled analysis of deepwater floating production systems. Appl Ocean Res 2006;28(6):371–85. [15] Haver S, Nyhus KA. A wave climate description for long term response calculations. Proceedings of the 9th international conference on offshore mechanics and arctic engineering 1986;4:27–34. [16] Haver S. On the prediction of extreme wave crest heights. In: Proceedings of the 7th international workshop on wave hindcasting and forecasting, meteorological service of Canada. [17] Ross SM. Simulation. 5th ed. San Diego: Academic Press; 2013. [18] Gao Y, Low YM. An efficient importance sampling method for long-term fatigue assessment of deepwater risers with time domain analysis. Probab Eng Mech 2016;45:102–14. [19] Afifi A, May S, Clark VA. Practical multivariate analysis. 5th ed. CRC Press; 2011. [20] Rencher AC, Christensen WF. Methods of multivariate analysis. 3rd ed. Wiley; 2012. [21] Nelder JA, Mead R. A simplex method for function minimization. Comput J 1965;7:308–13. [22] James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: with applications in R. New York: Springer; 2013. [23] Nelson BL. Control variate remedies. Oper Res 1990;38(6):974–92. [24] Low YM. Variance of the fatigue damage due to a Gaussian narrowband process. Struct Saf 2012;34(1):381–9.