Accepted Manuscript Probabilistic transformation model for preconsolidation stress based on clay index properties
Jianye Ching, Tsai-Jung Wu PII: DOI: Reference:
S0013-7952(17)30020-0 doi: 10.1016/j.enggeo.2017.05.007 ENGEO 4570
To appear in:
Engineering Geology
Received date: Revised date: Accepted date:
5 January 2017 8 May 2017 14 May 2017
Please cite this article as: Jianye Ching, Tsai-Jung Wu , Probabilistic transformation model for preconsolidation stress based on clay index properties. The address for the corresponding author was captured as affiliation for all authors. Please check if appropriate. Engeo(2017), doi: 10.1016/j.enggeo.2017.05.007
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT Probabilistic transformation model for preconsolidation stress based on clay index properties
AC C
EP T
ED
MA
NU
SC
RI
PT
Jianye Ching1 and Tsai-Jung Wu2
1
(Corresponding author) Professor, Dept of Civil Engineering, National Taiwan University, #1, Roosevelt Road Section 4, Taipei 10617, Taiwan. Email:
[email protected]. Tel: +886-2-33664328. 2 Graduate Student, Dept of Civil Engineering, National Taiwan University, #1, Roosevelt Road Section 4, Taipei 10617, Taiwan. Email:
[email protected]. 1
ACCEPTED MANUSCRIPT ABSTRACT This paper develops probabilistic transformation models that predict the probability density function (PDF) of the preconsolidation stress (p ) of a clay based on its index properties. These probabilistic transformation models are more versatile than traditional transformation models that
PT
only provide point estimates, because the PDF can quantify the transformation uncertainty in p
RI
and also can be further used to develop median and confidence interval estimates for p . The
SC
confidence interval is especially useful because the transformation uncertainty in p is fairly
NU
significant. Three probabilistic transformation models are developed: one is a generic model and the
MA
other two are models specialized for contractive and dilative clays. The generic model is applicable to both contractive and dilative clays, but its transformation uncertainty is larger. The two
ED
specialized models require the prior knowledge about the contractive/dilative behavior for the clay
EP T
of interest, but their transformation uncertainties are smaller. The performances of the three probabilistic transformation models are verified by statistical cross-validation and independent
AC C
validation database. The analytical forms for the median and confidence interval estimates are presented in this paper so that engineers do not need to derive the Bayesian equations. Keywords: preconsolidation stress; index properties; transformation model; transformation uncertainty; clays.
2
ACCEPTED MANUSCRIPT INTRODUCTION The preconsoilidation stress (p ) and overconsolidation ratio (OCR) are important parameters that quantify the stress history of a clay. The undrained shear strength of a clay can be estimated based on its p (Mesri, 1975, 1989). The at-rest earth pressure coefficient can be estimated based on its
PT
OCR (Mayne and Kulhawy, 1982). The contractive/dilative behavior a clay during shearing also
RI
depends on OCR. p for a clay can be determined in laboratory using the oedometer test. It can be
SC
also estimated based on the index properties of a clay through a transformation model (Stas and
NU
Kulhawy, 1984; Nagaraj and Srinivasa Murthy, 1986; DeGroot et al., 1999; Ching and Phoon, 2012;
MA
Kootahi and Mayne, 2016) as a first-order approximation. The latter (index properties) is desirable for the preliminary design stage where extensive laboratory tests are not yet conducted and for
ED
scenarios where the site investigation budget is limited.
EP T
However, there is significant uncertainty in the estimated p given the knowledge of the index properties. Table 1 shows the bias and coefficient of variation (COV) for several p transformation
AC C
models based on index properties. The bias is defined as the mean value for (measured p )/(predicted p ) and COV is the coefficient of variation for (measured p )/(predicted p ). The biases and COVs for the transformation models in Table 1 are calibrated by the CLAY/10/7490 database developed in Ching and Phoon (2014a). The magnitude for the transformation uncertainty is quantified by the COV. Consider the most recent transformation model developed by Kootahi and Mayne (2016). It has the smallest COV and yet the COV = 0.67 is still fairly large. The COVs for other models are even larger. 3
ACCEPTED MANUSCRIPT It is therefore desirable to quantify the significant transformation uncertainty in p by adopting a probabilistic transformation model. The main difference between probabilistic and traditional transformation models is that the former provides a probability density function (PDF) for p , whereas the latter only provides a point estimate for p . The availability of a PDF is a significant
PT
advantage because design engineers can get a sense for the magnitude of the transformation
RI
uncertainty, e.g., the 95% confidence interval for p is the interval bounded by the 0.025- and
SC
0.975-fractiles for the PDF. Point estimates can also be derived from the PDF, e.g., the mean or
NU
median estimate; Eurocode 7 has suggested the 0.05-fractile as the characteristic value (CEN 2004).
MA
More recently, probabilistic transformation models for some soil and rock properties have been proposed (Yan et al., 2009; Wang et al., 2010; Ching and Phoon, 2012, 2014b; Wang and Cao, 2013;
ED
Ching et al., 2014, 2017b; D'Ignazio et al., 2016; Feng and Jimenez 2015; Ng et al., 2016), but to
EP T
the authors’ best knowledge, the probabilistic transformation model for p based on clay index properties is not yet available.
AC C
This paper develops probabilistic transformation models for p based on the index properties of a clay. First, the multivariate PDF for (p /Pa, v0 /Pa, PL, LL, wn ) is constructed from a multivariate clay database using the translation approach (Liu and Der Kiureghian, 1986; Li et al., 2012), where Pa = 101.3 kN/m2 is one atmosphere pressure, PL and LL are plastic and liquid limits, and wn is the natural water content. v0 is considered because it is well known that p is positively correlated to v0 . wn is considered because a large p tends to produce a smaller void ratio, hence a smaller water content. PL and LL are considered because they are limiting values for w n . (PL, LL, 4
ACCEPTED MANUSCRIPT wn ) can be consolidated into a single variable LI, liquidity index. However, it is found in this paper that the prediction accuracy for p will decrease if such consolidation is adopted. As a result, (PL, LL, wn ) are not consolidated into LI. The constructed multivariate PDF for (p /Pa, v0 /Pa, PL, LL, wn ) serves as the prior PDF for
PT
the subsequent Bayesian analysis. Then, given the site-specific information on (v0 /Pa, PL, LL, wn )
RI
for a clay of interest, the PDF of its p /Pa can be updated by the Bayesian analysis. The entire
SC
framework has the numerical advantage that the updated PDF of p /Pa and its point and interval
NU
estimates can be expressed in analytical forms. Design engineers can directly implement these
MA
analytical equations without the need to re-derive the Bayesian equations. The performance of the resulting probabilistic transformation models will be compared with existing traditional
EP T
assessed by validation databases.
ED
transformation models. The effectiveness of the probabilistic transformation models will be
AC C
MULTIVARIATE CLAY DATABASE The CLAY/10/7490 database (Ching and Phoon, 2014a) is a generic (global) clay database consisting of 7490 data points from 30 countries/regions worldwide. The clay properties cover a wide range of OCR, sensitivity (S t ), and plasticity index (PI). In the database, there are a subset of 1217 data points with simultaneous knowledge of (p /Pa, v0 /Pa, PL, LL, wn ) from 141 sites worldwide. Data points with OCR > 20 (possibly fissured clays) are excluded. Table 2 shows the statistics for (p /Pa, v0 /Pa, PL, LL, wn ) and OCR. In the following, (Y1 , Y2 , …, Y5 ) are defined as 5
ACCEPTED MANUSCRIPT Y1 = ln(p /Pa) Y2 = ln(v0 /Pa) Y3 = PL Y4 = LL
(1)
Y5 = wn where PL, LL, and wn are in percents, e.g., if wn = 90%, take wn = 90. Figure 1 shows the
RI
PT
histograms for (Y1 , Y2 , Y3 , Y4 , Y5 ).
SC
MULTIVARIATE JOHNSON PROBABILITY DENSITY FUNCTION
NU
The 1217 data points are adopted to construct the prior multivariate PDF of (Y1 , Y2 , …, Y5 ) for the subsequent Bayesian analysis. It is common for real data points to follow non-normal distributions.
MA
This can be clearly seen in Figure 1, where most histograms exhibit a certain degree of asymmetry,
ED
which is not consistent with the normal model. A practical approach for constructing the multivariate PDF is the translation approach (Liu and Der Kiureghian, 1986; Li et al., 2012): the
EP T
non-normal data (Y1 , Y2 , …, Y5 ) are first mapped into standard normal data (X1 , X2 , …, X5 ). Then,
AC C
(X1 , X2 , …, X5 ) are assumed to follow the multivariate standard normal distribution. Mapping (Y1 , Y2 , …, Y5 ) into (X1 , X2 , …, X5 ) The first step for the translation approach is to map (Y1 , Y2 , …, Y5 ) into (X1 , X2 , …, X5 ). Ching and Phoon (2014b) and Ching et al. (2014, 2017b) have shown that the Johnson system of distributions (Phoon and Ching, 2013) is effective in modeling various histograms of soil parameters. We cover the Johnson system of distributions below, starting with a well known member: the shifted lognormal distribution. The lognormal distribution has zero as its lower bound. The shifted
6
ACCEPTED MANUSCRIPT lognormal distribution generalizes the lognormal distribution to account for non-zero lower bounds. If Y is shifted lognormal, the relationship between X and Y is
Y bY X bX ln aX aY
(2)
PT
where X is standard normal; aX, bX, aY, and bY are the parameters for the shifted lognormal
RI
distribution. The parameter bY is the lower bound of Y and it is typically determined by physics.
SC
The shifted lognormal distribution is a member of a more general Johnson system, which can be expressed in the following form following the notations presented by Slifker and Shapiro (1980):
NU
Y bY X bX Yn aX aY
(3)
MA
where Yn = (Y-bY)/aY is the normalized Y. The SU member (‘U’ denotes ‘unbounded’) is unbounded and it is defined by
ED
Yn sinh 1 Yn ln Yn 1 Yn2
(4)
(5)
AC C
Y Yn ln n 1 Yn
EP T
The SB member (‘B’ denotes ‘bounded’) is bounded between [bY, aY + bY] and it is defined by
The SL member (‘L’ denotes ‘lognormal’; this is the shifted lognormal member) is bounded from below by bY and it is defined by Yn ln Yn
(6)
Their probability density functions are as follows:
2 1 y 2 y 1 y 2 y
2 a X a Y exp 0.5 b X a X sinh 1 y n 2 f y | a X , b X , a Y , b Y a X a Y exp 0.5 b X a X ln y n 1 y n 2 a X a Y exp 0.5 b X a X ln y n
2 n
n
n
n
for SU for SB
(7)
for SL 7
ACCEPTED MANUSCRIPT Slifker and Shapiro (1980) proposed an elegant selection and parameter estimation approach for the Johnson distribution using percentiles. Based on the percentiles of Y, the distribution type has been identified (SU, SB or SL), the distribution parameters (aX, bX, aY, bY) can be estimated. The current study adopts a two-step approach to select the distribution type and estimate the
PT
parameters (aX, bX, aY, bY). In the first step, the percentile-based approach proposed by Slifker and
RI
Shapiro (1980) is adopted to select among SU, SB, and SL and to find the preliminary estimates for
SC
(aX, bX, aY, bY). In the second step, the maximum likelihood method is adopted to find the (aX, bX,
NU
aY, bY) estimates, using the preliminary estimates for (aX, bX, aY, bY) from the first step as the initial
MA
point for the optimization search. The maximum likelihood estimates of (aX, bX, aY, bY) are those who maximizing the following likelihood function: L a X , b X , a Y , b Y f Y (i) | a X , b X , a Y , b Y n
(8)
ED
i 1
EP T
where Y(i) is the i- th data point of Y; n is the total number of data points (n = 1217). The selected types and estimated parameters for the marginal PDFs of (Y1 , Y2 , …, Y5 ) obtained by this two-step
AC C
approach are summarized in Table 3. The fitted marginal Johnson PDFs are plotted in Figure 1 together with the histograms for the Y data. With the selected types and estimated parameters in Table 3, Eqs. (3)~(6) can be adopted to transform Y to a standard normal variable X. Consider the mapping from Y1 to X1 as an example. In Table 3, the distribution type SB is identified for Y1 . Therefore, Eq. (5) should be adopted for the mapping: Y X1 bX ln n aX 1 Yn
(9) 8
ACCEPTED MANUSCRIPT where Yn = (Y1 –bY)/aY = [ln(p /Pa)–bY]/aY. It is then clear that ln p Pa b Y a Y X1 bX a X ln 1 ln p Pa b Y a Y
ln p Pa 4.396 13.475 4.328 ln 102.514 ln p Pa
(10)
The mappings from (Y2 , …, Y5 ) to standard normal variables (X2 , …, X5 ) can also be derived based
PT
on the same principle: ln v0 Pa 5.420 X 2 0.249 2.756 ln 5.099 ln v0 Pa
RI
w n 2.723 X 5 1.815 1.326 ln 259.443 w n
(11)
SC
LL 15.652 X 4 2.214 1.262 ln 297.290 LL
PL 19.494 X 3 0.827 1.190 sinh 1 7.309
NU
The normality of (X1 , X2 , …, X5 ) can be checked using the Kolmogorov-Smirnov (K-S) test
MA
(Conover, 1999). The p-values associated with the K-S test for (X1 , X2 , …, X5 ) are listed in Table 3. The p- values indicate that the Johnson PDF may be acceptable for (Y1 , Y2 , Y3 , Y4 ), but the p-value
ED
for X5 indicates that Y5 cannot be plausibly modeled as a Johnson random variable. In fact, several
EP T
distributions for Y5 are attempted, including lognormal, Gamma, and Gumbel distributions. All of them are skew-to-the- left bell-shaped distributions, similar to the histogram of Y5 (see Figure 1).
AC C
The parameters of these distributions are also estimated using the maximum likelihood method, yet the resulting p-values are all smaller than 0.05, i.e., none of them passes the K-S test, either. We decide to adopt the Johnson SB distribution, because (a) (Y1 , Y2 , Y3 , Y4 ) are already modeled by the Johnson system; (b) the p-value for Y5 (0.007) produced by the Johnson SB distribution is not exceptionally low. Estimation for the correlation matrix of (X1 , X2 , …, X5 ) A key assumption for the translation approach is that (X1 , X2 , …, X5 ) follow the multivariate 9
ACCEPTED MANUSCRIPT standard normal distribution. It is crucial to note that collectively (X1 , X2 , …, X5 ) does not necessarily follow a multivariate normal PDF even if each component is normally distributed. The multivariate normal distribution is an assumption for the translation approach. Multivariate normality requires separate checks. Based on the authors’ experience, soil parameters do not strictly
PT
follow a multivariate normal distribution, even when they are transformed individually to normal
RI
random variables. Nonetheless, Ching and Phoon (2012, 2014b), Ching et al. (2014), and Ching et
NU
justification for this multivariate normality assumption.
SC
al. (2017b) demonstrated that meaningful results can be obtained even in the absence of rigorous
f x C
1 2
2
5 2
1 T exp x C1 x 2
MA
The multivariate standard normal PDF is defined uniquely by a correlation matrix C: (12)
15 25 45 1
(13)
AC C
12 13 1 1 23 C 1 Sym.
EP T
and C is the correlation matrix:
ED
where x = (x1 , x2 , x5 )T is a standard normal random vector (“T” refers to vector/matrix transpose),
where ij is the Pearson product-moment correlation coefficient between Xi and Xj. Ching et al. (2016c) showed that the equation proposed by Hotelling and Pabst (1936) is robust in estimating ij: ij 2sin rij 6
(14)
where rij is the Spearman (rank) correlation between Yi and Yj. Table 4 shows the estimated C matrix. It is a positive definite matrix. With the marginal PDFs in Table 3 and the C matrix in Table
10
ACCEPTED MANUSCRIPT 4, the construction of the multivariate PDF for (Y1 , Y2 , …, Y5 ) is complete. This multivariate PDF will be referred to as the multivariate Johnson PDF.
BAYESIAN UPDATING FOR p/Pa
PT
The 5-dimensional multivariate Johnson PDF constructed above serves as the prior PDF for the
RI
Bayesian updating. The engineer should first judge if the clay of interest falls within the range of
SC
clay types appearing in Table 2. If so, it is possible to update the PDF of Y1 = ln(p /Pa) for the clay
NU
of interest based on the information for its (Y2 , Y3 , Y4 , Y5 ). An attractive feature for the multivariate
MA
Johnson framework is that the updated PDF for Y1 = ln(p /Pa) is still Johnson (Phoon and Ching, 2013; Ching and Phoon, 2015).
ED
Bayesian updating for X1
EP T
Let us start the Bayesian derivations in the standard normal space. The random vector in standard normal space is partitioned into two subvectors:
AC C
X1 X 2 X1 X X3 X X (2) 4 X 5
(15)
where X(2) = (X2 , X3 , X4 , X5 )T . The correlation matrix can be partitioned accordingly as 0.834 0.380 0.331 0.726 1 0.834 1 0.336 0.259 0.640 1 C 0.380 0.336 1 0.842 0.708 C(21) 1 0.718 0.331 0.259 0.842 0.726 0.640 0.708 0.718 1
C(12) C(22)
(16)
The updated PDF of X1 given X(2) = (X2 , X3 , X4 , X5 )T is a normal PDF with the following mean and 11
ACCEPTED MANUSCRIPT standard deviation: 1 1 C(12) C(22) X(2) 0.575 X 2 0.056 X3 0.119 X 4 0.483 X5
(17)
1 1 1 C(12) C(22) C(21) 0.480
where (X2 , X3 , X4 , X5 ) can be computed using Eq. (11) based on (v0 /Pa, PL, LL, wn ) for the clay
RI
standard deviation = 1 , hence the updated X1 can be expressed as
PT
of interest. Equation (17) indicates that the updated X1 is a normal variable with mean = 1 and
(18)
SC
X1 1 1 X
NU
where X is standard normal. Bayesian updating for Y1
MA
Because Y1 is Johnson SB, the relationship between Y1 and X1 is (Eq. 5) (19)
ED
Y1 b Y X1 bX a X ln a Y bY Y1
and (19), we have
EP T
where aX = 4.328, bX = 13.475, aY = 106.913, and bY = -4.396 (see Table 3). Combining Eqs. (18)
AC C
Y 4.396 1 1 X 13.475 4.328 ln 1 102.514 Y1
or X
13.475 1 4.328 Y1 4.396 ln 1 1 102.514 Y1
(20)
(21)
By comparing Eq. (5) with Eq. (21), it is clear that the updated PDF of Y1 is Johnson SB with the following updated parameters: a X
13.475 1 4.328 9.026 bX 28.102 1.199 X 2 0.116 X 3 0.247 X 4 1.008 X 5 1 1
a Y 106.913
(22)
bY 4.396
The updated PDF for p /Pa has the following analytical form: 12
ACCEPTED MANUSCRIPT
Updated PDF f p Pa
ln p Pa bY 0.5 bX a X ln a Y bY ln p Pa
2
a X a Y e 2 p Pa ln p Pa bY a Y bY ln p Pa
(23)
The p-fractile of for the updated p /Pa, denoted by (p /Pa)p , also has the following analytical form: 1 p bX exp bY a Y 1 exp a X
1
(24)
PT
p Pa
p
RI
where -1 is the inverse cumulative density function (CDF) for the standard normal variable.
SC
Point and interval estimates for p/Pa
(letting p = 0.5 in Eq. 24) can be expressed as
MA
Median for p Pa
NU
Substituting the (aX, bX, aY, bY) in Eq. (22) into Eq. (24), the median for the updated p /Pa
(25)
ED
1 28.102 1.199X 2 0.116X 3 0.247X 4 1.008X 5 exp 4.396 106.913 1 exp 9.026
where (X2 , X3 , X4 , X5 ) can be computed using Eq. (11) based on (v0 /Pa, PL, LL, wn ) for the clay
EP T
of interest. The (1-2p)100% confidence interval (CI) for the updated p /Pa is the interval bounded by the p-fractile and (1-p)-fractile:
AC C
Upper & lower bounds
1 1 p 28.102 1.199X 2 0.116X 3 0.247X 4 1.008X 5 (26) = exp 4.396 106.913 1 exp 9.026
In particular, the 95% confidence interval for p /Pa can be computed by setting p = 0.025.
VERIFICATION The two main prediction equations are Eqs. (25) and (26). The purpose of this section is to verify
13
ACCEPTED MANUSCRIPT whether these two equations are effective. Verification of the median estimate The effectiveness of the median estimate (Eq. 25) with respect to the 1217 data points of CLAY/10/7490 is verified. Because Eq. (25) is “trained” based on the 1217 data points, it is not fair
PT
to verify Eq. (25) directly using the internal training data. Instead, the effectiveness of the median
RI
estimate is verified indirectly using cross-validation (Geisser, 1993). Recall that the 1217 data
SC
points are from 141 sites worldwide. The (p /Pa, v0 /Pa, PL, LL, wn ) data points for 140 sites are
NU
used to “train” the median equation in Eq. (25), then the trained equation (which will be slightly
MA
different from Eq. 25 because one site is left out) is used to predict the p values of the data points for the left-out site using the (v0 /Pa, PL, LL, wn ) information (we “pretend” their p values are
ED
unknown). This cross-validation is conducted for 141 times by switching among the 141 sites. By
EP T
doing so, each of the 1217 data points has a predicted p value, and the predicted value is based on the cross- validation, not based on internal training data. Figure 2f compares the measured p with
AC C
the predicted p . The performances for the five traditional transformation models in Table 1 are also demonstrated in Figures 2a~2e. The five traditional transformation models are verified directly using the 1217 data points, because these models were not trained by these data points. The effectiveness of each model is evaluated by three indices (Kootahi and Mayne, 2016): (a) coefficient of determination (R2 ); (b) coefficient of efficiency (E); and (c) mean absolute error (MAE):
14
ACCEPTED MANUSCRIPT n n n n Y1k Y1k* Y1k Y1k* k 1 k 1 k 1 R2 = 2 2 n n n 2 n * 2 * n Y Y n Y Y 1k 1k 1k 1k k 1 k 1 k 1 k 1
2
Y n
E 1
k 1
1k
Y1k*
2
1 n Y1k n Y1k k 1 k 1 n
2
MAE
1 n Y1k Y1k* n k 1
(27)
where n = 1217 is the total number of data points; Y1k is the ln(p /Pa) measured value for the k-th data point (k = 1, 2, …, n); Y1k * is the ln(p /Pa) predicted value for the k-th data point. The
PT
prediction is effective if R2 is large, E is large, and MAE is small. It is evident that the
RI
transformation model proposed by Kootahi and Mayne (2016) and the median estimate proposed in
SC
this study are superior. Notice that these two more accurate transformation models also need the
NU
most information, including (v0 , PL, LL, wn ). In terms of amount of required information, they are
MA
not the most convenient models. Models developed by Stas and Kulhway (1984) and DeGroot et al. (1999) are more convenient because they only require LI. Note that the probabilistic transformation
ED
model proposed in this study provides not only a point estimate but also an interval estimate. The
EP T
effectiveness of the interval estimate will be verified in the next section. It is also possible to consolidate (Y3 = PL, Y4 = LL, Y5 = wn ) into a single variable Y3 = LI.
AC C
The (p /Pa, v0 /Pa, LI) data points can be adopted to re-construct the multivariate Johnson distribution as well as to re-develop the median estimate of ln(p /Pa). However, the performance of this LI-based probabilistic transformation model is worse than the one based on (PL, LL, wn ) in terms of R2 , E, and MAE.
Verification of the interval estimate The effectiveness of the interval estimate (Eq. 26) with respect to the 1217 data points is verified. 15
ACCEPTED MANUSCRIPT The cross validation is again adopted. After the cross-validation, each of the 1217 data points has a predicted confidence interval (CI) for p /Pa. This CI can be compared with the measured p /Pa to obtain a “yes” or “no” (“yes” means that the measured p is within the CI). It is desirable that the percentage of “yes”, called the cross-validated percentage in this study, is close to the “nominal”
PT
percentage, i.e., (1-2p)100%. Figure 3 shows how the cross-validated percentage varies with the
RI
nominal percentage. The vertical error bars show the statistical uncertainties in the cross-validated
SC
percentages, obtained by bootstrapping (Efron and Tibshirani, 1993). It is evident that the
NU
cross-validated percentage is generally close to the nominal percentage. This indicates that the
MA
probabilistic transformation model is effective, in the sense that the resulting “nominal CI” is close to the “genuine CI”.
ED
However, this conclusion for a close-to-genuine CI depends on the nature of the validation
EP T
database. Later, we will show that the nominal CI ceases to be genuine if the validation database has
AC C
characteristics that are substantially different from those for the 1217 data points.
INDEPENDENT REGIONAL DATABASE It is insightful to verify the performance of the proposed probabilistic transformation model with respect to another independent clay database. The F-CLAY/7/216 database developed by D’Ignazio et al. (2016) is adopted in this section. This database is a “regional” clay database consisting of 216 data points from 24 different test sites in Finland. The clay properties cover wide ranges of S t (2~64), PI (2~95), OCR (1~7.5), and natural water content wn (25~150). 216 data points with 16
ACCEPTED MANUSCRIPT simultaneous knowledge of (p /Pa, v0 /Pa, PL, LL, wn ) are extracted from this regional database. All data points are with OCR < 20. None of the 216 data points overlaps with the 1217 data points from CLAY/10/7490. Table 5 shows the statistics for (p /Pa, v0 /Pa, PL, LL, wn ). Compared with Table 2, it is evident that the parameter ranges for F-CLAY/7/216 is significantly narrower than
PT
those for CLAY/10/7490. This is expected because F-CLAY/7/216 is a regional database. The
RI
purpose of this section is to verify whether Eqs. (25) and (26) are effective with respect to the
SC
regional database F-CLAY/7/216.
NU
Verification of the median estimate
MA
Because F-CLAY/7/216 is an external validation dataset, the cross-validation is not needed. For each of the 216 data points, its p is predicted by Eq. (25) using its (v0 /Pa, PL, LL, wn )
ED
information (again, we pretend its p value is unknown). Figure 4f compares the measured p with
EP T
the predicted p . The performances for the five traditional transformation models in Table 1 are also demonstrated in Figure 4a~4e. It is again evident that the transformation model proposed by
AC C
Kootahi and Mayne (2016) and the median estimate proposed in this study are superior. It is remarkable that although both models were developed based on global databases, they provide satisfactory prediction results for the reginal data points. The probabilistic transformation model proposed in this study also provides an interval estimate, to be presented in the next section. Verification of the interval estimate Figure 5 shows how the validated percentage varies with the nominal percentage. It is clear that the validated percentage is systematically larger than the nominal percentage. Namely, the CI predicted 17
ACCEPTED MANUSCRIPT by Eq. (26) ceases to be genuine with respect to the regional database. For instance, the 95% CI predicted by Eq. (26) is in effect a 99% CI for the F-CLAY/7/216 database. This is because the calibration database for the probabilistic transformation model is a global database (i.e., the 1217 data points) with a wide coverage over clay types, but the validation database is a regional database
PT
with a narrower coverage. In particular, the global database is a mixture of contractive and dilative
RI
clays, whereas the regional database contains primarily contractive clays, as discussed below.
SC
Kootahi and Mayne (2016) showed that the transformation model for p depends on whether
NU
the clay is contractive or dilative. Two substantially different transformation models were proposed
MA
by Kootahi and Mayne (2016), one for contractive clays and the other for dilative clays. Figure 6 shows the OCR histograms for the two databases: CLAY/10/7490 and F-CLAY/7/216.
ED
CLAY/7/7490 is a mixture of contractive and dilative clays: about 80% of the data points are
EP T
contractive (OCR < 3) and about 20% of the data points are dilative (OCR 3). F-CLAY/7/216 is predominantly contractive: only about 6% of the data points are dilative. This may explain why the
AC C
predicted CI developed by CLAY/10/7490 is not genuine with respect to F-CLAY/7/216. The predicted CI has to be wide in order to accommodate the mixture of contractive and dilative data points in CLAY/10/7490. However, the predicted CI can be too wide for F-CLAY/7/216 that is predominantly contractive.
MODELS FOR CONTRACTIVE AND DILATIVE CLAYS To resolve the issue of non- genuine CI, it is desirable to develop probabilistic transformation 18
ACCEPTED MANUSCRIPT models that are specialized for contractive and dilative clays. Probabilistic transformation model for contractive clays Among the 1217 data points in CLAY/10/7490, 982 data points are contractive (OCR < 3). By following the same steps in previous sections, these 982 data points are used to develop the
PT
following prediction equations:
RI
Median for p Pa
(28)
SC
1 8.389 2.634X 2 0.035X 3 0.223X 4 0.027X 5 exp 4.741 14.858 1 exp 10.989
NU
The (1-2p)100% confidence interval: Upper & lower bounds
MA
1 1 p 8.389 2.634X 2 0.035X 3 0.223X 4 0.027X 5 exp 4.741 14.858 1 exp 10.989
(29)
ED
Note that (X2 , X3 , X4 , X5 ) are now computed from the following equations, rather than from Eq.
EP T
(11):
ln v0 Pa 3.529 X 2 0.402 1.856 ln 3.484 ln v0 Pa
AC C
LL 15.745 X 4 2.015 1.210 ln 284.813 LL
PL 19.535 X 3 0.852 1.251 sinh 1 8.345
w n 4.341 X 5 2.086 1.463 ln 288.676 w n
(30)
Recall that F-CLAY/7/216 is primarily contractive, the performance of Eqs. (28)~(30) can be verified by F-CLAY/7/216. Among the 216 data points in F-CLAY/7/216, 203 data points are contractive (OCR < 3). For the point estimate, Figure 7a compares the measured p with the median value of p predicted by Eq. (28) for these 203 contractive data points. The predictions are satisfactory. For the interval estimate, Figure 7b shows how the validated percentage varies with the
19
ACCEPTED MANUSCRIPT nominal percentage. Figure 7b can be compared with Figure 5: the validated percentage is now closer to the nominal percentage in Figure 7b. This is probably because the CI is now developed from the 983 global contractive data points and the 203 regional data points are also contractive. The results in Figure 7 show that the probabilistic transformation model calibrated by global data
PT
points can perform well for regional data points with similar soil behavior characteristics. This
RI
observation is in contrast with the common criticism that transformation models calibrated by
NU
Probabilistic transformation model for dilative clays
SC
global data points are not applicable to regional or local data points.
MA
Among the 1217 data points in CLAY/10/7490, there are 235 data points with OCR 3. These 235 data points are used to develop the following prediction equations: Median for p Pa
(31)
EP T
ED
1 8.614 2.064X 2 0.364X 3 0.396X 4 1.493X 5 exp 2.696 7.097 1 exp 4.158
The (1-2p)100% confidence interval: Upper & lower bounds
AC C
1 1 p 8.614 2.064X 2 0.364X 3 0.396X 4 1.493X 5 exp 2.696 7.097 1 exp 4.158
(32)
Note that (X2 , X3 , X4 , X5 ) are now computed from the following equations: ln v0 Pa 6.269 X 2 0.010 2.579 ln 5.391 ln v0 Pa LL 10.277 X 4 4.537 2.147 sinh 10.424 1
PL 18.710 X 3 0.809 1.115 sinh 1 5.123 w n 8.653 X 5 1.904 0.961 ln 225.367 w n
(33)
The performance of Eqs. (31)~(33) is also verified by F-CLAY/7/216. Among the 216 data
20
ACCEPTED MANUSCRIPT points in F-CLAY/7/216, 13 data points are dilative (OCR 3). Figure 8 compares the measured p with p predicted by Eq. (31) for these 13 dilative data points. The predictions are satisfactory. The plot that compares the validated percentage with nominal percentage is not presented, because the
PT
number of validation data points (13 data points) is too small to obtain meaningful statistics.
RI
IMPLEMENTATION EXAMPLE
SC
This section demonstrates the estimation of the p profile based on the (v0 , PL, LL, wn ) profiles
NU
for a Norway site. Some data points for the Onsφy site (Norway) are extracted from D’Ignazio et al.
MA
(2016). The site investigation data for these data points are shown in Table 6. Three scenarios are considered: (a) there is no prior knowledge about the contractive/dilative behavior of the clays, so
ED
the equations for generic clays (Eqs. 11, 25, and 26) are used to update the p estimates; (b) there is
EP T
prior knowledge that the clays are contractive, so the equations for contractive clays (Eqs. 28~30) are used to update the p estimates; (c) there is prior knowledge that the clays are dilative, so the
AC C
equations for dilative clays (Eqs. 31~33) are used to update the p estimates. Scenario (a): no prior knowledge If there is no prior knowledge about contractive/dilative, the generic equations (Eqs. 11, 25, and 26) can be used to update the p estimates. Consider the case in Table 6 with depth = 11.0 m: v0 = 67.5 kPa, PL = 36.8, LL = 72.9, and wn = 69.4. The steps for predicting its p are as follows: 1. Determine (Y2 , Y3 , Y4 , Y5 ): Y2 = ln(v0 /Pa) = -0.406, Y3 = 36.8, Y4 = 72.9, and Y5 = 69.4. 2. Determine (X2 , X3 , X4 , X5 ) using Eq. (11): (X2 , X3 , X4 , X5 ) = (-0.009, 1.074, 0.490, 0.426). 21
ACCEPTED MANUSCRIPT 3. Update the median estimate for p /Pa using Eq. (25). The resulting median is p /Pa = 1.060, i.e., p = 107.3 kPa. 4. Update the interval estimate using Eq. (26). Consider the 95% CI (letting p = 0.025 in Eq. 26). The resulting 95% CI for p /Pa is [0.457 2.955], i.e., p [46.3 299.3] kPa.
PT
The above steps are repeated for all depths in Table 6. Figure 9a shows the profile of the updated
segments are the 95% CIs.
NU
Scenario (b): prior knowledge for contractive clays
SC
RI
estimates for p . The grey squares are the median for p , whereas the grey horizontal line
MA
If there is prior knowledge indicating that the clays are contractive, the equations for contractive clays (Eqs. 28~30) should be used to update the p estimates. Figure 9b shows the profile of the
ED
updated p estimates. It is evident that with the prior knowledge, the 95% CIs become significantly
EP T
narrower than those in Figure 9a. This shows the value of prior knowledge: uncertainty can be significantly reduced by past experiences. The case with depth = 1.9 m is a false judgment: it is a
measured p .
AC C
dilative clay (OCR = 5.0). This explains why the 95% CI at depth = 1.9 m does not contain the
Scenario (c): prior knowledge for dilative clays Suppose that there is prior knowledge indicating the clay at depth = 1.9 m is dilative. The equations for dilative clays (Eqs. 31~33) should be used to update the p estimates. The red horizontal line in Figure 9b shows the resulting 95% CI for depth = 1.9 m: now the 95% CI contains the measured p . 22
ACCEPTED MANUSCRIPT
CONCLUSIONS There is significant transformation uncertainty in the transformation model that predicts the preconsolidation stress (p ) based on index properties of a clay. A limitation for traditional
PT
transformation models is that they only provide point estimates for p , but the associated
RI
transformation uncertainty is not treated and quantified. This study develops probabilistic
SC
transformation models that overcome this limitation: they are able to predict the probability density
NU
function of p based on the index properties. Three probabilistic transformation models are
MA
developed. One probabilistic transformation model is a generic model that can accommodate both contractive and dilative clays. The transformation uncertainty of this model is the largest. The other
ED
two probabilistic transformation models are specialized for contractive and dilative clays. Their
EP T
implementation requires the prior knowledge about the contractive/dilative behavior for the clay of interest. Nonetheless, the transformation uncertainties for these specialized models are less than the
AC C
generic model. The performances of the three probabilistic transformation models are verified by statistical cross-validation and independent validation database. An important observation made in this study is that the probabilistic transformation model calibrated by global data points can perform well for regional data points with similar soil behavior characteristics. This observation is in contrast with the common criticism that transformation models calibrated by global data points are not applicable to regional or local data points. More future research is needed to clarify the main factors that affect the applicability of generic 23
ACCEPTED MANUSCRIPT transformation models to regional or local cases.
REFERENCES CEN, 2004. EN 1997-1:2004 Geotechnical Design – Part 1: General Rules.
RI
distribution. Canadian Geotechnical Journal, 49(5), 522-545.
PT
Ching, J., Phoon, K.K., 2012. Modeling parameters of structured clays as a multivariate normal
SC
Ching, J., Phoon, K.K., 2014a. Transformations and correlations among some clay parameters – the
NU
global database, Canadian Geotechnical Journal, 51(6), 663-685.
MA
Ching, J., Phoon, K.K., 2014b. Correlations among some clay parameters – the multivariate distribution, Canadian Geotechnical Journal, 51(6), 686-704.
ED
Ching, J., Phoon, K.K., Chen, C.H., 2014. Modeling CPTU parameters of clays as a multivariate
EP T
normal distribution. Canadian Geotechnical Journal, 51(1), 77-91. Ching, J., Phoon, K.K., 2015. Constructing multivariate distributions for soil parameters. Chap. 1 in
Francis.
AC C
Risk and Reliability in Geotechnical Engineering (Eds.: K.K. Phoon and J. Ching). Taylor &
Ching, J., Lin, G.H., Chen, J.R., Phoon, K.K., 2017a. Transformation models for effective friction angle and relative density calibrated based on a multivariate database of coarse- grained soils. Canadian Geotechnical Journal (in press). Ching, J., Lin, G.H., Phoon, K.K., Chen, J.R., 2017b. Correlations among some parameters of coarse-grained soils – the multivariate probability distribution model. Canadian Geotechnical 24
ACCEPTED MANUSCRIPT Journal (in press). Ching, J., Phoon, K.K., Li, D.Q., 2016c. Robust estimation of correlation coefficients among soil parameters under the multivariate normal framework, Structural Safety, 63, 21-32. Conover, W.J., 1999. Practical Nonparametric Statistics. 3rd edition, John Wiley & Sons, Inc., New
PT
York.
RI
DeGroot, D. J., Knudsen, S., Lunne, T., 1999. Correlations among pc, su, and index properties for
SC
offshore clays. Proc., Int. Conf. on Offshore and Nearshore Geotechnical Engineering, Balkema,
NU
Rotterdam, Netherlands, 173-178.
MA
D'Ignazio, M., Phoon, K.K., Tan, S.A., Lansivaara, T., 2016. Correlations for undrained shear strength of Finnish soft clays. Canadian Geotechnical Journal, 53(10), 1628-1645.
ED
Efron, B., Tibshirani, R., 1993. An Introduction to the Bootstrap. Boca Raton, FL. Chapman and
EP T
Hall/CRC.
Feng, X., Jimenez, R., 2015. Estimation of deformation modulus of rock masses based on Bayesian
AC C
model selection and Bayesian updating approach. Engineering Geology, 199, 19-27. Geisser, S., 1993. Predictive Inference. Chapman and Hall, New York, NY. Hotelling, H., Pabst, M.R., 1936. Rank correlation and tests of significance involving no assumption of normality. American Mathematical Statistics, 7, 29-43. Kootahi, K., Mayne, P.W., 2016. Index test method for estimating the effective preconsolidation stress in clay deposits. ASCE Journal of Geotechnical and Geoenvironmental Engineering, 142(1), 04016049. 25
ACCEPTED MANUSCRIPT Li, D.Q., Wu, S.B., Zhou, C.B., Phoon, K.K., 2012. Performance of translation approach for modeling correlated non-normal variables. Structural Safety, 39, 52-61. Liu, P.L., Der Kiureghian, A., 1986. Multivariate distribution models with prescribed marginals and covariances. Probabilistic Engineering Mechanics, 1(2), 105-112.
PT
Mayne, P.W., Kulhawy, F.H., 1982. K 0 -OCR relationships in soil. ASCE Journal of Geotechnical
RI
Engineering, 108(GT6), 851-872.
SC
Mesri, G., 1975. Discussion on “New design procedure for stability of soft clays”. ASCE Journal of
NU
the Geotechnical Engineering Division, 101(4), 409-412.
MA
Mesri, G., 1989. A re-evaluation of su(mob) = 0.22 'p using laboratory shear tests. Canadian Geotechnical Journal, 26(1), 162-164.
ED
Nagaraj, T.S., Srinivasa Murthy, B.R., 1986. Prediction of compressibility of overconsolidated
EP T
uncemented soils. ASCE Journal of Geotechnical Engineering, 112(4), 484-488. Ng, I.T., Yuen, K.V., Dong, L., 2016. Nonparametric estimation of undrained shear strength for
AC C
normally consolidated clays. Marine Georesources and Geotechnology, 34(2), 127-137. Phoon, K.K., Ching, J., 2013. Multivariate model for soil parameters based on Johnson distributions. Foundation Engineering in the Face of Uncertainty, Geotechnical Special Publication honoring Professor F. H. Kulhawy, 337-353. Slifker, J.F., Shapiro, S.S., 1980. The Johnson system: selection and parameter estimation. Technometrics, 22(2), 239-246. Stas, C.V., Kulhawy, F.H., 1984. Critical evaluation of design methods for foundations unde r axial 26
ACCEPTED MANUSCRIPT uplift and compressive loading. Electric Power Research Institute, Palo Alto, Calif. Report EL-3771. Wang, Y., Au, S.K., Cao, Z., 2010. Bayesian approach for probabilistic characterization of sand friction angles. Engineering Geology, 114(3-4), 354-363.
PT
Wang, Y., Cao, Z., 2013. Probabilistic characterization of Young's modulus of soil using equivalent samples. Engineering Geology, 159, 106-118.
RI
Yan, W.M., Yuen, K.V., Yoon, G.L., 2009. Bayesian probabilistic approach for the correlations of
SC
compression index for marine clays. ASCE Journal of Geotechnical and Geoenvironmental
AC C
EP T
ED
MA
NU
Engineering, 135(12), 1932-1940.
27
ACCEPTED MANUSCRIPT Table 1 Biases and coefficients of variation (COVs) for several p transformation models. Literature
Transformation model
Stas and Kulhawy (1984) Nagaraj and Srinivasa Murthy (1985) DeGroot et al. (1999) Ching and Phoon (2012) Kootahi and Mayne (2016)
p Pa 101.111.62LI (for St < 10)
n
Calibration results Bias COV
248
1.13*
1.36*
1242
1.38*
3.46*
p kPa 102.90.96LI
1313
1.86*
1.20*
p Pa 0.235 LI1.319 S0.536 t
506
1.36
0.89
1.62 (v0 Pa )0.89 (LL)0.12 (w n ) 0.14 if DS** > 1.123 p Pa 0.71 0.53 0.71 if DS 1.123 7.94 (v0 Pa ) (LL) (w n )
1242
1.10
0.67
LL 0.25log10 v 0 kPa
PT
5.97 5.32 w n p kPa 10
SC
RI
Pa = one at mosphere pressure; LI = liquid ity index; St = sensitivity; wn = natural water content; LL = liquid limit; v0 = in-situ effective vertical stress; n = number of calibration cases. * The ratios (measured p )/(predicted p ) for some cases are extremely large, so an alternative additive definit ion (Ch ing et al., 2017a) for b ias and COV is adopted: bias = (samp le mean of measured p )/(sample mean of pred icted p ), standard deviation = sample standard deviation of measured p – bias predicted p . The COV = (standard deviation)/( sample mean of measured p ). ** DS 5.152 log10 ( v' 0 Pa ) 0.061 LL 0.093 PL 6.219 n e, where en is the natural void ratio. It may be
MA
NU
estimated as e n wn /2.65.
Table 2 Statistics for (p /Pa, v0 /Pa, PL, LL, wn ) & OCR for 1217 data points in CLAY/10/7490. Random variable
Mean
Standard deviation
Min
Max
p /Pa
–
2.48
4.27
0.08
48.49
ln(p /Pa) v0 /Pa
Y1 –
0.26 1.03
1.04 1.16
-2.55 0.03
3.88 14.96
Y2 Y3
-0.39 27.3
0.92 11.1
-3.60 0.6
2.71 76.5
LL
Y4
64.4
30.5
20.3
236.4
wn OCR
Y5 –
60.9 2.47
31.7 2.48
10.3 0.46*
193.4 20
*
EP T
AC C
ln(v0 /Pa) PL
ED
Clay parameter
There are six data points with OCR significant less than 1 (under-consolidated).
28
ACCEPTED MANUSCRIPT Table 3 Johnson PDF types and parameters for (Y1 , Y2 , …, Y5 ). Random
Clay
Distribution
variable
parameter
type
aX
bX
aY
bY
Y1
ln(p /Pa) ln(v0 /Pa)
SB SB
4.328 2.756
13.475 0.249
106.913 10.518
-4.396 -5.420
0.59 0.56
Y4
PL LL
SU SB
1.190 1.262
-0.827 2.214
7.309 281.639
19.494 15.652
0.45 0.18
Y5
wn
SB
1.326
1.815
256.720
2.723
0.007
Y2
p-value
RI
PT
Y3
Distribution parameters
X1
X2
X3
1.000
0.834
-0.380
X2
1.000
X4
Symmetric
-0.331
-0.726
-0.336
-0.259
-0.640
1.000
0.842
0.708
1.000
0.718 1.000
EP T
ED
X5
X5
MA
X3
X4
NU
X1
SC
Table 4 Estimated C matrix (entries are ij).
Table 5 Statistics for (p /Pa, v0 /Pa, PL, LL, wn ) & OCR for 216 data points in F-CLAY/7/216.
AC C
Clay parameter
Random variable
Mean
Standard deviation
Min
Max
p /Pa
–
0.79
0.40
0.20
2.27
ln(p /Pa)
Y1
-0.35
0.48
-1.62
0.82
v0 /Pa ln(v0 /Pa)
– Y2
0.46 -0.88
0.22 0.47
0.07 -2.60
1.61 0.48
PL LL
Y3 Y4
27.7 66.3
5.7 19.8
10 22
50 125
wn
Y5
76.3
20.5
25
150
OCR
–
1.84
0.94
1
7.5
29
ACCEPTED MANUSCRIPT Table 6 Some site investigation data for the Onsφy site (Norway). v0 (kPa)
PL (%)
LL (%)
wn (%)
p (kPa)
OCR
1.9 3.5
12.2 22.4
32.1 29.4
50.2 59.9
65.1 57.6
61.1 48.2
5.0 2.2
5.5 7.9
34.3 48.9
34.0 34.9
56.4 66.2
58.9 65.8
46.1 56.3
1.3 1.2
11.0
67.5
36.8
72.9
69.4
86.9
1.3
13.6 16.3
83.5 99.8
35.6 37.9
71.5 72.7
68.9 64.5
107.0 100.2
1.3 1.0
AC C
EP T
ED
MA
NU
SC
RI
PT
Depth (m)
Figure 1 Histograms for (Y1 , Y2 , Y3 , Y4 , Y5 ). The red lines are the fitted Johnson PDF.
30
NU
SC
RI
PT
ACCEPTED MANUSCRIPT
MA
Figure 2 Comparison between measured p with predicted p for various models (1217 data points
AC C
EP T
ED
in CLAY/10/7490).
Figure 3 Comparison between cross-validated percentage and nominal percentage (with respect to 1217 data points in CLAY/10/7490). 31
NU
SC
RI
PT
ACCEPTED MANUSCRIPT
MA
Figure 4 Comparison between measured p with predicted p for various models (216 data points
AC C
EP T
ED
in F-CLAY/7/216).
Figure 5 Comparison between validated percentage and nominal percentage (with respect to 216 data points in F-CLAY/7/216).
32
PT
ACCEPTED MANUSCRIPT
(b) F-CLAY/7/216
RI
(a) 1217 data points in CLAY/10/7490
AC C
EP T
ED
MA
NU
SC
Figure 6 OCR histograms for CLAY/10/7490 and F-CLAY/7/216
(a) Measured p versus predicted p
(b) validated % versus nominal %
Figure 7 Verification results for probabilistic transformation model developed based on 983 contractive data points (with respect to 203 contractive data points in F-CLAY/7/216).
33
SC
RI
PT
ACCEPTED MANUSCRIPT
AC C
EP T
ED
MA
NU
Figure 8 Verification results for probabilistic transformation model developed based on 235 dilative data points (with respect to 13 dilative data points in F-CLAY/7/216).
(a) Without prior knowledge
(b) with prior knowledge
Figure 9 Profiles of updated estimates for p .
34
ACCEPTED MANUSCRIPT Highlights A transformation model for preconsolidation stress based on clay index properties is
developed. The model is probabilistic because it predicts the PDF of preconsolidation stress.
Validation shows that the model outperforms previous models proposed in the literature.
Analytical forms for the model are available. Engineers do not need to derive Bayesian equations.
AC C
EP T
ED
MA
NU
SC
RI
PT
35