Model parametrization strategies for Newton-based acoustic full waveform inversion

Accepted Manuscript Model parametrization strategies for Newton-based acoustic full waveform inversion Amsalu Y. Anagaw, Mauricio D. Sacchi PII: DOI:...

Download PDF

19MB Sizes 0 Downloads 107 Views

Report

Full Text

Accepted Manuscript Model parametrization strategies for Newton-based acoustic full waveform inversion

Amsalu Y. Anagaw, Mauricio D. Sacchi PII: DOI: Reference:

S0926-9851(17)30730-9 doi:10.1016/j.jappgeo.2018.06.004 APPGEO 3532

To appear in:

Journal of Applied Geophysics

Received date: Revised date: Accepted date:

4 August 2017 16 March 2018 3 June 2018

Please cite this article as: Amsalu Y. Anagaw, Mauricio D. Sacchi , Model parametrization strategies for Newton-based acoustic full waveform inversion. Appgeo (2017), doi:10.1016/j.jappgeo.2018.06.004

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Model parametrization strategies for Newton-based acoustic Full Waveform Inversion Amsalu Y. Anagaw 1,* and Mauricio D. Sacchi 1 1

Department of Physics, University of Alberta, Edmonton, AB, Canada T6G 2E1

*

AC C

EP T

ED

MA

NU

SC

RI

PT

Corresponding author: [email protected]

ACCEPTED MANUSCRIPT Abstract

SC

RI

PT

Full waveform inversion (FWI) is an eﬃcient engine for estimating subsurface model parameters. FWI is often implemented via local optimization methods such as conjugate gradients or steepest descent. In this paper, we examine the eﬀect of model parameterization on the estimation of velocity models obtained via Newton-based optimization FWI methods. We consider three model parameterizations for acoustic FWI. These include velocity, slowness and slowness squared parameterizations. We numerically demonstrate that model parametrization signiﬁcantly inﬂuences the convergence rate of the Newton method. A high convergence rate for Newton-based FWI algorithms can be attained by using a slowness squared model parameterization. The latter is true for data with relatively high signal-to-noise ratio. However, in the case of data contaminated by high levels of noise, the slowness parameterization provides a good trade-oﬀ between the resolution of the reconstructed model parameters and convergence rate in comparison to velocity or slowness squared parameterization. Numerical results were conducted with the Marmousi velocity model to highlight our premises.

AC C

EP T

ED

MA

NU

Keywords: Full waveform inversion, Numerical modelling, Wave propagation, Optimization

ACCEPTED MANUSCRIPT 1 Introduction

NU

SC

RI

PT

The main goal of full waveform inversion (FWI) is to estimate high-resolution images of subsurface structures by minimizing the misﬁt between observed data and modelled seismograms (Lailly, 1983; Tarantola, 1984, 1987; Stekl and Pratt, 1998; Pratt et al., 1998; Operto et al., 2007; Virieux and Operto, 2009). Various iterative optimization methods based on either gradient or Newton techniques have been proposed. These techniques include steepest-descent, nonlinear conjugate gradient (NCG) (Hestenes and Stiefel, 1952), limited memory BFGS (l-BFGS) (Nocedal, 1980), Gauss-Newton (GN) and full Newton (FN) optimization (Pratt et al., 1998; Brossier et al., 2010; Hu et al., 2011; Herrmann et al., 2011; Anagaw and Sacchi). For large scale problems, the computational cost of simulating multiple sources on every iteration makes gradient-based methods a practical and popular optimization engine for FWI. However, gradient-based optimization methods are extremely slow, and ﬁnding the proper scaling operator that needs to be applied to the gradient could become a challenge. A common method to improve the rate of convergence of gradient-based optimization algorithms is to scale the gradient by the diagonal of the Pseudo-Hessian matrix (Shin et al., 2001).

EP T

ED

MA

Newton-based methods that require the Hessian matrix have fast convergence rate compared to gradient-based methods. However, the rapid convergence rate of the Newton-based approach comes at the expense of solving the inverse of a large dense Hessian matrix. An economically way of approximating the inverse of the Hessian matrix is by using the limited memory BFGS (l-BFGS) method. The l-BFGS method approximates the inverse of the Hessian matrix without explicitly computing or storing it. Instead, the l-BFGS method approximates a positive-deﬁnite Hessian matrix by a limited history of gradient vectors and model perturbation updates.

AC C

Even though Newton-based methods have faster convergence rate, the choice of model parameterization inﬂuences the structure of the Hessian matrix and thereby aﬀects the convergence rate and the quality of the reconstructed model parameters. In acoustic full waveform inversion, one often adopts a velocity parameterization. However, other types of model parameterization can also be employed such as slowness and slowness squared (Herrmann et al., 2011; Anagaw and Sacchi, 2014). These three model parameterizations do not change the shape of the misﬁt function, but they do aﬀect the gradient. Therefore, model parameterization inﬂuences the convergence rate and the accuracy of the inversion. In this paper, we study the impact of model parameterization in Newton-based acoustic full waveform inversion for velocity estimation. Our focus is on velocity model building, but the study of model parameterization is also of prominent importance for multiparameter FWI. In fact, our analysis could also be extended to, for example, velocity and density estimation. For our numerical inversion, we adopt a multiscale FWI approach in the frequency domain, where ﬁnite discrete frequencies are selected, and the inversion is carried out sequentially starting from low to high-frequency data components (Sirgue and Pratt, 2004). First, long wavelength components of the model parameters are recovered from low-frequency data, and then more details are extracted as the inversion proceeds to higher frequencies. In the

ACCEPTED MANUSCRIPT multiscale inversion technique, model parameters obtained from the inversion of lowfrequency data are used as a starting model for higher frequency data components.

PT

The paper is organized as follows. First, we describe the formulation and implementation of the Newton-based method for FWI. We use a matrix-free conjugate gradient iterative technique to approximate the solution of the Newton equations. The matrix-free Newtonbased method is formulated using the second-order Lagrangian formulation. The gradient of the objective function can be directly obtained from the ﬁrst-order adjoint-state formulation (Plessix, 2006). Then, the Singular Value Decomposition (SVD) of the Hessian matrix extracted from a small portion of the Marmousi velocity model is used to analyze model parameterization.

NU

SC

RI

The matrix-free Gauss-Newton (GN) and full Newton (FN) methods were used in our numerical experiments. Numerical experiments based on the Marmousi data set were used to highlight the inﬂuence of model parameterization in gradient-based and Newton-based methods for full waveform inversion. Finally, we address the sensitivities of the three model parameterizations to noise using the Newton-based algorithms.

2 Theory

MA

We start by considering the acoustic forward modelling problem. We consider the twodimensional acoustic wave equation in the frequency domain (Hu et al., 2009)

ED

  1   2  1  p()    p()   p()  f (), [1]  x   x  z   z  

AC C

EP T

where, κ = ρv 2 is the bulk modulus, ρ is the density, v is the velocity, p is the pressure ﬁeld, ω is the angular frequency and f(ω) is the source signature. For notational simplicity, the dependency of the variables on spatial positions are not written explicitly. After discretizing Eq. (1) via an optimal 9-point stencil staggered grid ﬁnite diﬀerence approximation (Jo et al., 1996; Hustedt et al., 2004), the acoustic wave equation for a single source and frequency, can be written in a compact matrix form as A(m, ) ps ()  f ()s ,

[2]

where A(m, )  N  N is the discrete Helmholtz operator, ps ()  N , f s ()  N , m  N is the vector of model parameters, s is the index number of the sources and N is the number of model parameters. The model m can be parameterized in terms of velocity (v), slowness (v −1 ) or slowness squared (v −2 ). Typically, Eq. (2) is solved using a direct solver based on the multi-frontal LU decomposition of the ﬁnite-diﬀerence Helmholtz operator A (Amestoy et al., 2001, 2006; Schenk and Gartner, 2004). This operator is quite sparse and, therefore, storable in memory. Once the decomposition is performed for a given frequency ω and a given background velocity, the pressure ﬁelds can be eﬃciently solved for multiple sources using forward and backward substitutions. During modelling, in order to avoid boundary reﬂections, we apply a perfectly matched layer absorbing boundary condition (Berenger,

ACCEPTED MANUSCRIPT 1994) for absorbing outgoing waves at the boundaries. The modelled data at receiver positions, d scal () , can then be computed as dscal ()  rps (), [3]

d

obs s

() 

is a receiver sampling operator that maps waveﬁelds to receiver positions, , Nr is the number of receivers.

Nr  N

where r  Nr

N N s , N r

  (d i

obs s,r

† obs cal (i )  d scal , r (i )) (d s , r (i )  d s , r (i ))  R(m),

s,r

where † is the complex conjugate transpose, R(m) :

N

[4]

RI

1 2

is a regularization term and μ is



SC

J (m) 

PT

Waveform inversion often uses the least-squares misﬁt deﬁned as the l2 norm of the residual between the observed data dobs and the modelled data dcal,

NU

the regularization parameter. The parameters Ns and Nω represent the number of sources and frequencies, respectively. For the regularization term, R(m), we use the zero-order quadratic regularization term R(m) ‖ m‖ 22 .

MA

The minimization of the misﬁt function is a nonlinear problem. The misﬁt function depends explicitly on the complex pressure ﬁeld and implicitly on the model parameters through the pressure ﬁeld, which depends on the properties of the medium. The resulting nonlinear optimization problem is minimized using a constrained optimization method J (m)

ED

argmin

[5]

m

A(m, ) ps ()

subject to

 f () s .

AC C

EP T

For gradient-based and Newton-based optimization methods, the misﬁt function, J(m), is expressed using the Lagrangian formulation (Plessix, 2006; Akcelik et al., 2002; Fichtner and Trampert, 2011; Métivier et al., 2013), where we minimize the associated Lagrangian function ( p, m, )



1 2

Nw Ns , Nr

 d i

obs s,r

(i )  W (i )rps (i )

2

R(m)

s,r

Nw , Ns

e(

  ( ), A(m,  ) p ( )  f ( ) ), s

i

i

s

i

s

i

[6]

x

i , s

where 〈, 〉x is the scalar product in space x,  s () 

N

is the vector of Lagrangian multipliers.

For each frequency at every iteration, the complex source scaling parameter, W(ωi), is estimated via the solution of a linear least-squares problem in the frequency domain (Pratt, 1999) W (i ) 

d obs (i )d cal (i )† . d cal (i )d cal (i )†

[7]

ACCEPTED MANUSCRIPT The ﬁrst-order suﬃcient criteria for an optimal solution that satisﬁes the Karush-KuhnTucker (KKT) conditions (Kuhn and Tucker, 1951) are obtained by setting  p,m,  0

m

 A* (m, ) s ()  r †d ()  0 [8a]

 m A(m, ) ps () 

*s ()  m R(m)  0

 A(m, ) ps ()  f s ()  0,

where ” * ” denotes complex conjugate,





[8b]

[8c]

A(m, )  ;   { p, m, } , m A(m, )  m 

PT

p

, △ d(ω) =

SC

RI

dobs(ω) − dcal(ω) is the data residual and ⊙ represents the Hadamard product. Eq. (8a) is the adjoint-state equation, which computes the adjoint waveﬁeld λs(ω) by back-propagating the residual waveﬁelds at the receiver locations (Plessix, 2006). Eq. (8b) is basically the gradient of the objective function which is often written as g ( g  m ). Similarly, Eq. (8c) is the forward modelling operator.

B

G   R 2 m

C

A†   p     †  C  m    0     

ED

 r †r  † B  A 

MA

NU

The optimality conditions of the KKT system in Eqs. (8a-8c) deﬁne a system of nonlinear equations. If one applies Newton’s method, hijitus.ccis.ualberta.ca ton direction is obtained by solving the following second-order KKT system of linear equations

where

EP T

G  (2m A) ps ()†s ()

  m    p

[10a]

B  (m A† ) s ()

AC C

C  (m A) ps ().

[9]

[10b] [10c]

Newton’s step is obtained by solving for △m, after eliminating the perturbation of the state waveﬁeld △p and the adjoint waveﬁeld △λ, respectively. The resulting equation leads to the following linear system of equations e[ H ] m  e[ g ],

[11]

where e represents the real part of e and H is the full Hessian matrix given by H

((m A) ps )† ( A† )1 r †rA1(m A) ps  2m R G  B† A1C  C † ( A† )1 B.

[12]

Further details on the formulation of the KKT conditions can be found in numerical optimization books (Nocedal and Wright, 2006) or in Kuhn and Tucker (1951) and Akcelik et al. (2002). The ﬁrst term on the right-hand side of Eq. (12) can be expressed in terms of the

ACCEPTED MANUSCRIPT Jacobian matrix, J, where J = A−1 (∇m A)ps. The Jacobian matrix is the ﬁrst-order derivative of the waveﬁeld with respect to model parameters. The last three terms in Eq. (12) correspond to second-order derivatives of the waveﬁelds, and are equivalent to 2m ps r † d () . A physical

RI

PT

interpretation of these terms is provided in Pratt et al. (1998). Incorporating the inverse of the Hessian matrix in the FWI optimization scheme not only boosts the convergence rate by improving the distribution of amplitudes on poorly illuminated subsurface areas but also acts as a deblurring operator; it removes artifacts that arise in the gradient vector due to multiple reﬂected waveﬁelds (Pratt et al., 1998). The second-order derivatives of the waveﬁeld, the last three terms of Eq. (12), accounts for waveﬁelds generated by multiple scattering in the subsurface. In the presence of high-velocity contrast, the eﬀect of the second-order derivatives of the waveﬁeld is signiﬁcant, and employing these terms in the numerical inversion mitigates the eﬀect of high amplitude multi-reﬂected waveﬁelds.

SC

In the Gauss-Newton approximation, data residuals are assumed to be small and the last three terms in Eq. (12) are ignored. The latter leads to the approximate Hessian [13]

NU

HGN  ((m A) ps )† ( A† )1 r †rA1(m A) ps  m2 R,

2.1 Numerical optimization

MA

where HGN represents the approximate Hessian matrix.

AC C

EP T

ED

Gradient-based minimization schemes such as the steepest descent and conjugate gradient methods (Nocedal, 1980) ignore the Hessian matrix by assuming H = I. These methods require the solutions of two forward modelling problems to compute the gradient: one for forward modelling the waveﬁeld ps(ω) and the other for estimating the adjoint state waveﬁeld λs(ω). The gradient is then computed by the product between the forward-propagated waveﬁelds and back-propagated residual (adjoint) waveﬁelds for each frequency. Interested readers can ﬁnd a detailed description of the steepest descent and conjugate gradient algorithms in Nocedal (1980) and Nocedal and Wright (2006). For our tests via gradientbased optimization, we adopted the nonlinear conjugate gradient (NCG) method with a line search that satisﬁes Wolfe conditions (Hestenes, 1969; Nocedal and Wright, 2006). Gradient-based iterative methods have been shown to converge slowly (Shin et al., 2001). On the other hand, full Newton and Gauss-Newton methods are well known for their fast convergence rate for well-posed problems. Unfortunately, the aforementioned methods are computationally expensive for large scale inverse problems. Solving the model parameters for the Newton-based algorithms require the gradient, the Hessian and the inverse of the Hessian matrix. The Hessian matrix is dense and large. In practice, the computation of the inverse of the Hessian matrix is often avoided by employing a matrix- free iterative techniques like the method of conjugate gradients (CG). In the Gauss-Newton approximation, the Hessian matrix is a positive deﬁnite matrix (see Eq. 13) whereas in the full Newton method, the real part of the full Hessian is an indeﬁnite symmetric matrix. Therefore the standard conjugate gradient method is no longer suitable for such system of linear equations (Eq.11). As a result, one needs to use the stopping criterion discussed by Eisenstat et al.

ACCEPTED MANUSCRIPT

mk 1  mk   mk ,

[14]

SC

Output: argminm J(m)

RI

Input: dobs(ω), m0, Nw, max_iter

PT

(1994) in the inner loop of the optimization algorithm (see Algorithm 1). The latter avoid solving for the negative curvature direction of Newton’s equations. Finally, we update the solution using a line search method. This iterative technique to solve Newton’s equations is called the inexact or truncated Newton method. Algorithm 1 is the pseudo-code for the inexact full Newton and Gauss-Newton methods. The stopping criterion is also required to avoid over ﬁtting the system of Newton’s equations (Nash, 2000). In general, the conjugate gradient iterative algorithm is often terminated early and the model is updated using the resulting inexact descent direction in conjunction with a line search method

Initialize parameters

NU

for wi in 1: Nw do

MA

Starting model ← mk for k in 1: max_iter do Compute:

ED

Forward waveﬁelds dcal(ω)

EP T

Residual waveﬁelds △ d(ω) Back-propagate residual waveﬁelds λ(ω) Gradient ∇mJ(mk)

AC C

Solve:

△mk = −[Re(H)]−1Re(∇mJ(mk)) solve △mk using CG method Three forward modelling operators are required to compute the action of the Hessian matrix on △m inside the CG algorithm Compute: α using line search method Update Model:

ACCEPTED MANUSCRIPT mk+1 =mk + α△mk end for end for Algorithm 1 Pseudo-code: Multiscale Newton’s method

NU

SC

RI

PT

where the scalar α is the step length obained via line search. The conjugate gradient method inside Algorithm 1 requires only the action of the Hessian on the model perturbation; matrixvector product, which can be implemented in a matrix- fee manner. For a single frequency, as described before, once the matrix A is factorized using LU decomposition, a set of solutions for multiple shots can be computed by forward and backward substitutions at a relatively low computational time. To compute the action of the Hessian on a vector, we only need to solve three forward problems and the product is computed on the ﬂy at every iteration of the conjugate gradient method. Details about the computations of the Hessian-vector products for second-order adjoint state methods can be found in Anagaw and Sacchi (2014) and Métivier et al. (2014).

2.2 Model Parametrization

A(m, ) . Furthermore the eigenvectors of the Hessian matrix in Eq. (12) are m

AC C

operator,

EP T

ED

MA

In this section, we describe three model parameterizations for 2D acoustic full waveform inversion. Model parametrization inﬂuences the convergence rate and the accuracy of reconstructed model parameters. Even though our main goal is to estimate a velocity model, one can parametrize the model parameter m in terms of velocity (v), slowness (v −1 ) or slowness squared (v −2 ). In this study, we only consider those three model parameterizations. The shape of the misﬁt function in Eq. (4) does not change by the type of the model parameterization. However, the choice of model parameterization inﬂuences the eigenvalue spectrum of the Hessian matrix and the direction of the gradient. Consequently, this also inﬂuences the convergence rate of the algorithm. The change in the direction of the gradient in Eq. (8b) comes from the scaling term induced by the partial derivatives of the Helmholtz

inﬂuenced by terms

A(m, )  2 A(m, ) and . The operator A(m, ω) is the 2D Helmholtz wave m m2

equation operator that can be expressed as follows A

 1   1  2   . x  x z  z v 2

[15]

The Helmholtz operator in Eq. (15) in its discrete form is a large sparse matrix with non-zero oﬀ-diagonal elements. While considering constant density, the computation of the derivatives of the above matrix as a function of model parameters is trivial. The ﬁrst or second derivatives have only nonzero elements along the matrix diagonal corresponding to the model parameter element m k. Table 1 summarizes the derivatives of the operator A(m, ω) with respect to the three model parameterizations studied our numerical work.

ACCEPTED MANUSCRIPT In the next section, we compare numerical solutions of estimated model parameters via Newton-based schemes where we have incorporated the Hessian matrix with scaling according to diﬀerent parameterizations.

3 Result and Discussion

SC

RI

PT

The performance of the three model parameterizations are compared numerically through a series experiments performed with the Marmousi data set. We ﬁrst consider a portion of the Marmousi velocity model. Figure 1 shows the portion of the true Marmousi velocity model (Figure 5a), which is taken approximately from lateral position 3.9 Km to 4.8 Km. The analysis focus on the eigenvalues and eigenvectors obtained from the Singular Value Decomposition (SVD) of the Hessian matrix. The model is discretized on a 41×61 grid with a uniform grid spacing size of 24 m in both vertical and horizontal directions. For the numerical forward modelling, a single source located in the middle of the survey and one grid size below from the surface was used to simulate the shot gather. A total of 36 receivers at the surface were used to record the seismic data.

AC C

EP T

ED

MA

NU

To simplify the analysis, instead of computing the full Hessian matrix, we calculate the approximate Hessian matrix in the Gauss-Newton approximation. Figures 2a-c are the approximate Hessian matrix of the model (Figure 1) at 12Hz using velocity, slowness and slowness squared model parameterizations, respectively. To compute an approximated Hessian matrix, the model in Figure 1 was ﬁrst smoothed with a 3 × 3 Gaussian 2D ﬁlter. The approximate Hessian matrix is computed via Eq. (13) after ignoring its last two terms. In the GN approximation, each element in the Hessian matrix represents the correlation between the partial derivatives of waveﬁelds between one-pixel point m j and another pixel point mi (Pratt et al., 1998). In the high-frequency limit, the partial derivatives of the waveﬁelds with respect to two diﬀerent model parameters would be uncorrelated with each other, and the Hessian matrix would be mainly diagonal. The non-zero elements of the Hessian matrix observed oﬀ the main diagonal are due to the band-limited nature of the seismic data, limited acquisition geometry and heterogeneity of the model. Due to the geometrical spreading and incomplete acquisition, the amplitude of the correlation of the partial derivative waveﬁelds decay with depth along the main diagonal (Pratt et al., 1998); see Figure 2a. Note that, the velocity model and the unknown vector of parameters are stored in matrices that were reshaped into vectors by ordering the elements in the matrix column-wise. Therefore, the elements of diagonal of the resulting discrete Hessian matrix increase blockwise with depth.

3.1 Eigenvectors and Eigenvalues To continue our analysis, we evaluate the eigenvalues and eigenvectors of the Hessian matrix. The approximate Hessian matrix is a symmetric matrix that can be factorized as follows HGN  QQT ,

[16]

ACCEPTED MANUSCRIPT where Λ and Q are the eigenvalue and eigenvector matrices of the approximate Hessian matrix HGN, respectively. Plugging Eq. (16) into Eq. (11) N

  q q g,

m

k 1

1

T k k

[17]

k

RI

PT

where qk is the k th column of the eigenvector matrix Q and σk is the corresponding eigenvalue in Λ. The model update perturbation direction can be expressed as the sum of the projected gradient along the eigenvectors qk. The direction of eigenvectors, qk, corresponding to the largest and smallest eigenvalues, σk, point in the direction of the greatest and smallest curvature of the objective function, and provide the well and poorly retrieved parts of the model parameter, respectively. The eigenvector, qk, corresponding to the largest eigenvalue, σk, provides the most importance part of the model parameters that can be retrieved by FWI.

AC C

EP T

ED

MA

NU

SC

Figure 3 shows the largest 60 eigenvalues of the Hessian matrix for the three types of model parameterizations. Figure 4 shows the ﬁrst three eigenvectors versus depth. The resulted eigenvectors were extracted from the middle part of the velocity model. In Figure 4, plots from bottom to top are the 1st, 2nd and 3rd largest eigenvectors in depth associated with the largest three eigenvalues, respectively. The red, green and blue colours represent eigenvectors for velocity, slowness and slowness squared model parameterization, respectively. As we see from the ﬁgure, unlike the eigenvectors that arise from slowness or slowness squared model parameterization, the magnitude of eigenvectors results from the velocity model parametrization decreases with depth. As a result, with the velocity model parametrization, the velocity model of very shallow parts would be properly retrieved well than the deep part of the model, where the velocities are the highest. In the case of slowness squared parameterization, the eigenvectors are almost depth independent and are strong with depth compared to eigenvectors from the slowness or velocity parameterization. Hence, the deep part of the model is properly reconstructed. The gradient vectors result from velocity, slowness and slowness squared model parameterizations are scaled by a magnitude proportional to ∼ 3ω2 ρ−1 v −3 , ω2 ρ−1 v −1 and ω2 ρ−1 , respectively. In the case of the velocity model parameterization, the gradient associated with high velocities of the deep parts of model are damped by a factor ∼ 3ω2 v −3 than the other two cases. The Hessian matrix will also be changed accordingly (see Eq. (13) and Table 1). However, the eﬀect of the unbalanced scaling amplitude observed in the velocity parameterization case is mitigated by the use of slowness or slowness squared parameterization. The use of slowness or slowness squared makes the eigenvectors of the Hessian matrix independent of depths where velocities are the highest in deeper parts, and model update perturbation direction steers in the right direction. In all these three cases, eigenvectors or perturbation updates are larger in shallow than deeper parts. In velocity parameterization, deep part velocity is larger; we need larger updates. In slowness parameterization, deep part slowness is small; we only need smaller updates. As we see from the eigenvectors of the Hessian matrix in Figure 4, the amplitude is larger in the slowness squared parametrization than slowness or velocity parametrization at the deeper depth. And hence, a more accurate reconstruction of velocity model would be achieved by the use of slowness or slowness squared parameterization in FWI algorithm. In

ACCEPTED MANUSCRIPT the next section, we demonstrate the eﬀectiveness of the three types of model parameterizations using gradient and Newton-based optimization methods. We present a series of numerical experiments of FWI on Marmousi model both on noise-free and noisy data set.

3.2 Marmousi Velocity Model

3.2.1 Numerical experiment setups

NU

SC

RI

PT

For comparison of numerical solutions computed from the three types of model parametrizations, we ﬁrst present results obtained from various numerical optimization methods using the gradient and Newton-based FWI scheme on Marmousi velocity model shown in Figure 5a. We then give more emphasis on Newton-based methods. The 2D Marmousi velocity model is discretized by 126×384 grid points with a grid on a 25×25 m regular square grid. Figure 5b is a starting velocity model used to run all the inversion presented in this paper. The starting model is obtained by smoothing the true velocity model. A good starting model for full waveform inversion is crucial to ensure convergence to a minimum, and generally, the quality of the numerical solutions obtained from FWI depended on the starting velocity model and the quality of low-frequency data content.

EP T

ED

MA

We ﬁrst generate synthetic data adopting 95 sources with an interval of 100 m and 192 receivers with an interval of 25 m. The sources are placed 25 m, one grid point below the surface, and receivers are placed at the surface. To simulate the source function, we used a Ricker wavelet with a 10 Hz central frequency. For the inversion, a set of 8 discrete frequencies were selected between 3.1 Hz and 21 Hz based on the strategy proposed by Sirgue and Pratt (2004). Such set of frequencies are (3.15, 4.70, 6.22, 7.90, 9.52, 13.00, 15.87, 20.02 Hz). The numerical inversion is then carried out in a sequential approach starting from low to high frequencies data. During inversion, we recalibrate the source wavelet signature at every iteration using Eq. (7).

AC C

We consider the following numerical optimization methods: nonlinear conjugate gradient (NCG), preconditioned nonlinear conjugate gradient (PNCG), the limited memory BFGS (lBFGS), Gauss-Newton (GN) and full Newton (FN) methods. Results from the gradient-based techniques are merely presented for comparison with Newton’s method. For the preconditioned NCG, the diagonal of the Hessian matrix in the GN approximation is taken as a preconditioner. To fairly compare solutions obtained from these various optimization methods, for all numerical inversion presented in this paper, we run the inversion approximately to have the same forward modelling solved. In the numerical inversion, for each iteration, on an average approximately 4-5 forward modelling are solved. For each frequency in the FN and GN methods, we run up to a maximum of 12 GN or FN iterations. Inside the GN or FN loop, for every iteration, solutions of two forward modelling are required to compute the action of the Hessian on the model perturbation. With this, in the outer loop, we adjust the number of iterations required to run the inversion for all algorithms to have roughly similar computational time. In addition, we also impose stopping criteria proposed by Eisenstat et al. (1994) for solving the inner GN and FN system equations. A

ACCEPTED MANUSCRIPT maximum of 25 iterations per frequency is used for NCG, PNCG and l-BFGS methods. If the iterative process doesn’t converge for the given frequency, the inversion process will proceed to the next frequency data.

3.2.2 Numerical results

SC

RI

PT

Figures 6a-i are the reconstructed velocity models after the end of the 1 st frequency data using l-BFGS, GN and FN methods. From top to bottom are the reconstructed velocity models using velocity (v), slowness (v −1 ) and slowness squared (v −2 ) parameterization, respectively. On the right from top to bottom (a)-(c) are results obtained using l-BFGS, centre (d)-(f) are results using GN method, and on the left (g)-(i) are results obtained using FN method. As we see from these reconstructed velocity models obtained from all the Newton-based methods, most of the long wavelength structures of the velocity model are suitably resolved by slowness or slowness squared parameterization.

MA

NU

The deep parts of the model, where velocities are the highest, are well more adequately estimated for slowness or slowness squared parameterization than for velocity parameterization. Thus, the FWI solutions using slowness and slowness squared parameterizations have a more balanced amplitude response in the shallow and deep parts of the model. These results coincide with the analysis carried out via the eigendecomposition of the Hessian matrix. From our numerical results with Newton-based optimization methods, we conclude that the GN and FF methods attain better reconstruction than the l-BFGS method (Figures 6).

AC C

EP T

ED

Figure 7 show reconstructed velocity models at the end of all data frequency components for velocity, slowness and slowness squared parameterization (from left to right), respectively. Each ﬁgure contains reconstructed models estimated by NCG, PNCG, l-BFGS, GN and FN methods. As we see from the results obtained from diﬀerent optimization methods, except the PNCG all the methods are struggling to retrieve the deep part of the model when FWI is parametrized with velocity model. The main reason for the better result in the deeper parts of the model in the case of the PNCG FWI is that its gradient is scaled by the inverse of the diagonal of the Hessian matrix. However, the shallow parts of the model are well retrieved by the NCG, GN and FN methods, and by PNCG as well. The l-BFGS algorithm for this numerical experiment setup in the velocity parametrization of FWI fails to converge to the local minima. On the other hand, when slowness or slowness squared parametrization is employed, the deep parts and all features of the original Marmousi velocity model are properly reconstructed. Though a high-resolution model is obtained by employing a slowness squared parametrization. The reconstructed velocity model obtained by using slowness squared parametrization in FWI optimization scheme provides an accurate velocity model than the other two model parameterizations. All features of the original Marmousi velocity model are reconstructed in the correct position, and better focusing is achieved by using slowness squared parametrization(see Figure 7 on the left column). To further examine the inﬂuence of the three model parameterizations, we compare the vertical velocity proﬁle of the results obtained from all numerical optimization methods

ACCEPTED MANUSCRIPT investigated in this paper. Figures 8, 9 and 10 show the velocity proﬁles extracted at (2.25 km, 0 km) and (3.5 km, 0 km), respectively. These results highlight once again that the velocity proﬁles reconstructed both on the shallow and deep parts of the model are fairly well resolved when slowness or slowness squared parameterization is employed. In the case of the velocity model parameterization, almost all methods fail to recover the deep parts of the model.

3.2.3 Convergence behaviour

NU

SC

RI

PT

The eﬃciency optimization engine is critical for any full waveform inversion practical applications for the large-scale geophysical inverse problem. In this paper, we compare the convergence properties of all optimization methods based on the three types of model parameterization employed in the numerical inversion scheme of FWI. Figures 11 shows the convergence rate of the nonlinear conjugate gradient, the preconditioned nonlinear conjugate gradient, the limited memory BFGS, the Gauss-Newton and full Newton methods at 4.70 Hz. In the plots, in each frequency, the misﬁt function is normalized by the misﬁt function value at the ﬁrst iteration. Figures 11a, b and c are the converges rate of the misﬁt function using the velocity, the slowness and the slowness squared model parametrization, respectively.

AC C

EP T

ED

MA

As we see from the convergence rate of gradient and Newton-based methods, in the case of velocity parameterization the PNCG method converges faster than the other methods because it updates the deeper parts of the model and scales the amplitude of the model perturbation in the deeper parts. Hence, the optimization scheme updates the deeper and shallow parts properly well as compared to the other methods, and the inversion proceeds in the right direction faster. Whereas in the Newton-based methods, as a result of the lack of proper scaling in the shallow and deep parts of the model and the failure in search direction for updating the model parameters, the optimization scheme does not provide a satisfactory inversion result (see the ﬁrst column in Figures 7). But a much-improved rate of convergence can be achieved by the use of slowness or slowness squared parameterization as these model parameterizations provide proper scaling of the descent direction both in the deeper and shallow parts of the model. Compared to the velocity parameterization method, the performance of GN and FN methods shown in Figures 11 are superior in the case of slowness or slowness model parameterization. The converges rate of NCG is slow in all three types of model parameterizations, and hence requires many numbers of iterations and forward modelling to achieve the desired solution. To have a clear understanding of the inﬂuence of the three types of model parameterization on convergence rates of the Newton-based optimization methods such as the GN and FN methods, we plotted them in a separate ﬁgure shown in Figures 12. On the left, from top to bottom are misﬁt functions for 4.70 Hz and 20.02 Hz data sets using Gauss-Newton method, respectively. And on the right from top to bottom are misﬁt functions using full Newton optimization method. The results in the ﬁgure demonstrate that a faster reduction in the misﬁt function using either GN or FN method would be achieved by employing slowness squared parameterization than from slowness or velocity type model parameterization. From the eigenvalue and eigenvector analysis, we see that the slowness squared parameterization better

ACCEPTED MANUSCRIPT

PT

scales the deep part of the model than slowness parameterization, and also by far corrects the model perturbation update than the velocity model parameterization. During the early iterations, the inversion has more velocity updates in the deep parts, and as result, rapid reduction in the misﬁt is achieved by the slowness squared parameterization. From our numerical results, we conclude that velocity estimation methods that employ FWI based on the GN and FN optimization methods should be carried out using slowness or slowness squared parameterization. The use of slowness squared parameterization can achieve superior convergence rates of Gauss-Newton and full Newton methods. In next section, we address the noise sensitivity of the three types of model parameterizations in the Gauss-Newton and full Newton methods.

RI

3.3 Noisy data set

MA

NU

SC

In this section, we test the Gauss-Newton and full Newton minimization methods for the three types model parameterizations with data contaminated by noise. First, a time-domain synthetic data were generated, and random noise with SNR=10 and SNR=5 contaminates the data set. Figures 13 and 14 depict the reconstructed velocity models obtained by GN and FN methods from noisy data with SNR=10 and SNR=5, respectively. On the right, Figures 13a, b and c are reconstructed velocity models using the GN method with the velocity, the slowness and the slowness squared parameterization, respectively. On the left, Figures 13d, e and f are reconstructed velocity models using FN method with the velocity, the slowness and the slowness squared parameterization, respectively; and the same goes for Figure 14.

AC C

EP T

ED

It is clear from our results that the numerical inversion via GN and FN methods were able to reproduce models that are comparable to the original velocity model. For data with high noise level, SNR=5, more artifacts in the reconstructed velocity model are observed in the deep parts of the model when slowness squared parameterization is employed. For high noisy data sets, the velocity estimation that employs FWI is less sensitive to slowness parameterization than slowness squared parameterization. Better result regarding resolution and quality of the inverted model can be achieved by using slowness parameterization. To quantify our analysis, we compute Q  10log10[|| mo ||2 / || mo  m ||2 ] , a quality of the reconstruction, as our comparison metric, where mo and m are the true velocity model and reconstructed velocity model, respectively. A high Q value corresponds to a more accurate solution (see Table 2). These quality factors in Table 2 demonstrates that, for a high noisy data set, the Newtonbased full waveform inversion should be carried out using slowness model parameterization. And for clean, which probably unlikely in real practical world and less noisy data set, a slowness squared parameterization is the best model parameterization to employ in the FWI optimization scheme in terms of quality and eﬃciency of FWI. Eigenvalues of the Hessian matrix are very sensitive to noise and a high signal to noise ratio is needed for retrieving relivable model parameters.

ACCEPTED MANUSCRIPT 4 Conclusions

NU

SC

RI

PT

We examined the eﬀect of model parameterization on 2D acoustic FWI velocity model building using gradient and Newton-based optimizations. Velocity, slowness, and slowness squared model parameterizations were considered. In the Newton’s based algorithm, the full waveform inversion is formulated and implemented in a matrix-free algorithm. The use of the Hessian matrix in the FWI numerical algorithm plays a crucial role in improving resolution and amplitude response of deeper parts of the velocity model. We analyze the inﬂuence of three model parameterizations on the behaviour of the Hessian matrix using the eigenvalue decomposition of the Hessian matrix. Analyzing eigenvectors of the Hessian matrix provides insights into the inﬂuence of model parameterization on FWI. An eigenvector analysis of the Hessian matrix shows that slowness or slowness squared model parameterization makes the eigenvectors almost depth independent as compared to that of the velocity parameterization. As a result, the deeper parts of the model can be properly reconstructed when one adopt slowness or slowness square parameterization. In Newton-based optimization methods (GN and FN), faster convergence rate can be attained by using slowness or slowness squared parameterization than a velocity parameterization.

EP T

ED

MA

The series of numerical experiments that were carried out in the preparation of this work suggest that velocity estimation methods that employ FWI with Gauss-Newton or full Newton optimization should be carried out adopting a slowness or a slowness squared parameterization. For noise-free data or data with high signal-to-noise ratio, we notice a signiﬁcant improvement of the convergence of Gauss-Newton and full Newton methods when one adopts a slowness squared parameterization. On the other hand, the slowness squared model parameterization can be sensitivity to noise. A slowness model parameterization should be preferred in situations where the data are contaminated by a nonnegligible amount of noise.

5 Acknowledgements

AC C

We wish to thank the sponsors of the Signal Analysis and Imaging Group (SAIG) at the University of Alberta for ﬁnancial support.

ACCEPTED MANUSCRIPT Table 1 Model parametrization m. Note that derivatives of the matrix A lie on the diagonal. Model Parametrization Velocity Slowness Slowness squared

v v−1 v−2

 m j Aik

 2m jl Aik

− 3ω2ρ−1v−3δijδjk ω2ρ−1v−1δijδjk ω2ρ−1δijδjk

6ω2ρ−1v−4δijδjkδilδlk ω2ρ−1δijδjkδilδlk 0

SNR =5 FN 36.18 38.61 40.10

GN 36.27 38.53 35.23

AC C

EP T

ED

MA

NU

GN Velocity 36.21 Slowness 39.48 Slowness squared 40.25

RI

SNR =10

SC

Model Parameter

PT

Table 2 Quality of the reconstructed Marmousi velocity model obtained from the three types model parameterization from noisy data set using Gauss-Newton (GN) and full Newton (FN) methods.

FN 36.20 38.36 37.25

ACCEPTED MANUSCRIPT References Akcelik, V., Biros, G., Ghattas, O., 2002. Parallel multiscale Gauss-Newton-Krylov Methods for inverse wave propagation , 41 doi:10.1109/SC.2002.10002. Amestoy, P., Duﬀ, I., L’Excellent, J., Koster, J., 2001. A fully asynchronous multifrontal solver using distributed dynamic scheduling. SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS 23, 15–41. doi:{10.1137/ S0895479899358194}.

RI

PT

Amestoy, P., Guermouche, A., L’Excellent, J., Pralet, S., 2006. Hybrid scheduling for the parallel solution of linear systems. PARALLEL COMPUTING 32, 136–156. doi:{10.1016/j.parco.2005.07.004}. 3rd Workshop on Parallel Matrix Algorithms and Applications (PMAA 2004), CIRM, Luminy, FRANCE, OCT 20-22, 2004.

NU

SC

Anagaw, A.Y., Sacchi, M.D., . Full waveform inversion with simultaneous sources using the full Newton Method. chapter 531. pp. 1–5. URL: http://library.seg.org/doi/abs/10.1190/segam2012-1432.1, doi:10.1190/segam2012-1432.1, arXiv:http://library.seg.org/doi/pdf/10.1190/segam2012-1432.1.

MA

Anagaw, A.Y., Sacchi, M.D., 2014. Comparison of multifrequency selection strategies for simultaneous-source full-waveform inversion. GEOPHYSICS 79, R165--R181. URL: http://dx.doi.org/10.1190/geo2013-0263.1, doi:10.1190/geo2013-0263. 1, arXiv:http://dx.doi.org/10.1190/geo2013-0263.1.

ED

Berenger, J., 1994. A perfectly matched layer for the absorption of electromagnetic-waves. Journal Of Computational Physics 114, 185--200.

EP T

Brossier, R., Operto, S., Virieux, J., 2010. Which data residual norm for robust elastic frequency-domain full waveform inversion? GEOPHYSICS 75, R37--R46. URL: http://library.seg.org/doi/abs/10.1190/1.3379323, doi:10.1190/1.3379323, arXiv:http://library.seg.org/doi/pdf/10.1190/1.3379323.

AC C

Eisenstat, S.C., Walker, H.F., Eisenstatt, S.C., Homer, Walker, F., 1994. Choosing the forcing terms in an inexact newton method. SIAM J. Sci. Comput 17, 16--32. Fichtner, A., Trampert, J., 2011. Hessian kernels of seismic data functionals based upon adjoint techniques. Geophysical Journal International 185, 775--798. doi:{10.1111/j.1365246X.2011.04966.x}. Herrmann, F.J., Li, X., Aravkin, A.Y., van Leeuwen, T., 2011. A modified, sparsity promoting, Gauss-Newton algorithm for seismic waveform inversion, in: Papadakis, M and VanDeVille, D and Goyal, VK (Ed.), WAVELETS AND SPARSITY XIV, SPIE. doi:{10.1117/12.893861}. Conference on Wavelets and Sparsity XIV, San Diego, CA, AUG 21-24, 2011. Hestenes, M., 1969. Multiplier and gradient methods. Journal of Optimization Theory and Applications 4, 303--320. URL: http://dx.doi.org/10.1007/BF00927673, doi:10.1007/BF00927673.

ACCEPTED MANUSCRIPT Hestenes, M.R., Stiefel, E., 1952. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards 49, 409--436. Hu, W., Abubakar, A., Habashy, T.M., 2009. Simultaneous multifrequency inversion of fullwaveform seismic data. Geophysics 74, R1--R14. URL: http://link.aip.org/link/?GPY/74/R1/1, doi:10.1190/1.3073002.

PT

Hu, W., Abubakar, A., Habashy, T.M., Liu, J., 2011. Preconditioned non-linear conjugate gradient method for frequency domain full-waveform seismic inversion. Geophysical Prospecting 59, 477--491. URL: http://dx.doi.org/10.1111/j.1365-2478.2010.00938.x, doi:10.1111/j.1365-2478.2010.00938.x.

SC

RI

Hustedt, B., Operto, S., Virieux, J., 2004. Mixed-grid and staggered-grid finite-difference methods for frequency-domain acoustic wave modelling. Geophysical Journal International 157, 1269--1296. URL: http://gji.oxfordjournals.org/content/157/3/1269.abstract, arXiv:http://gji.oxfordjournals.org/content/157/3/1269.full.pdf+html.

MA

NU

Jo, C., Shin, C., Suh, J., 1996. An optimal 9-point, finite-difference, frequency-space, 2-d scalar wave extrapolator. GEOPHYSICS 61, 529--537. URL: http://library.seg.org/doi/abs/10.1190/1.1443979, doi:10.1190/1.1443979, arXiv:http://library.seg.org/doi/pdf/10.1190/1.1443979. Kuhn, H.W., Tucker, A.W., 1951. Nonlinear programming , 481--492.

ED

Lailly, 1983. The seismic inverse problem as a sequence of before-stack migrations. Conference on Inverse Scattering: Theory and Application .

EP T

Métivier, L., Bretaudeau, F., Brossier, R., Operto, S., Virieux, J., 2014. Full waveform inversion and the truncated newton method: quantitative imaging of complex subsurface structures. Geophysical Prospecting 62, 1353--1375. URL: http://dx.doi.org/10.1111/13652478.12136, doi:10.1111/1365-2478.12136.

AC C

Métivier, L., Brossier, R., Virieux, J., Operto, S., 2013. Full waveform inversion and the truncated newton method. SIAM Journal on Scientific Computing 35, B401--B437. URL: http://dx.doi.org/10.1137/120877854, doi:10.1137/120877854, arXiv:http://dx.doi.org/10.1137/120877854. Nash, S.G., 2000. A survey of truncated-newton methods. Journal of Computational and Applied Mathematics 124, 45 -- 59. URL: http://www.sciencedirect.com/science/article/pii/S037704270000426X, doi:http://dx.doi.org/10.1016/S0377-0427(00)00426-X. numerical Analysis 2000. Vol. IV: Optimization and Nonlinear Equations. Nocedal, J., 1980. Updating Quasi-Newton matrices with limited storage. Mathematics of Computation 35, pp. 773--782. URL: http://www.jstor.org/stable/2006193. Nocedal, J., Wright, S.J., 2006. Numerical Optimization. 2nd ed., Springer.

ACCEPTED MANUSCRIPT Operto, S., Virieux, J., Amestoy, P., L’Excellent, J.Y., Giraud, L., Ali, H.B.H., 2007. 3d finite-difference frequency-domain modeling of visco-acoustic wave propagation using a massively parallel direct solver: A feasibility study. Geophysics 72, SM195--SM211. URL: http://link.aip.org/link/?GPY/72/SM195/1, doi:10.1190/1.2759835. Plessix, R.E., 2006. A review of the adjoint-state method for computing the gradient of a functional with geophysical applications. Geophysical Journal International 167, 495--503. doi:{10.1111/j.1365-246X.2006.02978.x}.

PT

Pratt, G., Shin, C., Hicks, 1998. Gauss-Newton and full Newton methods in frequency{space seismic waveform inversion. Geophysical Journal International 133, 341--362.

SC

RI

Pratt, R.G., 1999. Seismic waveform inversion in the frequency domain, part 1: Theory and verification in a physical scale model. Geophysics 64, 888--901. URL: http://link.aip.org/link/?GPY/64/888/1, doi:10.1190/1.1444597.

NU

Schenk, O., Gartner, K., 2004. Solving unsymmetric sparse systems of linear equations with PARDISO. Future Generation Computer Systems 20, 475--487. doi:{10.1016/j.future.2003.07. 011}. International Conference on Computational Science (ICCS 2002), Amsterdam, Netherlands, APR 21-24, 2002.

MA

Shin, G., Jang, S., Min, D., 2001. Improved amplitude preservation for prestack depth migration by inverse scattering theory. Geophysical Prospecting 49, 592--606.

ED

Sirgue, L., Pratt, R., 2004. Efficient waveform inversion and imaging: A strategy for selecting temporal frequencies. Geophysics 69, 231--248. doi:{10.1190/1.1649391}.

EP T

Stekl, I., Pratt, R.G., 1998. Accurate viscoelastic modeling by frequency-domain finite differences using rotated operators. Geophysics 63, 1779{1794. Tarantola, A., 1984. Linearized inversion of seismic-reflection data. Geophysical Prospecting 32, 998--1015.

AC C

Tarantola, A., 1987. Inverse problem theory. Methods for data fitting and model parameter estimation. Amsterdam: Elsevier. Virieux, J., Operto, S., 2009. An overview of full-waveform inversion in exploration geophysics. Geophysics 74, WCC1--WCC26. doi:{10. 1190/1.3238367}.

ACCEPTED MANUSCRIPT Figure 1 Portion of Marmousi velocity Model (Figure 5a) from 3.9Km to 4.8Km. Figure 2 The approximate Hessian matrix from the three choices of model parameterizations from a single source at 12 Hz. Approximate Hessian matrix using velocity (v) (a), slowness (v −1 ) (b) and square of slowness (v −2 ) (c) parameterization. Figure 3 The ﬁrst 60 largest eigenvalues of the Hessian matrix in Figure 2 for the three types of model parameterizations; velocity (red), slowness (green) and slowness squared (blue).

RI

PT

Figure 4 The ﬁrst three eigenvectors of the approximate Hessian matrix against depth corresponding to the largest eigenvalues of the Hessian matrix using velocity (red), slowness (green) and slowness squared (blue) parametrization. From bottom to up shown are the 1st, 2nd and 3rd largest eigenvectors.

SC

Figure 5 True Marmousi velocity model (a) and smooth velocity model used as starting model for inversion (b).

MA

NU

Figure 6 l-BFGS, Gauss-Newton and full Newton FWI. Reconstructed velocity models after the end of the 1st frequency data component inversion. From top to bottom are the reconstructed velocity models using velocity (v), slowness (v −1 ) and slowness squared (v −2 ) parameterization, respectively. On the right from top to bottom (a), (b) and (c) are results obtained using l-BFGS. In the centre (d), (e) and (f) using Gauss-Newton. On the left (g), (h) and (i) are results obtained using full Newton method.

EP T

ED

Figure 7 Reconstructed velocity models after the end of all frequency data components. Reconstructed models from left to right column are with velocity (v), slowness (v −1 ) and slowness squared (v −2 ) model parameterizations, respectively. From top to bottom: (a) nonlinear conjugate gradient (NCG), (b) preconditioned nonlinear conjugate gradient (PNCG), (c) limited memory l-BFGS, (d) Gauss-Newton (FN) and (e) full Newton (FN) methods.

AC C

Figure 8 Comparison of vertical velocity proﬁles of the reconstructed velocity models obtained from diﬀerent optimization methods using velocity model parametrization. The depth velocity proﬁles are extracted from Figure 7 at (2.25 km,0 km) (a) and (3.50 km,0 km) (b). Figure 9 Comparison of vertical velocity proﬁles of the reconstructed velocity models obtained from diﬀerent optimization methods using slowness parametrization. The depth velocity proﬁles are extracted from Figure 7 at (2.25 km,0 km) (a) and (3.50 km,0 km) (b). Figure 10 Comparison of vertical velocity proﬁles of the reconstructed velocity models obtained from diﬀerent optimization methods using slowness squared parametrization. The depth velocity proﬁles are extracted from Figure 7 at (2.25 km,0 km) (a) and (3.50 km,0 km) (b).

ACCEPTED MANUSCRIPT Figure 11 Relative data misﬁt reduction of the Marmousi velocity model for diﬀerent optimization methods using three types of model parametrizations: velocity (v) (a), slowness (v −1 ) (b) and square of slowness (v −2 ) (c) at 4.70 Hz data set.

PT

Figure 12 Data misﬁt reduction of the Marmousi velocity model for three types of model parametrizations: velocity (v), slowness (v −1 ) and slowness squared (v −2 ). On the left, from top to bottom are misﬁt functions for 4.70 Hz and 20.02 Hz data sets using Gauss-Newton method, and on the right from top to bottom are misﬁt functions using full Newton optimization method.

SC

RI

Figure 13 Gauss-Newton and full Newton FWI. Reconstructed velocity models after the end of all frequency data component. From top to bottom are the reconstructed velocity models using velocity (v), slowness (v −1 ) and slowness squared (v −2 ) parameterization, respectively. On the right from top to bottom (a), (b) and (c) are results obtained using Gauss-Newton and on the left (d), (e) and (f) are results obtained using full Newton method obtained from noisy data with SNR=10.

AC C

EP T

ED

MA

NU

Figure 14 Gauss-Newton and full Newton FWI. Reconstructed velocity models after the end of all frequency data component. From top to bottom are the reconstructed velocity models using velocity (v), slowness (v −1 ) and slowness squared (v −2 ) parameterization, respectively. On the right from top to bottom (a), (b) and (c) are results obtained using Gauss-Newton and on the left (d), (e) and (f) are results obtained using full Newton method obtained from noisy data with SNR=5.

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Figure 11

Figure 12

Figure 13

Figure 14

Model parametrization strategies for Newton-based acoustic full waveform inversion

Model parametrization strategies for Newton-based acoustic full waveform inversion

Recommend Documents