Neural networks in auroral data assimilation

Neural networks in auroral data assimilation

ARTICLE IN PRESS Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243– 1250 Contents lists available at ScienceDirect Journal of Atmo...

473KB Sizes 0 Downloads 107 Views

ARTICLE IN PRESS Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243– 1250

Contents lists available at ScienceDirect

Journal of Atmospheric and Solar-Terrestrial Physics journal homepage: www.elsevier.com/locate/jastp

Review article

Neural networks in auroral data assimilation Fabrı´cio P. Ha¨rter a,b,c,, Haroldo F. de Campos Velho a,b,c, Erico L. Rempel a,b,c, Abraham C.-L. Chian a,b,c a b c

University of Waterloo (UofW), Waterloo, ON, Canada 2NL 3G1 ˜o Jose´ dos Campos, SP, 12227-010, Brazil National Institute for Space Research (INPE), Sa ´utica, Prac- a Marechal Eduardo Gomes, 50, CEP 12228-900, Sa ´gico de Aerona ˜o Jose´ dos Campos, Brazil Instituto Tecnolo

a r t i c l e in fo

abstract

Article history: Received 14 November 2006 Received in revised form 14 February 2008 Accepted 23 March 2008 Available online 18 April 2008

Data assimilation is an essential step for improving space weather forecasting by means of a weighted combination between observational data and data from a mathematical model. In the present work data assimilation methods based on Kalman filter (KF) and artificial neural networks are applied to a three-wave model of auroral radio emissions. A novel data assimilation method is presented, whereby a multilayer perceptron neural network is trained to emulate a KF for data assimilation by using cross-validation. The results obtained render support for the use of neural networks as an assimilation technique for space weather prediction. & 2008 Elsevier Ltd. All rights reserved.

Keywords: Auroral radio emissions Nonlinear dynamics Chaos Data assimilation Kalman filter Neural networks

Contents 1. 2. 3.

4. 5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nonlinear coupled wave equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1. Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2. Artificial neural network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1. Introduction Space weather research is the study of the disturbances in the space environment, usually caused by the solar

 Corresponding author at: University of Waterloo (UofW), Waterloo, ON, Canada 2NL 3G1. Tel.: +55 61 3343 1779; fax: +55 61 3343 1619. E-mail address: [email protected] (F.P. Ha¨rter).

1364-6826/$ - see front matter & 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.jastp.2008.03.018

1243 1244 1245 1245 1246 1246 1248 1249 1249

activity and/or interactions of interstellar medium and galactic cosmic rays with the heliosphere. Due to the potential impact of space weather on technological systems, as well as on human health (Chian, 2003), space weather forecasting is today an essential task. One of the main problems in forecasting is the chaotic nature of the mathematical models. Nonlinear and chaotic phenomena represented by mathematical models have an intrinsic relationship with the initial conditions (ICs). Therefore,

ARTICLE IN PRESS 1244

¨rter et al. / Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243–1250 F.P. Ha

from very small discrepancies between two similar ICs, after some time-steps, a disagreement could occur for some systems. In other words, sensitive dependence on the IC could cause the forecasting error to grow exponentially fast with the integration time (Grebogi et al., 1987). This implies that a better representation for the initial condition will produce a better prediction. The problem for estimating the initial condition is so complex and important that it constitutes a science called Data Assimilation (Daley, 1993; Kalnay, 2003). Nowadays data assimilation is a research topic in some of the areas of applied physics, such as meteorology (Miller et al., 1994) and ionospheric weather (Schunk et al., 2004; Scherliess et al., 2004; Hajj et al., 2004). In addition to the forecasting error associated with uncertainties due to the chaotic systems, there are errors of representativeness from the observational system and also, errors due to modeling and from the numerical approximation of the physical system. Data assimilation is a sophisticated way for combining data from a mathematical model and observations. Many methods have been developed for data assimilation (Daley, 1993; Kalnay, 2003). They have different strategies to combine numerical forecasting and observations, using Kalman filters (KFs) or a variational approaches. The use of artificial neural network (ANN) for data assimilation is a very recent issue, with applications found only in meteorology. ANNs were suggested as a possible technique for data assimilation by Hsieh and Tang (1998), but the first implementation of the ANNs as a new approach for data assimilation was employed by Nowosad et al. (2000a). ANNs have also been used by Liaqat et al. (2001) and Tang and Hsieh (2001) for data assimilation, where the ANN is used to represent an unknown equation in the mathematical equation system of the model. The technique developed by Nowosad et al. (2000a) is quite different from the latter two, producing analysis (assimilated) data as an output of the ANN—being the inputs the model data and observations. An ANN is an arrangement of units characterized by a large number of very simple neuron-like processing units; a large number of weighted connections between the units, where the knowledge of a network is stored; and highly parallel distributed control. Two distinct phases can be devised while using an ANN: the training phase (learning process) and the run phase (activation). The training phase consists of an iterative process for adjusting the weights for the best performance of the network in establishing the mapping of many input/target vector pairs. Once trained, the weights are fixed and new inputs can be presented to the network, which calculates the corresponding outputs based on what had been learned. In this paper, a multilayer perceptron neural network (MLP-NN) (Haykin, 1994; Nowosad, 2001) is trained to emulate a KF-based data assimilation system. This novel data assimilation strategy is applied to a three-wave model of auroral radio emissions near the electron plasma frequency involving resonant interactions of Langmuir, Alfve´n, and whistler waves (Chian et al., 1994, 2002; Rempel et al., 2003). Observational evidence of auroral

radio emission and nonlinear coupling between Langmuir, Alfve´n, and whistler waves have been obtained in rocket experiments in the Earth’s auroral plasmas (Boehm et al., 1990). These auroral whistler waves may explain the leaked AKR (auroral kilometric radiation), providing the radio signatures of solar–terrestrial connection, and may be used for monitoring space weather from the ground. 2. Nonlinear coupled wave equations It has been shown that the nonlinear coupling of Langmuir waves with low-frequency magnetic field fluctuations such as Alfve´n waves, may provide an efficient mechanism for generating magnetospheric radio waves (Chian et al., 2002; Rempel et al., 2003). In this section, we present a nonlinear dynamical model of threewave interactions involving Langmuir, Alfve´n, and whistler waves in the planetary magnetospheres. Consider the nonlinear parametric interaction of Langmuir (L), whistler (W), and Alfve´n (A) waves, all propagating along the ambient magnetic field B ¼ B0 z^ . We assume the following phase-matching condition, oL  oW þ oA , kL ¼ kW þ kA . Then, following Lopes and Chian (1996), by starting with a two-fluid plasma description, separating the physical variables in two time scales, and considering slow spatiotemporal modulations of wave fields, the set of coupled wave equations governing the nonlinear interaction L Ð W þ A is given by (see also Chian et al., 2002; Rempel et al., 2003) A_ L ¼ nL AL þ AW AA , A_ W ¼ nW AW  AL A ,

(2)

A_ A ¼ idAA þ nA AA  AL AW ,

(3)

A

(1)

where Aa is the normalized slowly varying complex envelope of the wave electric field, and a ¼ ðL; W; AÞ, d is a parameter accounting for the frequency mismatch among the three waves and it is set as d ¼ 2, and na are the normalized growth/damping parameters. The dot denotes derivative with respect to the phase variable t ¼ kðz  vtÞ, v and k are arbitrary wave velocity and wave vector, respectively. We set nL  1 (linearly unstable L wave) and nW ¼ nA  n (linearly damped W and A waves). The dynamics of the system described by Eqs. (2)–(4) is extremely sensitive to small variations on the control parameter n. Chian et al. (2002) have performed a detailed analysis on this system, using a bifurcation diagram to study the evolution from periodic to chaotic and intermittent regimes. Routes to chaos such as doubling cascades period and saddle-node bifurcations, as well as global bifurcations are found in a small range of n. In the present paper we choose n ¼ 28:14 for periodic regime and n ¼ 28:128 for chaotic regime. A prediction system based on a mathematical model is only an approximation of the real world. Therefore, the forecasting process will be conducted to follow far from the true response of the dynamical system. We can force the mathematical system to go back following the dynamics of the real world, inserting observation data (obtained from measurements from the natural system)

ARTICLE IN PRESS ¨rter et al. / Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243–1250 F.P. Ha

3. Methodology

Data Insertion without Assimilation Procedure − γ = 28.128 Model Corrupted Model

Aw (t)

150

100

50

0 2.5

2.55

2.6

2.65 t (s)

2.7

2.75

2.8

Noise added to the model to creat synthetic data − γ = 28.128 0.02 0.015 0.01 Aw (t)

0.005 0

In recent years data assimilation has become crucial in forecasting, with applications to meteorology (Miller et al., 1994), magnetosphere (Garner et al., 1999), ionosphere (Guo et al., 2003; Hajj et al., 2004; Scherliess et al., 2004; Schunk et al., 2004), and solar wind (Arge et al., 2004). Several methods have been developed for data assimilation, with different strategies to combine numerical forecasting and observations. The extended KF (Jazswinski, 1970) is the gold standard for data assimilation (see Kalnay, 2003, p. 179). The KF provides a recursive solution of the least-squares method. For linear problems the KF provides the Best Linear Unbiased Estimate (BLUE) of the state, and the extended KF provides suboptimal solutions for nonlinear cases. Despite its accuracy, the KF is computationally expensive and simpler strategies are being investigated. As mentioned before, ANNs have been suggested as an alternative tool for data assimilation (Hsieh and Tang, 1998; Nowosad et al., 2000b; Liaqat et al., 2001). From the mathematical point of view, the assimilation process can be represented by a weighted balance between the forecasting value and the observation. This can be expressed by xa ¼ xf þ Wp½yo Hðxf Þ,

−0.005 −0.01 −0.015 −0.02 0

1

2

3

4

5

1245

6

t (s) Fig. 1. (a) Shock illustration: data inserted without assimilation technique with frequency of five Dt for short time period; (b) noise added to the model to create synthetic data.

into the mathematical prediction model. However, when noisy data (a permanent feature in the observations) are inserted in the numerical model, without a suitable assimilation technique, the forecasting simply loses the dynamics of the process. Fig. 1 illustrates the insertion for the variable Aw in the three-wave model. In this figure, the observed data is inserted in the mathematical model without assimilation technique. Here, the synthetic observations are generated by addition of random noise with small level in the exact 5 value: AObs , and r n is a w ðt n Þ ¼ Aw ðt n Þ þ lr n , where l ¼ 10 random value at time t n . The continuous line represents the reference model and the dashed-dot line represents the shock1 created in the forecasting due to the noisy data inserted in the numerical model without any assimilation technique. Clearly, it is noted that the dynamics of the system is lost, even with a small difference in the IC.

1 Meteorologists use the word shock to point out the dynamical equilibrium broken by the data insertion.

(4)

where xa is the value of the analysis (estimated); xf is the forecasting (from the mathematical model); W is the weighting matrix (most of the effort of the assimilation techniques falls in the computation of this matrix, that it is generally calculated from the covariance matrix of the prediction errors from forecasting and observation); yo denotes the observation; H represents the data observation system; fyo  Hðxf Þg is the innovation; and p½ is a discrepancy function. For the KF the matrix W is the Kalman gain, and p½x ¼ x. Data assimilation by ANN is not a weighted combination similar to Eq. (4). The analysis is obtained from a nonlinear mapping: xa ¼ f ANN ðyo ; xf Þ.

(5)

3.1. Kalman filter Starting from a prediction model (subscript n denotes discrete time-step, and superscript f represents the forecasting value) and an observation system: wfnþ1 ¼ Fn ðwfn Þ þ mn ,

(6)

zfn ¼ Hn ðwfn Þ þ nn ,

(7)

where Fn is our mathematical model, mn is the stochastic forcing (random modeling noise error). The observation system is modeled by operator (or matrix, for linear systems) Hn , and nn is the noise associated to the observation. The typical Gaussian probability density function and zero-mean hypotheses for the noises are adopted. The state vector is defined as

ARTICLE IN PRESS 1246

¨rter et al. / Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243–1250 F.P. Ha

wnþ1 ¼ ½ðAL Þnþ1 ; ðAW Þnþ1 ; ðAA Þnþ1 T , and it is estimated through the recursion wanþ1 ¼ ðI  Gnþ1 Hnþ1 ÞFn wan þ Gnþ1 znþ1 ,

(8)

where wanþ1 denotes estimated (analysis) value, Hn is a matrix representing the observation system, z being the measured values, and Gn is the Kalman gain, computed from the minimization of the estimation error variance J nþ1 (Jazswinski, 1970) J nþ1 ¼ Efðwanþ1  wfnþ1 ÞT ðwanþ1  wfnþ1 Þg

(9)

being Efg the expected value. For computing the Kalman gain matrix, three matrices are considered: Q n (the covariance of mn ), Rn (the covariance of nn ), and Pfn (the error covariance matrix) given by Pfnþ1 ¼ Fn Pan Fn þ Q n

(10)

being the Kalman gain calculated by Gnþ1 ¼ Pfnþ1 HTn ½Rn þ Hn Pfn HTn 1 .

(11)

The assimilation is done through the sampling: rnþ1  znþ1  zfnþ1 ¼ znþ1  Hn wfnþ1 .

(12)

The error propagation is expressed by the matrix Panþ1 ¼ ½I  Gnþ1 Hn Pfnþ1 .

(13)

The KF technique for data assimilation can be summarized as: (1) (2) (3) (4) (5)

compute the forecast step: wfnþ1 ¼ Fn ðwfn Þ; compute the value: zfnþ1 ¼ Hn ðwfn Þ; compute the Kalman gain: Gnþ1 (Eq. (10)); perform the assimilation (analysis): wanþ1 (Eq. (7)); update the error covariance matrix Panþ1 (Eq. (12)).

The KF algorithm is an expensive procedure, from computational point of view. Under certain conditions, the algorithm can be simplified. For example, if one is only interested in obtaining the steady-state filter, under the assumption that the dimension of system noise process is much less than the dimension of system states, the Chandrasekhar-type algorithm can be employed to obtain a computational reduction (Zhang et al., 1997). In our case, where the system has low dimension, it is not necessary. One way of dealing with the difficulties associated with the KF algorithm is to estimate the background covariance matrix Q . One approach to compute this matrix is to use the Fokker–Planck (FP) equation (Belyaev and Tanajura, 2002), where the matrix Q is determined by solving an associated FP equation. Alternatively, the ensemble KF can be applied to compute such covariance matrix (Kalnay, 2003). Another strategy is to employ an adaptive KF, where the error covariance matrix of modeling Q is parameterized using a few (or only one) parameters. Such parameters can be estimated by using a secondary KF (Nowosad et al., 2000b).

3.2. Artificial neural network In order to emulate a KF in data assimilation an MLPNN was trained with the backpropagation algorithms (Haykin, 1994). Multilayer perceptrons with backpropagation learning algorithm, commonly referred to as backpropagation neural networks (NNs), are feedforward networks composed of an input layer, an output layer, and a number of hidden layers of neurons. The k-th neuron can be described by the two coupled equations uk ¼

m X

wkj xj ,

(14)

j¼1

yk ¼ jðuk þ bk Þ,

(15)

where x1 ; . . . ; xm are the inputs; wk1 ; . . . ; wkm are the connection weights for neuron-k, uk is the linear output of the linear combination among weighted inputs, bk is the bias; jð:Þ is the activation function; and yk is the neuron output. The bias is a special weight that applies a linear transformation in uk . The activation function implemented in the hidden layers is a hyperbolic tangent and a linear function is used in the output layer. An MLPNN with two hidden layers and learning rate equal to 0.1 without momentum constant was implemented and trained with data obtained from numerical forecasting (numerical integration of Eqs. (2)–(4)) and simulated observation (numerical forecasting plus noise) as input vectors and data from the KF as target vectors in the backpropagation method. The data assimilation using MLP-NN has two phases: (i) training phase: where a set of filtered (assimilated) data wa ¼ ½wa1 wa2 . . . wam T are employed to calculate the connection weights, (ii) activation phase: the trained MPLNN computes the assimilation data from the observed and prediction data as the input for the network. Considering an ANN with 2m inputs, m outputs, K (5m) hidden layers, with OðmÞ neurons for each hidden layer, it can be shown that, for the activation process, the ANN is an algorithm with complexity Oðm2 Þ. However, a KF with m state variables and m observations has complexity Oðm3 Þ (Nowosad, 2001). Other types of ANNs can also be applied for data assimilation (Ha¨rter and Campos Velho, 2005). However, for the present system, the MLP-NN, with backpropagation algorithm for learning, has produced good results. In one previous paper, we have proposed a modification in the learning algorithm (Nowosad et al., 2003). For that case, the assimilation process is applied to the shallow water model, a standard model for meteorology and oceanography. That model has a much higher dimension than the problem studied here, and such procedure for training process did not bring improvement. 4. Numerical experiments Our numerical experiments are just for testing the strategy. For simplicity, we assume that all error covariance matrices are diagonal ones. The numerical values for

ARTICLE IN PRESS ¨rter et al. / Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243–1250 F.P. Ha

these are given as follows: Q n ¼ 0:1I; ( Pf0 ¼

Rn ¼ 2I;

10ðwf0 Þ2i

for i ¼ j;

0

for iaj:

(16)

The three-wave system was integrated using a fourthorder Runge–Kutta scheme, with Dt ¼ 102 . The insertion of observational data into the model is done every five time-steps. The network is trained using cross-validation. Cross-validation consists of splitting the target data set into two subsets, one for training and another one for validation. We call an epoch a whole training cycle involving the presentation of the first (training) data set to the ANN. After each epoch, the connection weights are evaluated with the second (validation) data set to check the ANN’s generalization skills. For the training phase, the first data set containing 20 000 pairs of input/target vectors is considered (observation and forecasting vector in input layer and KF output as target vector). After training, cross-validation is performed with the second data set of 30 000 vector pairs. The data sets for the validation phase are obtained from a previous simulation using KF for data assimilation every five time-steps. As mentioned in Introduction, the goal here is to design an ANN to emulate the KF. The strategy for that is outlined below:

1247

In driving us to design the appropriate topology for the ANN, we tried to use the same standard applied for data assimilation in Lorenz’s system (Ha¨rter and Campos Velho, 2005). In that paper, the authors investigated different types and topologies for data assimilation considering one and two hidden layers with several numbers of neurons: multilayer perceptron, radial base function, Elman NN, and Jordan NN. For that experiment, convergence was noted for all ANNs and some topologies studied, where good results were obtained with one hidden layer and two hidden layers did not improve the results. For the present study, one hidden layer did not produce acceptable results. Working with two hidden layers good results were obtained. We first test the MLPNN for data assimilation in a periodic regime at n ¼ 28:14. We use an MLP-NN with two hidden layers, with 2 neurons in the first hidden layer and 10 neurons in the second hidden layer. During the learning process, the training error dramatically decreases for the first two epochs, and it keeps almost constant for the rest of the training process. This reflects the benefits of using crossvalidation.

Forecasting − Reference Model, KF, MLP−NN − γ = 28.14

TEn ¼

N TR 1 X 1 KF 2 ðA  ANN a;i Þ ; NTR i¼1 2 a;i

NCV qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 X NN 2 ðAKF CVEn ¼ a;i  Aa;i Þ ; NR i¼N

50

0 988

990

992

994

996

998

1000

988

990

992

994

996

998

1000

988

990

992

994 t (s)

996

998

1000

100

Aw (t)

The errors for training ðTEÞ, cross-validation ðCVEÞ, and estimation ðEEÞ for each epoch n are computed as follows:

Aw (t)

100 (1) The three-wave system is run, and the KF is employed for data assimilation—for the worked example, it is not necessary to use an additional technique to estimate the modeling error covariance matrix (Q ). (2) The assimilated data by the KF is stored in order to be used as a target for the MLP-NN training. (3) The MLP-NN is trained using the assimilated data (or analysis: xa ), and a set of connection weights is computed. (4) From the weight set found in [3], the MLP-NN is prepared to be applied for data assimilation, emulating a KF performance.

(17)

50

0

(18)

100

EEi ¼

3 1X 3 a¼1

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ðAfa;i  ANN a;i Þ ;

(19)

where a ¼ L; W; A; NTR denotes the number of elements of the training set; N CV is the number of elements of the cross-validation set; the superscript f denotes the forecasting value; and superscripts KF and NN denote values assimilated by KF and NN, respectively. During the crossvalidation period t i ; i ¼ NTR ; NTR þ 1; . . . ; N CV , the data for simulated prediction does not belong to the training data set. In EEi , i ¼ N CV ; N CV þ 1; . . . ; NT , for NT ¼ 104, and in CVEn , NR ¼ NCV  NTR .

Aw (t)

TR

50

0

Fig. 2. Time series of jAW j: (a) reference model (upper), (b)data assimilation using KF (middle), and (c) data assimilation using MLPNN (bottom).

ARTICLE IN PRESS 1248

¨rter et al. / Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243–1250 F.P. Ha

Forecasting Error of KF and MLP−NN − γ = 28.14

x 10−3

Error − EE

15

10

5

0 0

Error − EE

15

x

500

1000

500

1000

1500

2000

2500

3000

1500

2000

2500

3000

10−3

10

5

0 0

time−step Fig. 3. Assimilation error for periodic regime: (a) Kalman filter (upper) and (b) neural network (bottom).

After the choice of the best weight set, Eqs. (2)–(4) are integrated considering data assimilation at each five timesteps. Fig. 2 depicts the last 103 time-steps of a time series of jAW j. Fig. 2(a) represents the numerical model (reference, without noise), Fig. 2(b) depicts the integration of Eqs. (2)–(4), where the noisy observational data is inserted into the model, and the noise is assimilated by the KF. In Fig. 2(c) the data assimilation is performed by the MLP-NN. The MLP-NN and KF are effective to carry out the assimilation. Fig. 3 shows the mean errors of KF and MLP-NN. Small errors are verified for both schemes. Next we analyze the efficiency of the data assimilation systems (MLP-NN and KF) in chaotic dynamics. The same steps performed for the periodic regime are repeated for a chaotic regime, at n ¼ 28:128. The maximum Lyapunov exponent of the chaotic attractor is l  0:35, computed from the algorithm presented by Wolf et al. (1985). The same procedure applied for training at periodic regime, was employed to the chaotic dynamics, and the cross-validation strategy was used too. As before, the learning error dramatically decreases for the first two iterations, and keeping almost constant for the rest of the process. The minimum average error at 0.00518 is reached by the network after three epochs. After the choice of the best weight set, the system is integrated considering data assimilation at each 5 timesteps, using KF and MLP-NN (with 2 hidden layers, 2 neurons for the first hidden layer and 10 neurons for the second hidden layer). The last 103 time-steps of the time series of jAW j are shown in Fig. 4 (upper part). The assimilation by KF and MLP-NN are also shown in Fig. 4, in the middle and at the bottom, respectively. Fig. 5 shows the mean errors for KF and MLP-NN, respectively. Both

assimilation systems work well, but the MLP-NN seems to give better results, as seen by a comparison between the high peaks in Fig. 5. The errors in the chaotic regime are bigger than in the periodic regime (Fig. 3).

5. Concluding remarks In this work the techniques of Kalman filter (KF) and artificial neural networks (ANNs) were tested for data assimilation in a three-wave model of auroral radio emissions in periodic and chaotic regimes. An ANN was trained with data from a KF using a cross-validation scheme. Both the KF and ANN techniques are effective for periodic and chaotic regimes. However, after the training phase the ANN presents lower computational cost than the KF. The cross-validation is a good strategy to choose the best weight set. The nonlinear three-wave model given by Eqs. (2)–(4) is a realistic formulation of auroral radio emissions. Highfrequency Leaked Auroral Kilometric Radiation (Chian et al., 1994) can be detected on a continuous basis by ground receivers in the polar region, which provides enough data to use either KF or NN. Thus, auroral radio emissions can be used as a practical means for monitoring solar–terrestrial relation and prediction of space weather. Taking into account that measuring devices could have different observation frequency, noise level, and the acquisition system can partially stop operating—implying no access to the full information/measurements (for some reason a device could stop of monitoring in a period of time), these cases deserve to be investigated in a deeper way. Such investigation will be our focus in a future work.

ARTICLE IN PRESS ¨rter et al. / Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243–1250 F.P. Ha

150

A study for predictability of the three-wave system could be done using the bred vector approach (Kalnay, 2003), such as employed for Lorenz’s system (Evans et al., 2004). Another interesting future study is to use a nonsupervisioned NN, such as Hopfield NN. However, for the present nonlinear system, a nonlinear version of the Hopfield NN should be applied, as those used in recent papers (Sebastia˜o and Braga, 2005; Viterbo et al., 2004). It is important to note that ANNs are intrinsically parallel procedures, and a hardware implementation (neuro-computers) is also possible. Hence, we propose ANNs to be employed as an assimilation technique for space weather prediction.

100

Acknowledgments

Forecasting − Reference Model, KF, MLP−NN − γ = 28.128

Aw (t)

150 100 50 0

Aw (t)

988

990

992

994

996

998

1000

This work was supported by CNPq and FAPESP.

50

References

0 988

990

992

994

996

998

1000

Aw (t)

150 100 50 0 988

990

992

994

996

998

1000

t (s) Fig. 4. Time series of jAW j: (a) reference model (upper), (b) data assimilation using KF (middle), and (c) data assimilation using MLP-NN (bottom).

Forecasting Error of KF and MLP−NN − γ = 28.128

Error − EE

0.2 0.15 0.1 0.05 0 0

500

1000

0

500

1000

1500

2000

2500

3000

1500

2000

2500

3000

0.2 Error − EE

1249

0.15 0.1 0.05 0 time−step

Fig. 5. Error for chaotic regime: (a) Kalman filter (upper) and (b) neural network (bottom).

Arge, C.N., Luhmann, J.G., Odstril, D., Schrijver, C.J., Li, Y., 2004. Stream structure and coronal sources of the solar wind during the May 12th, 1997 CME. Journal of Atmospheric and Solar—Terrestrial Physics 66, 1295–1309. Belyaev, K., Tanajura, C.A.S., 2002. An extension of a data assimilation method based on the application of the Fokker–Planck equation. Applied Mathematical Modelling 26, 1019–1027. Boehm, M.H., Carlson, C.W., McFadden, J.P., Clemmons, J.H., Mozer, F.S., 1990. High-resolution sounding rocket observations of large-amplitude Alfve´n waves. Journal of Geophysical Research 95, 12157–12171. Chian, A.C.-L., 2003. Foreword: advances in space environment research. Space Science Reviews 107, 1–3. Chian, A.C.-L., Lopes, S.R., Alves, M.V., 1994. Generation of auroral whistler-mode radiation via nonlinear coupling of Langmuir-waves and Alfve´n waves. Astronomy and Astrophysics 290, L13–L16. Chian, A.C.-L., Rempel, E.L., Borotto, F.A., 2002. Chaos in magnetospheric radio emissions. Nonlinear Processes in Geophysics 9, 435–441. Daley, R., 1993. Atmospheric Data Analysis. Cambridge University Press, Cambridge. Evans, E., Bhatti, N., Kinney, L., Pann, L., Pen˜a, M., Yang, S.-C., Kalnay, E., Hansen, J., 2004. RISE undergraduates find Lorenz’s model regime changes predictable. Bulletin of the American Meteorological Society 85, 520–524. Garner, T.W., Wolf, R.A., Spiro, R.W., Thomsen, M.F., 1999. First attempt at assimilation data to constrain a magnetospheric model. Journal of Geophysical Research 104, 25145–25152. Grebogi, C., Ott, E., Yorke, J.A., 1987. Chaos, strange attractors, and fractal basin boundaries in nonlinear dynamics. Science 238, 585–718. Guo, J.S., Shang, S.P., Shi, J.K., Zhang, M.L., Luo, X.G., Zheng, H., 2003. Optimal assimilation for ionospheric weather—theoretical aspects. Space Science Reviews 107, 229–250. Hajj, G.A., Wilson, B.D., Wang, C., Pi, X., Rosen, I.G., 2004. Data assimilation of ground GPS total electron content into a physicsbased ionospheric model by use of the Kalman filter. Radio Science 1, RS1S05. Ha¨rter, F.P., Campos Velho, H.F., 2005. Recurrent and Feedforward Neural Networks Trained with Cross Validation Scheme Applied to the Data Assimilation in Chaotic Dynamics. Revista Brasileira de Meteorologia 20, 411–420. Haykin, S., 1994. Neural Networks: A Comprehensive Foundation. Macmillan, New York. Hsieh, W.W., Tang, B.Y., 1998. Applying neural network models to prediction and data analysis in meteorology and oceanography. Bulletin of the American Meteorological Society 79, 1855–1870. Jazswinski, A.H., 1970. Stochastic Processes and Filtering Theory. Academic Press, New York. Kalnay, E., 2003. Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, Cambridge. Liaqat, A., Fukuhara, M., Takeda, T., 2001. Application of neural network collocation method to data assimilation. Computer Physics Communications 141, 350–364.

ARTICLE IN PRESS 1250

¨rter et al. / Journal of Atmospheric and Solar-Terrestrial Physics 70 (2008) 1243–1250 F.P. Ha

Lopes, S.L., Chian, A.C.-L., 1996. A coherent nonlinear theory of auroral Langmuir–Alfve´n–whistler (LAW) events in the planetary magnetosphere. Astronomy and Astrophysics 305, 669–676. Miller, R.N., Guil, M., Gauthiez, F., 1994. Advanced data assimilation in strongly nonlinear system. Journal of the Atmospheric Sciences 51, 1037–1056. Nowosad, A.G., 2001. New approaches for meteorological data assimilation. D.Sc. dissertation, Applied Computing Graduation Course, National Institute for Space Research (CAP-INPE), Sa˜o Jose´ dos Campos (SP), Brazil (in Portuguese). Nowosad, A.G., Campos Velho, H.F., Rios Neto, A., 2000a. Neural network as a new approach for data assimilation. In: Proceedings of the Brazilian Congress on Meteorology, 16–20 October, Rio de Janeiro (RJ), Brazil (paper code PT00002), pp. 3078–3086. Nowosad, A.G., Rios Neto, A., Campos Velho, H.F., 2000b. Data assimilation using an adaptative Kalman filter and Laplace transform. Hybrid Methods in Engineering 2, 291–310. Nowosad, A.G., Campos Velho, H.F., 2003. New learning scheme for multilayer perceptron neural network applied to meteorological data assimilation, In: XXIV Iberian Latin–American Congress on Computational Methods in Engineering (CILAMCE-2003), 29–31 October, Ouro Preto (MG), Brazil (Proceedings in CD-Rom: paper code cil26228, 8pp.)

Rempel, E.L., Chian, A.C.L., Borotto, F.A., 2003. Chaotic temporal variability of magnetospheric radio emissions. Space Science Reviews 107, 503–506. Scherliess, L., Schunk, R.W., Sojka, J.J., Thompson, D.C., 2004. Development of a physics-based reduced state Kalman filter for the ionosphere. Radio Science 1, RS1S04. Schunk, R.W., Scherliess, L., Sojka, J.J., Thompson, D.C., Anderson, D.N., Codrescu, M., Minter, C., Fuller-Rowell, T.J., Heelis, R.A., Hairston, M., Howe, B.M., 2004. Global assimilation of ionospheric measurements (GAIM00). Radio Science 1, RS1S02. Sebastia˜o, R.C.O., Braga, J.P., 2005. Retrieval of transverse relaxation time distribution from spin-echo data by recurrent neural network. Journal of Magnetic Resonance 177, 146–151. Tang, Y., Hsieh, W., 2001. Coupling neural networks to incomplete dynamical systems via variational data assimilation. Monthly Weather Review 129, 818–834. Viterbo, V.C., Braga, J.P., Shiguemori, E.H. Silva, J.D.S., Campos Velho, H.F., 2004. Atmospheric temperature retrieval using non-linear Hopfield neural Network. In: Inverse Problems, Design and Optimization Symposium (IPDO), vol. 2, 17–19 March, Rio de Janeiro (RJ), Brazil, pp. 57–60. Wolf, A., Swift, J.B., Swinney, H.L., Vastano, J.A., 1985. Determining Lyapunov exponents from a time series. Physica D 16, 285–317. Zhang, X.F., Heemink, A.W., van Eijkeren, J.C.H., 1997. Data assimilation in transport models. Applied Mathematical Modelling 21, 2–14.