A prediction model of short-term ionospheric foF2 based on AdaBoost

A prediction model of short-term ionospheric foF2 based on AdaBoost

Accepted Manuscript A prediction model of short-term ionospheric foF2 based on AdaBoost Xiukuan Zhao, Baiqi Ning, Libo Liu, Gangbing Song PII: DOI: Re...

531KB Sizes 1 Downloads 53 Views

Accepted Manuscript A prediction model of short-term ionospheric foF2 based on AdaBoost Xiukuan Zhao, Baiqi Ning, Libo Liu, Gangbing Song PII: DOI: Reference:

S0273-1177(13)00762-X http://dx.doi.org/10.1016/j.asr.2013.12.001 JASR 11630

To appear in:

Advances in Space Research

Received Date: Revised Date: Accepted Date:

1 September 2013 28 November 2013 1 December 2013

Please cite this article as: Zhao, X., Ning, B., Liu, L., Song, G., A prediction model of short-term ionospheric foF2 based on AdaBoost, Advances in Space Research (2013), doi: http://dx.doi.org/10.1016/j.asr.2013.12.001

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

A prediction model of short-term ionospheric foF2 based on AdaBoost Xiukuan Zhaoa,b, Baiqi Ninga,b, Libo Liua,b, Gangbing Songc,d a

Key Laboratory of Ionospheric Environment, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, 100029, China b

Beijing National Observatory of Space Environment, Institute of Geology and Geophysics, Chinese Academy of Sciences, Beijing, 100029, China

c

Department of Mechanical Engineering, University of Houston, Houston, TX 77204, USA d

School of Civil Engineering, Dalian University of Technology, Dalian, Liaoning, China

Correspondence to: Dr. Xiukuan Zhao Contact message: Dr. Xiukuan Zhao (email: [email protected]) Key Laboratory of Ionospheric Environment, Institute of Geology and Geophysics, Chinese Academy of Sciences, No. 19, Beitucheng Western Road, Chaoyang District, Beijing, 100029, China Can also contact Dr. Baiqi Ning (email: [email protected]) Key Laboratory of Ionospheric Environment, Institute of Geology and Geophysics, Chinese Academy of Sciences, No. 19, Beitucheng Western Road, Chaoyang District, Beijing, 100029, China Dr. Libo Liu (email: [email protected]) Key Laboratory of Ionospheric Environment, Institute of Geology and Geophysics, Chinese Academy of Sciences, No. 19, Beitucheng Western Road, Chaoyang District, Beijing, 100029, China

Dr. Gangbing Song (email: [email protected]) Department of Mechanical Engineering, 1

University of Houston, Houston, TX 77204, USA Abstract In this paper, the AdaBoost-BP algorithm is used to construct a new model to predict the critical frequency of the ionospheric F2-layer (foF2) one hour ahead. Different indices were used to characterize ionospheric diurnal and seasonal variations and their dependence on solar and geomagnetic activity. These indices, together with the current observed foF2 value, were input into the prediction model and the foF2 value at one hour ahead was output. We analyzed twenty-two years’ foF2 data from nine ionosonde stations in the East-Asian sector in this work. The first eleven years’ data were used as a training dataset and the second eleven years’ data were used as a testing dataset. The results show that the performance of AdaBoost-BP is better than those of BP Neural Network (BPNN), Support Vector Regression (SVR) and the IRI model. For example, the AdaBoost-BP prediction absolute error of foF2 at Irkutsk station (a middle latitude station) is 0.32 MHz, which is better than 0.34 MHz from BPNN, 0.35 MHz from SVR and also significantly outperforms the IRI model whose absolute error is 0.64 MHz. Meanwhile, AdaBoost-BP prediction absolute error at Taipei station from the low latitude is 0.78 MHz, which is better than 0.81 MHz from BPNN, 0.81 MHz from SVR and 1.37 MHz from the IRI model. Finally, the variety characteristics of the AdaBoost-BP prediction error along with seasonal variation, solar activity and latitude variation were also discussed in the paper. Keywords AdaBoost; ionosphere; foF2; short-term prediction

1

Introduction

The ionosphere is highly variable, due to the influence of solar, geomagnetic and other sources (Davies, 1990; Kelley, 2009; Mukhtarov et al. , 2013). Accurate specifications of spatial and temporal variations of the ionosphere during geomagnetic quiet and disturbed conditions are critical for applications, such as HF communications, satellite positioning and navigation, power grids, pipelines, etc. Therefore, developing empirical models to forecast the ionospheric perturbations is of high priority in real applications. The critical frequency of the F2 layer, foF2, is an important ionospheric parameter, especially for radio wave propagation applications. Thus, the variations of foF2 deserve further intensive research and require modeling with improving accuracies. The forecasting of the ionospheric F2 layer is yet to be resolved and is still a challenging issue (Belehaki et al. , 2009; Chen et al. , 2010a; Mikhailov et al. , 2007). As the Earth’s upper atmosphere is an open system with many uncontrolled inputs forcing it both from above and below. However, the physics of variations in the solar, magnetospheric and ionospheric characteristics cannot be properly modeled, due to the lack of current information about several natural physical parameters. Various efforts have been made to build station-specific, regional and global empirical models (Kutiev and Muhtarov, 2001; Liu et al. , 2004; Liu et al. , 2006; Pietrella and Perrone, 2008; Tsagouri and Belehaki, 2008). The most widely used international reference ionosphere (IRI) model (Bilitza, 2001; Bilitza and Reinisch, 2008) serves as a standard description of the ionospheric parameters. Other methods have also been adopted to predict the ionospheric 2

behaviors, such as the neural network approach (Cander et al. , 1998; Francis et al. , 2000; Habarulema et al. , 2007; Habarulema et al. , 2010; McKinnell and Poole, 2004; Oyeyemi et al. , 2006; Wang et al. , 2013; Wintoft and Cander, 1999), autocorrelation method (Muhtarov and Kutiev, 1999), storm-time correction model (Araujo-Pradere and Fuller-Rowell, 2002; Araujo-Pradere et al. , 2002), assimilation ionosphere model (Liu et al. , 2012; Sojka et al. , 2001; Yue et al. , 2007), and support vector machine (Ban et al. , 2011; Chen et al. , 2010b). Some excellent reviews (Belehaki et al. , 2009; Bilitza, 2002; Schunk et al. , 2003) have presented many available empirical models recently. The development of accurate models to forecast the ionospheric conditions in short time scales (typically 1-24h in advance) is of crucial importance for related applications. In this paper, we propose a model based on the AdaBoost method, which can provide an acceptable accuracy and may be recommended for practical use. It is the first time that the AdaBoost method has been used in modeling/predicting ionospheric parameters. AdaBoost is an ensemble method, which is often possible to increase the accuracy of a classifier/predictor by averaging the decisions of an ensemble of classifiers/predictors (Freund and Schapire, 1996; Schapire et al. , 1997; Schwenk and Bengio, 2000). It has been successfully applied to a number of applications, including disease detection (Morra et al. , 2010), fingerprint classification (Liu, 2010), music classification (Bergstra et al. , 2006), predicting protein structural class (Niu et al. , 2006), etc. The purpose of this work is to build a short-term empirical model for predicting foF2 one hour ahead and evaluate its availability and accuracy.

2

Short-term prediction model of foF2 based on AdaBoost

Among the statistical methods to build short-term prediction model of foF2, Support Vector Regression (SVR) and BP Neural Network (BPNN) are the two most popular methods. In SVR, the input x is first mapped onto an m-dimensional feature space using some fixed mapping (usually Gaussian kernel), and then a linear model is constructed in this feature space. The SVR estimates can be obtained by minimizing the empirical risk on the training data. BPNN is composed of several layers of neurons: an input layer, one or more hidden layers and an output layer. Each layer of neurons receives its input from the previous layer or from the network input. The output of each neuron feeds the next layer or the output of the network. The unknown parameters of any neural network are the weights which can be found through training with different algorithms on the known input-output patterns. The neural networks would learn by adjusting the weights connecting neurons in different layers to minimize the error of outputs (Bishop, 1995; Haykin, 2008). For the application of different back propagation algorithms in modeling ionospheric parameter, see the paper by Habarulema and McKinnell (2012). AdaBoost is an ensemble method, which is often possible to increase the accuracy by averaging the decisions of an ensemble of BPNN. The short-term prediction model of foF2 based on AdaBoost is shown as follows.

3

2.1 AdaBoost-BP algorithm The AdaBoost algorithm, introduced by Freund (1995), solved many practical difficulties of the earlier boosting algorithms. It combines a set of weak learners in order to form a strong classifier. In general, more improvement can be expected when the individual classifiers are diverse and yet accurate. We use standard BP neural network with an input layer, a hidden layer and an output layer as weak learner for AdaBoost and call it the AdaBoost-BP prediction algorithm. Pseudocode for AdaBoost-BP is given as follows: (1) Given training dataset:

( x1 , y1 ) ,..., ( x N , y N ) ,

where, xi ∈ X , yi ∈ Y , X and Y indicate

some domain or instance space, respectively. The weight of this initial distribution on training example xi is denoted as D1 ( i ) = 1 N , i = 1,..., N , and N is the size of the training data. Then, configure the initial BP neural network parameters based on the input and output dimension. (2) For t = 1,..., T , T is the size of weak learners: z

Train BP neural network and obtain hypothesis ht : X → Y

z

Calculate the error of ht : ε t ( i ) = ht ( xi ) − yi

z

Update distribution Dt :

if

ε t ( i ) ≥ Δ then Dt +1 ( i ) = c × Dt ( i ) , Errt = ¦ Dt ( i ) else Dt +1 ( i ) = Dt ( i )

where Δ is a threshold and c is a coefficient.

z

Calculate the weight of weak learner ht : α t =

z

Set Dt +1 =

Dt +1 ¦ Dt +1 (i )

(3) Output the final hypothesis: h final = ¦ α t ht

4

1

2 exp ( Errt

)

2.2 Inputs to the prediction model As with all mathematical modeling, we seek input quantities to which our target output is either known or suspected to be related. Observations show that the ionospheric foF2 exhibits diurnal variation, seasonal variation, solar cycles and magnetic variation.

2.2.1

Diurnal variation

Observations show that the ionospheric foF2 exhibits diurnal variations remarkably. We choose the hour number (HR) as the primary index of diurnal variation, where HR is an integer in the range 0 ≤ HR ≤ 23 . As in Poole and McKinnell (2000), HR is converted to two quadrature components to avoid unrealistic numerical discontinuity at the midnight boundary,

HRS = sin (2π × HR / 24)

(1)

HRC = cos( 2π × HR / 24)

(2)

where HRS and HRC are the sine and cosine components of HR, respectively.

2.2.2

Seasonal variation

It has been demonstrated that foF2 has distinct diurnal and seasonal variations. These variations are ultimately due to the rotation of the earth and the evolution of the earth. For more information about the F region seasonal behaviors, please refer to Liu et al. (2009). As described in Poole and McKinnell (2000), we use the DoY (day of year, 1 ≤ DoY ≤ 365 or 366 ), to describe the seasonal variation, similarly converted to two quadrature components. DNS = sin( 2π × DoY / DiY )

(3)

DNC = cos( 2π × DoY / DiY )

(4)

DiY, the total days in a year, equals 365 or 366 (366 is for leap years). DNS and DNC are the sine and cosine components of DoY, respectively.

2.2.3

Solar cycle and magnetic variation

We choose the 10.7cm solar flux F107 as input to describe the solar cycle variation as it is correlated with foF2. In addition, ap and Dst indices are used to represent magnetic variation. The ap index is usually used for responding to global geomagnetic changes, while Dst index is more sensitive during the geomagnetic storms in low and mid latitude.

2.2.4

Present value of foF2

Considering that the foF2 values themselves should be meaningful in forecasting the foF2 one hour ahead, the present value of foF2 is fed to the model. A large numbers of statistical analyses were made to study the auto-correlation coefficients of foF2. Not surprisingly the 5

coefficients between present foF2 value and the value of foF2 one hour ahead are high, and we have embodied this in the prediction model.

2.3 foF2 short-term prediction model Fig. 1 shows the flow chart of the foF2 short-term prediction model. The parameters of the training dataset are composed by HRS, HRC, DNS, DNC, F107, ap, Dst, and foF2. The training dataset generates an initial distribution and a weak learner is obtained. The final hypothesis will be achieved by combining all the weak learners after the defined loop is terminated.

3

Experiment and discussion

3.1 Data sets In order to check the ability of our model for short-term prediction, we chose hourly foF2 values from nine ionosonde stations in this study. The Wuhan ionosonde data are from the Wuhan ionosonde station, which belongs to the Institute of Geology and Geophysics Chinese Academy Of Sciences (IGGCAS). The others are available in the American National Geophysical Data Center (NGDC-NOAA) database and WDC for Ionosphere, Tokyo, National Institute of Information and Communications Technology. A list of the nine selected stations is shown in Table 1. We chose twenty-two years’ data from each station. The first eleven years’ data were used as a training dataset to train the model and the second eleven years’ data were used as a testing dataset to verify the model. After the model construction, the present value at time t was used as an input for predicting foF2 at time t+1.The training dataset and testing dataset contained different solar activity data. The testing dataset did not contain any data from the training dataset, so the testing results could be considered to be reasonable. Fig. 2 shows the location of the stations selected in this study.

3.2 Error analysis method The performance of the prediction model is evaluated by absolute error ΔfoF 2 , relative error RE and root mean square error RMSE (Habarulema et al. , 2007), which are defined, respectively, as

ΔfoF 2 = RE =

1 N

1 N ¦ foF 2o ( i ) − foF 2 p (i ) N i =1

¦ ( foF 2 ( i ) − foF 2 (i ) N

o

p

i =1

RMSE =

1 N

)

foF 2o ( i ) × 100%

N

¦ ( foF 2 (i ) − foF 2 ( i )) o

i =1

6

(5)

p

2

(6)

(7)

where

foF 2o and

foF 2 p represent the observed values and the predicted values,

respectively. N is the prediction data size.

3.3 Discussion To show how each input parameter contributes to the accuracy of predicting foF2, diurnal variation (HRS, HRC), seasonal variation (DNS, DNC) and solar flux (F107) were chosen as the initial modeling parameters for the foF2 prediction. Keeping the initial modeling parameters, the ap, Dst, and foF2 were respectively included in the modeling to determine the influence level of each input parameter in the foF2 prediction. Fig. 3 shows the computed RMSE values between the modeled and actual foF2 values for the selected stations within our region of interest. Generally, results show that the greatest contributor to foF2 prediction is the present foF2. For the magnetic variation component, the Dst is a greater contributor than ap for the most stations except Okinawa and Taipei. For Yakutsk, Irkutsk, Wakkanai, Kokubunji and Yamagawa stations, the performance of using ap and Dst together is better than using ap or Dst alone. For Wuhan and Manila, the performance of using ap and Dst together is better than using ap alone but worse than using Dst alone. For Okinawa and Taipei, the performance of using ap and Dst together is better than using Dst alone but worse than using ap alone. However, for the all nine stations, using ap and Dst together achieves better performance than the initial modeling parameters, which neither contain ap or Dst. Therefore, later in this article, the parameters of the dataset are composed by HRS, HRC, DNS, DNC, F107, ap, Dst, and foF2. Taking Wakkanai station as an example, Fig. 4 shows the observed foF2 on the 34th and 35th days of 1970 at Wakkanai compared with AdaBoost-BP predicted foF2. The x-axis represents UT time and the y-axis indicates foF2 values. The dark solid curve represents observed data and the red dash curve represents predicted data. From this figure, the AdaBoost-BP model provides accurate prediction one hour ahead. The predictions present comparatively smooth diurnal variations, which are close to the observations during both the ascent and descend phases and reproduce the general behavior of foF2. To validate our short-term prediction model based on the AdaBoost-BP method, we compared it with the IRI model, BPNN and SVR prediction method. For fair comparisons, in both AdaBoost-BP and BPNN methods, the same BP architecture was presented with one input layer (8 input nodes), one hidden layer and one output layer. Hecht-Nielsen (1987) suggested that an optimum number of hidden nodes for a single hidden layer network can be 2n+1 nodes, where n is the number of input nodes. Here, we chose 17 hidden nodes for 8 inputs. The initial parameters and weights of the BPNN were optimized by genetic algorithm. After repeated experiments, the threshold Δ in AdaBoost-BP was set to 0.2 and the coefficient c was set to 1.1. The SVR penalty factor c and kernel parameter g were also optimized by genetic algorithm. The mean ΔfoF 2 , RE and RMSE of different methods between the predicted and observed values were calculated for each station, and the results are presented in Table 2. Table 2 shows the mean errors of testing dataset for each station. From Table 2, the proposed AdaBoost-BP method outperforms others. For the Irkutsk station in a mid-latitude area, the 7

absolute error of AdaBoost-BP is 0.32 MHz, which is better than that of the BP network is 0.34 MHz, SVR is 0.35 MHz and the IRI model is 0.64 MHz. For the Taipei station in a low-latitude area, the absolute error of AdaBoost-BP is 0.78 MHz, which is better than that of the BP network is 0.81 MHz, SVR is 0.81 MHz and the IRI model is 1.37 MHz. For further comparison of the different methods, we made the RMSE and RE curves of Wakkanai station by month in Fig. 5 and by year in Fig. 6. Other stations have the same trend. In Fig. 5, the x-axis represents a specific month. In Fig. 5 (a), the y-axis represents the root mean square error. In Fig. 5 (b), the y-axis represents relative error. Each point in (a) and (b) indicates mean RMSE or RE, respectively, corresponding to the mean prediction error of eleven years’ testing dataset in a specific month of each year. From Fig. 5, we can see the AdaBoost-BP method achieves the lowest RMSE and RE every month. Meanwhile, the RMSE and RE in summer is lower than those in winter, which means the prediction of foF2 value in winter is more difficult than that in summer. We think the foF2 varies larger in winter than in summer that makes it more difficult to predict. Such as the prediction RMSE of the AdaBoost-BP method is 0.45 MHz in June, while that is 0.64 MHz in December. The prediction RE of the AdaBoost-BP method is 5.00% in June, while that is 8.20% in December. In Fig. 6, the x-axis represents a certain year. In Fig. 6 (a), the y-axis represents solar flux F107. In Fig. 6 (b), the y-axis represents root mean square error. In Fig. 6 (c), the y-axis represents relative error. Each point in Fig. 6 (b) and (c) indicates mean RMSE or RE corresponding to each year. From Fig. 6, we can see the AdaBoost-BP method achieves the lowest RMSE and RE every year. By contrasting (a) and (b), the RMSE during a period of high solar activity is higher than that for low solar activity. As the prediction RMSE of the AdaBoost-BP method is 0.57 MHz in 1979 (high solar activity), while that is 0.50 MHz in 1976 (low solar activity). However, from Fig. 6 (c), the prediction RE of the AdaBoost-BP method is 4.54% in 1979, which is lower than that of 7.45% in 1976. To analyze the prediction error changing with the station geomagnetic latitude, the curves of AdaBoost-BP errors are plotted in Fig. 7. In Fig. 7, the x-axis represents the station code which is arranged by geomagnetic latitude, and the y-axis represents root mean square error. From Fig. 7, the prediction errors of three stations (Wuhan, Okinawa, Taipei) in mid-and low-latitude areas (geomagnetic latitude from about 10° to 20°) are larger than others. That’s probably because these three stations are located near or in the northern hump of the equatorial anomaly, where the electron density of the ionosphere is high and changes remarkably. These effects will reduce the accuracy of foF2 prediction for these stations.

4

Conclusion

We developed a new short-term prediction model of the ionospheric parameter foF2 based on AdaBoost. Compared with other prediction methods, the effectiveness of the prediction model was verified by the data from nine ionosonde stations in the East-Asian sector. (1) The proposed prediction model achieves better performance than BP Neural Network, Support Vector Regression and IRI model. Thus, the AdaBoost method could improve the foF2 prediction. 8

(2) For different seasons, the prediction error of all methods in the summer is lower than that in winter. Meanwhile, the AdaBoost-BP method performs better than other frequently-used methods during all twelve months. (3) The prediction RMSE during a period of high solar activity is higher than that during low solar activity. However, the prediction RE during a period of high solar activity is lower than that during low solar activity. That’s because the amplitude of foF2 during high solar activity is larger than that during low solar activity, whereas the change range during high solar activity is smaller than that during low solar activity. Meanwhile, the AdaBoost-BP method outperforms the other three methods both in high and low solar activity. (4) Analyzing the prediction error changing with station geomagnetic latitude, the prediction errors of stations in mid-and low-latitude areas (geomagnetic latitude from about 10° to 20°) are larger than others. Acknowledgments

The research was supported by the National Natural Science Foundation of China (No. 41104106, 41074113, 41204113), National Key Basic Research Program of China (2012CB825604), and the Science Fund for Creative Research Groups from the National Science Foundation of China under Grant No. 51121005. The F107, ap and Dst indices are downloaded from the SPIDR Web site http://spidr.ngdc.noaa.gov/. The ionosonde data used are available in the American National Geophysical Data Center (NGDC-NOAA) database and WDC for Ionosphere, Tokyo, National Institute of Information and Communications Technology.

References Araujo-Pradere, E.A., Fuller-Rowell, T.J. STORM: An empirical storm-time ionospheric correction model 2. Validation. Radio Science 37, 1071, 2002. Araujo-Pradere, E.A., Fuller-Rowell, T.J., Codrescu, M.V. STORM: An empirical storm-time ionospheric correction model 1. Model description. Radio Science 37, 1070, 2002. Ban, P.-P., Sun, S.-J., Chen, C., Zhao, Z.-W. Forecasting of low-latitude storm-time ionospheric foF2 using support vector machine. Radio Science 46, RS6008, 2011. Belehaki, A., Stanislawska, I., Lilensten, J. An Overview of Ionosphere—Thermosphere Models Available for Space Weather Purposes. Space Science Reviews 147, 271-313, 2009. Bergstra, J., Casagrande, N., Erhan, D., Eck, D., Kégl, B. Aggregate features and ADABOOST for music classification. Mach Learn 65, 473-484, 2006. Bilitza, D. International Reference Ionosphere 2000. Radio Science 36, 261-275, 2001. Bilitza, D. Ionospheric models for radio propagation studies. In: Stone, W.R., (Ed.). Review of Radio Science: 1999-2002. 1 ed. Wiley-IEEE Press, 625-679, 2002. Bilitza, D., Reinisch, B.W. International Reference Ionosphere 2007: Improvements and new parameters. Advances in Space Research 42, 599-609, 2008. 9

Bishop, C.M. Neural Networks for Pattern Recognition. Oxford University Press, New York, 1995. Cander, L.R., Milosavljevic, M.M., Stankovic, S.S., Tomasevic, S. Ionospheric forecasting technique by artificial neural network. Electronics Letters 34, 1573-1574, 1998. Chen, C., Wu, Z., Sun, S., Ban, P., Ding, Z., Xu, Z. Forecasting the ionospheric foF2 parameter one hour ahead using a support vector machine technique. Journal of Atmospheric and Solar-Terrestrial Physics 72, 1341-1347, 2010a. Chen, C., Wu, Z.S., Xu, Z.W., Sun, S.J., Ding, Z.H., Ban, P.P. Forecasting the local ionospheric foF2 parameter 1 hour ahead during disturbed geomagnetic conditions. Journal of Geophysical Research 115, A11315, 2010b. Davies, K. Ionospheric radio. Peter Peregrinus Ltd., London, United Kingdom., 1990. Francis, N.M., Cannon, P.S., Brown, A.G., Broomhead, D.S. Nonlinear prediction of the ionospheric parameter foF2 on hourly, daily, and monthly timescales. Journal of Geophysical Research 105, 12839-12849, 2000. Freund, Y. Boosting a weak learning algorithm by majority. Information and Computation 121, 256-285, 1995. Freund, Y., Schapire, R. Experiments with a new boosting algorithm. Machine Learning: Proceedings of the Thirteenth International Conference. Morgan Kaufmann, San Francisco, 1996. Habarulema, J.B., McKinnell, L.-A., Cilliers, P.J. Prediction of global positioning system total electron content using Neural Networks over South Africa. Journal of Atmospheric and Solar-Terrestrial Physics 69, 1842-1850, 2007. Habarulema, J.B., McKinnell, L.-A., Opperman, B.D.L. TEC measurements and modelling over Southern Africa during magnetic storms; a comparative analysis. Journal of Atmospheric and Solar-Terrestrial Physics 72, 509-520, 2010. Habarulema, J.B., McKinnell, L.A. Investigating the performance of neural network backpropagation algorithms for TEC estimations using South African GPS data. Ann. Geophys. 30, 857-866, 2012. Haykin, S. Neural Networks: A Comprehensive Foundation. Prentice Hall, 2008. Hecht-Nielsen, R. Kolmogorov's mapping neural network existence theorem. IEEE First Annual International Conference on Neural Networks. pp. 11-14, 1987. Kelley, M.C. The Earth's Ionosphere: Plasma Physics & Electrodynamics. Access Online via Elsevier, 2009. Kutiev, I., Muhtarov, P. Modeling of midlatitude F region response to geomagnetic activity. Journal of Geophysical Research: Space Physics 106, 15501-15509, 2001. Liu, J., Liu, L., Zhao, B., Wan, W., Chen, Y. Empirical modeling of ionospheric F2 layer critical 10

frequency over Wakkanai under geomagnetic quiet and disturbed conditions. SCIENCE CHINA Technological Sciences 55, 1169-1177, 2012. Liu, L., Wan, W., Ning, B. Statistical modeling of ionospheric foF2 over Wuhan. Radio Science 39, RS2013, 2004. Liu, L., Wan, W., Ning, B., Pirog, O.M., Kurkin, V.I. Solar activity variations of the ionospheric peak electron density. Journal of Geophysical Research: Space Physics 111, A08304, 2006. Liu, L., Zhao, B., Wan, W., Ning, B., Zhang, M.-L., He, M. Seasonal variations of the ionospheric electron densities retrieved from Constellation Observing System for Meteorology, Ionosphere, and Climate mission radio occultation measurements. Journal of Geophysical Research: Space Physics 114, A02302, 2009. Liu, M. Fingerprint classification based on Adaboost learning from singularity features. Pattern Recognition 43, 1062-1070, 2010. McKinnell, L.-A., Poole, A.W.V. Predicting the ionospheric F layer using neural networks. Journal of Geophysical Research: Space Physics 109, A08308, 2004. Mikhailov, A.V., Depuev, V.H., Depueva, A.H. Short-Term foF2 Forecast: Present Day State of Art. In: Lilensten, J., (Ed.). Space Weather. Springer Netherlands, 169-184, 2007. Morra, J.H., Zhuowen, T., Apostolova, L.G., Green, A.E., Toga, A.W., Thompson, P.M. Comparison of AdaBoost and Support Vector Machines for Detecting Alzheimer's Disease Through Automated Hippocampal Segmentation. IEEE Transactions on Medical Imaging 29, 30-43, 2010. Muhtarov, P., Kutiev, I. Autocorrelation method for temporal interpolation and short-term prediction of ionospheric data. Radio Science 34, 459-464, 1999. Mukhtarov, P., Pancheva, D., Andonov, B., Pashova, L. Global TEC maps based on GNSS data: 1. Empirical background TEC model. Journal of Geophysical Research: Space Physics 118, 4594-4608, 2013. Niu, B., Cai, Y.-D., Lu, W.-C., Li, G.-Z., Chou, K.-C. Predicting Protein Structural Class with AdaBoost Learner. Protein and Peptide Letters 13, 489-492, 2006. Oyeyemi, E.O., McKinnell, L.A., Poole, A.W.V. Near-real time foF2 predictions using neural networks. Journal of Atmospheric and Solar-Terrestrial Physics 68, 1807-1818, 2006. Pietrella, M., Perrone, L. A local ionospheric model for forecasting the critical frequency of the F2 layer during disturbed geomagnetic and ionospheric conditions. Annales Geophysicae 26, 323-334, 2008. Poole, A.W.V., McKinnell, L.-A. On the predictability of foF2 using neural networks. Radio Science 35, 225-234, 2000. Schapire, R.E., Freund, Y., Barlett, P., Lee, W.S. Boosting the margin: A new explanation for the effectiveness of voting methods. Proceedings of the Fourteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., 1997. 11

Schunk, R.W., Scherliess, L., Sojka, J.J. Recent approaches to modeling ionospheric weather. Advances in Space Research 31, 819-828, 2003. Schwenk, H., Bengio, Y. Boosting Neural Networks. Neural Computation 12, 1869-1887, 2000. Sojka, J.J., Thompson, D.C., Schunk, R.W., Bullett, T.W., Makela, J.J. Assimilation Ionosphere Model: Development and testing with Combined Ionospheric Campaign Caribbean measurements. Radio Science 36, 247-259, 2001. Tsagouri, I., Belehaki, A. An upgrade of the solar-wind-driven empirical model for the middle latitude ionospheric storm-time response. Journal of Atmospheric and Solar-Terrestrial Physics 70, 2061-2076, 2008. Wang, R., Zhou, C., Deng, Z., Ni, B., Zhao, Z. Predicting foF2 in the China region using the neural networks improved by the genetic algorithm. Journal of Atmospheric and Solar-Terrestrial Physics 92, 7-17, 2013. Wintoft, P., Cander, L.R. Short-term prediction of foF2 using time-delay neural network. Physics and Chemistry of the Earth 24, 343-347, 1999. Yue, X., Wan, W., Liu, L., et al. Data assimilation of incoherent scatter radar observation into a one-dimensional midlatitude ionospheric model by applying ensemble Kalman filter. Radio Science 42, RS6006, 2007.

12

Fig. 1. foF2 short-term prediction model based on AdaBoost. Fig. 2. Locations of the stations selected in this study. Fig. 3. The contribution of input parameters that influence foF2 prediction (The initial parameters are HRS, HRC, DNS, DNC and F107). Fig. 4. Comparison of the observed foF2 and AdaBoost-BP forecasted foF2 on the 34th and 35th days of 1970 at Wakkanai. Fig. 5. Comparison of monthly RMSE and RE between AdaBoost-BP and other methods for Wakkanai data. Fig. 6. Comparison of yearly RMSE and RE between AdaBoost-BP and other methods for Wakkanai data versus F107 values. Fig. 7. Prediction errors of AdaBoost-BP for different stations.

13

Fig. 1

14

Fig. 2

15

Fig. 3

16

Fig. 4

17

Fig. 5

18

Fig. 6

19

Fig. 7

20

Table 1. List of the nine selected stations where the foF2 are modeled in the paper. Station

Station code

Geomagnetic latitude

Geographic latitude

Geographic longitude

Yakutsk

YA462

51.2

62.0

129.6

1959-1969

1970-1980

Irkutsk

IR352

41.2

52.5

104.0

1959-1969

1970-1980

Wakkanai

WK545

35.5

45.4

141.7

1958-1968

1969-1979

Kokubunji

TO535

25.7

35.7

139.5

1958-1968

1969-1979

Yamagawa

YG431

20.6

31.2

130.6

1958-1968

1969-1979

Wuhan

WU430

19.3

30.6

114.4

1969-1979

1980-1990

Okinawa

OK426

15.5

26.3

127.8

1958-1968

1969-1979

Taipei

TP424

13.8

25.0

121.5

1959-1969

1970-1980

Manila

MN414

3.6

14.6

121.1

1964-1974

1975-1985

21

Cover Year (Train)

Cover Year (Test)

Table 2. List of the prediction errors at the nine stations with different methods. Station

Prediction method

ΔfoF 2 (MHz)

RE (%)

RMSE (MHz)

Yakutsk

IRI BPNN SVR AdaBoost-BP

0.76 0.32 0.33 0.31

12.97 5.73 6.08 5.49

0.99 0.49 0.53 0.47

Irkutsk

IRI BPNN SVR AdaBoost-BP

0.64 0.34 0.35 0.32

10.30 5.34 5.79 5.07

0.85 0.49 0.54 0.47

Wakkanai

IRI BPNN SVR AdaBoost-BP

0.71 0.40 0.40 0.39

10.73 6.03 6.11 5.87

0.93 0.57 0.58 0.55

Kokubunji

IRI BPNN SVR AdaBoost-BP

0.88 0.49 0.49 0.48

11.44 6.74 6.71 6.56

1.14 0.68 0.68 0.66

Yamagawa

IRI BPNN SVR AdaBoost-BP

1.02 0.58 0.56 0.56

12.32 7.30 7.08 7.00

1.32 0.80 0.78 0.77

Wuhan

IRI BPNN SVR AdaBoost-BP

1.02 0.78 0.77 0.73

11.13 8.71 8.93 8.18

1.34 1.12 1.15 1.06

Okinawa

IRI BPNN SVR AdaBoost-BP

1.34 0.76 0.74 0.73

15.43 8.19 8.09 7.90

1.79 1.04 1.04 1.01

Taipei

IRI BPNN SVR AdaBoost-BP

1.37 0.81 0.81 0.78

13.96 7.74 7.72 7.44

1.79 1.11 1.12 1.08

Manila

IRI BPNN SVR AdaBoost-BP

1.04 0.65 0.66 0.61

10.94 6.79 7.01 6.29

1.35 0.92 0.95 0.85

22