Chin.Astron.Astrophys.11 (1987) 186-190 Act.Astrophys.Sin.L (ITT) 55-61
FUZZY
SETS.
FLARE
PREDICTION
LIU
Xu-zhao
Keywords
CLASSIFICATION
and LI Wei
Pergamon Journals. Printed in Great Britian 0275-1062/87$10.00+.00
OF SOLAR
ACTIVE
REGIONS
AND
Beijing Observatory, Academia Sinica
Sun - Solar Active Regions and Flares - Fuzzy Set Analysis
ABSTRACT
We selected 400 sample solar active regions of 1980, each containing the following data: sunspot type, total spot area, magnetic field type, maximum field intensity, peculiar structures, radio flux at 3 cm, and radio flux ratio between 3 cm and 10 cm. These quantities define a generalized vector for each sample point. We applied a fuzzy set analysis, which resulted in 14 types of active regions. On the basis of fuzzy classification,we make a fuzzy prediction (whether there will be a flare of Class 1 or over on the day following each of the 12 mid-month days of 1981). The result proved to be encouraging.
1.
INTRODUCTION
Zadeh [l] in 1965 first introduced the concept of fuzzy set, generalising the ordinary sets into fuzzy sets. Mathematics thereby began to penetrate some hitherto forbidden regions in applied mathematics, natural sciences, humanities, management, PI; even in pure mathematics, it found its peculiar use. Since then, Chinese scholars have also applied the methods of fuzzy sets in meteorology [3], medicine [4], athletics, image recognition and control, agriculture [S], biology, building materials, geology, mining industry, economics and psychology [6], linguistics and law. Solar active regions are the seats of complicated physical processes. Although great strides have been made in the study of solar physics, our ideas on the physical processes of flares are far from clear. The active regions can be classified according to different physical parameters, e.g., morphology, magnetic type, etc. The complexity of the physical process involved is also reflected in several physical parameters being operative in the prediction of solar activity. Predictions based on the different parameters are distinctive and cannot be substituted one for the other. The various quantities characterising the active region each have their own significance and are mutually irreplaceable and there exist no clear mathematical relations among them. Researchers the world over have, in the course of time, used empirical formulae, probability calculations and other mathematical methods for solar prediction
work (cf. Refs.[7,8,9]). In this paper, we attempt to make a classification of solar active regions by means of fuzzy set analysis, using the relations between certain physical quantities and flare activity found by different authors. We shall then attempt a short-term (next day) prediction of solar activity on the basis of the fuzzy classification. Lastly, we shall give a brief discussion.
2.
FUZZY SETS
In our case, we use 7 physical quantities of active regions. The active region type expressed by the 7 quantities is a fuzzv vector formed of 7 fuzzy components. we call n(xO) - 0
BA,(x!) -Mia{n,(xi),...,f~&;)}
(I)
the degree of membership of the sample (x0) with respect to the fuzzy vector x, where uij (xj') is the degree of membership of each component. In order to obtain the fuzzy equivalent relationship of the sample set, we need first the fuzzy consistent relationship over the set. We use the cosine of the _ generalized angle to fix the fuzzy consistent relationship, i.e.,
(2)
I83
Fuzzy Predictionof Flares
degreeof association with flareactivity (i= l-400xepresentsthe serialnumberof samplepoint): wh%re i and j label the samplesets and k the
HDCBJ TypeF E E Xi1 1 0.5 0,179 0.393 0.286 8.179 0.06 0.06
vectorcomponents. In this paper,we cIassify Component2 is the sunspotgroup area. the activeregionson the basis of the fuzzy When the area is 1000 solar surfaceunits or equivalentrelationship of the sampleset. greater,xi2= 1; as the areas decreasesfrom Fox an introduction to fuzzy sat tbeasy, 1000 to 0, xi2 decreaseslinearlyfrom 1 to Refs. [lO,ll]may be consulted, 0. ~o~on~nt 3 is the magneticfield intensity As inthecase of sunspotarea, we use a linearrelation:when the field is 30006or We use the followingquantitiesas the over, xi3=l; between0 and 3000,xi3 is between0 and 1, componentsof the activeregionvector: Component4 is the magnetic type of the sunspottype, total area,magnetictype, field sunspotgroup. The numericalassignmentis intensity,anomalousfeatures,radio flux at 3 cm and ratio of radio fluxesat 3 and IO cm. as follows: The last two parameterswere taken from Type y BY Q "S.G,D.".and the othersfrom Tai~ane 8 0.931 DiquiwuliZiliao",publishedby Beijin"g 0.345 0 1 Xi4 Observatory.The flaredata were taken from Component5 is the morphologyof the active "S*G,D.". region. We mean the neculiar mornholonies The activeregiondata on one day were * * [14]connectedwith the occurrenceof flares, taken as one samplepoint. In order to xeflectthe effectof ~~~~~~~g~~a~ analysis, also calledpeculiarstructures.There are 11 types:The A structure(two rows of we confinedourselvesto the active regions within*60° of the centralmeridian. Further, spots close togetherwith oppositepolaxitiesf in order to cut down the amountof reduction, The S-typemagneticstructure(umbra1cores we discardedall sunspotgroupswith sunspot of oppositepolaritiesand close together appearingin the same penumbra),c. Composite type A or magnetictype a and some groups with types B or J. For the same reason, we group (complex polaritiesyet easily separableinto leadingand followinggroups). sampled only every other day. In this way, $%Y'. e. Bipolar we obtainednearly 500 samplepoints from tbe d. Magneticinclination entiredata of the year 1980. After discard- groupwith anomalouspolarities. f. The ing randomlysome data in the middlepart leaderspot at a higher latitudethan the (April)and the end part, (December), we were followerspot. g* Globularstructure(only one very largemain spot, its penumbra left with 400 samplepoints for use. When we made predictionsusing the results containingseveralpenumbralcores, surroundof clusteranalysis,we took the data on each ed by mainly small spots). h. Large follower mid-monthday of the subsequentyear, 1981, spot (the leaderspat being relativelymuch smaller). 3.. The line joiningthe follower (if data was not completeon the 15th,then we took the next day's) and predictedthe and leaderspots is rotatedthrough60" ox more with respectto the directionof solar flare activityon the next day, Again,only regionswithin +605 of the centralmeridian rotation, j, The total sunspotarea having were used. Therefore,oux predictionwas for variedby 50% or more; k. Positiveor negative rotations, We assign the numericalvalues flare activitywithinf6O" af the centraL as follows: meridian, s7
4. 'IHESEVEN COMPONENTSOF THE ACTIVE VECTOR REGION We present in turn 7 physicalquantitiesthat characterize an activeregion. Their relationships with flare activityare given in nume’t~uspapers fcf. Refs. 19,12.13.141), Here, we put-togetherthese re&&ci\e&~ Se resultsto constructa sunspotgroup vector, Each factoris normalizedto between0 and 1, Component1 is identifiedwith the sunspot group type. The differenttypes are assigned the followingvaluesaccordingto their
Type
f
xi5
1
CL67
b 0.5
others 0.233.
Component6 is the radio flux at 3 cm. All radio fluxesare taken from the one-dimennsional imagesin "S.G.D,"'.What we give is relativeflux, when there is confusionin the image,we apportionthe flux. Our measuredrelativefluxvaried from 0 to 13. These were linearlynormed to between0 and 1. ~orn~on~~~7 is the ratio of radio fluxes at 3 and 19 cm, It was normed to between0 and 1 as in the last entry. We shallnot list here the 400 samplevectors.
LIU and LI
188
Active
Reegion Fuzzy Classes
Next-Day
“-r--
Jzf.0025 .009
u:1 _--2 -3
4
*:t ----
-01
.09
.04
Gj --
‘5
.4!
.94
@L ----
.05
.04
.09
fd ---
.46
.42
.a6
.06 _--.45
.04
.04
-57
.9
n:1 ----
.04
.09
.04
f,t _-~-
.34
.32
.85
.027 -__ .2a6 --.0025
.a4
.023
.15
.b8
iSI ---
*:, -f,l 7 -.
8 _.
9
.09
.5
*_ _ _ _
6
1
% ----
u:1
F
,815
1
fd ----
1
di ----
Flare Prediction
.49
.a3
6
5
4
-‘\
I
and
-----_ .9
.369
.09 ----_.345
.6L15 -.09
.5
.233
.a8
.27
.01
.OY
IO25
.4f
.I6
.022
.022
.44
.I6
904
,023
.74
.29
.04
.04
.61
.22
.04
---
,023
Gt _---
.286
.056
@li _-~-
.0025
.005
f,l ----
.2a
.21
.Ol
.931 .32 ----16 .09 .0036 -----.233 .345 -------_-*5 .Ol .01 --.22 .94 --------.-49 .Ol .16 ----_1 .94 --~-~c .0025 .0036
.931 ----.04 .0036 -----.6 .931 ---.0036 ,023 -.345 .71 _-___~
a .0025 0 .0025 .I
---_
.OY -_.88
size
prob.
14
14
100%
11
6
54.5%
7
43.7%
6
40.0%
26
53,1%
4
40.0%
-
-_-
.I8
-35
.0x3
4
2
SO.O%
6
I
16.7%
150
23
15.3%
.0036 .I6 -7.7%
10
,931 --
If
.2 23.5%
12
13 14
Flare Prob.
.023.
,017
.04 -~-.61
sample size
.09
.o4
--
--
1
all sunspot
groups
of Type A or of Magnetic
Type
01
Fuzzy
5.
Prediction
FUZZY CLASSIFICATION OF ACTIVE REGIONS
To derive the equivalent partition of the 400 sample points, we must carry out the following sequence of calculations: 1) We determine the fuzzy consistent relationship. We calculate the covariances uij defined at (2), which constitute the matrix !! of the fuzzy consistent relationship. 2) We keep repeating the operation of self-composition of R, until @n = IZ2n, that is,
3) We clean up !!o with a suitable value of h, (taken to be 0.985 here) and we obtain the level ensemble d(0.985). In A(9.985), we have 14 major categories, each containing 4 or more sample points (see TABLE I). Of the 63 sample points not included in these categories, nearly all were singletons, with no group having more than 3 members. For convenience in use, the number of categories must be kept small, so these maverick points were assigned to suitable categories according to the expressCategory 13 in the TABLE was the ion (1). result of combining two categories, which had similar properties, being both of Sow activity level, with no flares (of importance class 1 or greater: this proviso will he understood throughout this paper) occurring on the next day, Category 14 consists of all active regions with sunspot group type A or magnetic type o, that were discarded during the initial data selection. TABLE 1 gives the fuzzy vectors of the 13 active region categories and their probabilities of havfngflares appearing on the The form of membership function next day. of vector component j of active region category i was taken to be
The mean values were calculated from all the sample points belonging to the given category, while the variances were taken to be greater than the calculated values, and were determined, somewhat subjectively, with due regard to the calculated values and the actual distributions. It should be said that the determination of membership function is yet an unsolved problem and the present state of art depends very muchonthe experience of individual researchers. The last 3 columns of TABLE 1 give, for each active region category, the number of sample points included in that class, the number of instances with a flare occurring on the next day, and the probability of such an occurrence. For some categories. the
of
Flares
189
sample size is small, and the statistical For example, Category result may be suspect. 7 has only 4 sample points. Even for Category 1 with 14 sample points, the result on flare probability does not match the facts. Nevertheless, this is all we can do at present pending future improvement.
6.
SOLAR FLARE PREDICTION
We shall use the above resultsoffuzzy classification of active regions in 1980 for predicting solar activity on some selected days in 1981. We selected the 12 mid-month days and predict whether or not a flare will occur on the next day, within 60” of the central meridian on the solar disk. The prediction can be extended to the whole disk. Where data on a mid-month day is incomplete, we move forward by one day. There were 55 active regions on the 12 days selected. The prediction can be made in different ways. One way is through a fuzzy clustering analysis, but it would be very timeconsuming. So we opted for the following method, where we calculated the degree of membership of each active region with respect to the various region categories. Specifically, the following steps were taken: 1. After gathering the data on a given active region, we construct its 7-dimensional characteristic vector according to the numerical prescriptions of Section 4, 2. In accordance with the expressions (3) and (l), we calculate the degree of membership of the active region vector with respect to each of the 13 fuzzy vectors representing the region categories. 3. The active region is then taken to belong to that category with respect to which it has the highest degree of membership. 4. From TABLE 1, we look up the corresponding probability of a flare on the next day. 5. We regard the active regions to be The sum rule of statistically independent. probabilities of independent events pi is
p (k ‘-8
8’)-g
P(h) -
,
I
**. + (-l)a-‘p(B,,B,,...B.).
(4)
and we easily calculate the probability of a flare on the next day when we know the active region(s) on the given day. 6. If the flare probability is greater than 0.5, we predict “Danger”; if less than 0 5 then “Safe”. We can, of course, choose o;h& levels, depending on how we want to balance the two kinds of false predictions. Let us take 1981 Jan 15 as a worked
190
LIU and LI
TABLE 2 Fumiy Pxediction Results ("1" for flares; "0" for no flares) 1
example. Apart from type A and magnetic type a spots, there were 4 other spot groups within 60° of the central meridian on that day. These groups were of spot types D, C, B, E, respectively. From the above procedure, we derived that the D, C, E, spots belonged to Category 10, while the B spot, to Category 13. The probability of a flare on the next day is 0.077 for a Category 10 region, and is zero for a Category 13 region. Hence, according to (4), we derived the probability of a flare on Jan 16 1981 to be,
9.16 ! 10.16 ! 11.16 ! 12.16
improve the result of prediction, we need to search for more effective parameters. The use of fuzzy classificationgiven here may lead to quite a large number of region categories. In the above, small categories were removed. The main reason was that these could not be effectively established when the sample size was so small, but it also shows that the form of the membership function may be improved. It is regrettable that only statistical results can be obtained by this method. Of our 12 predictions, 83.3% are hits, 22.2% are misses, and there were no false alarms. This result seems to be even better than expected. We must await the result of much larger analyses before we venture an over-optimistic conclusion.
According to our rule, we therefore predicted "safe". As a matter of fact, there were indeed no flares on the next day. TABLE 2 gives our results. REFERENCES r11
7.
DISCUSSION
This paper is a first attempt at applying the method of fuzzy sets to the problem of classification of solar active regions and the prediction of solar flares. In several aspects, it is incomplete and arguable. First, as in many problems treated by the method of fuzzy mathematics, degree of membership functions are determined by experience, often subjectively. We may use different numerical assignments for the vector components characterizing the active region and the membership function (3) need not have the form stated. The values of oij2 in (3) were determined with a high degree of subjectivity. Three of the seven components (the spot type, magnetic type and morphological group> were discrete. Actually they should be fuzzified quantities, and their fuzzification awaits further work. The procedure presented here applies equally well when these three quantities are suitably fuzzified. Althoughwe have used the main parameters that characterize active regions, there could be some other more suitable quantities, e.g. from X-ray and UV observations. Also, certain dynamic parameters have become increasingly important in solar physics and in the prediction work. And among the chosen quantities, some may be discarded on further examination. In summary, to further
Zadeh, L.A., Information S Control 8 (1965) 338. [2 1 Dubois, D., Prade, H., Fuzzy Sets and Systems: Theory and Application (1980) CHEN Guo-fan and SHENG Jia-rong Ziran [31 Zazhi 7 (1984) 181. WANG Huai-qing and XU Zai-fu Mohu Shuxue [41 ("Fuzzy Mathematics") 2 (1982) 91, FU Ning and HE Zhong-xiong Mohu Shuxue 2 [51 (1982) 79. FENG Xiao-yu and XIIYong-chun, Hohu [61 Shuxue 2 (1982) 73. Smith, J.B. Jr. Solar Activity Observat[71 ions and Predictions (1972) Solar Activity Prediction Group of [81 Beijing Observatory (1979) in "Solar Terrestrial Predictions, Vol 1 Prediction Group Report" HU Wen-rui et al. "Taiyang Yaoban" (Solar [91 Flares) Kexue Chubanshe (1983). DOI WANG Pei-zhuang "Mohu Jiehe Lun Ji Qi Ying yang" ("Theory of Fuzzy Sets and its Application") (1983) K. Asai, "Introduction to the Theory of WI Fuzzy Sets" 1983. 1121 CHEN Xie-zhen and ZHAO Ai-di, "Selected Papers in the S~posium on Solar Physics and Radio Astronomy'~(in Chinese) (1973) WNG Shi-lun and WANG Jia-long, ibid ;:43; SHI Zhong-xian, LXN Yuan-zhang et al., Act.Astron.Sin. 16.(1975) 12.