Vistas in Astronomy, Vol.38, pp. 299-307, 1994
Pergamon
Copyright O 1995 ElsevierScienceLtd Printed in Great Britain. All rights reserved 0083-6656/94 $26.00 0083-6656(94)00016-6
THE USE OF A H Y B R I D N E U R A L SYSTEM FOR THE CLASSIFICATION OF STARS M. Klusch Institut ftir Informatik und Praktische Mathematik, Christian-Albrechts-Universit~it Kiel, 24118 Kiel, Germany
Abstract: A hybrid neural approach for a fully automated spectral and luminosity classification of stars is presented. The hybrid neural system (HNS) integrates a neural classifier and a semantic network used for similarity based reasoning and conceptual knowledge representation, respectively. The semantic net is designed in accordance to the two-dimensional spectral and luminosity MKclassification of stars. In this paper, the architecture, functionality and application of the HNS in astronomy are presented. The functional capabilities and results of stellar classification of the HNS show significant improvements compared to conventional astronomical techniques. After knowledge aquisition is once completed, the system classifies stellar objects very fast, reliably and without any need for further pre-classification of them. The pure neural classification is completed with information retrieval using the respective attached semantic net. Moreover the HNS is also able to compare classes of stars without forcing the user to give any raw input data and special knowledge about relations between these classes. This function relies on the integration of both networks using spreading-activation. The HNS provides the first application of a hybrid neural approach in the area of astronomical classification of stellar objects. Aspects for further work on the HNS are briefly proposed. Keywords: star classification - - neural net - - semantic net I integration
1. I N T R O D U C T I O N Most research efforts in astronomy on techniques for the classification of stars aim to develop automated procedures as the only practical way of handling the enormous amount of digitized data produced by machine scans of objective-prism spectra. Ongoing projects and basic attempts to develop procedures for an automated, fast and reliable spectral and luminosity classification (the Mo rgan-Keenan(MK) classification) are reported e.g. in [14],[15],[1],[21] and [24]. The major drawbacks of these approaches are that they provide only semi-automated or at least very restrictively applicable procedures. However, all these classification tools used in astronomy differs fundamentally from the hybrid neural approach presented in this work. Moreover in most cases moderately better resuits were obtained by the latter. In particular, the HNS is the first system which uses a neural net 299
300
M. Klusch
for the MK-classification of stars[11 ]. Until now there exists no other fully automated, very fast and reliable method applicable to this two-dimensional classification of stars without any restriction or using special prior knowledge. All hybrid neural sytems are instances of a hybrid synthesis of neural classification and symbolic knowledge representation [20]. Recent works on hybrid neural systems such as those to be found in [9],[25] confirm that this approach is very promising for numerous applications. The architecture of the presented hybrid neural system HNS integrates a backpropagation neural network and a semantic net. For the purpose of spectral and luminosity (MK) classification of stars the integrated neural net of the HNS currently uses photometric data. Additionally, the semantic net provides an accessible symbolic representation of knowledge about the properties of stars due to their spectral and luminosity type. It is also partially responsible for executing the comparison of named classes of stars with respect to some similarity learned by the neural net. Providing the direct comparison of classes without any given numerical input offers new facilities not only for astronomical considerations. While working with the HNS there is no need for any pre-classification of stars or prior knowledge about relations between classes of stars. It will be shown that the presented hybrid system improves and extends the functionality of the existing systems for a MK classification of stellar objects within astronomy. The remainder of this paper is organized as follows. In section 2 we shortly present the hybrid architecture of the HNS and its functionality. The main results of using the HNS for the MK classification of stars and comparing named classes of stars are summarized in section 3. Section 4 presents a short discussion of the hybrid system with respect to existing astronomical classification methods and an outlook on future work.
2. T H E H Y B R I D N E U R A L S Y S T E M H N S 2.1. Functionality The structure and functionality of the HNS is given by the hybrid synthesis of a neural classifier and a semantic network. Conceptual knowledge about the Morgan-Keenan (MK) classification of stars is symbolically represented in an appropiate class hierarchy of the semantic net. Knowledge about similarity of stellar objects in respect to their photometric indices is distributed over weighted links of the neural net which were learned supervised during training by samples. Independent from a desired initialization of the semantic net, leading to the actual state of the knowledge base of stars (KBS), the neural net can be constructed and trained over a set of sample patterns by the user. A hidden compatibility check for both components is part of the internal integration of the HNS. Before working with both parts of the system it is possible to test the created neural net extensively using a test-tool which is part of the implemented HNS. There exist two classification methods: bottom up (BU) and top return (TR). Using the BU mode neuronal classification determines the most similar class associated with the given special numerical input. An immediately following identification finds out the respective class in the actual overlayed semantic net. The evaluation corresponds to an access to the symbolically coded information in the KBS which belong to the classified class such as property values. Using the TR mode spreading-activation (cf. Section 2.3)[7] gets the most similar class to a named one using all weights of the last neural layer. Comprehensive examples for both classification methods of the HNS are given in Table 1 and 2.
2.2. Architecture." Neural and Semantic Net The neural net of the HNS uses an improved version of the well known supervised backpropagation learning procedure with momentum term for multi-layered neural networks[2]. The purpose of
Use of Hybrid Neural System for Star Classification
301
Semantic Network 5TARS
KBS B.STARS
F-STARS
A-STARS
B-V-STARS ... B-l-STARS
A-V-STARS ... A-I-STARS
F-V-STARS .,. F-I-STARS
Identification of output units with instances
000
output layer
000 . . . * .
000
hidden layers
weighted links between all units of successive layers Input layer
0
0
0
0
Neuronal Network
(BackpropagationBP-net) Fig. I. Global structure of the HNS. the algorithm is to train the net supervised such that it will approximate the desired classification function in a given precision e. This is done by minimizing the net error defined over all weights and the given training set of sample patterns by computing its conjugated gradient at each unit locally. For a general survey and a more detailed discussion of neural nets used for classification tasks see [16],[27],[101 and [2]. In contrast to the HNS most implemented improvements of the original backpropagation ignore or do not correspond to the required strictly locally neural computing manifested particularly in [6] and [12]. The need for exclusively local computation in all neural units motivates the special Hecht-Nielsen backpropagation architecture based on neurophysiological investigations [6] which is used by the HNS. For reasons of space limitation I omit to go into details. A formal description and complexity of computation for this neural architecture can be found in [10]. For representing conceptual knowledge about stars a semantic net is designed in accordance with the M K classification. A Semantic network is used as a graphical knowledge representation formalism. It represents attributed concepts as nodes and their semantic relations as directed edges between them. The structure of some semantic net in particular depends on the kind of domain knowledge the designer intends to represent. In the semantic net specifically used by the HNS for its current application we have hiTable 1. Bottom up (BU) classification. user input:
ml = 0.102 el = 0.001 = 2.557 HNS output: classified MK class is B0I user input: temperature:a mass :p HNS output: Evaluate: temperature = unknown mass = 50 25 (BOIB5I) :a means actual property valueof identifiedobject :p permitted property valuesof all respectiveMK classes
M. Klusch
302
Table 2. Top return (TR) classification. user input: HNS output:
user input: HNS output: user input: HNS output: user input: tiNS output:
B0-I radius :a Mvis :a Similarity with conceptual class BOIIIdetected. Evaluate: radius = 16, Mvis = -5 B0-I A0-III No similarity with MK class AOIIIdetected. BI-STARS BIII-STARS Similarity with conceptual class BIII-STARS. B0-I A-STARS < rank list of most similar MK classes> No similarity with conceptual class A-STARS.
erarchically organized all considered M K classes in a simple is-a hierarchy following the fact that all stars which belong to a certain spectral and luminosity type (MK-)class are necessarily members of the respective spectral type class. E.g. each star seen as an instance of the semantic net concept o f 'B-V stars' obviously is an instance of the concept of' B-stars'. Thus, each concept in this semantic net represents one certain set of stars as its set of instances. Attributes of a concept are all properties known to be valid for describing this concept. All instances of one concept have the same properties but can posses different attribute values. In a derived, i.e. more specific concept in the hierarchy all attributes o f the more general concept are implicitly inherited in addition to the new attributes which belong exclusively to the derived concept. This constitutes an attribute inheritance hierarchy in the reverse direction of the conceptual is-a relation. For a more detailed introductionary survey o f semantic nets we refer the interested reader to [3]. A formal description of the semantic net used in the H N S by a frame model F R S is given in [10]. In the current version of the HNS, stars with spectrum B, A, F (in subtypes 0 and 5) and luminosity type V, III, I are stored. In general, the semantic net is used as a basic structure for information retrieval with respect to the M K class classified by the neural net and moreover for providing a starting point for the T R classification mode of the HNS. The benefits o f the hybrid synthesis of the semantic net with the neural net are a fast and comfortable information retrieval with respect to the M K class actually classified by the neural net and on the other h a n d providing a facility for the comparison of named M K classes without forcing the user to give any specific numerical input.
2.3. Functional Integration The hybrid character of the HNS bases mainly on the integration of the neural and semantic net in order to dispose the functionality of these components. Since we use a backpropagation neural net (cf. Section 2.2) it is obvious that the functionality and performance of the BU classification method bases on that of the forward pass through the neural net, the identification of respective class in the semantic net and final evaluation of properties. The more interesting T R classification method bases intrinsically on a spreading activation algorithm to determine similarities between two different M K classes of the semantic net using the strengths of neural connections. The use of this technique is motivated in particular by [7]. Each symbolic input (see e.g. Tab. 2) includes at first position the concept name for which some other similar concept have to be detected for which no actual is-a relation path in the semantic net exists. This means in particular the task to find a connection between some of the respective concept
Use of Hybrid Neural System for Star Classification
303
instances based on the similarity relation learned by the neural net concerning the numerical photometric input values. Spreading activation starts within the semantic net exactly at the given concept node. In order to find some connection it follows the is-a hierarchy in the reverse direction down to the most specific concept level. For each instance of the given concept the uniquely associated neural output unit is activated by a normalized numerical activation value (1.0). Now this unit uk,ia spreads its activation through its backward connections to the units at the previous layer. From these units in turn activation passes back to all units at the last layer except the initial unit. The resulting maximum of Top-Return activities of units uk,i leads to the class of stars most similar to the given one by identifying the instance to which the unit is attached. Thus the TR-activity of each output unit is computed by Mk-I
~. wk,id,j/Mk - 1 * wk,i,j/Mk j=l with i ~ id, M[k] number of units at layer k, wtij weight of connection between unit ut,i at layer l and ut-t,j at layer I - 1. The immediate identification of respective instance of some semantic net concept and use of the is-a hierarchy leads to an answer for the required comparison of named classes of stars with respect to some similarity detected by the neural net. A formal notation of identification, evaluation and communication algorithms is given in [10][11]. Future work on the HNS will deal with the analysis and applicability of several different types of such TR-activity functions.
2.4. Empirical Investigations The success of the HNS depends greatly on that of the neuronal classification. Since there is no general solution for the problem of how to design an optimal neuronal network for a given application, the only possibility of testing the efficiency is an empirical investigation. In order to adapt the integrated neural net of the HNS for the MK classification of stars, combinations of the Str6mgren uvby~ photometric indices taken from the catalogue of Hauck & Mermilliod[5] are used. These values compensate for the effect of interstellar reddening[3] and fulfill the required condition: they are significant and standardized by the use of standard stars [10]. [1] described the necessary steps for the calculation of quasi-photometric indices from objective-prism plates. After this work is done the HNS is applicable to the survey data in its present form. Alternatively one can derive photometric colours from exposures of photographic plates with different filters, like it was used by [21] for spectral classifications. The data reduction for the ongoing project in [24] is essentially the conversion of objective-prism plates and catalogue into one-dimensional digital spectra. In this paper I present the classification results for backpropagation networks with three layers and fixed values for the learning parameters: c~ = 0.5, 77 = 0.5. Initial values for the weights were randomly chosen from the interval -0.5, 0.5. A comprehensive examination of the influence of learning parameters on the learning process for a given net configuration by observing the direction and velocity of convergence with respect to different combinations of values for c~ and r/is presented in [10]. We have to distinguish between the BU (bottom up) mode and the TR (top return) mode. The classification method of the TR mode is not comparable with that of the BU mode. A star is considered as successfully classified by the BU classification if the class identified as the most likely class to a given numerical input is in accordance with the given MK class. In particular this means that both, spectral and luminosity, classes must be correctly recognized by the neural net! Moreover, a star is considered as successfully classified by the TR classification if the identified class is identical with the second candidate of the rank list obtained by the BU classification. Therefore the TR classifica-
304
M. Klusch
tion rate indicates especially the relation between the classification of the BU method and spreading activation. The calculated classification rate is the simple percentage relation between the summarized number of correctly classified test stars for each class and the cardinality of the whole test set. Thus the term classification rate denotes the widely used percent correct recognition rate. The main results of testing the H N S for the M K classes BOV, BOIII, BOI, B5V, B5III, B5I, AOV, AOIII, A0I in test run RUN- 1a are shown in Table 3 below. The test set contains a bulk of stars with an accurate spectral classification given in the standard catalogue[18] and which are different from the standard stars trained during the learning phase. Note that the neural nets have to learn for each class just one corresponding input pattern! The system will probably yield good or even better results also without this restriction. For reasons of space limitation we refer the interested reader for a more detailed presentation of the performance of the H N S to [10]. The optimum classification rates are reached for different precisions e of approximation, i.e. the net error ___~, for different networks. It is demonstrated in [10][11] that the classification rate is not necessarily improved for a smaller e. Fig. 2 gives an example for the dependence of the classification rate on e in RUN- 1g. Table 3. Results: classification rate, corresponding number of backpropagation cycles NBp and e are given for the BU and the TR mode. Units 4 5 6 7 8 9 10
Bottom up Rate (%) NBp 83 5886 90 4492 76 2908 86 3133 93 3809 80 4045 90 2032
( 1.8 1.7 1.3 1.3 1.2 1.0 1.4
Top return Rate (%) NBp 60 5886 66 4492 50 2212 63 1421 56 1753 56 807 53 6259
e 1.8 1.7 1.5 1.7 1.5 2.3 0.9
After investigating the behavior of neural nets with upto 16 hidden units, for this test run networks with 8 and 5 hidden units produced the highest classification rates for BU (93('/0) and T R (66"/,,) method, respectively. The network with 5 hidden units is a reasonable compromise with a classification rate o f 90%t66% for the BU/TR method. Although in nearly all test runs the classification rate of the T R method was below that of the BU method, in most cases the M K class obtained by the T R method as the most similar class to a given one makes sense from the astrophysical point of view, even if it differs from the class obtained by the BU method as the second candidate [10]! Furthermore, it is remarkable that for the BU classification no misclassifications concerning the temperature class were observed! In a few cases only the luminosity classes were confused. But this behavior is also reported for the photometric classification done by [21 ] and classifications obtained in the conventional way (cf. Hoffleit 1982). The recently initiated project of [24] is restricted to neural net classification o f stellar spectra. For reasons of space limitations the interested leader can find further results on classification of the H N S with respect to an extended set of M K classes (including F spectra and luminosity types V, II and Iab), the topology of the neural net and variation of training data in [10].
3. DISCUSSION We compare the H N S with astrophysical methods used for the M K classification or the determination of equivalent basic stellar parameters like effective temperatures and surface gravities [3]. The classical, visual M K classification of stars uses photographic plates with a number of objective prism spectra taken at a single exposure. Classification is done by visually comparing the observed
Use of Hybrid Neural System for Star Classification 100
I
'
'
'
80
'
I
'
'
'
'
305
I
J v
__. //~bot~om
-
:up"
P 60
tO 0
top return
,,i-. 0
~0
I/J
0
"6
20
m
,
I
2.0
,
,
,
,
I
1.5
,
,
,
,
I
,
1.0
C Fig. 2. Dependence of the classification rate on e for a neural net with 8 hidden units.
spectra with standard spectra. One basic catalogue of standard stars is [18]. Since it is necessary to classify every star by visual comparison with the standards, this method is very ineffective and also a source of errors and inconsistencies between different authors. The "Bright Star Catalogue" of Hoffleit [8] reports many discrepant classifications of several stars. The other conventional classification method by Strb'mgren uvby~ photometry is used for the determination of basic stellar parameters. Moon published a FORTRAN-program[17] for the analysis of photometric data which requires an approximate pre-classification, but which is difficult only by means of photometric data. In contrast to that the HNS is able to learn the decision regions without any prior knowledge about relations between colours and photometric indices. A statisticalprocedure for the M K classification of stars based on the Vilnius multicolour photometry was developed by [21]. [22] presents classification rates of this procedure analogous to the BU classification rates of the HNS. Their results are nearly as reliable as those obtained by the HNS. But nevertheless their most seriously problem is the luminosity classification of stars which is shown to be solved by the HNS. The method in [14] and [15] uses the objective-prism spectra of stars and is based on a weighted metric distance technique, known as a simple pattern recognition method. A major disadvantage is the need for a very time consuming comparison of the spectrum, i.e. a vast of spectral points, of every investigated star with that of many standard stars. Their presented comparison with visual spectral classifications showed results moderatly poorer than that obtained by the HNS. Recently [24] presents initial results of a project aimed for the automated determination of stellar spectral types from a bulk of stars scanned from objective-prism plates. Like [I 9] for a simple galaxystar separation, their neural approach is based on a backpropagation net. Although being able to determine high reliable spectral classifications for spectral subtypes ranging from B3 to M4 they "do not yet have the ability to determine the luminosity classifications". In particular they use about 380 input units resulting from the number of resolution elements over the useful wavelength range of measured spectra. The HNS currently uses atmost 4 (cf. section 3). The HNS is the first hybrid neural system used for astronomical classification tasks. It obviously offers significant improvements for the MK classification of stars compared to existing astronomical classification tools. One main advantage of the HNS is the fully automatic, fast and reliable spec-
JPVA38:3-E
306
M. Klusch
tral and luminosity classification of stars. Another advantage is the useful similarity comparison of two named classes of stars without giving any raw input data (Top-Return classification)! For this the HNS only uses knowledge it adapts from the user during a learning phase. I believe that this is a new and promising issue on comparing classes of stellar objects. The HNS is implemented in an object-oriented language (C++) and used successfully at the Insitute of Astronomy and Observatory at CAU Kiel. Future research efforts on the HNS will include the following topics: • further investigation on the functional integration of both components - in particular the analysis of different types of Top-Return activity functions and their properties; • building a H N S Development Environment HNS-DE : the use of the HNS architecture is obviously not restricted to the MK-Classification and even not to astronomical classification tasks - it should be possible to independently create for each type of classification task within any application domain respective HNS component nets, maybe cascaded for several subtasks; improve net version management and extend the integrated test tool of the HNS; • continuing work on neural classification of stellar objects : explore the use of unsupervised neural nets (see e.g. Kohonen feature maps[l 3]) and their possible functional integration with semantic nets (cf. Section 2.3).
Acknowledgements I would like to thank Prof. Dr. D. Klusch for reviewing an earlier draft of this paper and providing constructive comments on its representation. Dr. R. Napiwotzki and Dr. S. Jordan have supported this work by giving much helpful advice with regard to astronomical considerations.
References [1] Flynn, C., Morrison, H.L., 1990, Astronomy Journal 100, 1181 [2] Grauel, A., 1992, Neuronale Netze, BI-Wiss.Verl., Mannheim [3] Griffith, R.L., 1982, "Three principles of representation for semantic networks", ACM TODS [4] Golay, M., 1974, "Introduction to astronomical photometry", Reidel Publ. Comp. [5] Hauck, B., Mermilliod, M. 1985, Astronomy & Astrophysics 60, 61 [6] Hecht-Nielsen, R., 1989, "Theory of the backpropagation neural network", Proc. IEEE IJCNN, Vol.1 [7] Hendler, J.A., 1989, "Problem solving and reasoning: a connectionist perspective", in: Connectionism in Perspective, Elsevier Sci. Publ. [8] HomeR, D. 1982, The Bright Star Catalogue, Yale University Observatory, New Haven, Connecticut, USA [9] IEEE IJCNN, Proc. of International Joint Conference on Neural Networks, 1990-93 [ 10] Klusch, M., 1991, "Entwicklung und Implementierung eines hybrid-neuralen Systems HNS zur Klassifikation von Sternen", Master Thesis, Institut ftir Informatik, CAU Kiel; short version in: Tech. Rep. 9307, 1993 [11] Klusch, M., 1993, "HNS: a hybrid neural system and its use for the classification of stars", Proc. Intern. Joint Conference on Neural Networks IJCNN-93, Japan, pp. 687-692; [12] Klusch, M., 1993, European Journal on Astronomy and Astrophysics, 276, pp. 309-319
Use of Hybrid Neural System for Star Classification
307
[13] Klusch, M., 1994, "Using a hybrid neural system for astronomical classification tasks", Proc. ME-94 "Mustererkennung", 21.9.-24.9.1994, Wien, TU Wien Informatik XPress 5, pp. 281-287 [14] Kohonen, T., 1988, "Introduction to Neural Computing", Neural Network, Vol. 1, Pergamon Journals [15] Kohonen, T., 1988b, "Self-organizing and associative memory", Springer Berlin [ 16] Kurtz, M.J., LaSala, J., 1991, "Toward an automated spectral classification", in: Objective-prism and other surveys, eds. A.G.D. Philip & A.R. Upgren, Van Vleck Ob~ Contrib. No. 12, Davis Press, Schenectady, New York, p. 133 [17] LaSala, J., 1988, "A program for automated two-dimensional classifiaction of objectiveprism spectra", in: Astronomy from large databases - scientific objectives and methodological approaches, eds. F. Murtagh & A. Heck, ESO Conf. Workshop Proc. No. 28, p. 127 [I 8] Lippmann, R.P., 1987, "An introduction to computing with neural nets", IEEE ASSP Magazine, Vol.4 [19] Moon, T.T. 1985, Comm. Univ. London Obs. 78 [20] Morgan, W.W., Keenan, P.C., Kellman, E., 1943,"An atlas of stellar spectra", University of Chicago Press, Chicago, Illinois USA [21] Odewahn, S.C., Stockwell, E.B., Pennington, R.L., Humphreys, R.M., Zumach, W.A., 1992, Astronomy Journal 103, 318 [22] Rich, E., 1990, "Expert systems and neural networks can work together", IEEE Expert, Vol. 10 [23] Smriglio, E, Boyle, R.P., Straizys, V., Janulis, R., Nandy, K., MacGillivray, H.T., McLachlan, A., Coluzzi, R., Segato, C., 1986, Astronomy & Astrophysics 66,181 [24] Smriglio, F., Dasgupta, A.K., Nandy, K., Boyle, R.P., 1990, Astronomy & Astrophysics 228, 399 [25] Storrie-Lombardi, M.C., Lahav, O., Sodr& L. Jr., and Storrie-Lombardi, L.J., 1992, Morphological classification of galaxies by artificial neural networks, Mon. Not. R. Astron. Soc., 259, 8p [26] Storrie-Lombardi, M.C. et al., 1993, "Automated Classification of Stellar Spectra. I: Initial Results with Neural Nets", Mon. Not. R. Astron. Soc., 269, 97 [27] Tarr, G.L., Rogers, S.K., Kabriesky, M., 1989, "Hybrid neural networks for tactical target recognition", Proc. IEEE Neural Networks, Vol. 1 [28] Ullman, J.R., 1973, "Pattern recognition techniques", Butterworths [29] Widrow, B., Lehr, M.A., 1990, "30 years of adaptive neural networks: perceptron, madaline and backpropagation", Proc. IEEE, Vol.78/9