Ast roparticle Physics ELSEVIER
Astroparticle
Physics
12 (1999)
l-17 www.elsevier.nl/locate/astropart
The cosmic ray composition between 1014 and 1016 eV M.A.K. Glasmacher a, M.A. Catanese a,1, M.C. Chantell b, C.E. Covault b, J.W. Cronin b, B.E. Fick b, L.F. Fortsonbv2, J.W. Fowlerb, K.D Green b,3 D.B. Kieda”, J. Matthewsas4, B.J. Newportbv5, D.F. Nitzay6, R.A. Ongb, S. Oserb, D. Sinclair”, J.C. van der Veldea a Dept. ofPhysics, The University of Michigan, Ann Arboc MI 48109, USA h Enrico Fermi Institute, The University of Chicago, Chicago, IL 60637, USA c Dept. ofPhysics. The University of Utah, Saft Lake City, UT 84112. USA Received 22 February
1999; revised 24 March 1999; accepted
7 April 1999
Abstract The mass composition of cosmic rays with primary energies between lOI eV and IO’” eV has been studied using the surface and buried scintillators of the CASA-MIA air shower array. Near 1OL4eV, the composition of cosmic rays is in agreement with direct measurements, roughly half light elements (protons and helium) and half heavier elements. The average mass increases with energy, becoming heavier above lOI eV. The mass changes coincide with the spectral steepening of the energy spectrum known as the knee. There is evidence for rigidity dependence in the spectral change. A method of calculating the primary cosmic ray energy which is insensitive to the composition is employed to achieve these results. @ 1999 Elsevier Science B.V. All rights reserved.
1. Introduction
Supernova shock wave acceleration may explain the origin of cosmic rays below about lOI eV. Normal supernovae, however, ,sre expected to have neither high enough magnetic fields nor long enough shock life-
’ Current address: Dept. of Physics and Astronomy, Iowa State University, Ames, IA 501311, USA. ‘Joint appointment with The Adler Planetarium and Astronomy Museum, Astronomy Department, Chicago, Illinois 60605, USA. 3 Current address: Koch Industries, Wichita, KS 67201, USA. 4 Current address: Dept. of Physics and Astronomy, Louisiana State University, Baton Rouge, LA 70803 and Dept. of Physics, Southern University, Baton Rouge, LA 70801, USA. ‘Current address: Micro Encoder, Inc., Kirkland, WA 98034, USA. ’ Current address: Dept. of Physics, Michigan Technological University, Houghton, MI 4993 1, USA. 0927-6505/99/$ PII SO927-6505
times to accelerate particles to higher energy [ I]. Nevertheless, cosmic rays have been observed with extreme energies, beyond lo*’ eV. The energy spectrum undergoes a significant change in slope, known as the “knee”, just beyond lOI eV. This suggests that the source or the acceleration mechanism of the majority of observed cosmic rays is changing in this region [ 21. One aspect of cosmic rays which may provide clues to their source is the mass composition. If the knee of the spectrum represents a “turning off” of a supernova source, for example, the average primary mass would be expected to increase with energy in this energy regime. Assuming supernovae use electromagnetic fields to accelerate particles, the maximum energy attainable is proportional to the charge 2 of the particle. Higher Z nuclei thus reach a larger maximum energy than lower mass particles. The average mass of
- see front matter @ 1999 Elsevier Science B.V. All rights reserved. (99) 00076-6
2
M.A.K. Glasmacher
et al./Astroparticle
observed cosmic rays also would increase if the particles begin to leak out of the galaxy in a charge dependent way. On the other hand, if some new source of cosmic rays came to dominance at the knee-energy, the composition could become lighter with increasing energy. An example could be an expanding shock (from a supernova or other cause) which accelerates particles in the normal interstellar medium, a population dominated by protons. Another case would be one where the source of cosmic rays is rapidly rotating compact objects, where the local high-radiation environment breaks up nuclei before they can emerge. Many experiments have attempted to determine the composition of cosmic rays with energies near the knee. Pioneering work was done by the Maryland group, who studied the arrival times of hadrons near the cores of air showers, and concluded that the composition was changing to heavier species in the knee region [ 3,4]. The most recent experimental work has been well summarized by Watson [ 51. These experiments used a variety of techniques, including ground studies of air showers, air-Cerenkov work, and underground (i.e., high-energy) muon measurements. Of these experiments, seven observe the composition becoming heavier through the knee, and three suggest it is becoming lighter. The EAS-TOP experiment employed several detector systems and analyses to infer that the composition is becoming heavier in the knee region [ 5,6]. The KASCADE experiment, using a high-statistics data set from surface counters, reports that the mass remains fairly light through the knee, but becomes heavier at somewhat higher energy [ 5,7]. The DICE experiment [ 81 has published results from air Cerenkov studies of the altitude of shower maximum which indicate a trend toward lighter mass primaries above the knee. Experiments whose detection thresholds are well above the knee, notably the Fly’s Eye, have reported a heavy, iron-like composition above 10 l7 eV, these results have been recently reviewed by Yoshida and Dai [ 91. This mixed set of results on composition through the knee has emerged from a group of experiments which were not all designed specifically to study this problem. In most cases there are insufficient statistics to obtain an unambiguous result. Differing air shower simulations complicate matters further, making comparisons between experiments difficult. Finally, an inability to calculate the energy of the primary cosmic
Physim
I2 (1999) 1-17
ray, in a composition-independentway and based upon quantities measured at the ground. can bias the results on composition as a function of energy. It is possible too that the differing results from different techniques (e.g., DICE [ 81) point to fundamental problems in our understanding of the physics at high energy. The CASA-MIA experiment measures the underground muon and surface electromagnetic components of air showers in the energy range of lOi eV to lOI eV. Combined with a simulation which represents the data well, the CASA-MIA data is used here to determine the probability, on an event by event basis, that a given data shower resulted from a “light” or a “heavy” primary. These distinctions are made by comparison to simulations of proton- or iron-induced air showers. A compositionally independent measure of the energy is used to search for trends in the composition as a function of energy. It will be shown that the CASA-MIA experiment indicates that the composition becomes heavier with increasing energy in the region of the knee.
2. The CASA-MIA
Detector
The CASA-MIA detector (more thoroughly described elsewhere [IO] ) is a ground based array of 1089 surface particle detectors (CASA) and 1024 underground muon detectors (MIA) located at the Dugway Proving Grounds southwest of Salt Lake City, Utah. The mean atmospheric overburden is 870 g cm-‘. The layout is shown in Fig. 1. The small squares in Fig. 1 represent the CASA detector stations. A station contains four counters, each of which consists of a two inch diameter photomultiplier tube glued to a square sheet of acrylic scintillator 61 cm on a side and 1.5 cm thick, wrapped in a black plastic tray. Each station has a thin (1 radiation length) lead sheet on its top, which converts some of the photons in the air shower, resulting in a net increase of the scintillator signal. The CASA stations are spaced 15 m apart on a square grid and connected to each of their four neighbors by timing cables. The array triggers when any three stations report at least three hit counters each. Stations with two or more counters reporting hits are “alerted” and digitize and store their data. The MIA counters are arranged in 16 patches, each
M.A. K. Glasmacher et al./Astroparticle Physics 12 (1999) I-I 7
Descriptions of the detector, its calibrations, and the standard fits have been previously published [ lo].
CASA-MIA -15
-10
0
-5
3
+5
110
+15
.................................
.................................... ................................. ................................. .............. .... I ............................ ....................................... ................................. ................................. ................................. ..........
................................. ................................. ................................ ............. .............
..)..........I
3. Data
....
n ...................
.::?:::I:::::::::
................................... ................
..V
..............
.............
................. ?
................................ ...... ..I -5 ..................... ............................... ................................. ................................. ................................. -lo.................................-
.......................
n
......... .-5
.Q
.................................
..m
..........................
....
................................. ................................. -15.................................................................. -15
l
-10
Several strict selection cuts are employed in the present analysis to ensure the integrity of the data. Firstly, only events reconstructing from the vertical (cost9 > 0.970) are used. This angular range corresponds to a systematic deviation of shower size due to attenuation of about 3% between showers of a given energy arriving exactly from the vertical (19 = 0” ) and those at the maximum zenith angle allowed (0 = 14”). Secondly, to avoid possible triggering or reconstruction effects of data near the edge of the array, only events whose reconstructed core positions fall within a square 300 m on a side, centered in the array, are used. Finally, events for which any one of the fitting routines (core location, direction, size) failed are rejected. After these cuts are performed, 54 million events remain for use in studying the vertical spectrum. These represent a live-time equivalent to 342 days of continuous running.
1% -5
0
15
CASA statlon
Fig. I. The layout of the CASA-MIA surface, and MIA is buried 3 meters
110
I
+15
MIA patch
array.
CASA
is on the
underground. The CASA
stations are I.5 meters apart. of which contains 64 individual counters. Each counter is a 1.6 m by 1.9 m sheet of acrylic scintillator viewed by one 5 inch diameter photomultiplier tube. The MIA patches are buried 3 meters underground and so register muons with energies exceeding about 750 MeV at the surface. Electromagnetic punch-through to the muon counters is negligible in comparison to the muon signal. The data are processed through a series of standard fits to determine the direction, core location, “electron size” (N,,) and muon size (N,), among other quantities, of the shower. The subscript “e*” is used here to emphasize that, strictly speaking, the quantity N,, does not simply represent the number of electrons above some threshold energy, but includes a fraction of shower photons and positrons as well. The sizes are obtained by fitting either a Nishimura-KamataGreisen (NKG) [ 1 l] function for surface counter data, or a Greisen function [ 121 for underground data, to the observed particle densities as they diminish with distance from the shower core. In the present analysis, a more detailed lateral distribution fit of the NKG function is also performed (described below in Section 6).
4. Simulation An air shower simulation is necessary to connect the measurements of air shower properties at the ground with the properties of the initiating cosmic ray. The MOCCA [ 131 shower simulation program, using the SIBYLL [ 141 hadron interaction codes, was chosen here for this purpose. In the present work, modifications of the results of this code have been made so that its predictions match those of a different hadron interaction algorithm, QGSJET. An E-’ differential spectrum of over one thousand iron and one thousand proton showers of energies between 1Ol3 and 10’6.5 eV was generated. The events were weighted to account for the steeper cosmic ray spectrum (see Section 8). To conserve computing time, the algorithm employs a technique called thinning, where air shower particles are only selectively tracked when their energies are below 10 GeV ( 100 GeV), in showers whose primary energy was less than (greater than) 1016 eV.
M.A. K. Glasrnacher et al. /Astroparticle
Physics 12 (1999) l-l 7
Muon Lateral Distribution
EM Lateral Distribution
R t-,
R
,nrr*,
Fig. 2. Comparisons of the distributions of surface and underground particles for the simulation (o - protons, A - iron) to data - l . The distributions for showers with more than 60 and less than 200 alerts are averaged. The simulation showers are weighted to correct their flat spectrum to the observed cosmic ray spectrum.
The simulated air shower particles are then processed through a simulated detector to account for the effects of particle detection, the array trigger, and the electronics. The detector simulation followed the particles in detail. All steps from the deposit of energy in the scintillator to the generation of a signal waveform from the photomultiplier were accurately simulated, based where possible on field measurements [ 10,151. After accounting for detector calibration and triggering procedures, the simulated events closely resemble real data showers, enabling the use of the same set of reconstruction codes that are used on actual data. The resulting core location, shower direction, and shower sizes are then retained for further analysis. Fig. 2 compares the lateral distributions of particles collected in the surface array and in the buried array, obtained from the data, to those from the simulated iron and proton showers. In both cases, the data have a shape intermediate between the “extremes” of iron and protons, giving confidence that the simulation is accurate. Many other comparisons have been made which show good agreement between the data and simulation [ 151, including examination of quantities such as the arrival times of particles at the ground, the shape of the shower front, and the distributions of alerted and triggered surface stations.
The different air shower simulations used by different experiments make comparing results complicated. Some simulations differ enough to change the character of the results based on them. The main disagreement between the most commonly used codes lies in their hadronic interaction characteristics, specifically their assumptions about inelastic cross sections and the distribution of energy and momentum flow in interactions (e.g., inelasticity). At the energies of interest in this analysis (10” eV to lOI eV) the observable differences between simulations are manifested primarily in the total numbers of muons and electrons ( Np and N,,) at the ground, but not in the shapes of the lateral particle distributions. A careful, systematic comparison [ 16,171 showed that the SIBYLL hadron interaction code differs more from a group of other interaction codes (VENUS, QGSJET, HDPM, and DPMJET) than these codes differ from each other. This by itself does not imply that SIBYLL is somehow incorrect as such, but rather that experiments which use SIBYLL to interpret their analyses may yield different results from work using the other models. Indeed, agreement between different experiments may be quite strong when the same simulations are employed [ IS]. Nevertheless, at the highest energies studied in this
M.A.K. Glasrnacher
et al. /Astroparticle
analysis (N 1016 eV) , SIBYLL may “underproduce” muons when compared to CASA-MIA data. The average N, reconstructed from the data appears to correspond to SIBYL], predictions for primary nuclei slightly heavier than iron [ 151. The other interaction algorithms appear to constrain the data between iron and proton primaries better at the highest energy (see Section 7 below). Therefore, in this analysis MOCCA/SIBYI,L’s predictions of the fitted sizes N, and N,, are modestly shifted to approximate the other simulations. For proton showers, log,,(N,* lw,,CN,)
) = log,(l r:N;*) ;t (0.07 - 0.04 log,a E) , (1) = h,o(
N;)
+ (-.03
+ 0.02 log,, E) , (2)
where N” denotes t.he original unscaled values, and E is the primary energy in units of TeV. For iron induced events, the shifts am log,,(N,,) log,,( N,)
= log,cl:N,P, ) + (0.05 - 0.04 log,, E) , (3) = log,,( N;)
+ (0.01 + 0.02 log,, E) , (4)
using the same notation. Other quantities which are directly related to the shower size should also be scaled using these equations. In this analysis, these include pr and pK, the density of surface particles and of muons, respectively. It is important to assess whether the choice of simulation influences the outcome of the analysis. The effect of using Eqs. ( l)-(4) to scale the values of fitted quantities from the simulation is discussed later, The choice of simulation - using SIBYLL or using other algorithms like QGSJET - does not qualitatively change the results reported here (see Section 12 below).
5. Energy determination The sensitivity of measurements of electron size at the ground to the nuclear species of the primary cosmic ray has been a potential problem for previous composition and spectrum measurements. Assumptions needed to be made about the composition in order to establish a relationship between electron size at the ground and
Physics 12 (1999) l-17
s
primary energy. However, using the muon information available from CASA-MIA, a good measurement of the primary energy which is not dependent on the underlying nuclear composition can be obtained [ 221. The bulk of the energy of the primary cosmic ray is dissipated through ionization losses in air of particles in the resulting air shower. This energy loss is roughly proportional to the primary energy. The remaining energy (also proportional to the primary energy) reaches the ground in the form of kinetic energy of individual particles, divided mainly between the electromagnetic portion (electrons, positrons, and photons) and the muons and neutrinos from hadronic interactions. CASA-MIA misses only the neutrino portion of the shower, which typically carries of the order of 10% of the primary energy. It might be expected that a suitable combination of the number of muons and the number of electrons at ground level will give a quantity that is simply related to the primary energy and independent of the type of primary particle. The number of muons in showers from heavy nuclei is greater than that from proton showers (at the same primary energy), but the number of electrons is less. The latter effect is due both to less of the shower energy flowing into electromagnetic channels as well as more rapid attenuation of the electron and photon numbers with depth after shower maximum. Shower maximum is higher in the atmosphere for heavy primaries. Fig. 3 shows the empirically determined optimal value of the energy parameterization, (N,, +25 x NP), as a function of energy, for two sets of vertical simulation showers, one purely protons and the other entirely iron. N,, and N, are obtained from the standard fits of events (see Section 2 and [ lo] ). This parameterization is logarithmically linear with energy and insensitive to the primary particle type. The factor of 25 multiplying N, differs numerically from that published previously [22] because the simulation used in this work (based on QGSJET) is different from the simulation used before (SIBYLL) . For simulated events at a given primary energy, the size N,, is less than before, but N, has increased, so the net change in energy assignment between the two simulations is typically less than 10%. No systematic change in the measured energy spectra1 index arises from the choice of simulation [ 151. The relative weighting of N, and N,, suggests that
M.A. K. Glastnacher et al. /A.stroparticle
6
Physics 12 (1999) I-17
6. The fits In order to study composition, certain parameters must be chosen and extracted to be used for the comparisons. Each parameter should have some discriminatory power by itself, and the set of parameters should be chosen carefully to give the maximum amount of discriminating power between proton and iron induced showers. This section describes the parameters chosen to best distinguish proton and iron induced showers. 6.1. The electromugnetic
4.75
4.5L..l,,.1..,l.,.l”‘1...“,,1,..1”“,.’ 2
2.2
2.4
2.5
2.8
3
3.2
3.4
3.6
3.8
logiO(EnergyTev
Fig. 3. The energy parameter loglu( A’,, + 25 x N,) as a function of energy for simulated showers from protons and iron nuclei. The two species are almost indistinguishable as desired for an energy parameterization. Over 900 proton and 900 iron vertical showers are included. The energy is the actual energy of the particle used in the simulated events.
the above parameterization is related to the total particle kinetic energy arriving at ground level. A typical muon energy at the ground is about 1 GeV, while electrons and photons are usually less than 20 MeV due to their more significant energy degradation from ionization losses. This method of energy determination yields systematic differences between the iron and proton energy assignments of less than 5%. The average absolute values of the energy reconstruction errors decreases from about 25% near 10’” eV to about 16% above 1015 eV. Such compositional insensitivity is crucial to producing an accurate energy spectrum. That is, if the composition changes in some energy region of interest, then the relationship between measured electron size and inferred energy will also change. Without an independent means of assessing the composition, an energy spectrum determined from N,, alone is subject to (unknown) systematic shifts. The weighted combination of N,, and N, is a robust way of determining the energy without these systematic uncertainties.
lateral distribution
For the purposes of the present analysis it is desirabie to examine the slope of the lateral distribution of particles as well as on the total size N,,. Iron showers develop higher in the atmosphere and so are expected to have a slightly broader lateral distribution on the ground than will proton showers of the same energy. This feature provides some discriminatory power between showers from different primaries but with the same ground-level size. Put another way, if a proton shower and an iron shower have the same size N,,, then the proton shower will have higher densities of particles very near its core, on average. A fit is performed here which differs from the standard fit of CASA-MIA data. The usual lateral distribution function is used, but its slope is allowed to vary as well as its normalization. The resulting parameterized function is then evaluated near the core, where proton and iron showers differ the most. For this analysis, a maximum likelihood fit is performed on the data using the following functional form: p:(r)
= PP(rlrl) f(r)/f(r,,)
3
(5)
where p:(r) is the measured density of particles in surface detectors at a distance r from the shower axis. The radial dependence f(r) is that of the well-known NKG [ 1 l] function, f(r) = r”( 1 + r/ro)a-2.5. The length scale parameter rO is analogous to the Moliere radius in electromagnetic cascades. The slope parameter (Yis related to the “age” s of the shower, (Y= s - 2. The evaluation point rn, is preselected and held constant during the fit. The results of this fit are two parameters, p,( r,, ) and CX.The evaluation point r,, is taken near zero so that the fitted function is evaluated very near the
M.A.K.
Glasrnacher et al. /A.stroparticle
1
Physics I2 (1999) I-17
0
2
~~,1,,,‘,,,‘,,,‘,~“111’111’,~~‘,,,’,~/ 2.2
24
2.6
2.8
3
J 3.2
34
3.6
3.8
log1 O(Energy To\
1;
Fig. 4. Fit to the lateral distribution of a typical CASA-MIA event. The histogram represents measured number of particles per CASA surface station, the curve is the fitted NKG function with detector effects. R is the distance of the measured points from the reconstructed shower core.
Fig. near iron used
shower core. At this point, the difference in pr between an iron and a proton shower of the same overall size N,, is maxima1 due to the different slopes of their lateral distributions. The standard analysis of CASA-MIA data (used in prior work) does not fit the slope of the lateral distribution. Its purpose is to evaluate overall shower size N,,. It performs a one-parameter fit of the measured lateral distribution to the NKG function,
face counters as a function of the distance r from the (fitted) core. The fitted NKG function is shown by the dashed curve. For small radii (less than about 50 m) the CASA stations begin to suffer saturation. At large radii, the detector’s inclusion of data only from stations which alerted (registered two or more hit counters) also reduces the number of particles reported by the detector. Note that in the figure, the displayed fit function has not been modified to show the saturation adjustment within 50 meters, but does accurately exhibit the “alert” adjustment as indicated by the “drop off” beyond 150 meters. The fitted function represents the data well over the entire lateral range. Studies of the procedure for larger and smaller events, and other zenith angles, indicate that it works very well for all CASA-MIA data [ 151. Fig. 5 shows the average value of the fitted NKG function evaluated near the shower core (the fitted parameter pe ( r,) ) versus primary energy, for simulated iron and proton showers. Fig. 6 similarly gives the fitted NKG slope parameter LYfor the two simulation sets. In each case, the separation of the two kinds of showers is evident.
p:(r)
0: N,* f(r),
(6)
where, as above, p:;(r) is the measured density of particles in a detector which is a distance r from the shower core and f(r) is the NKG radial dependence. The parameters LYand r0 are held constant in the standard fit. This method is computationally fast and yields accurate values of the shower size [ 10,151. The parameters pr (I-,,) an’d N,, from the two kinds of fits (i.e., the standard fit and the one used in this analysis) correlate linearly with each other to high precision [ 151. Fig. 4 displays a sample two-parameter NKG fit from the present analysis to a typical CASA-MIA event. In this figure, the data are shown by the histogram of the number of particles collected in the sur-
5. The average fitted pp (expressed as log,,,(particles/m’ ) the core) versus primary energy for simulated proton and events. Energy is the actual energy of the primary particle in the simulated events, in units of TeV.
M.A.K.
Glustnacher
et al. /Astrcprticle
Fig. 6. The average fitted slope parameter
cy versus primary energy for simulated proton and iron events. Energy is the actual energy of the primary particle used in the simulated events, in units of TeV.
Physics 12 (1999)
I-17
Fig. 7. The muon tateral distribution
fit to a single typical shower
6.2. The muon lateral distribution
As was the case for the surface particles described above, the buried counter data is fit using a maximum likelihood approach. The standard fits used by CASAMIA obtain N, by fitting the lateral distribution of muon counter hits to the Greisen function [ 121. The present analysis casts the Greisen function in a form like that in Eq. (5), substituting pP in place of pu, and using rO = 400 m. The radial dependence of the fit function is f(r) = r-,75( I + r/rO)-‘.O. The muon counter analysis only fits one parameter, the normalization pP (r,, ), using notation analogous to Eq. (5). In principle, the slope of the Greisen function could also be fit, as it was for the surface counter data using the slope parameter cy. However, the shapes of the iron and proton muon lateral distributions are very similar because of the lesser attenuation of the muon component of showers in air. The muon array does not provide enough coverage to fit the slope of the muon lateral density distribution with the required precision. The evaluation point rm of the fitted function is therefore somewhat arbitrary. A value r,, = 500 m was chosen based on technical reasons related to the quality of the convergence of the fitting algorithm [ 151. The fit of a typical muon lateral distribution is
-1.25
4.25
2.5
t,I,1,,,1,,,1...1...1,.‘1,,.1...1’.,1’.. 2
2.2
24
2.6
2.8
3
3.2
3.4
36
38
loglO(Energy TeV i
Fig. 8. The average fitted pfi 500 m from the core) versus and iron events. Energy is the used in the simulated events,
at as log,n(muons/w?) for simulated proton actual energy of the ptimary particle in units of TeV. (expressed
primary energy
shown in Fig. 7. The sparseness of the muon patches as well as their inability to measure particle numbers causes this distribution to be less clear than in the electron lateral distribution case on an event-by-event basis. Fig. 8 exhibits the clear separation of simulated proton showers from iron showers using the fitted pcL.The difference between the two sets accurately reflects the
M.A. K. Glasrntrcher et al. /A.stropnrticle
Physics 12 (1999) I-17
difference in the total muon number in showers produced by the two kinds of primary particle. 6.3. Rise times The temporal structure of particles in an air shower front is another technique which has been used in the past to study composition. It has been recently discussed as a potentially useful parameter in analyses which use muon and electron size information [ 191, as is also the case in this work. We comment here on why such information is not used in the present analysis. The analysis of rise times - the span of time in which the first 50% of the particles arrive, for example - was first employed by the Haverah Park group [ 201. This technique is more difficult to apply to CASAMIA data. Due to Haverah Park’s large size, it was able to measure particle arrival times at larger radii than CASA-MIA. The showers studied were of significantly higher energy. The arrival time distribution of the smaller showers in CASA-MIA is too short for this type of analysis, so rise times are not used here.
7. Muon size versus electron size As a first step toward evaluating the trend in primary composition. and before discussing the result of the full analysis, we 5how here a simpler approach, that of examining only the correlations of overall muon and electron sizes N, and N,,. These quantities are the usual sizes obtained from the standard fits of CASAMIA data, described in the last section, Eq. (6). This approach serves to give an idea of the behavior of the data which we will later quantify using a more complete statistical analysis. Fig. 9 shows the trend in the data and the expectation from simulation. The points represent the data. and the heavy dashed lines represent the averages for iron and proton simulated showers. The lighter dashed lines do the same for oxygen and helium simulations. More restrictive acceptance cuts are employed for this plot than in the full analysis described later: showers arriving from within 14” of vertical and landing within 50 meters of the center of the array are included. Over 240000 data showers meet these requirements. The single-composition simulation shower averages
5.5
3.5
” 5
‘I”
5.5
“““1 B
Fig. 9. Fitted Muon six represent CASA-MIA
5.5
j
’
7
”
1 I
I”’
7.5
loglO(N,j
versus fitted electron sze.
The points
data. The dashed lines represent simulated
iron, oxygen, helium. and proton showers, as labeled.
are derived from 975 (each) proton and iron showers between lOI cV and 10’6.’ eV, and 500 (each) oxygen and helium showers between 10” eV and IO’” eV. The trend seen in Fig. 9 is that the composition becomes heavier with increasing energy. The error bars represent the 1~ statistical uncertainty on the mean value of loglo( /VP) in each bin of log,,( N,,). The data appears reasonably “contained” between the simulation predictions for iron and for protons, although the muon sizes are becoming high for the largest events (log,,( N,,,) > 7.1). Had the SIBYLL simulation been used (recall SectiDn 4), the muon sizes of the data would have exceeded its predictions for iron showers in this region.
8. The KNN test The “K” Nearest Neighbor (KNN) test classifies objects as belonging to one type or another based on which type they look most resemble - or which type their characteristics are “closest” too. This can be understood in terms of an example illustrated in Fig. 10. In this figure there are two known samples, which will be referred to as the “open dots” and “closed dots” samples. A data point which fell in this graph near the position marked A would very likely belong to the
h4.A. K. Glasrnacher
IO
mu,,,,,,,,,,,,,,,,,,,,,,,,,,, 42
,.4
4.6
4.8
5
5.2
et (11./A.stroparticle
,,,,,,, j
5.4
some”c”haradt”eristic”
Fig. IO. The idea behind the KNN test. Data points falling at locations such as A, B. and C would be assigned a likelihood of belonging to the “closed dots” or “open dots” data set.
“closed dot” data set. One which fell near position B would very likely be a member of the “open dot” data set. However, a data point falling near the position C could belong to either of the two groups. So in each case, the data point is assigned a probability of belonging to one of the known distributions by finding the fraction of some number “K” nearest neighbors which also belong to that distribution. Using five nearest neighbors, point A would have nearly 100% probability of belonging to the “closed dot” distribution, B would have a 0% probability (or a 100% probability of belonging to the “open dot” distribution), and C would have about a 50% probability of belonging to the “closed dot” or the “open dot” set. This method can easily be extended to an arbitrarily large number of parameters - as many parameters as can be derived from the data and which can provide useful discrimination. The key to usefully calculating the distance from a datum point to the neighboring simulation points is ensuring that all parameters are measured in units which can be formally compared. One way is to compute distance units normalized to the variance or spread of the reference points. The Mahalobonis distance [ 211 provides a rigorous way of computing such lengths. If the set of parameters associated with two points are stored in vectors XI and x2, then the Mahalobonis
Physics
I2 (1999)
I-I 7
“distance”
between the points is
D =J(x,
- xz)‘Sp’(x,
- .x2).
(7)
In this equation S is the covariance matrix and the prime denotes the transpose. This measure of distance accurately accounts for correlations between parameters. These correlations can be extremely useful for discrimination, and it is important to account for them carefully [ 151. When implementing the KNN procedure, two or more “reference samples” are produced from simulations so that actual “data samples” may be compared to each. The reference samples in this work are the parameter sets associated with iron and with proton simulation events. The simulated reference events are fit using the same algorithms as are used on the data. Three parameters sensitive to composition (Section 6) are obtained: the density of particles near the core ptZ, the slope of the surface lateral distribution cy, and the density of muons at large distance p,+. The covariance matrix is calculated for each reference set separately. When the distance of a datum event to a shower in a reference set is calculated. then the covariance matrix for that reference sample is used. Thus, “distance” to a neighbor is defined differently for different species of reference shower, correctly accounting for any differences in the widths and corrclations of parameter distributions between species. In general, the total number K of nearest neighbors which are examined must be large enough to provide adequate statistics, yet small enough not to extend the nearest neighbor distances too far in parameter space from the datum under analysis. There are about a thousand showers per reference sample and five nearest neighbors are used in this analysis. This probes less than one percent of either reference sample. Optimization studies show that the analysis performs most accurately using K = 5 neighbors, but any choice of K between 3 and 10 works nearly as well [ 151. Special care is required using the KNN test with data on a steep energy spectrum. The cosmic ray differential energy spectrum has been measured to behave as EbY, where y = 2.7 + 3.0 through the knee region [22]. However, generating a set of simulation showers on such a steep energy spectrum will leave the high energy region sparsely populated, causing the KNN test to have to reach to greater and greater dis-
M.A. K. Glasmacher
1
0..
Fe
et al. /Astroparticle
I
p
Physics 12 (1999) I-I 7
aa-
Fe
1:
0 % Fe, 100%
1
04
02
1
Fe
1
oc
il
22
o!A
.a!,
-f
Fe
01
1
p 1
1
2.
0
0.2
FRACTION p NN
0..
p
I
p
80 % Fe,20%
P
FRACTION p NN
FRACTION p NN
as
11
p
01
0.1
LI
I
FRACTION p NN
I
L‘
1:
Fe
P
40 % Fe, 60%
p
+o: FRACTION p NN
FRACTION p NN
Fig. 11. Testing KNN. Plotted are the distributions of fractions of nearest neighbors which are protons for various input percentages of iron and proton simulatic~n showers (as labeled). The fractions can have six discrete values corresponding to O-5 proton neighbors out of the total of five. The disl:ributions have been normalized so that the y-axis corresponds to the fraction of the entire sample. The “p” and “Fe” in the upper right and left hand corners respectively mark the bins into which proton and iron showers would ideally fall.
tances to find the required number K nearest neighbors, when compared to searches at lower energies. The method adopted here is to generate a “flat” energy spectrum (E-’ ) of simulated reference showers, but then to weight leach reference event so that the weighted spectrum resembles that of the data ( E-‘.7). Then, during the examination nearest-neighbors, event weights are added until the total reaches the desired value of K. It is not necessary to precisely match the transition of energy spectral indices (2.7 --t 3.0) to obtain satisfactory, unbiased coverage of the high energy population.
9. Calibrating
the KNN test
To confirm quantitatively that the KNN test works properly, it is performed using a “calibration sample” of events instead of real data. The calibration sample is a set of simulated events of various primary nuclei and is compared to the (separately simulated) reference sample in the same way that a data sample would be. In this case, the composition is known a priori, and can be compared to the composition determined through use of the test. The analysis is performed in the same manner as that used with real data. More than 900 iron and 900
protons
0.4-
helium
0.30.20.10
( 0
I 0.2
I 0.4
, 0.6
I 0.8
’
0
, 1
(
1
0.2
0
I
I
0.4
0.6
FRACTION p NN
0.6
Fe
-I
P
I
0.6
r
1
FRACTION p NN
Fe
0.6
0.5
oxygen
iron I-l
0
,
0
I
I
0.2
0.4
I
0.6
I
0.6
I
1
FRACTION p NN Fig.
13. Testing
The fractions normalized
KNN
with
intermediate
can have six discrete
such that the y-axis
masses. The distribution
values corresponding
corresponds
to the fraction
FRACTION p NN of proton
to 0-S proton
nearest neighbor
neighbors
fraction
is shown for vxious
out of the total of five.
masq species.
The distnbutiow
have been
of the entire sample.
proton simulation showers with energies from 10” eV to lOi 5 eV are used as the reference sample. An additional 300 proton and 50 iron showers with energies below 10” eV are also included. This covers the compositionally nonsymmetric triggering threshold of the detector. Fig. I 1 shows the result of putting several different calibration sets (in place of data) through the KNN test. Each calibration set has a different fraction of proton and iron showers. Plotted is the fraction of the five nearest neighbors found in the reference set which were protons. In an ideal case, iron calibration showers would fall in the “0” bin, having no proton nearest neighbors in the reference sample, and proton calibration showers would fall in the “1” bin, having only proton nearest neighbors. In reality, fluctuations cause
many calibration proton showers to have some nearest neighbors in the reference set which are iron, and vice versa. Nevertheless, the trend will be toward iron showers falling on the left side of the histograms and proton showers falling on the right side. The method can be seen to work quite well, with more than 90% of the calibration showers having three or more nearest neighbors in the reference set of their own species. Around 50% have all five nearest neighbors of their own species. Another check of the KNN test is to attempt to identify species other than iron and protons. Fig. 12 shows the result, along with protons and iron for comparison. Helium calibration showers tend to have more proton nearest neighbors than iron neighbors in the reference set, but on average fewer protons reference
M.A. K. Glasrnacher et al./Astropurticle
’
I
a 0.9
B 5
0.8
2
0.7
_ t
L
p
-..-.._..
.-._ ‘.._ i‘-_
“+_.
0.5
He
0.4
....._._
Li
. .._ ..__ *
0.3
0
0.2
.._.
F:
.. ..
“‘.-._.__t
....
0.1 0
.._
Fig, 1.7. Nearest neighbor fraction versus log,n( A). The average fraction of nearest neighbors which were protons is shown versus the logarithm of the mas’i of the primary for simulation showers. The approximate zt IU slxead in the distribution is indicated by the boxed area. There is :I clear trend (shown by the dashed line), but the spread is too large to allow mass identification to high accuracy.
event nearest neighbors than calibration set protons have. Oxygen calibration showers, on the other hand, tend to have more iron nearest neighbors than protons, but not to the same degree as calibration iron showers themselves. Fig. 13, derived from the data in Fig. 12, shows the average fraction of nearest neighbors which were protons for each of five calibration sets using different primary masses. The boxed region represents the variation in the result in each of the five cases, showing the span over which the central 63% of the results lie. The difficulty with using the KNN test to attempt to identify the composition to finer divisions in mass than that between protons and iron is evident. Each of these distributions significantly overlaps the next higher and lower mass distribution, making it difficult to distinguish among species with even such course mass divisions. However, it is possible to say in general that a selection Iof showers which have 25% proton nearest neighbors will be heavier on average than a sample which sees 75% proton nearest neighbors.
10. The KNN results
Application of thle KNN test provides quantitative assessments of I:he likelihood that data are more “proton-like” than “iron-like” when compared to simulations. The distribution of the fraction of nearest
13
neighbors in the simulation reference set which were protons is shown in Fig. 14 for CASA-MIA data. Each event can take one of six possible values. corresponding to whether it has O-5 nearest proton neighbors out of the total of five nearest neighbors examined. The data were divided into four regimes of reconstructed energy for this plot. The highest energy data are clearly more “iron-like” than the lowest energy set. Dividing the data into smaller bins of reconstructed energy E, Fig. 15 shows the average fraction of the five nearest neighbors which were protons versus log,, E. This is termed the “proton resemblance”. Note again that the energy has been calculated in a way which is insensitive to the primary composition (Section 5). For comparison, results using pure proton and pure iron calibration sets instead of data are shown as well. The composition of the data appears to be changing toward heavier nuclei through this energy range, consistent with the trend seen earlier in the N,-N,, graphs (Section 7). Fig. 15 also exhibits two points showing the expected proton resemblance if the data were composed of a nuclear mix distributed according to the direct measurements of the JACEE experiment [ 23-251 for comparison. The location of these points has been estimated using their published fractions of nuclear types and the expected behavior of the KNN test for each (from Fig. 13). The error bars on the JACEE points estimate only the amount of error associated with converting the JACEE results into an expected fraction of proton nearest neighbors. The composition of the CASA-MIA data near lOI eV, obtained using the KNN technique, is consistent with the JACEE results. A portion of a purely proton simulation sample will be misidentified due to fluctuations. The ability to identify showers systematically improves with increasing energy. This is seen in Fig. 15 as an improvement of the identification of the “pure proton” or “pure iron” calibration samples. To better quantitatively compare the data to the simulation, Fig. 16 applies a correction to the results of Fig. 15 to account for this effect. The data points were normalized bin by bin to lie between the values of the pure iron and pure proton calibration samples shown. Hence, if the data were in fact purely composed of protons, the point in that bin would acquire a value of 1.O, while pure iron data would have a value 0.0. The size of the error bars is larger in this figure than in the
I-..,,,. -..__ .
‘il..
0.6
Physics 12 (I 999) l-l?
0
V
m
G -
ri
u-l
V
V
a
a
.A2
I
__._i
$ -.-.__-.. ._..._.._....._... __ .._ ._.._...
::.:. E....._.._
M.A.K.
04
03
0.2
0
Glasmacher
et ~11./Astroparticle
t t + ca I
#CO”
*irr
r,‘,‘,““,,,,‘,,,“,,.,‘U,,““’
2
225
25
275
3
325
35
375
4
425
Physics
I2 (1999)
1-17
t ‘t
L -t t
t
+-
45
logm(Energy. rev)
Fig. IS. Average fraction of proton nearest neighbors versus reconstructed primary energy. Error bars are statistical. Histograms show the results when substituting either a pure iron or pure proton simulation set for the data. The open squares estimate the result if the composition were distributed according to the JACEE measurements.
sults using tion 12.
11. Spectra
SIBYLL
are summarized
Fig. 16. Corrected average number of nearest neighbors versus reconstructed energy. A data set composed entirely of protons would lie along top border of the histogram, while iron would lie at the bottom. Error bars reflect the statistical uncertainty in both the data and simulation compositional assignments, providing an estimate of the overall accuracy. The open squares estimate the result if the composition were distributed according to the JACEE
below in Sec-
by comlposition
It is useful as a check of the consistency of the present analysis to see if there are differences between the energy spectra OF showers divided up according to their KNN-identified mass. Precise mass divisions cannot be performed here, but the spectra can be examined after being divided into coarse mass categories. If the knee of the spectrum is a result of cosmic rays escaping the galaxy or the upper limit of supernova acceleration, then we would expect that the energy spectra of heavy and of light elements will differ, with lighter nuclei exhibiting the knee at a lower energy than heavy ones. Fig. 17 gives the energy spectra of CASA-MIA data, having been divided into two large groups of composition. The data labeled “light” are defined for our purposes to have more than half proton nearest neighbors, and those labeled “heavy” have fewer than half. In a uniform distribution of primary types, or one similar
J
,.!
Fig. 17. CASA-MIA events selected by the KNN test as “light” or “heavy” (see text), binned by reconstructed energy. The spectra are multiplied by E2.5 for display purposes. Error bars represent the statistical uncertainty on the mean value; the potential effect of systematic errors in energy assignment are discussed in the text.
to the measured JACEE mixture [23,24], the “light” sample would be dominated by protons and helium nuclei. The actual difference in the calculated fluxes between the heavy and light sets below 500 TeV is likely less than the deviation of the points in the plot. Since
Fig.
proton
18.
notation
notation
the points represent the flux multiplied by a factor of E’,‘, a relative energy error between the points magnlfies their apparent differences. For example, an energy error of 20%. when multiplied by 2.5, would account for the observed deviations. Random energy reconstruction errors are of this size in this energy range, improving to about IO% near IO’” eV (Section 5). It is notable that when the SIBYLL simulation is used in the KNN analysis to identify the composition, the two spectrado not show such an intensity difference, but do exhibit the same degree of steepening at a similar energy as in this plot isee Section I? below. and [ 151). The spectra of the heavy and light components appear similar below 500 TeV, at which point the lighter component’s spectral index steepens. The heavier component shows no such “knee” at that energy. There may be a steepening of the heavy component at higher energy, but the statistics are too low for certainty. Given CASA-MIA’s mass resolution and the mass groupings above. we estimate that the heavy component would exhibit a spectral change at about IO times the energy of the corresponding knee of the tighter component if the composition is distributed as in the JACEE results. and is experiencing cutoffs of each component at fixed rigidity. (See [ 15,221 for further details about the spectrum and energy computation. )
showers.
with
as in Fig. 17
12. Use of other simulations The KNN analysis was also performed using a different simulation, based on the SIBYLL interaction generator ( see Section 4). None of the results are significantly altered when this is done. Fig. 18 shows the change in composition as a function of energy and the energy spectra for data grouped into sets identified as heavy or light, as described above. The notation and symbols on the left side of Fig. 18 are the same as in Fig. 16, and those on the right are as in Fig. 17. The trend toward a heavier average composition through the knee region is again apparent, as is the consistency with previous direct measurements at lower energy. A rigidity-dependent spectral knee is atso strongly suggested. The energies at which all changes occur appears to be slightly less when the SIBYLL-based simulation is employed. In light of the uncertainties discussed above, this difference is likely not significant.
13. Summary
and implications
The composition measured by CASA-MIA near IO” eV is consistent with direct measurements by other experiments. and becomes heavier through the knee region of the spectrum. At lOI eV, the data closely resemble simulated iron-induced events, in accord with measurements by other groups at higher energy. Spectra constructed separately for broad mass groups are consistent with cutoffs proportional to the
M.A. K. Glastnacher et al. /Astroparticle
particles’ rigidity. These results qualitatively do not depend on whether the hadronic interaction simulations are based on, the QGSJET model or on the SIBYLL code. These results have implications for the theory surrounding the acceleration and transport of cosmic rays. The compositional change and consistency with a mass-dependent cutoff would tend to favor a supernova shock wave acceleration scenario, a galactic escape scenario, or a combination of the two. Models which do not predict an increase in the average primary mass in this energy range are not supported by the CASA-MIA data.. The KNN analysis successfully employed here will be useful for analysis of data from the new generation of experiments at hl.gher energy, such as the HiRes Project [ 261 and the Pierre Auger Observatory [ 191. There will be even more parameters available in the new data - risetime., depth of shower maximum via fluorescence measurements, etc. - and so the potential power of the technique will be increased.
Acknowledgements The authors grateFully acknowledge the assistance of the Command and Staff of the U.S. Army Dugway Proving Grounds. We express thanks to M. Cassidy for technical support at Dugway, and acknowledge the contributions of A. Elorione, K. Gibbs, and L. Rosenberg during the construction and initial operations of CASA-MIA. We also thank S. Golwala, M. Galli, J. He, H. Kim, L. Nelson, M. Oonk, M. Pritchard, K. Riley, P. Rauske, and Z. Wells for help in the data processing. This work is supported in part by the National Science Foundation and the Department of Energy. Some authors wish to acknowledge the support of the Rackham Gra’duate School at the University of Michigan. the Grainger Foundation, and the Alfred P. Sloan Foundation. CEC is a Cottrell Scholar of the Research Corporation.
Physics 12 (1999) I-l 7
17
References II1 P.O. Lagage,
C.J. Cesarsky, Astron. Astrophys. 118 ( 1983) 223. Astrophysical 121 W.I. Axford Proc. ICRR Int. Symposium Aspects of the Most Energetic Cosmic Rays, Kofu (1990) p. 406. [31 J.A. Goodman et al., Phys. Rev. Lett. 42 ( 1979) 854. 141 H.T. Freudenreich et al.. Phys. Rev. D 41 ( 1990) 2732. 151 A.A. Watson. Proc. 25th Int. Cosmic Ray Conference, Durban, Vol. 8 (1997) p. 257. 1’31The EAS-TOP Collaboration, Proc. 25th Int. Cosmic Ray Conference, Durban, Vol. 4 (1997) p. 13. et al., Proc. 25th Int. Cosmic Ray 171 A. Chillingarian Conference, Durban, Vol. 4 (1997) p. 105. 181 K. Boothby et al., Ap. J. 491 (1997) L35. 191 S. Yoshida, H. Dai, J. Phys. G 24 (1998) 905. LlOl A. Borione et al., Nucl. Instrum. Meth. A 346 ( 1994) 329. [Ill K. Greisen, Ann. Rev. Nucl. Sci 10 (1960) 63. [I21 K. Greisen, Progress in Elementary Particle and Cosmic Ray Physics III ( 1956) 3. La [I31 A.M. Hillas, Proc. 19th hit. Cosmic Ray Conference. Jolla, Vol. I (1985) p. 155. 1141 R.S. Fletcher et al., Phys. Rev. D 50 ( 1994) 5710. I151 M.A.K. Glasmacher, Cosmic Ray Composition Studies with CASA-MIA, Ph.D. thesis (The University of Michigan, 1998), unpublished. [I61 J. Knapp, Proc. 25th Int. Cosmic Ray Conference, Durban. Vol. 6 (1997) p. 121. Karlsruhe 1171 I. Knapp, D. Heck, G. Schatz. Forschungszentrum Report FZKA 5828 ( 1996). unpublished. 1181 B.R. Dawson et al.. Astropart. Phys. 9 (1998) 331. Design Report, 2nd ed. (Fermilab. [I91 Auger Collaboration, 1997). 1201 R. Walker, A.A. Watson, J. Phys. G. 17 (1981) 1297. 1211 C.J. Hubetty. Applied Discriminant Analysis (Wiley, New York, 1994). [22] M.A.K. Glasmacher et al.. Astropart. Phys. (1999), to appear. [ 231 M.L. Cherry. Proc. 6th Conf. Intersections of Particle and Nuclear Physics, Big Sky, Montana ( 1997). to be published. [24] K. Asakimori et al., Ap. J 502 (1998) 278. [25] S. Swordy, Cosmic Ray Observations Below lOI4 eV, in: Proc. XXIII Int. Cosmic Ray Conference, (Calgary): Invited, Rapporteur, and Highlight Papers ( 1994) p. 243. [ 26 ] T. Abu-Zayad et al., Proc. 25th Int. Cosmic Ray Conference, Durban. Vol. 5 (1997) p. 321.