Forensic Science International, 17 (1981) 265 - 281 @ Elsevier Sequoia S.A., Lausanne - Printed in The Netherlands
A CLASSIFICATION
265
SCHEME FOR GLASS
D. A. HICKMAN Metropolitan Police Forensic Science Laboratory, (Great Britain) (Received September
109 Lambeth
Road, London SE1 7LP
19, 1980; accepted December 10, 1980)
Summary The analytical data (five elemental concentrations plus refractive index) for 349 glass samples have been assessed for the purpose of glass classification by the application of multivariate statistical procedures. The requirements of a classification scheme for glass samples are discussed, and such a scheme for unknown samples is presented. The use of this scheme to classify a glass sample involves a comparison of the glass analysis with a reference collection of analyses. By identifying glasses of similar composition to the unknown it is possible in many cases to classify the sample as sheet, container, tableware, headlamp, etc. The results are expressed as definite classifications, suggested classifications and as unclassifiable. The classification scheme was assessed by means of a blind trial set up within the Metropolitan Police Forensic Science Laboratory. For the type of sample occurring most often in casework the scheme produced a rate of successful definite classifications of 91%. No wrong classifications were obtained in the trial.
Introduction In forensic science casework it is often important to be able to establish unambiguously the type or class to which an unknown glass sample belongs. With the primary objective of classification, an analytical procedure [l] involving the determination of manganese, iron, magnesium, aluminium and barium concentrations is in routine use at the Metropolitan Police Forensic Science Laboratory for the examination of glass samples. Background data comprising the analyses [five elements plus refractive index (RI)] of 349 glass samples have been published [ 21. The purpose of the work described here was to assess these analyses in the context of glass classification. The data have been examined by a number of pattern recognition techniques. The requirements of a successful classification scheme are outlined and such a scheme, together with its practical assessment in the form of a blind trial, is presented. Results and discussion Classification
Classification can be defined as the desire to assign an unknown sample to one of a number of separate groups or classes. Various divisions - into
266
classes such as window, vehicle, container and tableware - were used in the presentation of the data from the overall survey of the 349 glass samples [2]. These groupings are traditional divisions of glasses, but it should be appreciated that such categories are somewhat arbitrary and subjective. For instance, the tableware group contains samples as diverse as drinking glasses, lead crystal glasses and dishes, ashtrays, vases and candleholders. If different manufacturing processes, with different chemical formulations, are used to produce these glasses, then they are unlikely to form one coherent group in terms of the elemental levels measured. It may be better to consider the tableware glasses as a number of distinct groups. The same argument can be applied to the other glass classes; it has been pointed out [3] that there are over 700 different glass compositions in commercial use. The number of glass formulations encountered in casework, however, is likely to be quite small since the majority of glass used in everyday life in Great Britain is produced by relatively few major manufacturers. The methods of production and the sources of raw materials of the major glass producers often do not change over considerable periods of time. The likelihood of achieving successful glass classification of this type of sample will thus be very high. Some glass classes can be expected to overlap. Much vehicle (windscreen and window) glass is manufactured by the same process as a large proportion of modern window glass, and it could be predicted that these two groups would be indistinguishable chemically. The basic philosophical approach adopted for the examination of the glass survey data was to determine how the samples divided, considering only the six analytical variables, and ignoring any pre-conceived classification of the samples. A detailed examination of the data might show that the traditional classes overlap, or that they should be split into smaller groupings. Any groupings so identified could possibly be related to the different compositions employed in glass manufacture. Cluster analysis
Cluster analysis [ 4,5] is a procedure that will identify natural groupings of samples, using the available analytical data. A preliminary step to the actual cluster analysis is the calculation of a single measure of similarity or dissimilarity between every pair of samples; i.e. in this case to reduce the six variables to a single variable. Various measures can be employed [6] , but a distance measure is a convenient dissimilarity measure that is readily calculated. The simplest distance to comprehend is Euclidean (“ruler”) distance (ED); the ED between two specimens, 1 and 2, each measured for two variables, x and y, is given by ED 1.2
=
@I
-3c2J2
+
(Yl -Y2v
If the two samples have been measured the ED between the samples is E&,2
=
(~1 -x2)'
+ (~1 -y2J2
.
for three variables,
+ 6%
--z2J2.
X, y and z, then
267
It is straightforward, mathematically, to extend this concept to four, five or n dimensions. Mean Euclidean distance (MED) and squared mean Euclidean distance (SMED) are extensions of Euclidean distance that will take into account any missing data by the use of a l/m term (where m is the actual number of squared terms: 0 < m < n). The SMED between two samples A and B is given by SMED*,a
=
[i
! 1=1
(Ai--Bi)']
where Ai and Bi are the co-ordinate values of variable i for samples A and B. Another important distance measure is mean character difference (MCD), also known as “Manhattan” or “City Block” distance, since the distance between two samples is not measured as a direct line, but as a “walk” along the co-ordinates [ 7, 81. The analyses of the glass samples showed that the elemental levels covered different concentration ranges; for example, manganese at the partsper-million level and magnesium at the percentage level. Since each variable is given equal weighting in calculating the distance measure, it is necessary to transform the original data in some way in order to reduce any bias in the distance measure which the different concentration ranges would produce. A simple logarithmic transformation of the data will realise the desire to give roughly equal weighting to a given percentage change in all variables. An alternative procedure is to standardise the data for each variable by adjusting to a mean of zero and a variance of one. This necessarily involves knowing the number of samples, and recalculation each time new samples are added to the data set. For logarithmic transformations such recalculations would be unnecessary. Standardisation to zero mean and unit variance implies a normal distribution of the variable; in practice this may not be the case. The procedure adopted for studying the data from this survey was to carry out a logarithmic transformation of the data prior to the calculation of the distance measure. The elemental levels had an effective lower limit of zero, and the refractive index measurements were related to 1.5000 as a base value. The computer program NADIST [ 91 was used to calculate MCD, MED and SMED distances between every pair of samples in the survey, and to store these distances in the form of a matrix. The computers used for this work were an in-house ICL 1904S, and the CDC 7600, 6600 combination at Brookhaven National Laboratory, New York. A number of cluster analysis procedures has been described in the literature, but the approach most widely employed appears to be the SAHN method: sequential, agglomerative, hierarchic, non-overlapping clustering [ 8, lo]. The first step in a SAHN clustering is to search the distance matrix to identify the smallest distance between two samples and hence to define the first two-sample cluster. The matrix is then searched again and, depending on
268 2899 s3806 S1286S791 s3617 S2703S382753873S308LS332LS379753596 S-
1958H 2013Hl 2038M 0.00
0.40
DIssimll~rlty 0.80
co*ffIcIent 1.20
Fig. 1. Dendrogram resulting from the cluster analysis of 21 glass samples of RI 1.5171 0.0001 (S = sheet; C = container; H = headlamp; M = miscellaneous - a television tube).
+
the value of a “clustering criterion”, either two new samples are joined, or a third sample is added to the first cluster of two samples. The process is repeated until all the samples are formed into one overall cluster. The clustering criterion is a mathematical function of the distance. The computer proseven different clustering criteria, gram AGCLUS [ll],which incorporates was used to cluster the distance matrices. The clustering criteria available in AGCLUS have been described in detail by Harbottle [ 81. The results of a cluster analysis are traditionally presented as a “dendrogram” [ 121; this is a tree-like diagram in which the interrelationships between samples and clusters can be identified by their nearness to each other. A simple example is shown in Fig. 1. The branches of the dendrogram run horizontally across the diagram, and the abscissa is graduated in the similarity or dissimilarity measure on which the clustering is based. The ordinate in a dendrogram has no special significance, and the order in which the branches are presented can vary within wide limits without changing the inter-sample relationships. The points where two stems - either samples or clusters of samples - are tied together represent the values of the similarity or dissimilarity coefficient. In the example shown two adjacent samples are tied together at a point representing the SMED between the samples, whilst points joining groups of samples are at values of the dissimilarity coefficient calculated between the groups. This does not necessarily mean that all possible group relationships are shown. Thus in Fig. 1 the dissimilarity, for example, between 1958H and 2013H, and 1696C, 1702C and 1432C is not represented. The dendrogram is therefore an imperfect representation in two dimensions of relationships existing in many dimensions but it is nevertheless a useful means of visualising the fusions of samples which have been made at each stage in a cluster analysis.
269
TABLE 1 Correlation coefficients for 349 glasses of various types
AI
-0.002
Ba
-0.071
RI
0.173 Mn
0.077 -0.226 0.033 Fe
-0.008 -0.177
0.324
-0.342
-0.301
-0.145
Mg
Al
Ba
Correlations between the measured variables An assumption made in a cluster analysis approach to data interpreta(i.e. spherical in more than tion is that “natural” groups are hyperspherical three dimensions) in multi-dimensional space. The presence of highly correlated variables among the data would affect this assumption, and thus it was necessary to test the survey data for correlation. The Pearson ProductMoment correlation coefficients [13] were calculated for the 349 glasses considered as one group (Table 1). For strong correlations the Pearson Product-Moment correlation coefficients approach +l or -1, with values reducing to zero for no correlation. Solomon [ 141 has stated, “As a rule of thumb, correlations as high as 0.5 will not produce Euclidean distances that lead to operational difficulties”, and Harbottle [8] has quoted 0.8 as a limiting value. The data in Table 1 show that only 1 of the 15 coefficients is > kO.5, and none are 2 *OS; the great bulk of the data is therefore not highly correlated. The practical consequence of this is that a Euclidean distance measure can be safely employed in the cluster analysis procedure. Cluster analysis procedure for the glass data The clusterings resulting from the different distance matrices were essentially similar, but the use of SMED produced “tighter” groupings than either MED or MCD. It has been noted [8, 111 that the average SMED between all pairs of samples in a cluster is a good index of the within-cluster variance. Since the goal of applying cluster analysis to the glass data is to identify groups of samples of the same class, with the exclusion of samples of other classes, it is desirable to minimise the within-cluster variance. Thus it could be predicted that SMED would be the appropriate measure to use in such an application. The use of some clustering criteria has been criticised on theoretical grounds [ 151. Seven criteria were tested in this work, and the criterion of complete linkage proved to be the most useful, giving glass groups with the minimum number of anomalous samples. In complete linkage clustering the
270
requirement for a sample to be admitted to a cluster is that the sample must have the shortest distance to the farthest member of the cluster. This type of clustering, also known as “farthest neighbour” or “longest link”, leads to tight, discrete and generally hyperspherical clusters [ 161. To summarise, the procedure found most informative for examining the glass data with the objective of classification was to take logarithms of the original data, calculate the SMEDs between all pairs of samples, and to carry out cluster analysis using the criterion of complete linkage. Results of cluster analysis of the survey data
The problem of glass classification would have been solved easily if the cluster analysis had shown that the glasses could be divided into a small number of groups; for example, one group each of sheet, container, tableware and headlamp glasses. In practice the most noticeable result of the cluster analysis of the 349 samples is that the glasses are separated into a large number of groups. Some of these groups are large (more than twenty samples) whilst others contain only a few samples. The cluster analysis identified one very large group, comprising 109 glass samples of very similar composition. The group statistics for each variable are listed in Table 2. A comparison of the group coefficients of variation with those of the analytical method shows that this is a very “tight” group in terms of the variables measured. This presumably is a reflection of strict quality control in the manufacturing process. The magnesium level (approximately 2%) of these glasses indicates sheet glass of modern manufacture. By referring back to the index of the laboratory glass collection the samples were identified as a mixture of window and vehicle (windscreen and window) glasses, plus two mirror glasses and a tableware glass. The latter was a puzzling inclusion amongst a group of otherwise exclusively sheet glasses. This
TABLE Group
2 statistics
for group of 109 sheet glasses
Statistic
Mn (ppm)
Mg (%)
Y;)
RI
Al
Ba
(%)
(ppm)
Mean (arithmetic)
88
0.073
2.19
0.56
112
1.5169
Mean (geometric)
87
0.072
2.19
0.56
110
1.5169
Relative standard deviation (%)
12.3
19.1
8.2
14.6
17.9
6.0
8.0
5.0
6.0
9.0
68 110
0.050 0.102
Relative standard deviation (%) of analytical method (l-month period) Group range (geometric mean r 2 group standard deviations)
-
1.84 2.60
-
0.42 0.74
-
77 155
1.5149 1.5192
-
271
sample was listed as the “tray of a dressing table set”, and was thus simply a piece of sheet glass. This example illustrates the danger of preassigning samples according to their use, and shows the advantage of using an objective method of classification, in this instance one based on composition, The existence of a group containing both window and vehicle glasses shows that these glasses should not be considered as separate groups since they are indistinguishable chemically, on the basis of the five elements measured. For this group, and for the other groups in the following discussion, the group ranges were calculated by adding and subtracting twice the log standard deviation to the geometric mean, and then converting back to the actual concentrations. Two other sheet glass groups which were identified by the cluster analysis contained 8 and 43 samples, respectively, and had magnesium levels similar to those of the samples in the 109~sample group. The main differences between the samples in the 8- and 43-member groups were in the manganese, iron and barium concentrations, but the two groups could be pooled to form a 51-member group, the group statistics of which are given in Table 3. The group ranges of this sheet group actually encompass all the samples of the log-member group, and thus the ranges are applicable to a group of 160 sheet glasses. The important distinction between the two groups is that the ranges of the “tight” group encompass no non-sheet glasses, whereas the cluster analysis showed some non-sheet glasses to fall within the broader sheet group. The 51 sheet glasses were again a mixture of window, vehicle and mirror glasses. Several other large groups of glasses were evident in the cluster analysis; these could be combined to give a group of 58 container glasses and a group of 36 tableware glasses, the statistics for which are also given in Table 3. The remaining samples in the survey were grouped as small clusters of similar samples. These clusters were usually of one type of glass but some contained samples of two types (for example, container and sheet or tableware and sheet), showing that the six variables measured were insufficient to classify unambiguously such compositions. The sheet glasses which were indistinguishable from some container and tableware glasses were some of the low level magnesium glasses, having compositions typical of old window glass. In most cases the small groups of samples were readily distinguishable one from another, but the calculation of group statistics on such small numbers of samples could be misleading. The cluster analysis showed that overall there were greater variations in the compositions of container and tableware glasses than there were in sheet glasses. This is a reflection of the greater range of manufacturing processes and compositions used in container and tableware glass manufacture. Several general points emerged from studying the compositions of different glass types. Glass from vehicles of foreign manufacture sometimes had low levels of barium and/or manganese. Tinted vehicle glass was similar to the most common type of sheet glass on all the parameters measured except iron which was present at much higher concentrations (0.2 - 0.4%). A few
272 TABLE 3 Group statistics for groups of 51 sheet glasses, 58 container glasses and 36 tableware glasses Group
Statistic
Mn (ppm)
51 sheet glasses
Mean (arithmetic)
68
0.080
2.18
0.52
80
1.5173
Mean (geometric)
60
0.071
2.16
0.44
62
1.5171
Relative standard deviation (%)
41
Group range
17 181
0.027 0.187
1.57 3.00
0.11 1.70
14 260
1.5126 1.5234
Mean (arithmetic)
135
0.034
0.098
0.70
185
1.5188
Mean (geometric)
107
0.030
0.078
0.69
171
1.5188
58 container glasses
36 tableware glasses
Fe (W)
58
RI
Mg (S)
16
50
67
Relative standard deviation (%)
68
Group range
22 509
0.010 0.084
0.02 0.29
0.48 0.99
Mean (arithmetic)
13
0.022
1.35
0.61
288
1.5154
Mean (geometric)
11
0.021
1.31
0.60
242
1.5154
Relative Standard deviation (%)
48
Group range
435
77
36
0.009 0.046
89
26
0.76 2.26
19
18
0.41 0.87
44
73 404
1.5158 1.5225
64
73 800
1.5111 1.5214
sheet glasses were encountered which matched the ranges of the main sheet group on all variables except barium, which was present at the 1000 - 2000 yg g-l level. One sample of patterned window glass had an extremely high manganese content (1.03%). The groupings of container glass were often not exclusively of colourless glasses; the five-element analysis thus did not always identify the colouring agent. Some coloured containers were found to have high concentrations of iron, and some green containers had high chromium contents. Two interesting groupings of container glasses were observed: one contained several milk bottles whilst the other, well-defined, group comprised four milk bottles, one orange squash bottle and another bottle, all products from
273
one supermarket chain. These groupings presumably indicate glass from one factory or the use of one formulation. Some container glass samples were essentially unique; for example, glass from an Advocaat bottle, manufactured in the Netherlands, had low levels (for container glass) of manganese, iron, magnesium and barium. Two types of headlamp glass were identified. One type comprised glasses of low RI having low levels of manganese, iron, magnesium and barium but with aluminium present at about the 1% level. These were borosilicate glasses. The other type was characterised by high RIs and very high levels of barium (approximately 1%). Other samples with high barium levels were the one spectacle lens analysed (1.7% Ba) and one beer glass (0.5% Ba). Criteria for a classification scheme for glass samples (1) The scheme must be able to classify correctly the majority of glass samples likely to be encountered in casework. The samples occurring most often are colourless glasses with RIs in the range 1.5125 - 1.5250; it is extremely important to achieve a high success rate for these glasses. Since they occur less often, glasses of unusual RI may be of high evidential value, and it is desirable to achieve successful classification of such samples. (2) Error rates should be extremely low (and preferably zero) for any scheme to be considered reliable; for forensic work it is more important to identify some glasses as unusual (and therefore unclassifiable)- than it is to attempt classification of every glass sample, with the attendant risk of error. (3) It must be possible to refine and update the classification scheme as more analytical data become available. (4) The classification procedure and its interpretation must be readily comprehended by the casework reporting officers, since they will be using the scheme to assess their glass evidence, and may have to explain the methods in court. (5) The practical performance of a classification scheme must be assessed using test data and by its performance in blind trials. A classification scheme for unknown glass samples The cluster analysis of the survey data showed the existence of several distinct groups of glasses, containing samples of similar composition. The important fact that the members of these groups appeared to be of a single type (for example, all sheet or all container) suggested a reasonable basis for classifying unknown glass samples, i.e. to compare the analysis of the unknown with the analytical ranges of the main groups. Three major groups were identified within the survey data, one each of tableware (36 samples), container (58 samples) and sheet (160 samples) glasses. These groups, which comprise over 70% of the survey samples, are well-separated from each other in terms of one or more of the variables measured. Within the sheet group a sub-group was identified, comprising 109 samples of very similar composition: group Sheet B. The relative importance of the measured variables in these four groups can be assessed by examining
274 TABLE 4 Group ranges for four major groups of glass samples Group
Mn (ppm)
Fe (%)
Mg (%)
Al (%)
Ba (ppm)
RI
Sheet A (160 samples)
17 - 181
0.027 0.187
1.57 3.00
0.11 1.70
14 - 260
1.5126 1.5234
Sheet B (109 samples)
68 - 110
0.050 0.102
1.84 2.60
0.42 0.74
77 - 155
1.5149 1.5192
Container (58 samples)
22 - 509
0.010 0.084
0.02 0.29
0.48 0.99
73 - 404
1.5158 1.5225
Tableware (36 samples)
4 - 35
0.009 0.046
0.76 2.26
0.41 0.87
73 - 800
1.51111.5214
Table 4 which gives the (mean f 20) ranges (where u is the group standard deviation) for each of the six variables measured. In order to prove that the main groups of Table 4 are in fact separate from each other, and to establish whether any of the other samples have compositions which correspond to the group ranges detailed in this table, each sample in the survey (except the samples in a particular group) was tested by directly comparing its analytical range (i.e. mean analytical value f an allowance for the experimental variation) with the ranges of that group. The results are summarised in Table 5; the analytical limits defining the main sheet group encompass six anomalous samples, whilst the narrower limits of group Sheet B excluded any anomalous samples from the survey data. The compositions of the anomalous samples are given in Table 6; manganese can be identified as an important element in discriminating these samples from the majority of samples in the two groups. The remaining 95 samples from the survey are a mixture of sheet, container, tableware, bulb, headlamp and other glasses, some of which are chemically similar and form small groupings of glasses of the same type, others falling in overlapping regions between types. When confronted by the analysis of a glass of unknown class, the first step in classifying the sample is to compare its analytical range [i.e. the mean analytical value f an allowance to take into account the known (longterm) analytical precision for each element] for each variable with the ranges of the four main groups (Table 4). A sample is assigned to a group if it matches on all six variables. If a sample can be assigned to group Sheet B as well as to the main Sheet group, then it will be reported as belonging to group Sheet B since greater confidence can be attached to this sheet classification: the group has no anomalous samples. Samples assigned to groups Sheet B or Container need no further testing for classification purposes. Samples assigned to groups with anomalous samples must be compared directly to the anomalies to assess their similarity. A visual comparison of
275 TABLE 5 Anomalous samples for the main groups Group
No. and type of glasses defining the group
Other glass samples tested for membership of the defined groups No. tested
Types* falling within ranges of group S
C
T
Other
Sheet A
160 sheet
189
0
1
3
2 (bulb)
Sheet B
109 sheet
240
0
0
0
0
Container
58 container
291
0
0
0
0
Tableware
36 tableware
313
1
0
0
0
*S = sheet; C = container; T = tableware.
TABLE 6 Compositions of the anomalous samples Group
Anomalous samples
Type*
Mn (ppm)
Sheet A
G G G G G G
C T T T B B
16 18 21 14 69 85
Tableware
G 33
S
38
1693 2583 2606 3963 2887 3322
Mg (%)
Al (%)
Ba (ppm)
RI
0.027 0.041 0.041 0.035 0.079 0.083
1.83 1.84 1.63 1.67 1.53 1.73
0.67 0.53 0.70 0.39 1.10 0.73
26 75 168 89 118 120
1.5141 1.5153 1.5180 1.5192 1.5128 1.5138
0.040
1.84
0.47
122
1.5180
*C = container; T = tableware; B = bulb; S = sheet.
the results, using the experience and judgement of the analyst, is the best procedure, although we have achieved some success in employing automatic calculation of a similarity measure. The SMEDs, using all six variables, between the assigned sample and the anomalies in that group are computed; experience has shown that a SMED of less than 0.01 would suggest samples to be indistinguishable, while SMEDs smaller than 0.03 indicate similar samples. These figures are not absolute indicators, and are always supplemented by visual examination of the data. The SMED calculation gives equal weighting to each variable, and in the present form it is relatively insensitive to (significant) changes in RI since the RIs are measured relative to 1.5000 as a base value. The analyst’s judgement gives more importance to magnesium, iron and refractive index than to the other variables, and it may be possible in the future to design a similarity measure to incorporate these intuitive judgements.
276
If the unknown sample cannot be assigned to any of the four main groups it is compared in turn to the other 95 samples from the background survey. These “other” samples comprise 22 sheet glasses (14 of which have low concentrations of magnesium), 33 container, 22 tableware, 6 headlamp, 4 bulb, 1 spectacle lens and 7 miscellaneous (TV tube, goldfish bowl, etc.) glass samples. Although the comparisons could be effected by simple visual inspection of the data, in practice it is quicker and less subjective to use a computer-based method. A satisfactory procedure has been found to be the calculation of the SMEDs between the sample and the 95 “other” samples, and then sorting these distances to find the nearest neighbours to the unknown. The values of the SMEDs will reveal the closeness of matching, distances smaller than 0.03 indicating similar samples. Depending on the number, type and closeness of the nearest neighbours, it is often possible to make a reasonable judgement of the class of glass to which the unknown sample belongs. An example of such a comparison is given in Table 7. This listing of the closest samples to the unknown shows that two samples have distances smaller than 0.03, the criterion for “similar” samples. Inspection of the data shows that these two samples are very similar to the unknown; they are both tableware glasses and therefore the unknown is classified as tableware. The next two nearest samples, one a tableware and the other a container glass, can be discriminated from the unknown by their refractive index and magnesium content, respectively. This approach for classifying “unusual” samples depends for its success on the reference collection containing a sufficient number of such samples, TABLE 7 Comparison of an unknown sample with the glasses in the reference collection Distances sorted to find closest samples to unknown ten nearest neighbours to sample A: Sample A 2560 3153 3389 1693 2568 1866 c 01 789 2553
Type* Distance 0.00000 0.01223 0.01892 0.03607 0.03785 0.08151 0.09552 0.11464 0.13275 0.15197
Mn 9.0
15.0 17.0 18.0 16.0 4.0 21.0 3.0 20.0 14.0
RI 0.018 0.022 0.015 0.014 0.027 0.027 0.063 0.018 0.067 0.013
1.11 1.24 1.35 0.85 1.83 1.31 2.07 2.32 1.00 0.24
0.51 0.57 0.57 0.42 0.67 0.17 0.53 0.45 0.23 0.56
List of similar samples to sample A - Distances less than 0.03000 2560 3153
0.01223 0.01892
*T = tableware; C = container; S = sheet,
14.0 18.0 10.0 7.0 26.0 26.0 8.0 55.0 41.0 61.0
1.5127 1.5131 1.5139 1.5180 1.5141 1.5018 1.5232 1.5164 1.5195 1.5178
so that the chance of matching the unknown to sample(s) in the collection is maximised. Some glass types, such as spectacle lenses, bulbs and tinted windscreens, are encountered infrequently, but if similar glasses have been analysed previously such samples can often be correctly classified. It is now our routine practice to add all analyses of casework control glasses to the collection of background data. It can be expected that as more samples are added to the background data further main groupings (of greater than 20 samples) may be identified, for example, of old (low magnesium) sheet glass, borosilicate headlamps, etc. The procedure of finding and assessing the nearest neighbours to an unknown sample will in some cases suggest further tests which could be carried out; for example, the determination of the lead (for lead crystal) or boron (for borosilicate glass) content of the sample. Sample classifications are reported as definite, as single suggestions, as double suggestions (for example, sheet or container), or as unclassifiable. Definite classifications result either from an assignment to one of the welldefined groups with discrimination of the sample from any anomalous samples in that group, or if the search of the “other” glasses reveals several similar glasses of one class. Single suggested classifications result from one or two similar samples of a single class being identified, while the double suggestions refer to the areas where two classes overlap on the six variables measured. A sample is termed unclassifiable if no similar samples are identified in the collection; such a classification infers the elimination of the common groups. The distance measure and sorting procedure can also be used to estimate the frequency of occurrence of a given glass composition, since it is often important to know whether a case sample is of a common or an unusual composition. The listing of the compositions of the most similar samples can be assessed in terms of discrimination from the case sample to give the number of “similar” samples in the background data. Assessment of classification scheme A blind trial was set up within the laboratory in order to assess the performance of the classification scheme. Fifty glass samples were analysed in triplicate by the usual ICP-AES procedure, using amounts in the range 200 300 pg for each analysis, and the results were processed using the classification scheme described above. The results are listed in Table 8. Firm classification was proposed for 33 samples; these attributions were all correct. In addition, eight single suggestions of either sheet, container, tableware or headlamp and four double suggestions of container or sheet, sheet or tableware and tableware or container were made; these were also all correct. The remaining five samples were categorised as unusual since no similar samples were identified amongst the background data; these samples comprised two tumblers, an ashtray, a Pils beer bottle and green glass from a church window, with RIs, respectively of 1.5082, 1.5115, 1.5086, 1.5217 and 1.5377. The samples were reported as unclassifiable.
278 TABLE
8
The analyses No.
and the results of classification
Description
RI
Mn g)
(ppm) 1 2. 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
Brown container 46.0 Vehicle window 91.0 Half-pint beer mug 18.0 Shop window 83.0 Vehicle window 110.0 Shop window 85.0 Container 35.0 Coffee jar 58.0 Pub window 97.0 Tableware 17.0 Wine glass 9.0 Window 91.0 Milk bottle 112.0 Shop door 88.0 House window 94.0 Milk bottle 16.0 Bottle 164.0 Tumbler (clear) 9.0 Tumbler (yellow) 5.0 Jar 90.0 Ashtray 5.0 Church window 3420.0 Caravan window 101.0 Wine glass 370.0 Windscreen 68.0 Patterned glass 419.0 Shop window 86.0 Sherry glass 12.0 Windscreen 77.0 Window 90.0 Window 43.0 Pils beer bottle 153.0 House window 58.0 Garage window 82.0 Glass dish 15.0 Window 41.0 Half-pint beer mug 20.0 Container 27.0 Wine glass 16.0 Vehicle glass 28.0 Headlamp 5.0 Container 145.0 Green bottle 302.0 Shop window 96.0 Whisky bottle 328.0 Garage window 86.0 Pils beer bottle 57.0 House window 95.0 Window 109.0 Window 22.0
*C = container;
of the blind trial glasses
0.169 0.069 0.015 0.057 0.105 0.056 0.036 0.193 0.076 0.005 0.018 0.073 0.026 0.062 0.077 0.021 0.025 0.010 0.010 0.032 0.010 0.620 0.088 0.010 0.042 0.017 0.075 0.034 0.072 0.068 0.073 0.163 0.050 0.074 0.020 0.029 0.023 0.047 0.020 0.046 0.022 0.026 0.247 0.085 0.263 0.073 0.144 0.076 0.101 0.031
S = sheet; T = tableware;
0.38 2.04 1.18 2.16 2.29 1.88 0.28 0.44 2.08 0.01 1.11 2.22 0.06 1.77 2.27 0.05 0.07 0.01 0.04 0.07 0.03 0.45 2.29 0.03 2.45 0.03 2.31 1.83 2.22 1.81 1.69 0.11 2.43 1.88 0.08 0.06 1.15 0.28 1.08 2.50 0.04 0.08 1.23 1.81 0.14 1.80 0.11 2.27 2.24 0.07
0.86 0.53 0.60 0.53 0.61 0.09 0.58 0.86 0.75 0.08 0.51 0.48 0.74 0.50 0.61 0.70 0.62 0.01 0.13 0.86 0.19 0.84 0.74 0.05 0.46 0.13 0.56 0.52 0.56 0.57 0.43 0.69 0.18 0.46 0.37 0.11 0.55 0.71 0.63 0.28 0.91 0.70 0.94 0.12 0.75 0.49 1.03 0.60 0.56 0.20
H = headlamp;
cBp”pm) 154.0 101.0 217.0 109.0 118.0 32.0 148.0 156.0 146.0 152.0 14.0 88.0 244.0 101.0 99.0 5.0 59.0 8.0 33.0 240.0 2500.0 4300.0 151.0 10000.0 24.0 165.0 102.0 232.0 104.0 102.0 37.0 2760.0 33.0 108.0 181.0 27.0 190.0 174.0 273.0 71.0 10.0 68.0 444.0 28.0 66.0 111.0 155.0 98.0 104.0 30.0
1.5203 1.5161 1.5185 1.5191 1.5152 1.5174 1.5174 1.5219 1.5161 1.5082 1.5127 1.5191 1.5218 1.5178 1.5157 1.5186 1.5188 1.5115 1.5127 1.5221 1.5086 1.5377 1.5159 1.5256 1.5163 1.5189 1.5169 1.5121 1.5166 1.5180 1.5189 1.5217 1.5219 1.5184 1.5184 1.5305 1.5187 1.5174 1.5127 1.5182 1.4778 1.5202 1.5246 1.5179 1.5225 1.5184 1.5224 1.5157 1.5165 1.5300
IJ = unclassifiable.
True class* C S T s S S C C S T T s C S s C C T T C T s s T S T S T S S S C s S T S T c T S H C C S C S C S S s
--
Classification* C (sugg.) S T S S S C C (sugg.) S u T S C S S C (sugg.) C IJ T (sugg.) C U U S T (sugg.) S T S T S S S U S S S/T S/T T C T S B (sugg.) C C (sugg.) S C (sugg.) S SIC S S SIT
279
The glass samples for the blind trial had been selected to include a number from each main class, together with a number of coloured samples. The samples were thus significantly different from the most commonly occurring samples in casework which are colourless glasses (mainly sheet) in the RI range 1.5125 - 1.5250. The blind trial included 35 colourless glasses (21 sheet, 7 container and 7 tableware) in this RI range. Definite assignments were given for 32 of these samples (21 sheet, 6 container and 5 tableware): a 91.4% success rate. Of the remaining two samples, one was suggested as being container (correctly), the other was suggested as sheet or tableware (actually tableware). No firm classifications or single suggestions for the blind trial samples were incorrect. This very low (zero) error rate indicates the reliability of the classification scheme, the basis of which is the matching of an unknown sample to well-defined groups of samples, or directly to reference samples, with the interplay of human judgement. Discrimination Although the work described in this paper was undertaken with the principal aim of classifying glass samples, some observations regarding discrimination can be made. In order to assess the discrimination given by the five-element analysis, a search was made of the background data to find glasses having RIs of 1.5171 + 0.0001. This produced a total of 21 samples, comprising 12 sheet glasses, 6 container glasses, 2 headlamp glasses and a televison tube. Their analyses are listed in Table 9. Cluster analysis was performed on these samples and the resulting dendrogram is given in Fig. 1. The dendrogram shows that the four types of glass present are well separated from each other. Eleven of the 12 sheet glasses cannot be discriminated one from another on the basis of the five elements measured; the twelfth sample is distinguished by a slightly higher aluminium level and a considerably higher iron level. The container glasses divide into two groups, due to differences in the levels of manganese and barium. This example confirms that the five elements selected perform well for classifying glass samples. A small degree of discrimination within an overall class (such as container) is provided by the analytical procedure, but this must be regarded as a bonus. It could be predicted that variables found useful for classification will be less useful for discrimination. Amongst the variables studied aluminium is the least useful parameter for classification. A sub-division of the sheet glasses due to different levels of aluminium was identified, however, and thus in certain instances aluminium could be a useful discriminating element for sheet glass. The elemental distributions of 540 sheet glass samples reported by Goode et al. [17] indicate that caesium, rubidium, strontium and antimony might be other useful discriminating elements for sheet glass. In some cases the measurement of the elemental concentrations of manganese, iron, magnesium, ahrminium and barium may provide a degree of
280 TABLE
9
Type
and composition
“G”
Glass
No.
type*
791 1286 1432 1535 1598 1628 1696 1702 1958 2013 2038 2783 2899 3084 3324 3596 3617 3797 3806 3827 3873
S S C C
C C C C H H M S S S S S S S S S S
of 21 glass samples
Mn (ppm) 83.0 83.0 144.0 13.0 27.0 122.0 146.0 148.0 23.0 9.0 85.0 87.0 89.0 92.0 85.0 64.0 76.0 101.0 84.0 74.0 82.0
*S = sheet; C = container,
Fe (%I 0.055 0.061 0.033 0.025 0.035 0.017 0.026 0.026 0.017 0.018 0.030 0.051 0.058 0.074 0.072 0.129 0.051 0.085 0.063 0.076 0.085
H = headlamp;
of RI 1.5171
f- 0.0001
Mg @)
AI (%)
Ba (ppm)
RI
1.86 2.28 0.09 0.02 0.04 0.10 0.07 0.06 0.09 0.67 0.02 1.81 2.34 1.92 1.92 1.63 2.13 2.31 2.31 2.26 2.17
0.52 0.52 0.76 0.70 0.55 0.64 0.93 0.75 0.38 0.53 2.40 0.40 0.53 0.49 0.43 0.72 0.50 0.59 0.56 0.53 0.54
105.0 119.0 214.0 129.0 144.0 105.0 144.0 141.0 11000.0 4100.0 59000.0 99.0 105.0 86.0 109.0 114.0 90.0 111.0 107.0 92.0 97.0
1.5171 1.5170 1.5172 1.5170 1.5172 1.5171 1.5172 1.5171 1.5172 1.5172 1.5172 1.5171 1.5170 1.5171 1.5171 1.5171 1.5171 1.5170 1.5171 1.5172 1.5170
M = miscellaneous.
discrimination between glass samples of identical RI, but the addition of several other elements to the analytical procedure should provide a more satisfactory basis for discrimination.
Conclusions The combined variables of refractive index and concentrations of manganese, iron, magnesium, aluminium and barium provide a firm basis for classifying glass samples. A reliable scheme for classifying glass samples involves comparing the analysis of an unknown sample with a reference collection of analyses of glasses of known origin. Examination of the data from a survey of 349 glasses has shown that 70% of the survey glasses could be divided into three main groups - one each of sheet, container and tableware glasses. A simple comparison of the analysis of the unknown sample with the ranges for these groups will be sufficient to classify the majority of glasses likely to be encountered in forensic science casework. These results are not limited to analyses carried out by the inductively coupled plasma-atomic-emission spectrometric procedure; analyses performed by other techniques, such as
atomic-absorption spectrometry or neutron activation analysis, classified by comparison to the group ranges quoted, provided analytical technique employed produces accurate analyses.
could be that the
Acknowledgement The author is grateful to Drs. Garman Harbottle and Edward V. Sayre of Brookhaven National Laboratory, Upton, New York, for informal discussions on many aspects of numerical taxonomy, and for the provision of computing facilities. References 1 T. Catterick and D. A. Hickman, Sequential multi element analysis of small fragments of glass by atomic-emission spectrometry using an inductively coupled radiofrequency argon plasma source. Analyst, 104 (1979) 516 - 524. 2 T. Catterick and D. A. Hickman, The quantitative analysis of glass by inductively coupled plasma-atomic-emission spectrometry: a five-element survey. Forensic Sci. Znt., 17 (1981) 253 - 263. 3 G. B. Rothenberg (ed.), Glass Technology; Recent Developments, Noyes Data Corporation, New Jersey, 1976, p. 1. 4 B. Everitt, Cluster Analysis, Heinemann Educational Books Ltd., London, 1974. 5 P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy, W. H. Freeman & Co., San Francisco, 1973, p. 201. 6 P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy, W. H. Freeman & Co., San Francisco, 1973, p. 114. 7 P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy, W. H. Freeman & Co., San Francisco, 1973, p. 121. 8 G. Harbottle, Activation analysis in archaeology. In G. W. A. Newton, (ed.), Radiochemistry, Vol. 3, Chemical Society, London, 1976. 9 A. M. Bieber, NADIST, a Program for Calculating Different Kinds of Tuxonomic Distance, Brookhaven National Laboratory, Upton, New York. 10 P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy, W. H. Freeman & Co., San Francisco, 1973, p. 214. 11 D. C. Olivier, AGCLUS, nn Aggregutive Hierarchical Clustering Program, Department of Psychology and Social Relations, Harvard University, Cambridge, Massachusetts. 12 P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy, W. H. Freeman & Co., San Francisco, 1973, p. 259 et seq. 13 N. H. Nie, C. H. Hull, J. G. Jenkins, K. Steinbrenner and D. H. Bent, Statistical Puckage for the Social Sciences, 2nd edn., McGraw-Hill, New York, 1975, p. 279 et seq. 14 H. Solomon, in F. R. Hodson, D. G. Kendall and P. Tautu (eds.), Mathematics in the Archaeological and Historical Sciences, Edinburgh University Press, Edinburgh, 1971, p. 67. 15 G. N. Lance and W. T. Williams, Computer programs for hierarchical polythetic classification (“similarity analyses”). Computer J., 9 (1966) 60 - 64. 16 P. H. A. Sneath and R. R. Sokal, Numeric41 Taxonomy, W. H. Freeman & Co., San Francisco, 1973, p. 222. 17 G. C. Goode, G. Wood, N. Brooke and R. F. Coleman, Multi-element analysis of glass fragments by neutron activation and the application to forensic science. Atomic Weapons Research Establishment, Aldermaston, U.K. Report No. 024/71, (1971).