Analytica Chimica Acta, 161(1984) 75-82 Elsevier Science Publishers B.V., Amsterdam - Printed in The Netherlands
ASPECTS OF BIOMARKER ANALYSIS BY GAS CHROMATOGRAPHY/MASS SPECTROMETRY WITH SELECTIVE METASTABLE ION MONITORING Part 2. Information Content of Biomarkers in Some Light Oils
OLAV H. J. CHRISTIE* Rogaland
Research
and TRYGVE
Institute,
MEYER
P.O. Box 2503,
Ullandhaug,
N-4001 Stavanger
(Norway)
PAUL W. BROOKS Institute of Sedimentary St. NW, Calgary, Alberta
and Petroleum (Canada)
Geology,
Geological
Survey
of Canada, 3303-33
(Received 21st November 1983)
SUMMARY The information content of steranes and pentacyclic triterpanes is extracted by means of eigenvector (factor) analysis. Steranes carry information different from that of hopanes even though they both provide information about the maturity of the source rock. Moretanes and 20R hopanes provide similar information which differs from that of 20s hopanes. The steranes carry somewhat different information, which is opposite to that carried by a series of unidentified triterpanes. The diacholestanes and cholestanes yield different types of information.
Biomarker molecules, such as steranes and triterpanes, are of considerable interest in petroleum chemistry and organic geochemistry because they can be used to solve specific problems in connection with maturity of natural hydrocarbon source rocks and, partly, migration of natural hydrocarbons from source rocks to petroleum reservoirs (see, e.g. [l-4] ). Such compounds are identified and quantified by gas chromatography/mass spectrometry (g.c./m.s.), which until recently has been hampered by considerable interference problems limiting quantitative measurements of sterane and pentacyclic triterpane chromatograms to a few peak ratios. With the introduction of metastable ion monitoring g.c./m.s. it became possible to record carbon number-specific chromatograms of steranes and triterpanes [ 5-71. This helped to solve the interference problem and opened the way for more selective surveys of the significance of biomarkers. In view of the increased selectivity provided by the new measurements, some criticisms may now be raised against current methods for data analysis of biomarker constituents in natural hydrocarbon mixtures. A major objection concerns the use of ratios. It is well known that information is lost when data are transformed to ratios, e.g., percentages. Decrease in the percentage of men in a population does not indicate whether some men have 0003-2670/84/$03.00
o 1984 Elsevier Science Publishers B.V.
76
died, more men than women have died, some girls have been born, or more girls than boys have been born, all of which information could have been understood directly from the original data. This exemplifies how pertinent information may be lost by transformation of raw data to ratios. A more serious problem, however, is that present techniques allow measurement of more than sixty molecule-selective sterane and triter-pane concentrations. This means that the samples can be described in terms of a variable space with more than sixty dimensions. The location of sample plots in this multidimensional space is vastly more informative than a few ratios. Thus, by using ratios rather than plots in a multidimensional biomarker space, the organic geochemist virtually discards a large proportion of the information inherent in the data. Therefore, new techniques of data analysis, e.g., disjoint principal components analysis and eigenvector projections, are likely to be used to a larger extent in future organic geochemistry (see e.g., [ 81). In order to learn more about the multidimensional variable space constructed from the biomarker variable axes, measurements from a small set of oil samples from the North Sea were investigated by means of eigenvector analysis, one of the many methods of pattern recognition. The goal of the present study is to clarify the correlations between the biomarker molecules. As the behaviour of steranes and triterpanes represents one of the best clarified fields in organic geochemistry, these compounds were selected for an introductory survey of how useful eigenvector analysis would be for the study of variable interrelations. DATA
AND METHODS
A set of seven light oil and condensate samples was analysed by metastable ion monitoring g.c./m.s. The present data for 63 ion specific peaks of pentacyclic triterpanes and steranes in the carbon number range 27-34 were obtained by the method described by Brooks et al. [ 51. The peak identification is given in Tables 1 and 2. The information carried by the different molecules was studied by means of multivariate correlation in terms of eigenvector loadings. Excellent descriptions of eigenvectors and factor analytical approaches have been given elsewhere [ 9, lo] and will not be reiterated here. This method was particularly chosen because it treats the data in a multivariate manner (i.e., simultaneously) in contrast to the pairwise sequential treatment of conventional cluster analytical algorithms. The basic idea of the approach is that the chemical difference between samples can be read from the sample plots in a variable space diagram consisting of as many axes as there are variables measured. In the present study, the variable space is 63-dimensional, because 63 variables were measured. According to the fundamentals of pattern recognition (e.g., [ 8, 111), large point scatter in the variable space is consistent with high density of informa-
77 TABLE 1 Peak identification Peak IlO.
1 2 3 4 5 6 I 9 10 11 12 13 14 15 16 17 18
of triterpanes
No. of carbons
Identification
27 27 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 30
18@(H)-trisnomeohopane 17cr(H)-trisnomeohopane umdent. umdent. unident. (Bisnomeohopane?) umdent. (Bisnorneomomtane?) unident. umdent. unident. Norhopane Normoretane unident. unident. umdent. unident. unident. 17a(H)-hopane Moretane
Peak
No. of carbons
Identification
IlO.
19 20 21 22 23 24 25 28 27 28 29 30 31 32 33 34
31 31 31 31 31 31 31 31 32 32 32 32 33 33 34 34
unident. unident. unident. l’lor(H)-homohopane (225) 17c~(H)-homohopane (22R) Homomoretane unident. unident. umdent. 17a(H)-bishomohopane (22s) 17cz(H)-bishomohopane (22R) Bishomomoretane 17&(H)-trishomohopane (205) 17&(H)-trishomohopane (20R) 17cY(H)-tetmkishomohopane (20s) 17a(H)-tetrakishomohopane (20R)
tion. The direction of largest point scatter corresponds to the direction of the first eigenvector, and the direction of highest point scatter at right angles to the first eigenvector is consistent with the direction of the second eigenvector. Consequently, the first-to-second eigenvector plane correspond8 to the direction of largest point scatter and thus contains more information than any other two-dimensional plane of projection. The third eigenvector is perpendicular to this plane. The multivariate correlation becomes evident from the direction of the eigenvectors in the variable space. The eigenvector direction is described in terms of a linear combination of all the variables, and the weighting of each variable defines the direction. This weighting is called the eigenvector (or factor) loading, and the correlation between the loadings of each eigenvector carries pertinent information about the data structure. The loadings were computed with the aid of the SIh4CA program [ 121.
RESULTS
AND DISCUSSION
All the recorded biomarkers The first-to-second eigenvector loading plot of all the recorded biomarker peaks is given in Fig. 1, with the known categories of molecules indicated. The most obvious feature of Fig. 1 is that the main groups of biomarker molecules form clusters: cholestanes to the right, moretanes and the 20R epimers of the hopanes at the top, and the hopane 20s epimers at the bottom. Further, most of the unidentified C2s-C31 triterpanes form a cluster to the left.
78 TABLE 2 Peak identification of steranes Peak
Peak
No. of carbons
Identification
IlO.
35 36 37 38 39 40 41 42 43 44 45 46 41 48
27 27 27 27 27 27 27 21 27 27 27 27 28 28
130.1 ‘ILu&acholestane (20s) 13p.l ‘la-diacholestane (20R) unident. 13cQ7pdiacholestane (20s) 13cx.l ‘I@diacholestane (20R) unident. umdent. umdent. 141x.1 ‘la-cholestane (205) 14fl.l7p-cholestane (20R) 14~,17~-cholestane (20s) 14cx,l’I’~-cholestane (20R) 24-Me-13/3,17wdmcholestane 24-Me-13&17or-dmcholestane
(20s) (20R)
No. of carbons
Identification
IlO.
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
28 28 28 28 28 28 28 28 29 29 29 29 29 29 29
24-Me-1 a.1 7@&acholestane (20s) 24-Me-13~~,17~-diacholestane (20R) unident. unldent . 24-Me-14or.l7cx-cholestane (205) 24-Me-14&17@cholestane (20R) 24-Me-14&l ‘Ifl-cholestane (20s) 24-Me-14a.l7a-cholestane (20R) 24-E-13~.17c+dxwholestane (20s) 24-E-13P.l’lol-diacholestane (20R) 24-E-l 3ql ‘I@dlacholestane (20s) 24-E-l 3a.l ‘IO-diacholestane (20R) 24-E-1&,17a_cholestane (20s)
[email protected] (20Rd~S)~ 24-E-14a.17o-cholestane (20R)
aPoorly resolved.
For the samples studied, the most important data structure is consistent with the direction of the first eigenvector (PI). This gives information on the segregation into two groups: cholestanes, and the triterpanes including the hopanes. There are a few exceptions to this general observation which will be discussed below. The second important feature of Fig. 1 is the segregation of the hopanes along the direction of the second eigenvector (&). The location of the 20R and 205 clusters is in harmony with the well-known inverse proportionality of these hopane epimers which has been used for evaluation of
(20s)
p,
Fig. 1. First (p,) to second (p,) eigenvector plot of the variables of the light oil and condensate oil samples studied. Clusters of steranes and triterpanes indicated. Variables not included in clusters are mostly diasteranes.
19
maturity [ 1, 3, 131. Further, the moretane and the hopane (20R) clusters overlap in Fig. 1. This indicates that these two groups of biomarker molecules yield similar information in the set of samples studied. That is in harmony with the observation that 20R hopanes and moretanes both disappear with increasing maturity of the source rock [ 141. In order to avoid interferences from other groups, a closer analysis of each of the main groups of the studied biomarkers is given below. The point locations in the following diagrams cannot be directly compared to those of Fig. 1 because the directions of the eigenvectors are different from those for the whole set of data. Hopanes and moretanes The complete separation of moretanes and the hopane epimers is seen from the eigenvector diagram of Fig. 2(a). The diagram reflects the present theories on the geochemistry of these triter-panes: the moretanes and the 20R hopane epimers cluster close together on the right of the diagram. This demonstrates that they carry similar geochemical information, and the inverse proportionality of the 20s and 20R hopanes is illustrated by the 20s cluster being located at the edge opposite to the 20R cluster. (b)
(a)
gz
Fig. 2. (a) First to second to third eigenvector plot of triterpanes. Third eigenvector along the vertical direction. S and R are the hopane epimer clusters, and M the moretanes; U denotes the two groups of unidentified triter-panes, one forming a flat U-shaped cluster at the bottom of the diagram, and the other a straight plane in the upper right half. Variables 1 and 2 are the l&(H) and 174H) isomers of trisnomeohopane; variables 10 and 17 are norhopane and 17p(H)-hopane, respectively. The no. 21 lies between 7 and 20 and has been omitted for clarity; 26 lies on the straight U plane behind the R cluster, and 32 was probably erratic and was omitted. (b) First to second eigenvector plot of triterpanes, demonstrating the main density points of information. U and U’ indicate the two groups of unidentified triterpanes.
80
The 1&/17a inversion of trinorneohopane upon increasing maturity is also clearly seen from the inverse proportionality shown in Fig. 2(a) (points 1 and 2), the l&-isomer (point 1) being consistent with relative immaturity. The inverse proportionality of point 17 (C3,, hopane) and points 11 and 18 (Cz9 and C3,, moretanes) is in harmony with the findings of Seifert and Moldowan [ 141 who introduced the ratio of CsO hopane to the sum of C& and Go moretanes as a specific indicator of maturation; their work was later supported experimentally [ 151. Norhopane (point 10) falls intermediately between the 20R and the 205 epimers of the heavier hopanes. There seems to be inverse proportionality between norhopane and normore tane (points 10 and 11). Among the unidentified triterpane peaks, a certain information structure may be observed in the set of samples studied. There seem to be two groups of unidentified isomers in the C28-C3o range: one consists of points 4-9 and 13-16 and carries information similar to that of 17a-trisnomeohopane (point 2), whereas the second group (points 3, 11, 12, and 18) can provide information similar to that of the CJ1 hopane 20R epimers and 18a-norneohopane (point 1). Further clarification of the geochemical information of these compounds must await clarification of their chemical identity. The major feature of Fig. 2(a) is that there seem to be three main density points of information. This is more clearly seen from the first-to-second eigenvector plot of Fig. 2(b), even though the information about the cluster separation along the third eigenvector is omitted from this diagram. One type of information is carried by the C31+hopane 20s epimers, another type of information seems to be carried by the majority of unidentified triterpanes making up the cluster U, and a third type of information is carried by the 20R epimers, the moretanes and the smaller group of unidentified triterpanes U’. Steranes The eigenvector loading plots of the steranes and diasteranes are given in Fig. 3. As can be seen, the information content of the steranes is different from that of the diasteranes. The steranes are characterised by forming dense clusters in the diagram, whereas the diasterane clusters are much less dense. This may indicate that the steranes inform about only a simple process and the diasteranes about a variety of independent factors. The sterane isomers of carbon numbers 27 and 28 fall in clusters that are elongated along the second eigenvector. The 14q17a 20R and 205 epimers are inversely proportional (points 43 and 46, and 53 and 56, respectively). This is consistent with the well known epimerisation of the steranes on increasing maturity [3, 4, 13, 151 and, from the clustering in Fig. 3, this direction in the data matrix studied can be assigned to maturity. The 20R epimer of C&-C& sterols seems to be a sensitive indicator of the origin of sedimentary organic matter in terms of land plant versus marine organism [ 161, and the corresponding steranes have been used as source
81
Fig. 3. First to second eigenvector plot of the identified C,,-C,, steranes. The cholestanes form a well defined cluster to the right of the diagram (C), whereas the diasteranes are spread around (D,, D, and D,). This indicates that the information supplied by the steranes is uniform and different from that of the diasteranes.
indicators because they seem to be but little influenced by thermal stress [Z, 4, 131. These compounds are not segregated in the direction of the second eigenvector in Fig. 3 (points 44, 46, 54, 56, 62, and 63), and, consequently, this direction is not consistent with source differences in the samples studied (cf. [ 171). The Cz9 sterane epimer and isomer relations were used for an estimate of combined maturation and migration [4] with the use of the ratio of the 20s and 20R epimers of the 14a,l7a-isomer versus the ratio of the 20R epimers of the 140, 170 to 1401,170 isomers (points 61 to 63 versus 62 (poorly resolved) to 63 in Fig. 3). These compounds lie close to another in this diagram, and it is a notable fact that the C27 and Czs analogs are preferable because of their larger chromatographic peak separation as well as their larger separation in the present diagrams, and, consequently, their higher load of information. For the C2, steranes, the corresponding variable ratio 43 to 46 versus 44 to 46 would result in a line of nearly the same angular coefficient as the line of first-order kinetic conversion in Fig. 10 of the paper by Seifert and Moldowan [4]. This is true for the Czs steranes also (points 53 to 56 versus 54 to 56). If the statement by Seifert and Moldowan [4] about the influence of maturation on these peak ratios and the selective influence of migration upon the 20R 140, 17@- and 14cu,l7a-isomers is correct, this gives further support to the statement that migration cannot be assigned to the variation along the first eigenvector of Fig. 3. However, it remains a fact that the data structure of the steranes of the samples studied is dominated by segregation of the diasteranes from the normal steranes. This point calls for further studies in order to be clarified.
82
Conclusbn The geochemical information carried by biomarker molecules analysed by metastable ion monitoring can be illustrated by means of eigenvector analysis. This method indicates that in the set of samples studied, the information carried by the steranes is different from that carried by the hopanes, although they carry information about maturity common to both groups of biomarkers. Evidently, diasteranes carry information very different from that carried by the steranes. The well-known segregation of the 20R and 20s hopanes, and the similarity of 20R hopanes and moretanes is also seen from the eigenvector diagTJXUllS.
Even information about unidentified triterpanes can be read from eigenvector diagrams based on measurements of peak heights in me&stable monitoring chromatograms. The information appears to fall into two quite different groups, one being somehow related to the 20R hopanes, and the other carrying unidentified information. REFERENCES 1 W. K. Seifert, Geochim. Cosmochim. Acta, 42 (1978) 473. 2 W. K. Seifert, in A. Prashnowsky (Ed.), Int. Alfred Treibs Symp., Munich. Bayr JuliusMaximiI-Univ. Wiirtzburg, 1980, p. 13. 3 A. S. Mackenzie, R. L. Patience, J. R. Maxwell, M. Vandenbroucke and B. Durand, Geochim. Cosmochim. Acta, 44 (1980) 1709. 4 W. K. Seifert and J. M. Moldowan, Geochim. Cosmochim. Acta, 45 (1981) 783. 5 T. Meyer, 0. H. J. Christie and P. W. Brooks, Anal. Chim. Acta, 160 (1984) 65. 6 P. W. Brooks, T. Meyer and 0. H. J. Christie, in P. A. Schenck (Ed.), Advances in Organic Geochemistry 1983, Elsevier, 1984. 7 G. A. Warburton and J. E. Zumberge, Anal. Chem., 55 (1983) 123. 8 0. H. J. Christie, K. Esbensen, T. Meyer and S. Wold, in P. A. Schenck (Ed.), Advances in Organic Geochemistry 1983, Elsevier, Amsterdam, 1984. 9 J. C. Davis, Statistics and Data Analysis in Geology, Wiley, New York, 1973. 10 K. G. Joreskog, J. E. Klovan and R. A. Reyment, Geological Factor Analysis, Elsevier, Amsterdam, 1976. 11 K. Varmuza, Pattern Recognition in Chemistry, Springer Verlag, Heidelberg, 1980. 12 S. Woki, Pattern Recog., 8 (1976) 127. 13 W. K. Seifert and J. M. Moldowan, Geochim. Cosmochim. Acta, 42 (1978) 77. 14 W. K. Seifert and J. M. Moldowan, in A. G. Douglas and J. R. Maxwell (Eds.), Advances in Organic Geochemistry 1979, Pergamon, Oxford, 1980, p. 229. 15 M. SchoeII, M. Teschner, H. Wehner, B. Durand and J. L. Oudin, in M. Bjorby (Ed.), Advances in Organic Geochemistry 1981, Wiley, Chichester, 1983, p. 156. 16 C. Lee, J. W. Farrington and R. B. Gagosian, Geochim. Cosmochim. Acta, 43 (1979) 35. 17 P. F. V. Williams and A. G. Douglas, in M. Bjor4y (Ed.), Advances in Organic Geochemistry 1981, Wiley, Chichester, 1983, p. 568.