Classification of malnutrition by statistical analysis of quantitative two-dimensional gel electrophoresis of plasma proteins

Classification of malnutrition by statistical analysis of quantitative two-dimensional gel electrophoresis of plasma proteins

COMPUTERS AND BIOMEDICAL Classification of Quantitative RESEARCH 19, 340-360 (1986) of Malnutrition by Statistical Analysis Two-Dimensional Gel...

2MB Sizes 0 Downloads 27 Views

COMPUTERS

AND

BIOMEDICAL

Classification of Quantitative

RESEARCH

19,

340-360 (1986)

of Malnutrition by Statistical Analysis Two-Dimensional Gel Electrophoresis of Plasma Proteins’

KARL LONBERG-HOLM,* MURIEL S. DOLEMAN,+

LAWRENCE B. SANDBERG,? AND AARON J. OWENS~

*E. I. duPont deNemours and Company, Central Research Department, Wilmington, Delaware 19898; TDepartments of Pathology, University of Utah and Veterans Administration Medical Centers, Salt Lake City, Utah 84148; Slmage Analytics Corporation, Montchanin, Delaware 19710; and $E. I. duPont deNemours and Company, Engineering Department, Wilmington, Delaware 19898 Received January 16, 1986 An attempt to use the relative concentrations of major plasma proteins for clinical assessment of severe malnutrition is described. Quantitative two-dimensional gel electrophoresis was used to measure the concentrations of 24 major proteins in small aliquots of plasma obtained from children, aged 0 to 3 years, who were patients and outpatients in Liberian hospitals. Fifteen had a clinical diagnosis of kwashiorkor, 36 were diagnosed with marasmus, and 18 were controls. There were also 5 controls from the United States. The individuals were placed in six groups; kwashiorkor, kwashiorkor who died during treatment, marasmus, marasmus who died, Liberian controls, and U.S. controls. The amount of protein in each spot in the two-dimensional gels was estimated by measuring bound stain using a laser scanner and computerized image analysis. We found very low serum transfertin levels in malnourishment, in agreement with reports from other investigators. All of the data for 24 protein variables were pooled for factor analysis; the mean factor scores for each group differed, with the kwashiorkor groups furthest from the controls. Results of discriminant analysis using the amounts of different numbers of protein variables (3 to 24) were compared for posterior assignment of individuals to groups. The validity of the method was tested by analysis of plasma aliquots obtained from patients following initiation of therapy and which were not a part of the training set. Predictive performance (prognosis of patient survival) depended upon the number of protein variables used. Although artifactual fitting of the data is expected to contribute to performance as the number of variables is increased, use of as many as 7 variables may be justified, even with our small patient groups. Possible use of these results for development of a practical clinical test is discussed. o 1986 Academic PESS, IK.

For many life processes there may be no single diagnostic biochemical variable. The technique of 2-dimensional gel electrophoresis (2lXE) provides a way to separate many biochemical variables at one time. Up to several thou’ This work was supported by the Central Research Department of E. I. duPont deNemours and Co. Support for L. B. Sandberg also came from the U.S. Veterans Administration and the University of Utah. 340 0010-4809/86 $3.00 Copyright AU rights

0 1986 by Academic Press, Inc. of reproduction in any form reserved.

STATISTICALANALYSISOFPLASMAPROTEINS

341

sand proteins may be resolved as spots in a single “high-resolution” gel. Quantitation and statistical treatment of 2DGE have begun (I), and this approach may become a powerful method for distinguishing between cell types, stages of differentiation, or even species. In our experience, the presently used techniques for “high-resolution” 2DGE do not give adequate recovery for quantitation of proteins with relative molecular mass (Mr) greater than about 80-100 kDa. Therefore we have adapted a “low-resolution” 2DGE method especially for plasma proteins, many of which have high M, (2). This method produces relatively large spots with correspondingly reduced resolution, but has the advantage of not using an isoelectric focusing step. Isoelectric focusing in the first dimension of 2DGE leads to losses of high M, proteins, and also separates glycoproteins as “charge trains” of spots which may overlap other unrelated spots. An additional advantage of the “low-resolution” system is that disulfidelinked subunits are not reduced and dissociated which would add to the complexity of the gel pattern. Applications of this system to human plasma proteins are described elsewhere (2-4). Computerized methods have already been described for analysis of gel spots (5, 6). Our laboratory has developed instruments for scanning and analyzing 2D gels (3). The presently used instrument (7) is now available commercially. We have used this system to gather quantitative data on the major proteins in small samples of plasma. In the present study, we compared control plasma with plasma obtained from 0- to 3-year-old children with marasmus (wasting malnutrition) or kwashiorkor (malnutrition with both wasting and edema). These were gravely ill children; some died while others recovered. We do not know in every case if the child recovered, nor can we be certain that malnutrition was the primary cause of death when a child did not recover. However, we hoped to discover some criterion which could identify those in greatest jeopardy so that future cases might be given extra attention; treated as inpatients rather than outpatients, or administered parenteral fluids. Preliminary findings using conventional methods of analysis suggested that low levels of fibronectin might indicate an increased probability of mortality (8). Visual inspection of the 2D gels indicated that low levels of transferrin and hemopexin might also be indicators, and this was confirmed by quantitative 2DGE. Others have also found transferrin to be a marker for malnutrition (9-23). We then analyzed our 2DGE data using a computer program for disciminant analysis (DA). Although our groups of patients were too small for confident application of DA using the available spot data, we experimented with different sized data sets containing between 24 and 3 variables. We also placed the samples into three as well as six groups. Although not definitive, our results are an early attempt to use quantitative 2DGE for selection of significant plasma protein variables for prognosis in disease. METHODS Samples. Collection of the plasma samples used in this work has been described elsewhere (8). Briefly, samples of EDTA plasma were obtained from

342

LONBERG-HOLM TABLE AVERAGE

CHARACTERISTICS

No.

Group CL cu K KD M MD

individuals 18 5 9 6 30 6

Age

Female/male 8/10 O/5 514 214 11/19 214

(months) 19 21 21 25 16 18

ET AL. 1

OF GROUPS

ANALYZED

BY 2DGE”

Weight deficit*

Length deficit*

Arm circumference<

Fibronectin (mg/dBd

12

2.5

13.8

24 32 42 SO

6.6 8.1 7.9 9.1

11.9 12.2 9.8 9.2

26.1 (17.7) 10.3 6.0 10.2 5.1

d The plasma samples were from children aged O-3 years and comprised six groups: CL. controls from Liberia; CU, controls from the United States; K, Liberian patients with kwashiorkor; KD, Kwashiorkor patients who died subsequent to initiation of therapy; M, Liberian patients with marasmus; and MD, Patients with marasmus who died. Mean values for anthropometric data are given for each group except CU. for which these data were not available. b Length and weight deficits were calculated by the following formula: %deficit = 100 - IO0 (measurement/standard for age). Standard values were from standard charts for North American children (20). c Upper arm circumference in centimeters. Since these values are fairly constant in children from 1 to 5 years of age, only values from children over 12 months were used in this calculation. d Fibronectin was not measured in some of the controls: data were available and averages were taken from 17 CL and 2 CU individuals.

malnourished and control children between ages 0 and 3 years during a physical examination upon admission to hospitals and clinics in Liberia. Rubeola is endemic in West Africa, and many of the cases of malnutrition may have been precipitated by measles. Samples were also obtained from a small group of similarly aged healthy children from the United States during medical examinations. Aliquots of 100 ~1 plasma were stored at -70°C or with solid CO*, and were analyzed within 1 year. Table 1 describes the patients and controls in this study. The Liberian control patients in many cases were seen as a result of another health problem, but were not primarily malnournished. Also, in one case, a “recovered” malnournished child was included among the control group. Levels of fibronectin were not available for all of the controls. Patients with ambiguous clinical diagnosis, or patients given prior blood transfusions were excluded from this study. For a few patients, several samples were taken at intervals following treatment; only the admission sample was included in the intergroup comparisons. Chemical and immunochemical assays. Total plasma protein was measured with the Ektachrome 400 Analyzer (Eastman Kodak, Rochester, N.Y.). Fibronectin was measured turbidimetrically (kits supplied by Cooper Biomedical, Malvern, Pa.), as previously described (8). cw-ZMacroglobulin was measured by rocket electrophoretic immunoassay employing rabbit antiserum and purified antigen (to be described elsewhere).

STATISTICAL

ANALYSIS

OF PLASMA

PROTEINS

343

Quantitative 20 gel electrophoresis. All samples were subjected to “lowresolution” two-dimensional gel electrophoresis by the methods described in detail elsewhere (2). Samples (4 ~1) were placed in 2-mm diameter wells in IOcm agarose gels and subjected to electrophoretic separation in pH 8.6 barbital buffer. Strips 1 cm wide were excised and placed upon 5.5-11% gradient sodium dodecyl sulfate (SDS) polyacrylamide gels. In each case residual agarose gel strips from each side of the sample strip were dried and stained to show that virtually all protein was transferred to the second-dimension gels, Following the second-dimensional separation, the 12 x 12-cm gels were fixed individually in methanol/acetic acid/water (3/l/6 by volume, “destain solution”) and stained on a platform rocker overnight in Pyrex trays with Coomassie brilliant blue R250 (182 mg/liter in destain solution), using 330 ml for two gels in one dish. The excess stain was poured off, the dish and the surfaces of the gels were wiped clean of particulate material, and the gels were destained by rocking 24 hr with 400 ml destain per two gels, The gels were drained and treated again with destain in the same way. Prior to scanning, the gels were stored in the second equilibrated destain solution for up to 3 days in the dark at 4°C. Stained gels were scanned in the LGS-30 helium-neon laser scanner and 2D gel analyzer (Image Analytics Corporation, Montchanin, Del.). Each gel was covered with equilibrated destain solution and “sandwiched” between optically flat glass plates, The image of each gel was digitized in 1000 x 1000 picture elements using a range of 4096 gray levels for 3.0 absorbance units. In earlier experiments, only 256 gray levels were available, and gels were scanned twice, first with full scale at 0.75 absorbance units, and then with 3.0 absorbance units. Data from the second scan were used for spots which were saturated in the first scan. A “blank” image was recorded at least once every hour and used to correct each subsequently recorded gel image according to the procedure supplied with the instrument. This corrected for inhomogeneity in the response of the detector along the axis swept by the laser. The image was temporarily stored on a “Winchester” hard disk and then subsequently transferred to magnetic tape for archival storage. It was found that Coomassie-blue-stained protein spots usually did not exceed an optical density of 3.0 when gels were prepared by the procedure described above. This probably resulted from the tendency for protein solutions not to exceed a concentration limited by their electrical conductivity during “stacking” in the discontinuous buffer system of the second-dimension separation. Thus the albumin spot which contains about half the total plasma protein had a broad outline, but did not exceed the absorbance range of the detection system. The digitized image of each gel was viewed on the LGS-30’s video monitor. Boundaries were placed to exclude visual edges, and a histogram of picture element density versus its frequency was constructed. It was assumed that the “background” staining of the gel was the most populated density in the histogram, and this uniform value was subtracted from the entire image. The image of the gel was then “modeled” using the SURF program, and the PDP 11174

344

LONBERG-HOLM

ET AL.

computer supplied with the instrument. The components modeling each spot were combined manually using the interactive ISUMM program as described by the instrument’s manufacturer. The resulting integrated optical densities (or “volumes”) for each protein spot were placed for future evaluation into a Saturn Calc spread sheet program (Saturn Systems, Minneapolis, Minn.). Spots which were too faint to be detected by the SURF program (using 600 components for modeling the gel image) were scored as zero concentration. Since the sensitivity for detection corresponded to 1-2 mgidl, zero was a reasonable approximation for any of the index spot proteins, all of which had average concentrations equal to or greater than 6 mg/dl. Most of the comparisons described in this report used data which had been normalized with total plasma protein measured in mg/dl. Arbitrarily, the volumes of 24 index spots were summed and then the volumes of the individual spots were multiplied by the ratio of the total protein to the sum of the volumes to give an approximate value for each protein component in mgldl. The concentration of cr-Zmacroglobulin (spot 1) was also measured immunochemically and compared with values based upon quantitative 2DGE normalized with total protein. The directly measured values were on the average 15% lower. This was probably due to omission of about 15% of the total protein in the 24 index spots, assuming equivalent binding of Coomassie blue to different protein species under denaturing conditions. The precision for measurement of protein concentrations from the 2D gels was tested in two ways. Plasma was diluted with buffer (0.8, 0.6, 0.4, and 0.2 times the initial concentration), and duplicate samples were analyzed. The Y values (correlation coefficients) were calculated from the regressions of the spot volumes. In this test, r values were 0.92 or greater for 18 of the 24 index spots, and 0.98 or greater for 10. This demonstrates that the measured “voiume” of each spot is directly proportional to the concentration of the corresponding protein in the sample. Also, an estimate of the average standard deviation for spots was calculated by comparison of duplicate analyses of two samples, and triplicate analyses of two other samples. The coefficient of variation for the 24 index spots was 26%, the average for the 18 spots with r above 0.92 was 22%, and the average for 7 “key” variables (spots 0, 1,3, 5.7, 11. and 33; see Results) was 19%. Further analyses of the limitations of the quantitative 2D gel system will be reported elsewhere (C. A. Phillips et al., in preparation). Discriminant andfactor analysis. We used the SAS statistical package (SAS Institute, Cary, N.C.) with the IBM 3081 mainframe computer (International Business Machines Corp., New York). First, the spot volumes were normalized as fraction of the sum of the 24 index spots. This data set for 74 samples was analyzed by the FACTOR program ( 14), which computed factor scores for each plasma sample based upon its location along directions of data variation in 24-dimensional hyperspace (15). The results were then examined by plotting the mean factor scores for each clinical group, to reveal structure within the data. Data sets with proteins expressed as percentage of total volume for 24 index

STATISTICAL

ANALYSIS

OF PLASMA

PROTEINS

345

spots, or normalized with total protein concentration, were then subjected to the DISCRIM program for discriminant analysis, and the STEPDISC program for selection of those variables which are most important for discrimination of the groups (14). The DISCRIM program computed a coefficient for each variable depending upon group assignment. The variables for each plasma sample were multiplied by the corresponding coefficients for each group, one group at a time. A posterior probability that a plasma sample belonged to a group was computed from the sum of these products. When the probability of assignment to any one of the designated groups was less than 50%, we elected to assign the sample to a group called “other.” It was assumed that the members of each group were normally dispersed with equal covariance matrices, so that linear discriminant analysis was used in all cases (14, 16). The STEPDISC program was used to evaluate the contribution of each variable to discrimination by iterative stepwise selection (24, 27). RESULTS

Figure la is a map of the 2DGE separation of human plasma proteins. The 24 index spots we have analyzed are numbered, together with three other spots. The protein identities of some of the numbered spots are known (legend to Fig. 1). It should be noted that resolutions of spots 7 and 11 were not optimal in these gels, and they are known to contain more than one protein species. Under the conditions used, spot 9 (prealbumin) ran beyond the anode wick and was not recovered quantitatively. Spot 15 (haptoglobin) is shown as the l-l phenotype in all panels of Fig. 1. Since type 2-l and 2-2 haptoglobins contain oligomers of very high molecular weight which are not recovered in the separating gel (2), and because the plasma levels of haptoglobin are variable following hemolysis or bruising, we did not include this protein in our analyses. Spot 35 (a- l-acid glycoprotein) was excluded from quantitation because of interference from albumin in some of the gels. Figure lb shows the separation of plasma proteins of a Liberian control (the map was traced from this pattern). Figure lc is the analysis of the plasma from a child with kwashiorkor who subsequently died, and Fig. Id is from a child who presented with marasmus and who survived. We examined samples from 74 children from six groups, as described in Table 1. Of these, a total of 51 were severely malnourished. We did not find the appearance of any “spot” characteristic of severe malnutrition or impending death. Some samples contained elevated acute phase reactants (C-reactive protein, ceruloplasmin, etc., not illustrated), probably reflecting infections. The 2D gel analyzer was used to measure the “volumes” of 24 index spots in each of the 74 plasma samples, as described under Methods. These were expressed either as a fraction of the sum of the 24, or were normalized to mg/dl using the ratio of chemically measured protein concentration divided by the sum of the “volumes” of the 24 spots (this introduces a systematic error of about 15% which represents unmeasured proteins; see Methods).

LONBERG-HOLM

346

a

00

ET AL

b

FIG. 1. Typical 2D gel patterns of plasma containing type 1 : 1 haptoglobin. (a) Map of 24 major spots used for analysis, plus spots 9, 15, and 35. (b) Control Liberian child. (c) Patient with kwashiorkor (this patient died). (d) Patient with marasmus. The gels were dried and then photographed (2). The identities of some of the spots are as follows: (0) albumin, (I) a-2-macroglobulin, (2) a-1-antitrypsin, (3) transferrin, (5) hemopexin, (7) oc-lipoproteins, (9) prealbumin, (14) C3, (15) haptoglobin, (18) albumin dimer, (19) factor B. (24) Inter-a-inhibitor, (28) a-I-antichymotrypsin, (35) a-l-acid glycoprotein, (53) tibrogen, (99) IgG. Spots with dashed outlines are variable in control samples. The diffuse spot 3A reacted with antibody to transfertin (3), and is probably a complex containing this protein. Spots 7 and 11 were not resolved from neighboring minor spots, and thus contained additional protein components.

STATISTICAL

ANALYSIS

OF PLASMA

TABLE

2

CONCENTRATIONSOFPLASMAPROTEINS

spot 0 1 2 3 5 7 IO 11 14 16 18 19 23 24 28 29 30 33 38 3A 41 53 90 99

All Mean

SEM

CL Mean

SEM

2820 293 195 195 52 162 26 49 122 13 II 28 10 8 45 19 19 15 7 14 6 402 27 1338

113 14 II 18 3 9 4 2 6 2 1 2 1 I 4 I 2 1 1 2 0.4 23 2 59

3792 390 174 375 71 206 13 49 164 18 19 35 13 14 34 22 25 19 10 19 s 476 31 1326

135 2s 22 23 5 13 2 3 12 4 2 3 1 2 6 2 4 2 1 6 I 28 6 129

y Mean concentration

of proteins

(mgldl)

CU Mean 3884 304 146 346 78 177 8 41 165 18 13 26 IS 11 36 19 16 15 13 7 4 547 31 761 in 24 different

347

PROTEINS

MEASUREDBY~DGE~

SEM

K Mean

SEM

KD Mean

SEM

M Mean

SEM

MD Mean

SEM

385 51 64 47 14 23 2 11 29 6 4 7 2 3 13 3 6 2 3 5 I 140 6 165

1970 168 200 76 30 182 33 64 103 16 4 23 IO 8 33 23 14 13 4 15 7 292 32 1446

258 9 22 21 7 24 7 9 11 4 I 3 2 2 4 4 4 2 1 7 I 50 9 103

1523 201 169 25 14 73 37 43 83 8 2 22 3 1 32 9 12 8 3 18 7 232 22 1453

162 24 26 II 5 16 4 6 I5 4 I 4 1 I IO 3 6 1 2 2 2 38 7 214

2666 294 205 150 54 147 28 46 III IO 10 27 9 7 58 19 20 14 6 12 7 392 22 1445

113 20 19 20 5 I1 6 3 9 2 2 3 I 1 7 2 3 I 1 3 1 35 3 97

2360 270 207 104 33 150 47 SO 82 9 7 28 7 5 55 19 9 20 5 6 6 444 31 1047

250 37 46 23 II 58 32 12 14 4 3 7 2 2 12 5 4 6 I 3 I 129 9 112

index

spots for plasma samples in groups listed in Table 1 (see Methods).

Table 2 lists the average amounts of each of the 24 index spots, normalized to units of mg/dl. Averages are given for all 74 samples, and for the samples in each group listed in Table 1. There do not appear to be significant differences in the averages of the control groups from Liberia and the United States, except that spot 99 (immunoglobulin G; IgG) may be higher in the former. Many of the proteins in the kwashiorkor (K) group are lower than in the Liberian control (LC) group, with possible exceptions being spots 2, 7, 16, 28, 29, 90, and 99. Average values of spot 10 may be increased in all of the malnourished groups; the identity of this spot is not known. The average values in the marasmus (M) group may not differ as much from the control groups as do those in the K group. Spot 28 (cr-1-antichymotrypsin, an acute phase reactant) may be somewhat elevated in the M group. Members of the kwashiorkor-died (KD) group tend to show further depression of the spots which are depressed in the K group, while spots 2 (cr-1-antitrypsin) and 28, both acute phase reactants, remain “normal,” on the average. The marasmus-died (MD) group is similar to the M group, with possibly lower levels of IgG. A most significant feature of these data, appears to be the depression of levels of spot 3 (transferrin), and to

LONBERG-HOLM

348

\ Oo

l

0

200

ET AL.

,

400 SPOT 3*5,mgtil

I

600

FIG. 2. Scatter plot of the concentrations of fibronectin and the proteins of spots 3 plus 5 in 70 individuals divided into five groups. Liberian and U.S. controls combined (0) kwashiorkor (a), kwashiorkor-died (A), marasmus (IX), and marasmus-died (U). Line A was drawn to separate controls from malnourished samples, line B to separate those who died from those who survived.

a lesser extent spot 5 (hemopexin) in the malnourished groups, particularly in the groups that died. This was also evident upon visual inspection of the gels. Figure 2 shows quantitative data in mg/dl for the sum of spots 3 and 5 in each sample plotted versus the concentration of fibronectin, which has already been found to have significant predictive value for survival (8). The fibronectin values were determined immunochemically, because fibronectin was not well resolved by electrophoresis. The data points seem to describe a regression‘ A dashed line (A) was drawn by visual inspection approximately perpendicular to the line of regression in order to separate the controls and the malnourished children. If a vertical line had been drawn (depending only on concentration of transferrin and hemopexin), the separation of the groups would not have been as effective. With the diagonal line, only one control falls into the malnourished group, and only two of the malnourished fall into the control group. It is evident from Fig. 2 that other lines drawn parallel to line A would no be very helpful in separating K from M, or KD from MD. However, line B had some predictive value for survival among the combined malnourished patients. Of the mahrourished samples (K, M, KD, and MD) which fell to the left of B, 67% (8112)died, while 90% (35/39) of those to the right survived. We then used computer programs to examine the data. First we used factor analysis to reveal internal structure within the quantitative 2DGE data for the 24 protein spots expressed as fractions of the sum of the spot “volumes” for each sample (Methods). It should be emphasized that the data for 74 samples were analyzed before separation into groups, and without addition of data from

STATISTICAL

ANALYSIS

OF PLASMA

PROTEINS

349

F3 2

f

FIG. 3. Factor analysis of 24 2DGE spots from 74 individuals. The amount of each spot was expressed as the fraction of the total amount of stain in 24 index spots. The data were analyzed by the FACTOR program, and the locations of the mean scores for six groups was plotted for factor 1 and factor 3. The widths of the diamonds represent the standard deviations for each group. CU, control-USA. CL, control-Liberia. K, kwashiorkor. KD, kwashiorkor-died. M, marasmus. MD, marasmus-died.

chemical analyses. Seven factors of decreasing importance were defined; examination of the relative contributions by principal components showed that spots 3, 24, and 99 were most important for factor 1 (Fl), spots 19, 0, and 29 were most important for factor 2 (F2), and spots 7,53, and 30 contributed most to factor 3 (F3). Factor scores were then computed for each sample (the sum of the products of the value for each spot multiplied by the factor coefficient) and the mean location of each sample group was calculated, together with its standard deviation. Figure 3 shows the center of each group versus Fl and F3; the standard deviations are indicated by the diamonds. These results, based only upon the relative amounts of the 24 proteins, indicate that there are consistent differences between the clinical groups. The separation of K and KD from each other and from the controls is greater than the separation of M and MD from each other and from the controls. Thus the kwashiorkor samples are more distinctly abnormal. Encouraged by this, the data were next subjected to discriminant analysis.

350

LONBERG-HOLM

ET AL.

TABLE

3

DATAUSEDFORDISCRIMINANTANALYSISOFPLASMAPROTE~NSINMALNOURISHED ANDCONTROL NO. individuals

CHILDFEN~

NO. Nomalization

variables

Variables

used

Spots 0. 1, 2, 3, 5, 7. IO, I I, 14, 16 18, 19, 23, 24. 28, 29. 30, 33, 38, 3A. 41, 53, 90, and 99 (Above)

74

Total protein

74

Total protein

74

Total protein

70

Total protein

8

74

Total protein

8

spots 0. I. 3, ?, 7. Il. 33. and tibronectin Spots 0. I. 3. 5. 7. II, 33. 38

74 70 70

Total protein Total protein Total protein

3 4 2

spots I. 3, 5 Spots 1, 3. 5. and fibronectin Spots 3 + 3A + 5, and fibronectin

spots 0, 1, 3, 5, 7, 10, 19, 23, 28, 29, 53, 90, and spots 0, 1. 3. 5.7. II,

II. 14. 18. 33, 41 99 33

UISCRIM

NO.

“ZNlK

groups

R24A K24S R24T P24A P24s P24T PlRA

6 3 3 6 3 3 6

CL, CU. K, KD, M. MD. C. S. D C. M. K CL. CU. K, KD, M. MD. C, S. D C, K. M CL. CU. K. KD. M. MD

P7A F7S P7T P7FA

6 3 3 6

CL. CU. K. KD. M. MD, C, S. D C. K, M Ct.. CU. K, KD. M. MD

P8A P8S P3A P3FA PIFS

6 3 6 6 3

CL, CU. K. KD. M. MD, C. S. D CL. CU. K, KU. M, MD CL. CU. K, KD, M, MD c. s. D

Groups

used

a See Table 1 and text.

The DISCRIM program computed the center of mass of each group of samples in hyperspace, and estimated the probability that each individual sample belonged to any one group (Methods). Table 3 lists data sets analyzed by the DISCRIM program. In those cases where fibronectin values were included, there was complete data for only 70 samples; all other sets contain 74 individuals. These sets used from 24 to 2 variables, and have either six or three groups. In some cases when three groups were used, samples were pooled into control (C; control Liberian (CL) plus control United States (CU)), survived (S; K plus M), and died (D; KD plus MD) groups. In other cases they were pooled into C (CL plus CU), K (K plus KD), and M (M plus MD) groups. The DISCRIM program multiplies the values of the variables of each individual by a coefficient which it has computed for a group; these products are summed together with a group constant which it has also computed. This is repeated with coefficients and constants computed for each group. The constants and coefficients provide optimal discrimination between groups, and the magnitude of each summation determines the posterior probability that the individual belongs to a group. (When the probability for membership in any group was computed to be less than 50%, it was placed in an additional group called “other”). The relative contribution that the coefficients of each group make to sample placement may be represented graphically as the product of the mean value for each variable within a group by the corresponding group coefficient. Figure 4 shows these products, and also the group constants, for analysis of 74 individuals divided into six groups and using 7 variables normahzed with

STATISTICAL

ANALYSIS

OF PLASMA

PROTEINS

351

FIG. 4. Relative contributions of 7 spots to discriminant analysis. The group constants and the products of group means times spot coefficients are shown for the DISCRIM, analysis of P7A (see Tables 3 and 4). Identity of groups shown in lower left.

total protein (P7A in Table 3). Generally, the group constants tend to balance the heavy contributions by albumin (spot 0). Thus relatively high albumin coefficients exaggerate differences between group albumin concentrations. In the case of spot 3 (transferrin), the products are negative for the malnourished groups and positive for the controls; a high spot 3 value will tend to place the sample in a control group. Some of the other relationships are more complex and not so easily explained. Table 4 gives the posterior placement of samples into six groups based upon use of different numbers of variables. Although the performance appears to be quite good with 24 variables, it is very likely that part of this is an artifactual “pattern recognition” due to use of too many variables (see Discussion). In attempts to reduce this, we first eliminated 6 spots which were measured with the lowest precision, based upon measurements on dilutions of plasma, and regression analysis of spot “volumes” (Methods). After running the DISCRIM program with both 24 and 18 spots (P24A and P18A, Table 3), the STEPDISC program was used to find which variables contributed most to discrimination between the six groups. In both cases spots 3, 5, and 1 were the most important, in decreasing order. Also, spots 0, 7, and 11 were among the first seven, and spot 38 (which was removed because of low precision) was replaced by spot 33 in the analysis with 18 variables. We then analyzed sets containing 7 variables (P7A; spots 0, 1,3,5,7, 11, and 33), and three variables (P3A; spots 1, 3, and 5). In addition, since fibronectin was already known to be important for discrimination (see Fig. 2), it was added to the sets of 7 and 3 to give sets P7FA and P3FA. Spot 38 was also added back to the list of 7 variables (P8A) to test if its exclusion weakened the analysis. Analysis of gel data using six clinical groups and with 24 variables expressed as fraction of total “volume” (R24A) was as effective as analysis of data with

352

LONBERG-HOLM

ET AL.

24 variables expressed as mg/dl protein (Table 4). Performance was surprisingly good with 24, or 18 variables, but was somewhat eroded as the number of variables decreased to 7 and then 3, especially in the M and MD groups. TABLE

4

PERFORMANCE MATRIX FOR CLASSIFICATION OF INDIVIDUALS INTO SIX CLINICAL GROUPS BY DISCRIMINANT ANALYSIS"

Diagnosis CL

cu

K

KD

M

DISCRIM

% CL

%CU

R24A P24A P18A P7A P’IFA P8A P3A P3FA R24A P24A P18A WA WFA P8A P3A P3FA R24A P24A P18A P7A P7FA P8A P3A P3FA R24A P24A P18A WA WFA PSA P3A P3FA R24A P24A P18A P7A WFA PSA P3A P3FA

83 89 83 67 82 83 56(61) 94 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 3 10 7 10 13 7 7

11 11 17 22 6 11 33 6 100 100 100 80 100 100 80(100) 100 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 3 0 0 7 7

% K

% KD

% M

% MD

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 89 89 78 67 67 67 ll(56) 22(67) 0 0 0 0 0 0 0(17) 0 0 0 7 3 3 3

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 22 22 22 22(33) il(22) loo 100 100 100 loo 83 50(67) 67(83) 0 0 0 3 3 0 O(3) O(10)

6 0 0 0 6 6 6 0 0 0 0 0 0 0 0 0 0 0 II 0 II II O(11) 0 0 0 0 0 0 0 0 0 93 83 67 47 53 53 20(40) 23(33)

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 O(17) O(17)

323) 327)

20 17 17

WO) 3(17)

%Other 0 0 0 II 6 0 6(O)" 0 0 0 0 20 0 0 ZO(O)b 0 II II II 11 0 0 67(O)* 55(o)* 0 0 0 0 0 17 50(0)b 33(0P 0 10 10 17 13 13 63(O)* 57(o)*

NO. individuals IX 18 IX 18 I7 18 18 I7 5 s 5 5 5 5 2 2 9 9 9 9 9 9 9 9 6 6 6 6 6 6 6 6 30 30 30 30 30 30 30 30

STATISTICAL

ANALYSIS TABLE

Diagnosis

DISCRIM

MD

R24A P24A P18A P7A WFA P8A P3A P3FA

% CL 0 0 0 0 0 0 0 0

% CU 0 0 0 0 0 0 0 0

OF PLASMA

353

PROTEINS

4-Continued

%K 0 0 0 17 17 17 0 O(17)

% KD 0 0 0 33 33 33 O(17) 17

%M

%MD

17 17 17 0 0 0 17 17

83 83 67 50 50 50

% Other

No. individuals

0 0 17 0 0 0 O(67) 83(0)b 50 17(0)b

0 Samples were analyzed by quantitative 2D gel electrophoresis and sets of the resulting data were assembled as described in Table 3. Grouping by clinical diagnosis (Table 1) is listed in the left column. Samples were placed in the “other” group when there was less than 50% probability for placement in any of the clinical groups. With sets of data containing only 3 or 4 variables, the “other” group was also omitted, and assignment was made to the group with highest probability (results in parentheses). b Values in parentheses omit “other” category.

Performance was improved in the smaller data sets if the “other” category was eliminated and assignments were made to the group with highest probability, even if less than 50% (as shown by the bracketed values). Comparison of P18A and P24A, and of P7A, P7FA, and P8A, show that removal of data for spot 38 (which has low precision), or introduction of data on fibronectin did not greatly influence the performance. A few individuals were consistently misclassified using 24, 18, or 7 variables, but because of the limitations in patient documentation, this is not necessarily the fault of the method. One of the nine kwashiorkor patients was consistently classified as kwashiorkor-died, or as “other,” but this patient had no followup and may have actually belonged to the kwashiorkor-died group. In another case, DISCRIM consistently misclassified a marasmus-died patient. In this case the medical record shows the patient had cerebral malaria as a complication, and it is possible that the malaria was the primary cause of death. This information strengthens our impression that discriminant analysis may have some validity for these samples. We attempted to increase the number of individuals in each group by combining groups. Tables 5 and 6 give performance matrices for discriminant analyses with individuals divided into three groups. The variables used for these were selected on the basis of results of STEPDISC analysis of sets with six groups (this selection may not have been optimal). Although use of larger groups reduced the likelihood of a&actual “fitting,” pooling dissimilar groups of samples, for example K and M, or K and KD, may have led to other problems if these samples did not belong together in a single group (as indicated by factor analysis in Fig. 1). With this reservation, the analyses performed well, except

354

LONBERG-HOLM TABLE

ET AL. 5

DISCRIMINATION OF MORTALITY’I Diagnosis

DISCRIM

%C

%S

%D

C

R24S P24S WS P8S PlFS R24S P24S P7S PSS PlFS R24S P24S WS P8S PlFS

96 100 96 100 89 5 3 10 8 8 0 0 0 0 0

4 0 4 0 5 77 74 64 67 56 8 8 8 8 2s

0

0

0 0

0 0

0 0

0 5

13 23 26 23 36 92 92 92 92 75

5 0

S

D

5%Other

0

3 0 0 0 0 0 0

individuals 13 23 73 23 19 39 39 39 39 39 12 12 12 12 12

(1Performance matrix for classification of individuals into three clinical groups: C. control; S: malnourished; D, malnourished who died. See Tables 3 and 4.

in group S (surviving patients with kwashiorkor or marasmus). Use of 24 variables did not offer any great advantage over use of 7 (Table 5). Also, the data set shown in the scatter plot of Fig. 2 was analyzed by DISCRIM. The precision with 2 variables only (the sum of spots 3 plus 5, and fibronectin) was fairly good, but S was not resolved from D in Table 5 as successfully as in the scatter plot. The reason for this may be that the DISCRIM boundaries between groups were selected using the standard deviation of the groups rather than by drawing an arbitrary line which gave a fortuitously good fit. Table 7 gives the classification of serial samples from four malnourished patients after initiation of therapy. This analysis used only the relative volume of the protein spots, and the coefficients and constants obtained from discriminant analysis of 74 individuals (R24A). Only the first sample in each series was of the “training” set. Two of the patients belong in the K group, and two in the M group. The first sample of each series was correctly assigned by the program. The following three samples were either correctly assigned or were assigned to a related group (M rather than K, MD rather than M), except in the case of the last sample of one of the kwashiorkor patients which was placed in the CU group. The correct assignment of most of these samples based upon analyses of 74 other samples suggest that our results with 24 variables are not due primarily to pattern recognition of random features in the data, although recognition of the “individuality” of the patient’s gel patterns cannot be en-

STATISTICAL

ANALYSIS

OF PLASMA

TABLE DISCRIMINATION

Diagnosis

DISCRIM

C

R24T P24T WT R24T P24T P7T R24T P24T P7T

K M

96 100 96 0 0 0 3 3 14

%K 0 0 0 100 100 93 6 3 11

355

6

OF KWASHIORKOR

%C

PROTEINS

AND MARASMUP

%M 4 0 4 0 0 7 89 94 75

% Other 0 0 0 0 0 0 3 0 0

No. individuals 23 23 23 1.5 15 15 36 36 36

a Performance matrix for classification of individuals into three clinical groups: C, control; K, kwashiorkor (surviving and died); M, marasmus (surviving and died). See Tables 3 and 4.

tirely ruled out. Furthermore, the recovery of the patients by criteria of weight gain and increased fibronectin levels was more rapid than recovery based upon the overall pattern in the relative concentration of proteins (see footnotes to Table 7).

DISCUSSION

Two-dimensional gel electrophoresis provides a systematic method to search for plasma protein markers in disease, and analysis of the samples from malnourished children showed “at a glance” that there was a correlation between nutritional status and levels of transferrin and hemopexin. The concentrations of other major proteins appeared to be altered, but a consistent pattern was not deduced by visual inspection of the gels. Since detection was limited to proteins with concentrations greater than about 1 mgidl, screening minor plasma components was not possible. However, with the computerized quantitative gel analyzer, it was possible to measure concentrations of 24 major plasma proteins as spots in each gel (Fig. 1). One of the goals of this study was to find a marker to predict which patients were in greatest jeopardy of dying, so that they might be given extra medical attention. In an earlier investigation we found that decreased fibronectin concentration had predictive value for survival of these patients (8). Use of two variables, fibronectin concentration, and the concentration of transferrin plus hemopexin gave better prognostic performance than use of either variable alone (Fig. 2). We reasoned that use of a larger numbers of variables might further increase the predictive performance. We inspected the data on all 24 protein spots in a total of 74 samples (51 from

356

LONBERG-HOLM TABLE

ET AL. 7

DISCRIMINANTANALYS~SOFSAMPLESFROMFOURMALNOUR~SHEDCHILDREN DURING Age

Patient

(months)

Sex

A.S.

28

M

TREATMENT”

Clinical classification -__ K

Days treated ..-- __---.--~0 4 8 14

C.R.

18

0 2 5 8

N.J.

24

SM.

23

0 7 14 21 0 7 14 21

Predicted classification --_ K (1.00) K (0.96) K (1.00) M (0.85) K (0.91) M (0.83) M (0.74) cu (0.52)

M (1.00) M (0.97) M (0.90) M (1.00)

M (0.79) MD (0.79) M (0.86)

M (0.71)

0 Plasma samples were obtained upon first visit or admission to a clinic and at intervals thereafter. Coefficients and constants used for this analysis were obtained from DISCRIM analysis R24A (Tables 3 and 4). The predicted clinical classifications and probabilities (in parentheses) are given in the right column, All patients survived. Patients A.S., N.J.. and S.M. showed weight gain and rapid rise in tibronectin levels within 3 weeks (data not shown). Patient C.R. was reported to have made good progress toward recovery in the third and fourth weeks after treatment began (plasma samples were not available).

seriously malnourished children and 23 from controls of the same age range; Table 1). Of some interest were spots that appeared to go down (such as transferrin), or up (such as spot 10) on the average in the malnourished groups. We confirmed decreases in average levels of transferrin (9, 10) and a-2-macroglobulin (IS) observed by others in marasmus and kwashiorkor. Differences in the concentration of cu-2-macroglobulin (spot 1) must be used with some caution because its concentration is known to decrease after about 3 years of age (19). However, the average ages of children in our groups are all within the range 16-25 months, with the M and KD groups at the two extremes. Figure 4 indicates that spot 1 is not a variable which contributes to discrimination of these two groups. Inspection of the quantitative data, sample by sample, did not seem to offer a means for discriminating clinical status. A computer program (FACTOR) was used to demonstrate internal structure in the data, and a result of this is shown in Fig. 3. The computation of the factor scores was made without reference to the clinical grouping of the individuals.

STATISTICAL

ANALYSIS

OF PLASMA

PROTEINS

357

The DISCRIM discriminant analysis program was then used to optimally separate the sample groups. Posterior placement of samples into groups was achieved by computing a score for each individual. Whenever the probability for placement in one of the clinical groups was less than 50%, i.e., when the individual was placed near a border between three or more groups, it was placed in a group called “other.” It is commonly recommended that the number of variables used in discriminant analysis be kept low relative to the number of individuals per group to avoid artifactual “fitting” (16). This is problematic in our analyses because the smallest groups contain only five or six samples. Thus, it is possible that we are “fitting” the results with random or irrelevant features in the data when using more than 2 or 3 variables. However, this may not be a serious problem if the groups are well separated in hyperspace. The data were first analyzed by the DISCRIM program using 24 variables. From the results (Table 4) it can be deduced that 65 out of 74 individuals were correctly placed in the six clinical groups. This was repeated using 18 variables so that the 6 which were measured with the poorest precision could be eliminated thereby decreasing the likelihood of artifactual pattern recognition in the data. This weakened discrimination somewhat in the M and MD groups, but overall, 57 out of 74 individuals were correctly placed. With the help of a separate routine (STEPDISC), we selected the 7 or 3 variables which contributed most to discrimination in the analyses made with 24 or 18 variables. The DISCRIM program was then run on the 74 individuals using only these 7 or 3 variables. Performance declined with variable elimination, but remained relatively good with 7 variables (at least 45 out of 74 samples were correctly placed in one of six clinical groups). Performance was poor with only three variables (26 out of 74), but could be improved if the “other” group was eliminated so that the samples were assigned to the group which gave the highest score, even though this score might be less than 50%. Runs were also made with 7 or 3 variables plus fibronectin, or plus one of the relatively imprecise variables eliminated with the first 6, but this did not give remarkably improved performance. Table 4 summarizes the results of all of the discriminant analyses made with six clinical groups. We tentatively conclude that use of 7 or more variables for discrimination separates most of the groups well. Use of 7 variables is justified on the basis of group size in the case of the Liberian control group and the marasmus group. However, the performance with the marasmus group is poor with only seven variables. This may be due to inherent inhomogeneity in this clinical grouping; the marasmus group may contain patients which have different underlying conditions or complications. In contrast, the kwashiorkor patients may be more homogeneous; this is also suggested by factor analysis (Fig. 3). Performance of discriminant analysis depends on the statistical distance between the groups of individuals. Values of D2 are provided for each pair of groups by DISCRIM, and from these may be estimated the probability that a member of one group is incorrectly assigned to another (16). This is an estimate of the overlap of normally distributed groups in hyperspace. A D2 value of about

358

LONBERG-HOLM

ET AL.

NO. OF VARIABLES

FIG. 5. Statistical distance, D*. between pairs of some groups analyzed by DISCRIM as a function of the number of variables. Values for the analyses P3A, P7A, P8A. P24A of Tables 3 and 4.

10 indicates that only about 5% of the samples will be incorrectly placed between two groups. In Fig. 5 we have plotted D2 for several representative pairs of groups as a function of the number of variables used by DISCRIM, as recorded in Table 4. These are from analyses of variables obtained only from quantitative 2DGE and normalized to milligrams per deciliter. The values are often about 10 in the range of 7-18 variables, although they are significantly lower for the M/MD pair, and higher for the K/CL pair. The values tend to be low with only three variables, and to increase between 18 and 24 variables. The increase above 18 variables may be largely due to spurious fitting from the use of too many variables. Tentatively we conclude that use of 7 to 18 variables is valid for discrimination among the samples in our groups, particularly CL, K, and KD. We also combined groups of individuals. Groups K plus M, and KD plus MD are analyzed in Table 5, and K plus KD, and M plus MD are in Table 6. The control groups were pooled in both cases. This provided larger-sized groups but may also have pooled individuals that do not form a single group by their nature. For example factor analysis shows that K is well separated from M or KD (Fig. 3). Despite this reservation, performance was adequate for some of the pooled groups using between 7 and 24 variables; 92% of the patients who died were assigned correctly, and 93-100% of the patients with kwashiorkor (K and KD) were assigned correctly. The discriminant analysis method may be tested by applying the coefficients and constants obtained from a “training” session to new samples. We analyzed samples that were not used in the original set of 74; these were plasma samples taken at 2- to 7-day intervals from four patients following the beginning of therapy. Two of the patients had kwashiorkor, and two had marasmus. The results of this (Table 7) show remarkable accuracy. It should be noted that these samples were analyzed using the coefficients and constants from analysis

STATISTICAL

ANALYSIS

OF PLASMA

PROTEINS

359

of 24 variables normalized as fractions of their total “volume”; a procedure most likely to be sensitive to idiosyncratic “fitting” based upon spurious variation in the data of the training set. This approach may be improved when more precise values can be obtained by 2DGE, and when larger training sets are available. The use of 7 or more variables for prognosis of malnutrition will require further validation. Clearly more work along these lines is justified. It is not likely that quantitative 2DGE will soon become a practical tool in the clinic. However it is possible that through research using ZDGE, selected groups of variables may be identified which will be useful in management of malnutrition and other disease states. ACKNOWLEDGMENTS We are grateful for the expert assistance of D. Damato-McCabe in running and analyzing 2D gels and of T. Finney in writing and maintaining computer software for the PDP 11/74 and for data management. We also thank C. R. Crain and Dr. R. D. Holsten for valuable suggestions concerning the manuscript, and M. M. Iacono for assistance in typing. Samples of control plasma from five healthy North American children were a gift from the Primary Children’s Medical Center, Salt Lake City, Utah. Special thanks and appreciation are extended to Dr. David Van Reken, Acting Head of Pediatrics of the John F. Kennedy Medical Center of Monrovia, Liberia, and Dr. Aloisius Hanson, Director of the Liberian Institute for Biomedical Research of Robertsfield, Liberia, whose assistance was essential for this project. We are also grateful to the physicians and nurses of ELWA Hospital, John F. Kennedy Medical Center, Phebe Hospital, and Saint Joseph’s Hospital, all of Liberia, for use of their facilities and for assistance in collection of plasma samples and anthropometric measurements. REFERENCES 1. ANDERSON, N. L., HOFMANN, J.-P., GEMMELL, A., AND TAYLOR, J. Global approaches to quantitative analysis of gene-expression patterns observed by use of two-dimensional gel electrophoresis. Clin. Chem. 30, 2031 (1984). 2. LONBERG-HOLM, K., BAGLEY, E. A. NUSBACHER, J., AND HEAL, J. M. Two-dimensional gel electrophoresis method for investigation of human plasma proteins: Detection of subtle changes during filtration leukapheresis. Clin. Chem. 28, 962 (1982). 3. JANSSON, P. A., GRIM, L. B., ELIAS, J. G., BAGLEY, E. A., AND LONBERG-HOLM, K. Implementation and application of a method to quantitate 2-D gel electrophoresis patterns. Electrophoresis 4, 82 (1983). 4. BAGLEY, E. A., LONBERG-HOLM, K., PANDYA, B. V., AND BUDDZYNSKI, A. Z. Two-dimensional gel electrophoretic analysis of plasma proteis from a patient bitten by a rattlesnake. Electrophoresis 4, 238 (1983). 5. ANDERSON, N. L., TAYLOR, J., SCANDORA, A. E., COULTER, B. P., AND ANDERSON, N. G. The TYCHO system for computer analysis of two-dimensional gel electrophoresis patterns. Clin. Chem. 27, 1807 (1981). 6. LUTIN, W. A., KYLE, C. F., AND FREEMAN, J. A. Quantitation of brain proteins by computeranalyzed two dimensional electrophoresis. In “Electrophoresis ‘78” (N. Catsimpoolas, Ed.), pp. 93-106. Elsevier/North-Holland, New York, 1978. 7. JANSSON, P. A. A laboratory son et lumiere; Part II. Anal. Chem. 56, 303A (1984). 8. SANDBERG, L., VANREKEN, D., WAIWAIKU, K., MARTIN-YEBOAH, P., WEISS, C., UPDEGRAFF, V., HANSON, A., SCHLEMAN, M., AND LODHIA, B. Plasma fibronectin levels in acute and recovering malnourished children. Clin. Physiol. Biochem. 3, 257 (1985).

360

LONBERG-HOLM

ET AL.

9. SCHELP, F. P., PONGPAEW, P., SUTJAHJO, S. R., SUPAWAN, V., SAOVAKONTHA, S., AND MIGASENA, P. Proteinase inhibitors and other biochemical criteria in infants and primary school children from urban and rural environments. &it. J. Nutr. 45, 451 (1981). 10. SCHELP, F. P., MIGASENA, P., PONGPAEW, P., AND SCHREUERS,H. P. Are proteinase inhibitors a factor for the derangement of homoeostasis in protein-energy malnutrition? Amer. J. Clin. Nutr. 31, 451 (1978). 11. BAKER, J. P., DETSKY, A. S., WESSON, D. E., WOLMAN, S. L., STEWART, S., WHITEWELL, J., LANGER, B., AND JEEJEEBHOY, K. N. Nutritional assessment: A comparison of clinical judgement and objective measurements. N. Engl. J. Med. 306, 969 (1982). 12. BUZBY. G. W., MULLEN, J. L., MATHEWS, D. C.. HOBBS. C. L., AND ROSATO, E. F. Prognostic nutritional index in gastrointestinal surgery. Amer. J. Surg. 139, 160 (1980). 13. KAMINSKI, M. V., JR., FITZGERALD. M. J.. MURPHEY, R. J.. PAGAST, P.. HOPPE. M. C., WINBORN, A. L., AND PLUTA, J. Correlation of mortality with serum transferrin and energy. J. Parenter. Enteral Nutr. 1, 27 (abstr) (1977). 14. SAS Institute Inc. “SAS User’s Guide: Statistics.” Gary, N.C., 1982. 15. MULAIK, S. A. “The Foundations of Factor Analysis.” McGraw-Hill. New York. 1972. 16. SOLBERG, H. E. Discriminant analysis. Clin. Lab. Sci. 9, 209 (1978). 17. KLECKA, W. R. “Discriminant Analysis.” Sage University Paper Series on Quantitative apphcations in the Social Sciences, Series 07-019. Sage, Beverly Hills/London 1980. 18. SCHELP, F. P., THANANGKUL, O., SUPAWAN, V.. SUTTAJIT, M., MEYERS, C., PIMPANTHA, R., PONGPAEW, P., AND MIGASENA. P. Serum proteinase inhibitors and acute-phase reactants from protein-energy malnutrition children during treatment. Amer. J. Clin. Nutr. 32, 1415 (1979). 19. LAURELL, C.-B., AND JEPPSON. J.-O. Protease inhibitors in plasma. In “The Plasmas Protein: Structure, Function, and Genetic Control” (F. W. Putnam, Ed.), Vol. I, pp. 229-254. Academic Press, New York. 1975. 20. HAMILL, P. V. V., DRIZD, T. A., JOHNSON, C. L., REED, R. B., ROCHE, A. F., AND MOORE, W. M. Physical growth: National center for health statistics percentiles. Amer. J. Clin. Nutr. 32, 607 (1979).