Some multivariate perspectives on shelf life research

Some multivariate perspectives on shelf life research

G. Charalambous (Ed.), Food Flavors: Generation, Analysis and Process Influence © 1995 Elsevier Science B.V. All rights reserved 1201 SOME MULTIVARI...

1MB Sizes 1 Downloads 104 Views

G. Charalambous (Ed.), Food Flavors: Generation, Analysis and Process Influence © 1995 Elsevier Science B.V. All rights reserved

1201

SOME MULTIVARIATE PERSPECTIVES ON SHELF LIFE RESEARCH 1

2

R.H. Albert, Ph.D., and C. Zervos, Ph.D. ^US FDA/CFSAN, Washington, DC 20204 US FDA/CDER, Washington, DC 20204 (deceased) (Contents do not necessarily reflect policies and decisions of the FDA.) The purpose of statistics is not mere numbers: the purpose of statistics is understanding. As a bonus, insight and inspiration may also be achieved. Multivariate statistics is a key to understanding but it is a set of tools, not an end in itself. These tools can find, and indeed, have found, application in most aspects of shelf life studies. Some specific areas may be cited here:the chemical analyzing of food components, the creating of relevant variables, the processing of foods, and the bringing of insights from different disciplines to bear on the problems of food technology. A. Chemical Analysis A crucial feature in shelf life research is the accurate identification of the chemicals present and the determination of their concentrations. The time course of the decomposition during the shelf life must be monitored by careful analysis of the metabolic products produced, with a view to determining the reaction kinetics of the decomposition. Whether deterioration is due to bacterial decomposition or to protein denaturation or to oxidation of linoleic acid or related unsaturated fatty acids, the researcher needs accurate and precise determination of the chemical intermediates that are involved in the complex process of food spoilage. Even such a routine problem of improving the resolution of the peaks in gas chromatograms has been attacked using multivariate techniques. Morgan and Deming {S.L. Morgan and S. N. Deming, J. Chromatogr. 112, 267-286 (1975)} determined optimal instrument settings, using a multivariate technique called simplex optimization {W. Spendley, G. R. Hext, and F.R. Himsworth, Technometrics, 4, 441-461(1962)}. Youden {W.J. Youden,"Critical Evaluation (Ruggedness) of an Analytical Procedure," in "Encyclopedia of Industrial Chemical Analysis, Wiley-Interscience, New York, 1966, pp. 755-788} has popularized the standard statistical technique known as fractional factorial design, whereby chemical analysis methods are pre-optimized prior to being subjected to interlaboratory validation. Called by Youden "ruggedizing", this multivariate approach is an integral part of the AOAC International [formerly Association of Official Analytical Chemists] protocol for development of chemical analysis methods for regulating foods and drugs. To avoid "statistics shock" that could frighten away chemists , Youden furnished a "template" for optimizing over a

1202 set of up to seven operating variables, provided that the variables could take on only two states. Typical of such "binary" or two state control variables are temperature(high/low), solvent (ethanol/water), flow rate (fast/slow), and pressure(high/low). The variables ("features") need not be such mundane entities: they could be weighted sums of such mundane entities. In this way the researcher might find the fractional factorial exploration might be more effective if, say, principal components were used. The template from Youden consists of a prescribed set of 8 experiments to be run, each experiment having a different combination of binary states for each of the 7 variables. A more complete menu of experimental designs can be found in the text by Haaland {Perry D. Haaland, "Experimental Design in Biotechnology",Marcel Dekker,New York (1989)}. This text includes the "Plackett-Burman designs",which include an 12-experiment design for 8 variables. Even now such pragmatic simplifications as those made by Youden are needed to bring the benefits of multivariate statistics to those involved in food chemistry, and in particular to those involved in investigating shelf life. As Aishima and Nakai {Tetsuo Aishima and Shuryo Nakai, Food Reviews International, 7(1), 33-101(1991)} state, "When a new idea is introduced.., a strong suspicion of its practical value among scientists [develops and].. chemometrics[=multivariate statistics applied to chemical problems]... is still unpopular among flavor researchers." Calibration curves{D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Mischotte, and L. Kaufman, "Chemometrics: a Textbook", Elsevier, Amsterdam (1988)} have the potential of being more effective and less liable to noise and interferences when multivariate techniques are used. Fatty acids may absorb at more than one infrared frequency and by utilizing more than one of the peaks, in particular by using a weighted sum of the absorption peak areas, as the y-variable to plot vs. the true concentration or true concentration added as the x-variable. Obviously, such an approach can be applied as well to other absorption spectroscopies and even to mass spectroscopy. Meglen {Robert R. Meglen, Fresenius J. Anal. Chem, 338, 363367(1990)} has reported on the use of principal component analysis for multivariate quality control of food reference samples. He was able to uncover anomalous samples and to find suggestive groupings of the samples into meaningful clusters. For this particular study, he was concerned with how closely food reference materials match corresponding real foods. In addition, the author offers the tantalizing prospect, with some hypothetical sketches, of how principal components can be used for multicomponent data from foods to quantify the effectiveness of the standard reference materials being used. In the use of such standard reference materials, it is essential that the profile of the reference standard as to analytes and their concentration approximate as closely as possible the actual food matrix. Consider how inefficient and costly it would be to create one single reference material for just one analyte at a

1203 time. By thinking globally, by using a multivariate approach, optimal multipurpose reference standards can be developed. The author's work vividly illustrates how we need multivariate techniques in order to catch up with the large data bases now becoming available from our instruments. A more general discussion by the same author {R. R. Meglen, Chemometrics and Intelligent Laboratory Systems, 3,17-29(1988)} treats of the role of chemometrics in chemical and measurement sciences. Currie {Lloyd A. Currie, "The Importance of Chemometrics in Biomedical Measurements"(chapter 6, pp. 74-100) in "Biological Trace Elements Research: Multidisciplinary Perspectives", edited by K. S. Subramanian, G.V. Iyengar, and K. Okamoto,ACS Symposium Series 445, American Chemical Society, Washington, DC (1991)}, in a review chapter, gives a good example of the use of the Youden ruggedizing approach in an air pollution study. He also gives an example of the simplex technique for optimizing instrument operating conditions and settings. Furthermore, he discusses multivariate quality control. Thus, he touches upon the key points addressed in this chapter. Even the basic processes of designing chemical determinations and evaluating the results can be improved by application of multivariate statistical techniques. Resistance to these techniques will wear away as their utility becomes evident and as the flood of data from our instruments force new perspectives on data processing, to go from raw data to meaningful information. B.DESIGNING RELEVANT VARIABLES Clearly multivariate statistical techniques can facilitate measuring chemical concentration. Now it will be shown that these techniques can be used as well to develop meaningful variables that go beyond mere individual chemical concentrations. Consider for example the problem of measuring rancidity. Oxidative rancidity in fats and oils has been with us since prehistoric times. Surely one of the reasons that the Europeans of Columbus' time were so avid for the spices of the Orient was to cover up the off-flavor and off-smell of meat in those prerefrigerator days. The efficacy of smoking meat, which deposits antioxidative phenols, must have been a serendipitous discovery. Typical rancidity is a result of a complex sequence of oxidations of fats and oils; consequently when a quantitative measure of rancidity is called for, some measure of extent of oxidation is thereby needed. Gray {J.I. Gray, "Simple Chemical and Physical Methods for Measuring Flavor Quality of Fats and Oils", chapter 12, pp. 223239, of "Flavor Chemistry of Fats and Oils, edited by David B. Min and Thomas H. Smouse, American Oil Chemists' Society (1985)} lists 5 analytical chemical procedures for assessing the extent of oxidation of fats and fat-containing foods. These methods are: peroxide value, TBA(thiobarbituric acid), carbonyl value.

1204 anisidine value, and the Kreis test result. Each of these tests actually detects a different aspect of the oxidation process. For example, the much-used TBA essentially just measures the malonaldehyde concentration in the food matrix. The author mentions some objections to this measure of rancidity, among them being the possibility of other materials absorbing at the 532 nm peak, where the TBA-malonaldehyde complex is traditionally measured. The complex also has an absorbance peak at 550 nm. Doesn't this suggest that some combination of the two peaks might be used to determine the concentration of the TBA-malonaldehyde complex? Pohle et al.{W. D. Pohle, R. L. Gregory, and B. Van Giessen, JAOCS 41, 649-?(1964)} did find that the flavor score could be estimated from the TBA value for the fats and oil that they studied. Similarly, the peroxide number could be used to predict what the flavor score would be. All 5 tests simply measure different aspects of oxidative degradation, like the five blind men describing an elephant. The peroxide test actually measures the concentration of the hydroperoxides that are the initial products of lipid oxidations but these are transitory and unstable. If it is assumed that the flavor score has some objectively-verifiable consistent meaning, the problem of combining the five measures to give a better fit to the flavor score would be an interesting research problem. Hopefully some of the weighting factors would be so small that those particular measures could be omitted as being insignificant. Multivariate techniques would also show which of the 5 measures are so highly correlated that some might be redundant and need not be considered. In short it might be fruitful to combine the scores for each of the measures for rancidity to create a new measure of rancidity. To further refine this rancidity assessment, the measures from the "dynamic methods" listed by Gray could be included: Schaal oven test, active oxygen methods, and various oxygen absorption tests. Even the results of standard spectroscopic and polarographic techniques could be incorporated into this suggested multivariate measure of rancidity. How important such a refinement of the rancidity measure might be is seen in the work of Verma et al. (Meat Sci. 14, 91-?,(1985), as reported by Ledward {Dave Ledward,Food Science & Technology Today, 1(3), 153-155(1987)}. The problem addressed was the role of iron-containing proteins (hemoproteins) in lipid oxidation during the spoilage of raw meat. The investigation entailed the study of the catalytic effect of various hemoglobin and myoglobin derivatives on the formation of thiobarbituric acid reactive compounds. Thus, the TBA score, as is usual, was used as a surrogate for the extent of rancidity. Despite the imprecision of such a measure, Verma et al. apparently were able to show that the ferric containing hemoproteins were the major catalysts for oxidative degradation. Duerr and Schobinger{P. Duerr and U. Schobinger, pp.179193,"Flavour '81", 3rd Weurman Symposium, Proceedings of the International Conference, Munich April 28-30,1981, ed. by Peter Schreier, Walter de Gruyter,New York (1981)} claim that about 330

1205 volatiles have been found in orange fruit and juice. Since offflavor in orange juice is indicated by the presence of alphaterpineol, the authors plotted the concentration of this volatile material in orange juice as a function of time, up to 90 days for storage temperatures of 4°, 20°, and 32° Centigrade in cardboard packages and in a glass bottle. Their conclusion was that the increase of alpha-terpineol was linear over 90 days and was strong in the glass bottled juice. As part of their shelf life experiments, they noted that neral and octanal decreased in the course of time over 90 days of storage and that, as expected, the decrease was greater at the higher temperatures. Why not sum up with appropriate coefficients the concentrations of these three well-defined volatiles—alpha-terpineol, neral, and octanal—to obtain a possibly more effective and reliable measure of orange juice deterioration? Such a synthetic variable may or may not have a physical interpretation. As in much of multivariate research, the scientist deals interactively with the statistical results and brings to bear his/her own skill and knowledge. Exploration is often necessary to seek reasonable conclusions. Multivariate statistics does not invariably give the right answer, but it does provide more choices for the statistician. In the case of apple juice, the authors report the work of Koch et al. who found that a simple sum of the 2-hexenal and 2-hexenol concentrations is highly correlated to the intensity of apple juice essence. In other words, the Koch composite variable, that could be used to monitor deterioration in apple juice, is simply : new variable = .5 x hexenal concentration + .5 x hexenol concentration= the simple average of the two concentrations. The researcher is invited to let his/her imagination soar above routine, one-variable-at-a-time computations, one variable at a time. Even in multivariate statistics, the imagination can come into play: an initial transformation of the raw data may provide more tractable variables, even prior to data reduction and manipulation via statistics. For example, a logarithm of a variable may be a better representation of reality than just the simple variable. In his study of off-flavors in canned beef sterilized by heat. Von Sydow{E. Von Sydow, Proc. R. Soc. Lond. B. 191, 145-153(1975)} found a linear relation between the so called "retort off-flavor" and the following variable: the square root of the product of the hydrogen sulfide concentration and the 2-methylthiophene concentration. Another example of the need for a created variable arises from the term "shelf life" itself. People in the field would agree that the word puts too much emphasis on the time aspect and that other factors should be considered as well. Certainly temperature or storage temperature is crucial to how well a stored food or drug retains its desirable features. In fact, reality is a bit more complicated as far as temperature goes: a typical product may undergo several temperature environments as it goes from manufacturer to truck to warehouse to retailer. So temperature history is a factor. In fact, such a time-temperature-history based variable would be pertinent to the cyclic refrigeration

1206 scheme proposed by Scott,Steffe and Heldman {E. P. Scott, J. F. Steffe, and D.R. Heldman, in "Changing Food Technology",pp.189208, edited by M. Kroger and A. Freed, Technomic Publishing Co.,Lancaster, PA (1989)}. Also relevant to quality retention are the conditions under which the product is kept by the consumer after purchase. Certainly, a variable analogous to degree-days would be a less-misleading and a more informative indicator of durability than would a mere declaration of some expiration date or time period. Handling, processing, additives, acidity, and wrapping are among other key factors that need to be considered when trying to quantify the durability of the desirable features of a food or drug product. A composite score might be feasible that would in effect be a linear discriminant function, serving to indicate when the values taken on by the environmental factors/time /processing / etc. cause this function to exceed some critical value. Karel{Marcus Karel, Chapter 17, "Focal Issues in Food Science and Engineering", in "Food Product Development", ed. by Ernst Gaf and Israel Sam Saguy, Van Nostrand Reinhold, New York(1991)} echoes the fact that shelf life is a function not simply of time but of conditions of storage. He alludes to a device that can be scanned at the checkout counter to indicate not only how long the product has been on the shelf but also the temperature conditions the product has been exposed to. The readout consists of the number of "equivalent shelf days"—a parameter that provides a measure of both time-on-shelf and abuse. A critical value for this designed variable can be determined to warn the consumer. Of course, nature does not usually make abrupt jumps: the transition from acceptable quality to unacceptable quality is arbitrary but the line must be drawn somewhere and thinking in terms of a linear discriminant function may help draw that line. Multivariate statistics can certainly help in determining a flavor score, that evanescent quantity often determined by the opinion of expert or not-so-expert panels. D. Thompson {David Thompson,Food Science and Technology Today 3(2),83-88(1989)} has been advocating a special multivariate approach to sensory panel evaluation. This approach, called Generalized Procrustes Analysis (GPA), differs from the usual flavor scoring. In the typical panel set up, a common set of vocabulary terms is established and the panel members are instructed on how to assign a score for each component of a flavor (e.g.,fruitiness, tartness, etc.) in the food items being tested. The GPA approach lets each panelist establish his/her own idiosyncratic set of variables and via a computer-intensive technique called multidimensional scaling the score space of each of the panelists is rotated to maximize the geometric similarity of the different spaces. Once a consensus score space is obtained, then the usual principal component analysis to adjust for correlations among the flavor components is performed. Use of principal components, that is, the use of special linear combinations of the separate flavor component scores may reduce the number of factors that need to be considered and my indicate superfluous variables and might point

1207

to suggestive high correlations among some of the flavor components. A recent book dedicated to application of multivariate statistics to sensory data has been written by Burgard and Kuznicki{David R. Burgard and James T. Kuznicki,"Chemometrics: Chemical and Sensory Data",CRC Press, Boca Raton (1990)}. It strives to bridge the gaps among analytical chemistry, sensory evaluation, and multivariate statistics. Individual chapters are dedicated to correlation and regression, discriminant analysis, and factor analysis. Factor analysis is the multivariate technique that is the most concerned with uncovering new variables but it is not primarily a routine tool. The mathematics can be formidable, even for a statistician, and a great demand is made on the researcher's creativity, insight, and luck. No pat formulas or algorithms exist in factor analysis and a certain ambiguity and arbitrariness exists in the artificial constructs of this technique. For example,an example using orange tea sensory attributes is presented, for which an "orangey" construct and a "spicey" construct are produced. However, Burgard and Kuznicki attractively present factor analysis for those who might want to explore the techniques. They exhaustively work through one particular data set and teach the essential features of this challenging tool in the tool kit of multivariate statistics. We (Zervos and Albert) have written a chapter in "Off-Flavors in Foods and Beverages" {C. Zervos and R. H. Albert, "Chemometries: the Use and Multivariate Methods for the Determination and Characterization of Off-Flavors", in "Off-Flavors in Foods and Beverages", edited by G. Charalambous, Elsevier Science Publishers, Amsterdam (1992)} that gives a sample of the use of multivariate statistics in dealing with off-flavors. In this survey, the work of Bertuccioli et al. {M. Bertuccioli, G. Montedoro and S. Clementi....,(1986)} among many others, was cited as typical of the kind of multivariate applications to flavor research actually being done. In our chapter, we demonstrated how clique analysis, a specialized version of clustering that lets an element belong to more than one group, can be applied to the Bertuccioli data to gain insight into the relationships among the variables used to measure the chemicals evolved during the storage of Provolone cheese. Variables that belonged together were flagged, wherein "belonging together" could mean, under favorable circumstances, that any variable in a clique might be able to act as a surrogate for all the other variables in the same clique. Thus, multivariate statistics can not only create new variables, it can reduce the number of variables that have to be dealt with. Subsequent correlation studies of the simplified variable set with sensory data could thereby be expedited. A much more recent paper on an aging index for cheese has been reported by Banks et al.{J.M. Banks, E. Y. Brechany, W.W. Christie, E. A. Hunter, and D. D. Muir, Food Research International, 25, 365-373(1992)} , who investigated the volatile

1208

components of Cheddar cheese as indicators of cheese maturity, flavor , and odor. Thirty-one gas liquid chromatograph peaks were studied from each of 12 cheese samples. A "partial least squares analysis"(which might have actually be a canonical correlation analysis) was carried out to relate the sensory scores (19 sensory attributes were used) from various panels and the chemical composition (31 peak areas). With only 12 samples, the use of multivariate statistics may be like using a sledgehammer to crack a walnut. Nevertheless, the authors do give for the few data points involved an interesting demonstration of how some multivariate techniques might be used, especially to generate those special linear combinations called principal components. Another dairy product of crucial concern in shelf life studies is milk. Leland {J.V. Leland, G. A. Reineccius, and M. Lahiff, J. Dairy Sci. 70(3), 524-533(1987)} studied 42 samples of milk, oxidized to various degrees using copper wire mesh. The oxidized material served as a surrogate for ordinary spoiled milk, but under controlled conditions. The authors analyzed 22 components via head space gas chromatography and concurrently used a panel of five experts to evaluate the milk for quality. In this study, they employed the special linear combination of variables called a linear discriminant function. Both a principal component and a linear discriminant function are linear combinations of variables, in this case linear combinations of component concentrations. However, the "essences" are different: the principal component is intended to capture as much of the variance structure, being that linear combination with the highest variance, subject to certain restrictions. By contrast, the linear discriminant function(which is just a variable too, of course) is that combination that gives a maximum separation among the centers of gravity of the sets of points corresponding to different groups (low quality, medium quality, high quality). The linear discriminant function often has associated with it a boundary value such that if the numerical value of the function for an unknown milk sample exceeds this critical value, it is assigned to one category, "high quality" for example, and if the value is not exceeded, the sample is assigned to a different category. Principal components are used for purposes of data reduction by capturing the variance structure of the data set by means of a reduced number of variables. Linear discriminant functions serve to categorize and to separate the data into distinct a priori groupings. A good example of a linear discriminant function is to be found in an article by Nishimura and Kato{Toshihide Nishimura and Hiromichi Kato, Food Reviews International, 4(2). 175-194(1988)} on the taste of free amino acids and peptides. The authors observe that proteins without taste, when hydrolyzed by proteases, produce bitter peptides. They studied the reported decomposition during storage of miso as indicated by the change in average peptide lengths with storage time, up to 50 days. In order to predict the bitterness on any given peptide they cite from the literature a table of coefficients for each of 16 key

1209

amino acids. A linear discriminant function is given by the sum of the number of occurrences of each amino acid in a given peptide, weighted by these coefficients divided by the number of amino acid residues in the peptide. If the numerical value of this sum, that is, if the value for this linear discriminant function, exceeds 1400, then the peptide will be expected to elicit a bitter taste. A chemical explanation is available for the relative values of these coefficients or weighting factors: the bitterness of peptides is caused by the hydrophobic property of the amino acid side chain. The authors sought a quantitative description of the frequent observation that the flavors of beef, pork, and chicken are actually improved by storage at low temperatures for certain periods of time. The linear discriminant score at least provides a handle on the problem. C. MULTIVARIATE STATISTICS FOR ASSESSING PROCESSING CONDITIONS Processing conditions can have a profound effect on the quality and stability of food and drug products. A wide range of factors can be involved in carrying out a process with the result that multivariate tools are required to monitor and control them. Alt and Smith{F. B. Alt and N.D. Smith, "Multivariate Process Control,",Ch. 17, in "Quality Control and Reliability", ed. by P. R. Krishnaiah and C. R. Rao, North-Holland Press, Amsterdam (1988)} develop some multivariate process control techniques that rely on the fundamental vector of means for each of the N processing conditions and on the corresponding N x N variancecovariance matrix. A useful way to quantify the effect of the array of processing conditions on the array of product quality indicators is the multivariate technique of canonical correlation. Canonical correlation is especially adept at relating a cluster of variables of one type with a cluster of variables of another type. Typically, one type of variable could be processing or storage conditions and the other type could be the cluster of "flavor notes" associated with the product. Also of potential value in assessing processing conditions are linear discriminant functions. These serve to delimit the values for the acceptable processing conditions, providing a multidimensional boundary between acceptable and unacceptable processing conditions. Here are some examples of multivariate processing studies. As an object lesson in how multivariate statistics might be of use, one can consider the paper by Lima and Cal-Vidal {A. W. O. Lima and J. Cal-Vidal, "Estimation of shelf life of film-packaged freeze-dried banana" in Journal of Stored Products Research 24(2),pp 73-78(1988)}, where the effect of various factors on the shelf life of film-packaged freeze-dried bananas was studied. For polyethylene films of different thicknesses, a plot is made of shelf life vs. water vapor pressure and a plot is made of

1210 shelf life vs the reciprocal temperature; a similar pair of plots is made for polypropylene files of varying thicknesses. Some three-dimensional plots might be useful to demonstrate the interactions between the factors of temperature, film thickness, and water vapor pressure as they combine to determine the shelf life. The actual estimation of the shelf life was based in part on some semi-empirical formulas for moisture transfer rates. Mittal et al.{G.S. Mittal, R. Nadulski, S. Barbut, and S. C. Negi, "textural profile analysis test conditions for meat products",Food Research International 25, 411-417(1992)} studied the effects of test conditions on the "texture profile"of three beef products: wieners, salamis and corned beef. The testing factors (which here may be viewed as processing factors) included diameter to length ratios, % compression, and the speeds of the compressing device; among the texture factors("parameters") were two different types of hardness, cohesiveness, springiness, chewiness and gumminess. Common-sense adjustment of the data, prior to analysis, was made to account for cross section area and for strain—an object lesson in the value of post-editing of raw experimental data to express the results in more meaningful terms. This was obviously a multivariate study that may or may not have benefited from application of canonical analysis. In fact, while multiple regression might have been of value, it could well be that, for a subtle reason, canonical analysis would be INAPPROPRIATE in this particular research. The hindrance is that each operating condition might be able to be assigned a level or value independently of the level or value or any of the other operating conditions. In such circumstances, the variancecovariance matrix for just the operating conditions could be manipulated to contain any values whatsoever. This would be a subtle but real violation of the assumptions underlying canonical correlation. In truth, the non-diagonal elements of the matrix for the processing factors are all zero if indeed the processing factors can be assigned values independently of one another. What the authors did calculate was the correlation of each of the separate operating conditions with each of the texture parameters. No variance-covariance matrix was used or needed by the authors. The authors could draw some valid conclusions as to the optimum testing conditions. Radiation to retard or prevent bacterial spoilage has been a controversial but often successful processing procedure. In a recent article, Narvaiz et al. {P. Narvaiz, G. Lescano, and E. Kairiyama, "Physicochemical and sensory analyses on egg powder irradiated to inactivate Salmonella and reduce microbial load",Journal of Food Safety, 112, 263-282(1992)} reported on the results of a study to identify adequate gamma radiation dose consistent with retaining egg powder quality. Four different radiation intensities were utilized to inactivate Salmonella : 0, 2, 5, and 10 kGy, where Gy represent a gray = 1 joule of absorbed energy per kilogram. These were the 4 processing conditions studies. Among the features of the resulting product measured over a 4 month period were: peroxide number, visible and

1211 ultraviolet absorption peak areas, foam stability and viscosity. Concurrently sensory panelists scored such properties as external appearance, odor, flavor, and acceptability. Microbial concentrations (MPN*s=most probable numbers) were also determined of course. No detailed attempt was made by the authors to apply multivariate techniques: each factor or feature was looked at separately, except to note that when rancidity( as measured by the peroxide number) was high so was the rejection rate by the sensory panelists. For the author's purposes, the nearlyunivariate approach used may be perfectly adequate. Four general types of quality deterioration during production, transportation, and storage are: (1) physical—e.g. drying; (2)chemical—e.g. rancidity; (3)enzymic—e.g., browning; and (4)microbiological—e.g., microorganism growth. In order to process food to ensure the removal or diminishing of this lastmentioned type of spoilage, a realistic predictive mathematical model of microbial growth and survival in foods is needed. This need has been discussed, for example, by G. Gould {Grahame Gould, "Predictive Mathematical Modelling of Microbial Growth and Survival in Foods", in "Food Science and Technology, Vol 3(2),pp 89-92,(1989)} The model must successfully predict the microbial behavior results for the many processing alternatives, including heating, irradiation, decontamination with gases, drying, acidifying, adding preservatives, decontaminating the ingredients. His work is an exemplar of how modern science now is so data-rich: he was able to capture simultaneously the growth patterns within 24 cultures with differing pH and salinity values. With the instrumentation used he could actually have studied up to 200 such cultures and could monitor the timedependence of the bacterial growth via automated optical density measurements. His report is part of a long-range data-base creation project to uncover and record the key determinants of microbial growth for a wide spectrum of microorganisms under a wide range of processing conditions. First comes data, then information, then finally wisdom. "Hurdle technology" in the processing and designing of food for safe storage is a multifactor approach that preserves food by combining methods. The "hurdles" may include such processing conditions as temperature, acidity, reduction-oxidation potential, atmosphere modification, presence of antioxidants, and presence of antimicrobials. "The concept is that for a given food the bacteria should not be able to 'jump over' all the hurdles present and so should be inhibited. If several hurdles are used simultaneously, a gentle preservation could be applied, which nevertheless secures stable and safe foods of high sensory and nutritional properties. This is due to the fact that the different hurdles in a food often have a synergistic (enhancing) effect."{anonymous, "Food Science and Technology Today",6(3), pg. 139(1992)} A good example of the application of hurdle technology is to be found in the paper by Chirife and Favetto {Jorge Chirife and

1212 Guillermo Favetto,"Food Research International" 25, 389396(1992)—Some physico-chemical basis of food preservation by combined methods}. They review such key processing and deterioration factors as water activity (a^), pH, solute effects, and temperature and consider the interaction of these factors. The water activity has historically been a more meaningful indicator than the simple moisture content because the activity includes the effects of the food matrix. Chirife and Favetto explore the combined effects of decreasing water activity and heating. By lowering the water activity, not so much heating is needed to deter bacterial growth. Be aware that the choice of the word "hurdle" to describe such a multivariate situation can mislead the reader, in that the process designer does not envision each of the conditions to be met sequentially, as in a hurdle race; rather, the process variables must be considered simultaneously. Chao and Rizvi {R. R. Chao and S. H. Rizvi, "Maximization of produce shelf life through modified microatmoshpere packaging", from the book "Changing Food Technology 2", pp. 175-178, ed. by Manfred Kroger and Allen Freed, Technomic Publishing Co., Lancaster, PA (1989)} must deal with multivariate processing conditions in their study of designing a modified microatmosphere packaging system to extend the shelf life of apples. Among the parameters to be juggled simultaneously were: storage temperature, time required to reach desired oxygen and carbon dioxide concentration levels, the surface for gas exchange, oxygen and carbon dioxide permeabilities of the wrapping film, and the packaging materials. Part of the strategy to cope with the complex interactions was to use mass balance equations for the gases within the packaging. Mass balance equations simply express the conservation of mass and facilitate the mathematical description of the processes taking place within the film-wrapped product as the product consumes oxygen and generates carbon dioxide, while simultaneously both these gases are permeating through the film. In thinking about such hurdle approaches, we could picture a three-dimensional space, with its three coordinate axes corresponding respectively to water activity, pH, and temperature. Any point in this space corresponds to a set of process conditions. Imagine a three dimensional figure like an icosahedron such that this figure is made up of all those points that correspond to an adequate combination of processing conditions. Any point lying outside this figure would correspond to an unacceptable combination of processing conditions. The shape of this figure—the boundary between acceptable and not acceptable — can be delineated with the help of multivariate techniques. Even if the boundary is more like a crumpled sheet than a polyhedron, application of techniques like piece-wise regression will help trace out this boundary and will help to indicate suitable combinations of operating conditions that will achieve the goal of safe product with less expenditure. As novel food products, such as low-acid foods or low-fat foods, designed

1213 to cater to new fads and fears, penetrate the market place, the application of hurdle technology and its inherent multivariate approach will gain in importance. TRANSCENDING STATISTICS AND PEEKING OVER THE FENCE INTO OTHER FIELDS Man does not live by statistics alone. The food technologist is truly confronted with the need to adopt a multidisciplinary approach to his/her problems. From ANTHROPOLOGY to ZOOLOGY, with important stops at ECONOMICS and PSYCHOLOGY, the cumulated store of many sciences must be tapped by those concerned with food sciences and new contributions to this store often have to be made. For example, the clique analysis referred to above, had its origin in SOCIOLOGY, with the first rigorous formula for complete enumeration of all cliques to be found in the journal Sociometry.IF. Harary and I.e. Ross, "A procedure for clique detection using the group matrix", "Sociometry",20(3),205215(1957)} Mathematics, and in particular multivariate statistics, can provide a unifying approach to assessing the results and estimating the uncertainties. To cite one example of how the scientist might have to adjust to a whole new paradigm: consider a simple sensory panel problem, involving just a paired comparison test. Each panel member is presented with two items and he/she is asked to select which item is "better" on the basis of some pre-agreed-upon protocol for evaluation. If three items are involved—cheese, chicken, and chocolate—• it is not inconceivable that the panelists (a) prefer cheese over chicken, and (b) prefer chicken over chocolate, and yet (c)prefer chocolate over cheese. Note how this set of preferences wreaks havoc on our ordinary quantitative training. Since Euclid, the third century B.C. geometrician and educator from Alexandria, every school person has been taught that if a is greater than b and if b is greater than c, then a is also greater than c. Mathematics label such behavior as "transitive." For example, 5 is greater than 3 and 3 is greater than 1, so 5 is greater than 1. Note that such is not the behavior in the cheesechicken-chocolate dilemma(trilemma?). Numerical instincts fail us. Merely assigning a "satisfaction value" or "util" can't possibly capture this counter-intuitive behavior. In one ECONOMICS textbook, Walsh {Vivian Charles Walsh, "Introduction to Contemporary Microeconomics",McGraw-Hill, New York(1970)} derides the use of utils, calling such a practice a relic from crude unthinking British empiricism of the nineteenth century. Walsh proceeds to explain "indifference curves" —curves that do not rely on exact numerical values, but instead depend on preference rankings. However the cheese-chicken-chocolate problem cannot be solved by this simple ruse of ranking and of "revealed preferences." If one insists on numerical values when transitivity does not hold, the researcher must literally rise above the "x-axis" and

1214 consider an item's utility to be expressed as a vector in multidimensional space; then the non-Euclidean preference ranking might be representable in a geometric fashion. Think higherdimensionally. Some of the more ambitious and daring might even want to explore the use of tensors to express utility. If this much trouble can realistically be expected to happen from time to time from a simple pair comparison, then imagine how creative the researcher needs to be in coping with the "duo-trio" test and the "triangular" test employed in sophisticated sensory panel evaluations.{Roland Harper,"A guide to sensory analysis and its development","Food Science and Technology Today",1(2),pp.7376(1987)} Much of the initial portion of this chapter has dealt with the concept, maligned above, of utility. Used by economists, philosophers, insurance companies, and management scientists (see below), the word "utility" is difficult to define: an adequate definition requires precision-honed mathematical symbolism. "Utility index" as defined by Dyckman et al. {T. R. Dyckman, S. Smidt, and A. K. McAdams,"Management Decision Making Under Uncertainty", Macmillan, London(1969)} is "a real number that gives a preference measure attaching to an outcome or the payoff for an outcome." The authors then go on to define a utility function as a function that assigns a value to a "gamble" such that if gamble g^ is preferred to gamble gg, then the corresponding value assigned by the utility function for the gamble g^ is greater than that assigned to the gamble g2. Note that, by the use of a definition in terms of real numbers, the transitivity condition is assumed to hold. From previous discussions in this chapter, it should be clear that use of principal components might be a fruitful means to composite several variables to create a utility function. Even the linear combination resulting from multivariate regression could be a candidate for utility function. Even adding cross terms in a regression, such as sweetness x density, is not out of the question as part of an exploration to find a suitable utility function. A realistic utility function must combine a widely disparate array of variables, both objective and subjective. Consider how difficult it would be to create a single utility function that mirrors the "healthiness" of food. "Is 5 units of vitamin B^g worth 3 units of vitamin A?" Such questions arise naturally as possible explicit formulas for a utility function are reviewed. Now try to contrive a solution to the harder problem of coming up will a utility function that mirrors the "quality" of food: in addition to "healthiness" variables, such as vitamin and fat content, all the "hedonistic variables, such as texture and taste, are to be considered. It may well be that in such situations the search for a single utility function is like the search for the holy grail: a noble enterprise but doomed to failure. It is perhaps not a matter of time and improved technology until such an explicit all-encompassing utility function can be devised; the goal may be illusory and may never ever be achieved.

1215 Another complication with utility functions as currently envisioned is that the utility indices are often erroneously assumed to be additive. What is after all merely an ordinal ranking is treated as a full-fledged continuous variable (i.e.,as a "ratio" variable). This mistake is particularly insidious when decisions are based on the results of multiplying the utility of each choice by the a priori known probability of some relevant conditions holding true. A utility index of 5 is not five times higher than a utility index of 1.: therefore 1/10 of 5 utils is not equivalent to 1 util in value. In fact, such a multiplication does not make any sense. Ranks don't work that way. This striving after utility functions does force the researcher to examine the actual physical problem closely. Therefore, the misuse of utility functions does not preclude them from being used as a crude guide. They're better than nothing. As discussed in the preliminary portions in this chapter, the transition from individual utility functions to an overall utility function must be made in order to intelligently assess options in foods and in life. Here we transgress into the realms of POLITICS and PHILOSOPHY. A name exists for such an overall merged utility function: it is called, by Nobel-prize-winning Kenneth J. Arrow and by others, the social welfare function. If conceptual and practical difficulties arise for individual utility functions, then no better can be expected from the social or communal welfare function. Arrow's definition, given on page 12 of his book "Social Choice and Individual Values"{Kenneth J. Arrow,"Social Choice and Individual Values",Yale University Press, New Haven (1963)} is as follows: "By a social welfare function will be meant a process or rule which, for each set of individual orderings R^,..., R„ for alternative social states (one ordering for each individual) provides a corresponding social ordering of alternative social states,R." Much of the book is devoted to establishing conditions under which such a function can exist. Clearly "not yet ready for prime time", the problem of integrating individual preferences into communal preferences is of deep, abiding concern that has important impact on how we vote, how we are governed, how we are taxed, and how we live. Associated with the problem of utility is that fact that utility is a function of how people perceive their needs and wants. Thus, the realm of PSYCHOLOGY is important in assessing and interpreting people's preferences. Shelf life is related to the retention of the quality of a food product, but the perception of quality is highly subjective. Even health aspects —matters of life and death— can be viewed differently by different peoples. Consider our attitude toward aflatoxin-contaminated grain versus the attitude toward the same grain of starving Somalis. As the proverb states, "Ones man's meat is other man's poison." ["One man's Mede is another man's Persian" might be an older version of this proverb.] Consider how different tastes are. Pork is anathema to both Jews and Muslims, yet some Polynesian islands are reputed to exist where pork is considered such a supreme

1216 delicacy that women are allowed to consume pork only on special occasions. New Englanders prefer brown eggs. As we mentioned in our chapter on the application of chemometrics to flavor studies {Zervos and Albert, op.cit.}, year-old decayed whale meat is considered a delicacy in Iceland and the off-flavor, as the nonafficionados would dub it, is essential to the enjoyment of this delicacy. As Williams et al.{Anthony P. Williams, Clive de W. Blackburn and Paul Gibbs, "Techniques to improve the safety and extend the shelf life of foods" in "Food Science and Technology Today" 6(3) 148-151(1992)} state, "... the criteria used to describe deterioration should be relevant to the cause for which the food would become unfit." In other words, good quality depends on both subjective and objective factors. Surveys are undertaken periodically to determine just what the salient factors are in determining food preferences. A study of food consumption trends in the United Kingdom has been described by A.M. Rees. {Ann Maree Rees,"Factors influencing consumer choice", in "Journal of the Society of Dairy Technology", pp.112116,vol 45(4),(1992)} She particularly focuses on the influence of such new phenomena as housewives' working, the introduction of microwave ovens, the increase in snacking, and the obsession with "natural" food. In the United States, the results of a survey of factors affecting food consumption has been described by C. I. Waslien.{Carol I. Waslien, "Factors influencing food selection in the American diet", pp. 239-269, in the book "Advances in Food Research Volume 32, edited by C. O. Chichester and B. S. Schweigert,Academic Press, New York(1988)} She describes the effects of age on food preference and food attitudes and on meal patterns. The complex issues of sex differences vis-a-vis senses of taste and smell, food values and attitudes, and actual food selection are also addressed. Moreover, racial and ethnic patterns are discerned. She describes the dramatic alterations in the make-up of U.S. society, in age distribution, and in life styles. In her conclusion, she states that "The marked changes ... demand research methodologies that can consider multiple factors simultaneouslyfitalics ours] and are still rapid enough to make predictions before the population has changed again." She also calls for techniques than can take advantage of computerized statistical procedures on the diverse populations needed to assess the full range of food-choice factors. The "hard sciences", in contradistinction to the "soft sciences" like economics, of course would also be expected to contribute to food technology and the understanding and prolonging of shelf life. CHEMICAL ENGINEERING is replete with multivariable approaches that apply to foods. A paper by Saguy and Karel {I. Saguy and M. Karel, "Modeling of quality deterioration during food processing and storage",Food Technology,34(2),pp.7885(1980)} on modeling quality deterioration during processing and storage exemplifies how chemical engineers approach the complex situations frequently encountered in food technology. A catalog of typical statistical routines is presented, including multiple linear regression and stepwise regression. Emphasis is on the

1217 kinetics of the decomposition of food. Lima and CalVidal{op.cit.}, by their model-based equations, also provide a useful illustration of the chemical engineering approach to coping with all the variables that must be considered in food processing, even though they may fail to go far enough. Even a certain mathematical technique from another of the "hard sciences", specifically from THEORETICAL BIOLOGY, may find application in shelf life work, especially in its chemical analysis and processing aspects. Called the "genetic algorithm", this technique, whose goal is to seek optimum values for instrumental and processing variables, is being actively explored for future routine application. Lucasius and Kateman{C.B. Lucasius and G. Kateman, "genetic algorithms for large-scale optimization in chemometrics: an application",Trends in Analytical Chemistry, 10(8),254-261(1991)}, in order to suggest fruitful application of the genetic algorithm, address the important chemical analysis problem of finding the wavelengths to use to estimate a complex mixture of chemicals, each having a different individual spectrum, i.e., each having a different functional relation between the absorbance, in the example used, of ultraviolet light and the wavelength of the ultraviolet light. The mixture dealt with in this feasibility study consisted of the four RNA nucleotides: adenylic, cytidylic, guanylic, and uridylic. The criterion of "goodness" is the selectivity, which can be calculated via the set of so-called Lambert-Beer equations and which is uniquely and directly calculable for a given choice of wavelengths. In the initial stage of the genetic algorithm, strings of I's and O's were generated at random. Because 36 absorption wavelengths had to be considered, each string was composed of 36 I's and O's, where a 1 at the n* position in a string indicates that the n wavelength is used, while a 0 in that position indicates that the corresponding wavelength is not to be used. Each string is a coding for what wavelengths are to be used concurrently and which are not to be used. For each of the starting set of 36-bit-long strings, a selectivity value — a measure of "goodness"— can be straightforwardly be computed. Since this selectivity is simply a numerical value, and not something so complicated as a preference vector alluded to above, the initial strings can be ranked in order from most selective to least selective. From this initial set, a certain number are selected with a probability proportional to the selectivity and these selected ones are allowed to "breed", creating a new generation. Then the strings in this new generation are evaluated and a sensitivity-based selection is again made to determine who shall reproduce the next generation. The algorithm resorts to this selection procedure, instead of picking say the top 35% of the strings, to ensure that every string has a chance, however small, of contributing to the ultimate goal of finding that string of I's and O's that yields the very highest numerical value for the sensitivity. Also, such a probabilistic approach prevents genetic dead ends, where the same situations keep recurring: in the genetic algorithm.

1218

something new and invigorating can always be introduced by chance alone. In mimicking nature, the genetic algorithm resorts to analogs of the crossovers and mutations found in genes. In crossovers, two strings are split at some random point along their respective lengths, say between position 29 and 30 and a pair of two new strings is generated: (a)the 29-bit portion of the first string is coupled to the 36 minus 29 = 7-bit string of the second and (b)the 7-bit segment of the first string is coupled to the 29-bit segment of the second string. As an analog to mutations, a certain number of bits are reversed at random, I's becoming O's and O's becoming I's. As each generation is created, the strings become associated with higher and higher sensitivity values. However, since the process is random, it is interesting to note that you can get different alleged optima for different random starting sequences of strings. A common-sense solution to this non-reproducibility problem is to resort to this efficient genetic algorithm to zeroin on a good '•fertile region" and to allow some more tedious optimization or "hill-climbing" algorithms slog through the computations to find the unique optimum. The food technologist may wonder why all this bit-playing is necessary, when it is a matter of just trying out all the possibilities and identifying that one set of conditions — t h a t one set of wavelengths— that yields the highest value. For some of the most interesting and important optimization problems, such a complete, exhaustive search may entail more computer time than is left in the life of the planet earth. The genetic algorithm is a powerful general-purpose search strategy that has its greatest strength in those large-scale problems that have no analytical solutions. The genetic algorithm, of course, could effectively be applied to the optimization problems of food processing. As such, it would provide an alternative to the already well-established bag of tricks employed by the discipline known as MANAGEMENT SCIENCE. Under the rubric of "operations research", this collection of procedures, seeks, inter alia, to uncover those values for a set of variables that optimize the value of some function. This function is called the "objective function", equivalent to the sensitivity in the above genetic algorithm. This function provides a measure of goodness. The search process strives to achieve the highest goodness, subject to the constraints imposed by limited resources. A frequent exercise is to calculate the least-cost nutritionally-adequate diet: given the nutrient composition of available foods and the price of these foods and the minimum daily requirements, the question is "What is the lowest cost diet consistent with the constraints of nutritional adequacy?" The well-developed arsenal in operations research {Frederick S.

1219 Hillier and Gerald J. Lieberman, "Operations Research",HoldenDay, San Francisco(1974)}includes the techniques called linear programming, non-linear programming, dynamic programming, integer programming, PERT (Program Evaluation and Review Technique) and CPM(Critical Path Method). Also part of operations research is a classic repertoire of problems addressed by these weapons, among them: the warehouse problem, the knapsack problem, and the optimal-reorder frequency problem. While not strictly in the domain of statistics, many of these areas of operations research rely heavily on statistics to model reality and to establish values for key parameters along with estimates of the uncertainty in these parameters. The following is a sample of some of the shelf life problems where operations research approaches may help provide the important answers: *The simplex method {Lloyd Currie, op.cit.},which was alluded to previously for optimizing experimental operating conditions, is a specialized development for chemists, based on the fundamental linear programming prescriptions of operations research. *The processing of food in preparation for storage should be as rapid as possible. PERT and CPM help identify bottlenecks in production processes and allow the manager to focus on streamlining just those step that make a difference in the overall time required. *The insight into process sequencing provided by PERT and CPM can be applied to the HACCP[Hazard Analysis of Critical Control Points] monitoring of a food production process, such as fish packing. HACCP is a form of quality control that is vital when rare but serious defects are involved. A good example would be contamination by Salmonella; the infestation in a food is unlikely, but when it does occur it can be lethal. Mere statistical sampling will not work well since to detect something that has a probability of .001 of happening will require prohibitively large samples. Instead, HACCP strives to identify those features of a process most liable to cause serious defects. It then focuses on assuring that those features are operating properly. A illustration of HACCP in practice would be the daily verification that the fish packers in an assembly line are wearing intact gloves. *The retail grocer doesn't want to have too much product waiting on the shelves. The Japanese have introduced the concept of "just in time delivery" in shipping parts to an assembly site that basically allows the manufacturer to keep no parts inventory at all. The grocer of course needs some inventory of the food product even though the ideal situation would be for the shipment of a unit of product to arrive just as the customer is about to buy. The strategy that is adopted by the grocer must take into account the fact that perishable products have a utility diminishing with time and have an availability that also diminishes with time as the consumer purchases the product. This type of problem is addressed in operations research under the general heading of inventory models. *The classic warehouse problem can help the food manufacturer decide from which storage facilities to ship to which retail outlets. The formulation of this problem is as follows: a

1220 manufacture has F factories, each producing a given amount of product and S stores that each demand some amount of product. The problem is to assign the product from the F factories to the S stores is such a way as to minimize the costs. For shelf life considerations, the cost might be taken to be transportation time. Here, as in any much of practical research, the problem is may not necessarily be solvable by simply plugging into a formula, even though the problem here is a classic one. Resorting to the tried and true formulation, the assignments will minimize the average transit time. What if, instead, it is more important to minimize the maximum transit or shipping time? That is, suppose it is imperative because of a spoilage threat that no single product shipment take more than 36 hours. No easy answer exists as to how to determine the optimum assignments in this case, but operations research provides at least a starting point for ferreting out a solution. Thus, many fields can contribute to the effective investigation and control of shelf life. IMPLEMENTING THE STATISTICAL PROCEDURES VIA COMPUTER SOFTWARE The value of multivariate statistics and of the multivariate approach has been demonstrated. However, unless the food scientist can actually apply the techniques, he/she is doomed to be a passive on-looker. It is obvious that there is very little that a scientist can do in the realm of multivariate statistics without resorting to a computer. A few simple multivariate approaches that require little if any computer power might be tried during preliminary data exploration: (a) Simply graphing one variable vs. another can suggest trends. (b) When the resulting so-called scatterplot is enclosed in a "convex hull", the limits of the variables are easier to spot. The convex hull of a set of points in a plane is simply the smallest polygon that contains all the points and has all its interior angles less than 180 degrees. If you have N points in your plot you can connect each point with every other point— there are N x (N-1)/2 such lines in all—and then trace out the outermost lines. At each corner of the convex hull is a data point, with any non-corner data points all contained within the convex hull. (c) You can plot one variable vs. another variable for several different categories and the respective convex hulls can provide boundaries to categorize new unknown samples. (d) You can fit a straight line to the scattered points— or you can transform the values—take the logarithm or raise to a power for example—and then fit a straight line, either via conventional regression formulas or by a minimax procedure.{Robert J. Blodgett, U.S. Food and Drug Administration, Washington, DC 20204. Personal communication. This minimax procedure is not generally known but offers an alternative to the ordinary regression procedure, which makes certain demands on the

1221 behavior of the data. The undemanding minimax-fitted straight line provides the minimum maximum "miss" or individual error: for each point in your set of plotted points, note how much this minimax line misses it as measured vertically. You will have N such values or misses. The worst of these N deviations is the maximum miss for the minimax line.. If you draw a straight line other than the minimax straight line, the fit will have at least one point missed by more than this maximum miss. To get the minimax line, draw the convex hull and find which corner is the furthest away from the opposite side of the convex hull. Then draw a line passing half-way between this corner and side, parallel to the side. This line is the minimax line, useful for spotting outlying data points and for indicating possible ranges for trends.} That's pretty much all you can do without computer power but fortunately the requisite computer power—both hardware and software—is available to apply multivariate statistics techniques to data. There is however both good news and bad news. The good news is that there are literally hundreds of software packages that include multivariate subroutines. And these proliferating software packages can be run on a personal computer: mainframe computers are not needed except for truly enormous data sets. That's also the bad news, namely, there are scores of multivariate statistics software packages. The uncontrolled growth of such packages makes selection difficult and some horror stories exist about the quality and validity of some of the programs in some of the packages. See, for example, the article by G.E. Dallal {Gerard E. Dallal, "Statistical microcomputing—like it is.", The American Statistician", 42(3) 212-216(1988)},where 5 different packages are put through the paces and so are found wanting. These were only 5 out of the more than 200 packages that Dallal claims are available just for the IBM PC alone.{Note that the naming of any commercial make of computer or software package is in no way to be construed as an endorsement. No conclusions may be drawn as to the relative merits of similar competing products.} Caution must always be exercised in the blind use of any technique and the very fact that a computer does the computations in no way relieves the scientist from the responsibility of knowing what's going on. Searle{Shayle R. Searle, "Statistical Computing Packages: Some Words of Caution", "The American Statistician", 43(4)189-190(1989)} issues the following warning: "With little or no knowledge of statistics, people with data can so easily have the data processed by a package that does the arithmetic for whatever sophisticated analysis they choose, whether the choice [of type of analysis] is appropriate or not." He recounts some horror stories of computer packages being abused. As far as mainframe packages are concerned, five are discussed in a book by Wolff and Parsons.{ Diane D. Wolff and Michael L. Parsons, "Pattern Recognition Approach to Data

1222 Interpretation",Plenum Press, New York(1983). Of these, at least three (namely SAS [Statistical Analysis System] and SPSS [Statistical Package for Social Sciences]) are available for personal computers and are we11-documented and supported. We ourselves use the software package called Statgraphics and we have seen references in British publications to a package called Genstat. Some shopping may have to be done but having a software package is the sine qua non for doing multivariate statistics. It is necessary but not sufficient however. A willingness to explore and to learn is required and a cheerful attitude in the face of adversity. Goethe, who was born before the golden age of multivariate statistics and computers and who was habitually successful in everything he undertook, always advised any friends fearful of failure in some enterprise:" Vertraue und verhandle!""Have faith and just do it!"