Applied Ergonomics 62 (2017) 19e27
Contents lists available at ScienceDirect
Applied Ergonomics journal homepage: www.elsevier.com/locate/apergo
Design with limited anthropometric data: A method of interpreting sums of percentiles in anthropometric design Thomas J. Albin High Plains Engineering Services, USA
a r t i c l e i n f o
a b s t r a c t
Article history: Received 22 March 2016 Received in revised form 10 February 2017 Accepted 10 February 2017
Occasionally practitioners must work with single dimensions defined as combinations (sums or differences) of percentile values, but lack information (e.g. variances) to estimate the accommodation achieved. This paper describes methods to predict accommodation proportions for such combinations of percentile values, e.g. two 90th percentile values. Kreifeldt and Nah z-score multipliers were used to estimate the proportions accommodated by combinations of percentile values of 2e15 variables; two simplified versions required less information about variance and/or correlation. The estimates were compared to actual observed proportions; for combinations of 2e15 percentile values the average absolute differences ranged between 0.5 and 1.5 percentage points. The multipliers were also used to estimate adjusted percentile values, that, when combined, estimate a desired proportion of the combined measurements. For combinations of two and three adjusted variables, the average absolute difference between predicted and observed proportions ranged between 0.5 and 3.0 percentage points. © 2017 Elsevier Ltd. All rights reserved.
Keywords: Adding percentiles Subtracting percentiles Anthropometric accommodation Indicator function variables
1. Introduction A general goal of designers is to design a product that fits a known proportion of a population of intended users; practitioner ergonomists have a closely related goal of understanding whether the dimensions of some object, such as a chair or a workstation, are sufficient to accommodate a known percentage or percentage range of a population of individuals who use those objects. Frequently the only anthropometric data available are tables of percentile values for the several anthropometric variables of interest, for example, 90th, 50th and 10th percentile values. These single percentile values do not provide sufficient data to allow use of multivariate techniques to estimate accommodation, e.g. measurements on variables of interest for each individual in a sample. Often the dimensions of interest, such as clearance spaces under a desk, or the height above the floor of a seated individual's elbows, may not have been measured directly in the anthropometric datasets, but can be estimated by combining two or more percentile values of the dimensions that have been measured. Accommodation on a single dimension is generally defined as
E-mail address:
[email protected]. http://dx.doi.org/10.1016/j.apergo.2017.02.005 0003-6870/© 2017 Elsevier Ltd. All rights reserved.
the proportion of an intended user population who fit the dimensions specified in a design. As a simple example, the design of an opening for a door whose height is set at a height corresponding to the 95th percentile of stature for males in an intended user population will generally accommodate at least 95 percent of all male and female users, as males are generally taller than females. A percentile value, by definition, describes the fit, or accommodation, on a single anthropometric variable for the intended user population. Simultaneous accommodation on two or more anthropometric variables has been defined (Roebuck, 1995) as “moving beyond evaluation of what specific percentiles are accommodated to evaluation of what percentage of persons in a sample (and thus, by implication) the percentage of an entire population will be accommodated”. An expanded definition of multivariate accommodation states: “In a multivariate problem, accommodation may be assessed on many device parameters, but a user can be disaccommodated only once. For instance, if user 1 is disaccommodated on parameter A, user 1 is considered disaccommodated even if the user is accommodated on parameters B and C.” (Garneau, 2009) Some interesting examples of accommodation on a single dimension are found in ISO 11064-4 (ISO 11064-4, 2013). There, 5 of
20
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
15 seated and standing workspace dimensions are specified as combinations of percentile values of different anthropometric variables. If one were to attempt to design a work surface conforming to that standard and that would accommodate a 90th percentile individual's seated elbow level above the floor, one might estimate the 90th percentile work surface height for the intended users by adding the 90th percentile values of popliteal height and elbow rest height, seated. However, as Robinette and McConville (1981) have succinctly noted, “the sum of the percentiles is not the percentile of the sums”. That is, the sum of the 90th percentile values of elbow rest height and popliteal height is generally not equivalent to the 90th percentile of all such summed values. Similarly, Gordon et al. (1997) state “… percentile models always include less than the intended population proportion when the design problem has more than one body dimension critical to fit and function”. Consequently the practitioner is left uncertain as to what “the percentile of the sums” is; that is, what proportion of individuals would be accommodated by a single dimension estimated by summing the percentile values? In this paper we show that there are special cases of percentile models that, in fact are never less than the intended population proportion when they are combined. These are the special cases where a single specific dimension is defined as the combination of two or more percentile values. Kreifeldt and Nah (1995) describe a method that may be used to calculate the actual proportion of all individually summed measurements less than or equal to the sum of two equally valued percentile values. In this method, the two percentile values, when combined, form a new, univariate variable. The Kreifeldt and Nah method predicts the actual proportion accommodated on the combined variable in terms of a multiplier of the normal, or z score, associated with the equally valued percentiles that are combined. For example, while two 90th percentile values each have an associated z score of 1.282, their sum (P90A þ P90B), has an associated z score equal to some multiple of 1.282, where that multiple is always equal to or greater than 1.0. Consequently, for these special cases, the sum of two percentile values is never less than the intended population proportion. Calculating the Kreifeldt and Nah multiplier (K&N) requires knowledge of the standard deviation of each variable combined as well as the correlation between them. Often these data (the standard deviations, the correlation values, or both) are not available for the user population that is of interest. This paper will describe how to calculate the Kreifeldt and Nah multiplier as it was originally defined, and suggests methods to calculate the multiplier in situations in which either the individual standard deviations, the exact correlation between the two variables, or both are unknown. Consider the case where it is desired to know the height of the eyes above the floor while seated, but where that dimension has not been measured directly. However, it can be estimated by combining the 90th percentile values of popliteal height and eye height, seated. Since the combination is a univariate parameter, an individual is considered to be accommodated if his or her individually summed measurements of popliteal height and eye height, seated are less than or equal to the combined 90th percentile values of popliteal height and eye height, seated. The Kreifeldt and Nah method described in this paper enables the practitioner to estimate what proportion of the intended users’ individually summed measurement values will be accommodated; that is, will be less than or equal to the summed 90th percentile values. Since the value of the Kreifeldt and Nah multiplier is never less than one, the proportion accommodated by the sum of the 90th percentile values will never be less than 90 percent. The second case describes how the Kreifeldt and Nah multiplier
may be used to estimate which equal percentile values should be summed in order to accommodate a specified proportion of the intended users. For example, which specific percentile values of popliteal height and eye height, seated must be combined in order to accommodate exactly 90 percent of the intended users? 1.1. Examples of use of the Kreifeldt and Nah multiplier As an example of the employment of the z score multiplier, suppose that a designer is working from a contract specification that specifies that the design must accommodate the central ninety percent of the range of seated eye heights of the intended users, specifically between the 95th and 5th percentile values for seated eye height above the floor. She has only percentile values with which to work. Although seated eye height above the floor has not been directly measured, it can be estimated by adding the 95th percentile values of popliteal height and eye height, seated. Example 1. It is not clear to the designer whether the sum of the 95th percentile values will satisfy the upper limit of the range of intended users, that is, is the sum equal to or greater than the 95th percentile value of the combined measurements? Using the method described in this paper, she determines the Kreifeldt and Nah multiplier to be 1.22. The z score associated with the 95th percentile is 1.645. Then the z score of the estimated proportion of users whose summed measurements will be less than or equal to the sum of the 95th percentile values of popliteal height (seat height) and eye height, seated (eye height), is equal to 1.22 times 1.645, or 2.007. A quick reference to a table of normal distribution values indicates that approximately 97.8 percent of all individuals will have summed measurements less than or equal to the two summed 95th percentile values. Example 2. The designer then estimates the lower limit of the range of adjustment values by adding the 5th percentile values of popliteal height and eye height, seated. The z score multiplier is again 1.22, and the z score associated with the 5th percentile is 1.645. Multiplying, she determines that the z score for the sum of the 5th percentile values will be 2.007. Reference to the normal table indicates that about 2.2 percent of individuals will have measurements less than the sum of the percentile values. Example 3. The designer now has estimates of the upper and lower limits of the adjustment range obtained from respectively summing the 95th and 5th percentile values of popliteal height and eye height, seated. The proportion accommodated by the range of adjustment between those two values is then obtained by subtracting the proportion of individuals whose measurements are less than the lower value from the proportion whose measurements are less than the upper limit, or 97.8 percent minus 2.2 percent; the measurements would be expected to accommodate 95.6 percent of the intended users, well in excess of the required 90 percent accommodation. Example 4. The designer notes that the contract explicitly requires that the upper limit of the range of accommodation must be the 95th percentile value of the combined measurements and the lower must be the 5th percentile. Additionally, there are opportunities to reduce the cost of the product if the dimensions of the upper and lower adjustment limits can be reduced slightly. Consequently, the designer uses the z score multiplier to determine the percentile value n, that, when the nth percentile values of seat and eye height are combined, will give an accurate estimate of the 95th percentile value and of the 5th percentile value. She does this by dividing 1.645 by 1.22, which gives an adjusted z score of 1.348. A z score of 1.348 corresponds to the 91st percentile, so combining the 91st percentile values of seat height and eye height will give an estimate of the 95th percentile value of
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
seated eye height above the floor. By symmetry, the lower limit has a z score of 1.348, which corresponds approximately to the 9th percentile value of seated eye height above the floor. The designer then concludes that, in order to accommodate the central 90 percent of the intended users for seated eye height above the floor, and whose upper limit is the 95th percentile of the combined measurements and whose lower limit is the 5th percentile value of the combined measurements, she must estimate the upper limit of the range of heights by summing the 91st percentile values of popliteal height and eye height, seated. The lower limit is equal to the sum of the 9th percentile values of popliteal height and eye height, seated. 1.2. The Kreifeldt and Nah multiplier Kreifeldt and Nah (1995) define the z-score multiplier in terms of two quantities. The first is K, the ratio of the smaller standard deviation of the two variables to the larger, and the second is r, the correlation between the two variables. K can take any value between 0 and 1, while r can take any value between 1 and 1. Table 1 is a sensitivity analysis of z-score multipliers for the range of possible values of K and r. Note that the value of the z-score multiplier is never less than 1. Kreifeldt and Nah give separate equations for the sum and difference of two equal value percentiles. For sums, the z-multiplier is equal to (1 þ K)/(1 þ K2 þ 2rK)1/2, where K is the ratio of the smaller standard deviation of the two variables to the larger and r is the value of the correlation between them. The Kreifeldt and Nah method requires knowledge of the individual standard deviations of the two variables whose percentile values are combined, as well as the correlation between them. Unfortunately this information may not be available in the anthropometric data available to the practitioner. 1.2.1. Variances and standard deviations in the simplified Kreifeldt and Nah z-score multipliers The Kreifeldt and Nah multiplier can be determined for a combination of equal percentile values of any pair of variables when the individual variables’ sample or population variances are unknown. This can be accomplished by transforming the anthropometric measurements into indicator function variables. An indicator function is defined to have a value of 1 if the individual measurement meets some criterion test, e.g. if an individual measurement value is less than or equal to some criterion, and a value of 0 if it is not. For example, if the indicator function criterion is whether or not an individual measurement on variable A is less than or equal to the 90th percentile value of variable A, then the indicator function will have a value of 1 for all individuals whose measurements on variable A are less than or equal to the 90th percentile value for variable A. The probability that any individual measurement for variable A will be less than or equal to the 90th percentile value is 0.90, and the probability that the indicator function will have a value of 1 is
Table 1 Z-score multipliers as a function of all possible values of K and for all values of r greater than 1. Note: A correlation of 1 results in an undefined z-score multiplier, as it results in division by zero. The limit of the z-score multiplier approaches infinity as the correlation value approaches 1. Value of K
1 0.5 0
Correlation Value 0.999
0
1
44.721 2.994 1.000
1.414 1.342 1.000
1.000 1.000 1.000
21
also 0.90. The variance of an indicator function is equal to the product of the probability that it occurs multiplied by the probability that it doesn't occur, in the example, the variance of A ¼ (0.90) times (1e0.90). Hence the variances of equal value percentiles are all equal when transformed into indicator function variables. This simplifies the Kreifeldt and Nah multiplier to: z-multiplier ¼ 2/ (2 þ 2r)1/2, where r is the correlation between the two variables combined. 1.2.2. Correlation in the Kreifeldt and Nah z-score multiplier It is sometimes the case that the specific correlation between the two variables is also unknown. In this paper an estimate of the average of all possible correlation values was substituted for the actual correlation value. As an example of the average of all possible correlation values, there are approximately 130 variables in the ANSUR database, and approximately 8400 possible correlation values for pairs of variables. Then the question of interest is then whether a modified Kreifeldt and Nah multiplier using the average of all those correlation values, or an estimate of it, will produce useful results. 1.2.3. A single form of the Kreifeldt and Nah multiplier instead of two While the original Kreifeldt and Nah multiplier has two different equations, one for addition and a second for subtraction, the modified approach utilizes only the addition form. Multiplying the variable to be subtracted by negative one and then adding it to the second variable is equivalent to subtraction. There are two notable effects of this change. First, it is important to note that, when this method of subtraction is used, the percentile values of the variable that is subtracted change. For example, when multiplied by 1, the former 10th percentile value becomes the 90th percentile value of the negative values. It is this 90th percentile value of the negative values that should be added to the 90th percentile value of the other variable. A quick method to determine the changed percentile value after multiplying by 1 is to subtract the original percentile value from 100. The result is the percentile value for the negative values. Then, for example, 100 original 90th percentile value equals the 10th percentile of negative values. 1.3. Three studies of the Kreifeldt and Nah multiplier The first study in this paper compares estimates of the proportions of users whose summed individual measurements are less than or equal to the sums and differences of percentile values for 20 pairs of anthropometric variables, where the estimates are made with the Kreifeldt and Nah z-score multiplier. In the second part of the study, ten sets of variables, each consisting of 2, 3, 4, 5, 10 and 15 variables, were randomly drawn from the ANSUR1 female dataset (Gordon et al., 1989). The Kreifeldt and Nah (average) multiplier was used to predict the proportion of all sums less than or equal to the sum of equal-valued percentiles for each combination of 2, 3, 4, 5, 10, and 15 variables. The predicted proportions were compared with the observed proportions for all 60 cases. In the third part of this study, the Kreifeldt and Nah multiplier was used to determine adjusted percentile values whose summed values would accurately estimate a desired proportion of the summed measurements of each individual in the intended user population. The 90th, 70th, 50th, 30th and 10th percentile values of combinations of twenty sets of two anthropometric variables and twenty sets of three anthropometric variables were estimated as sums of the adjusted percentile values, where the adjusted
22
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
percentile values were determined using the Kreifeldt and Nah multipliers. In summary, this paper studies three methods of accurately estimating the accommodation on univariate variables achieved by combining pairs of equal valued percentiles using the Kreifeldt and Nah multiplier. The accuracy of the predictions is demonstrated for combinations of as many as fifteen different variables. Each method is capable of predicting accommodation from a combination of percentile values, but the simplified Kreifeldt and Nah multipliers require less information. 2. Study one. Accuracy of proportion estimates for combinations of two variables into a single variable 2.1. Method Twenty pairs of anthropometric percentile values were randomly drawn from the ANSUR1 (Gordon et al., 1989) female database. The first 130 measurement variables in ANSUR1 were numbered sequentially from 1 to 130. Pairs of random numbers, each between 1 and 130, were generated. The first twenty non-duplicative pairs of variables were picked, and the corresponding pairs of variables were selected for combinations of two variables. The correlation value was computed for each of the twenty sets. The 90th, 70th, 50th, 30th and 10th percentile values were determined for each variable in each set, as were their sums and differences, e.g. 70th percentile of A (P70A) added to or subtracted from the 70th percentile of B (P70B). 2.2. Criterion measurements After each of the 2208 soldiers measurements for the pairs of variables were combined, the proportion of those measurements less than or equal to the sum of the appropriate percentile values was determined for the 90th, 70th, 50th, 30th, and 10th percentile values of each of the 20 pairs of variables. These observed proportions were used as the criteria against which the estimated proportions were compared.
pairs of randomly chosen variables are shown in Tables 2 and 3 below. The second column in Tables 2 and 3 shows the criterion value against which the estimates are compared in order to assess their accuracy. It is the average observed proportion of individuals whose summed measurements for each pair of variables was less than or equal to the sum of the nominal percentile values for each of the 20 pairs of variables. The third column shows the average proportion predicted to be less than or equal to the sum of percentile values using the original form of the Kreifeldt and Nah multiplier (K&N). The fourth column shows the average proportion predicted to be less than or equal to the sum of percentile values using the simplified Kreifeldt and Nah multiplier (K&N simplified). The fifth column shows the average predicted proportion using the simplified Kreifeldt and Nah multiplier and the average correlation value between all variables combined (K&N average). The differences between the estimated average proportions for sums of two values using the original Kreifeldt and Nah multiplier (K&N) and the criterion (observed) proportions are quite small. For sums of two variables (Table 2), the average of the absolute values of the differences are 0.1 percentage point for the original formulation of the Kreifeldt and Nah multiplier, 0.5 percentage point for the Kreifeldt and Nah (simplified) multiplier and 0.4 percentage point for the Kreifeldt and Nah (average) multiplier. The average algebraic difference between the observed and estimated proportions is 0 percentage point for all versions of the multiplier. The maximum difference between the average predicted and average observed proportions was 0.8 percentage point. For Table 3, where one variable is subtracted from the other, the average absolute value of the difference between the predicted and observed proportions are 0.4, 1.5 and 1.1 percentage points for the original formulation of the Kreifeldt and Nah multiplier, the Kreifeldt and Nah (simplified) multiplier and for the Kreifeldt and Nah (average) multiplier, respectively. The average algebraic difference was 0.4 percentage point for the Kreifeldt and Nah multiplier, the Kreifeldt and Nah (simplified) multiplier and for the Kreifeldt and Nah (average) multiplier, respectively. The maximum difference between the average predicted and average observed proportions was 2.9 percentage points.
2.3. Estimation of proportion accommodated 3. Study two. Combinations of more than 2 variables The proportion less than the sum of the percentile values was estimated using Kreifeldt and Nah multipliers. The Kreifeldt and Nah multipliers were calculated in three different ways; first, as it was defined by Kreifeldt and Nah (1995) using the individual standard deviations for each variable and the actual correlation between the combined variables. The second and third forms of the Kreifeldt and Nah multiplier are simplified versions. The Kreifeldt and Nah (simplified) used the actual value of the correlation between each set of variables combined, the Kreifeldt and Nah (average) used an estimate of the average correlation of all possible pairs of variables that might be combined. The average correlation value of all twenty pairs of variables was 0.329. This average correlation value for the 20 pairs of variables is within the 95 percent confidence interval for a sample of 75 observations of ANSUR1 correlation values previously reported in Albin and Molenbroek (Albin and Molenbroek, 2016). The proportions predicted by the three versions of the Kreifeldt and Nah multiplier were then compared to the observed proportion of the 2008 individuals in the ANSUR1 female database. (Gordon et al., 1997)
3.1. Method Ten sets of variables were randomly selected for groups of 2, 3, 4, 5, 10, and 15 variables drawn from the ANSUR1 female dataset. The 90th, 70th, 50th, 30th and 10th percentile values were determined for each variable in each set. In the addition condition, the pertinent percentile values of each variable were added. In the subtraction condition, one randomly chosen variable's measurement value was subtracted from the sum of all the other values. The proportion less than or equal to the sum of the pertinent number of variables (e.g. P90A þ P90B þ P90C) was determined for both addition and subtraction of variables. Finally, the proportion accommodated was estimated using the Kreifeldt and Nah (average) technique with an average correlation value of 0.329. This estimate was then compared with the criterion value. The criterion is the observed number of measurement sums less than or equal to the sum of the nominal percentile values of each combination of variables. 3.2. Results
2.4. Results The results of adding and subtracting the percentile values of 20
The results are shown in Tables 4 and 5 below. In Tables 4 and 5, the first six horizontal rows show the proportion of individuals’
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
23
Table 2 Estimated average proportion accommodated compared with a criterion of the observed average proportion accommodated (accommodation as sum of measurements less than or equal to the sum of the specified percentile values) for addition of two percentile values (n ¼ 20). Equal Percentile Values Combined
Criterion: Average Observed Proportion Sum of Percentile Values
Original K&N Estimated Average Simplified K&N (Actual Correlation) Proportion Sum of Percentile Estimated Average Proportion Sum of Percentile Values Values
Simplified K&N (Average Correlation) Estimated Average Proportion Sum of Percentile Values
90th 70th 50th 30th 10th
0.936 0.737 0.499 0.265 0.064
0.937 0.737 0.500 0.263 0.063
0.942 0.740 0.500 0.260 0.058
%ile %ile %ile %ile %ile
0.942 0.743 0.500 0.257 0.058
Table 3 Estimated proportion of intended users accommodated compared with a criterion of the observed proportion accommodated (accommodation as sum of measurements less than or equal to the sum of the specified percentile values) for subtraction of two percentile values (n ¼ 20). Equal Percentile Values Combined
Criterion: Average Observed Proportion Sum of Percentile Values
Original K&N Estimated Average Simplified K&N (Actual Correlation) Proportion Sum of Percentile Estimated Average Proportion Sum of Percentile Values Values
Simplified K&N (Average Correlation) Estimated Average Proportion Sum of Percentile Values
90th 70th 50th 30th 10th
0.974 0.811 0.507 0.198 0.028
0.972 0.806 0.500 0.194 0.028
0.987 0.817 0.500 0.183 0.013
%ile %ile %ile %ile %ile
0.983 0.831 0.500 0.169 0.017
Table 4 Average observed proportion of sums of individual measurements sums of percentile values compared to simplified Kreifeldt and Nah (average) predictions of proportion sums of percentiles for ten sets each of 2e15 variables. Number of Variables
Observed Proportion, 90th Percentile Combinations
Observed Proportion, 70th Percentile Combinations
Observed Proportion, 50th Percentile Combinations
Observed Proportion, 30th Percentile Combinations
Observed Proportion, 10th Percentile Combinations
2 3 4 5 10 15 Estimated proportion using K&N (average)
0.941 0.945 0.951 0.954 0.958 0.963 0.942
0.742 0.745 0.757 0.764 0.765 0.774 0.740
0.490 0.503 0.499 0.497 0.493 0.492 0.500
0.256 0.254 0.236 0.227 0.223 0.214 0.260
0.057 0.051 0.044 0.039 0.037 0.032 0.058
Table 5 Average proportion of differences of individual measurements sums of percentile values compared to simplified Kreifeldt and Nah (average) predictions of proportion sums of percentiles for ten sets each of 2e15 variables. Number of Variables
Observed Proportion, 90th Percentile Combinations
Observed Proportion, 70th Percentile Combinations
Observed Proportion, 50th Percentile Combinations
Observed Proportion, 30th Percentile Combinations
Observed Proportion, 10th Percentile Combinations
2 3 4 5 10 15 Estimated proportion using K&N (average)
0.982 0.977 0.974 0.984 0.968 0.974 0.987
0.837 0.802 0.812 0.830 0.788 0.795 0.817
0.507 0.501 0.504 0.504 0.496 0.492 0.500
0.169 0.199 0.191 0.163 0.200 0.190 0.183
0.018 0.020 0.021 0.010 0.024 0.021 0.013
combined measurements that were observed to be less than or equal to the combined percentile values for the several different numbers of variables. The number of variables combined is shown in the first column. Finally, the lowest row shows the proportion of individuals estimated to be less than or equal to the sum of the percentile values using the Kreifeldt and Nah (average) z score multiplier. The estimated value shown in the lower row may be compared to the proportion observed for the various numbers of variables combined. For example, in Table 4, the cell in row 2, column 2 describes cases where the 90th percentile values of three variables were
added. The average proportion of individuals whose summed measurements on the three variables were less than or equal to the sum of the three 90th percentile values is 0.945. The Kreifeldt and Nah(average) estimate of that proportion. 0.942, is shown in row 7, column 2. When 2 to 15 variables were added, the average absolute difference between the predicted proportions and the observed proportions were 0.4 percentage point for 2 variables, 0.5 percentage point for 3 variables, 1.3 percentage points for 4 variables, 1.8 percentage points for 5 variables, 2.1 percentage points for 10 variables, and 2.1 percentage points for 15 variables. The average
24
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
algebraic difference was 0.3 percentage point for 2 variables, 0 percentage point for 3 variables, 0.3 percentage point for 4 variables, 0.4 percentage point for 5 variables, 0.5 percentage point for 10 variables, and 0.5 percentage point for 15 variables. The maximum absolute difference between the average predicted and average observed proportions was 4.6 percentage points. When one variable was subtracted in combinations of 2e15 variables, the average absolute difference between the observed and predicted proportions was 1.0 percentage point for combinations of 2 variables, 1.0 percentage point for 3 variables, 0.8 percentage point for 4 variables, 0.9 percentage point for 5 variables, 1.6 percentage points for 10 variables, and 1.2 percentage points for 15 variables. The average algebraic difference was 0.3 percentage point for 2 variables, 0 percentage point for 3 variables, 0 percentage point for 4 variables, 0.2 percentage point for 5 variables, 0.5 percentage point for 10 variables, and 0.6 percentage point for 15 variables. The maximum absolute difference between the average predicted and average observed proportions was 2.9 percentage points. 4. Study three. Combining adjusted percentile values to achieve a desired proportion of users As was observed earlier, estimating a single variable by combining two or more variables generally results in an estimate for that dimension that accommodates a larger proportion of the user population than expected. For example, estimating seated elbow height above the floor by combining the 90th percentile values of popliteal height and elbow height, seated might accommodate 94 percent of the intended users rather than the expected 90 percent. In some cases it may be desirable to be able to determine an adjusted percentile value so that, when two such percentile values are combined, they estimate a specified proportion of the intended users. This section of the paper describes a second use of the Kreifeldt and Nah multiplier; the estimation of such adjusted percentile values. 4.1. Calculation of adjusted percentile values The adjusted percentile value is determined by first dividing the z-score associated with the desired proportion by the Kreifeldt and Nah multiplier, then converting the resulting z score into an adjusted percentile value using a normal distribution table. For example, the z score associated with 90 percent of a normal distribution is 1.282. For a Kreifeldt and Nah multiplier of 1.2, the adjusted z score would be 1.282/1.2, or 1.07. The approximate percentile corresponding to a z score of 1.07 is the 86th percentile. In this case, about 90 percent of the intended users would be expected to have summed measurements less than or equal to the sum of the adjusted percentile values. 4.2. The accuracy of combinations of two adjusted variables Twenty sets of pairs of variables were randomly drawn from the female soldiers’ data in ANSUR1 via the use of random numbers as described earlier. Two Kreifeldt and Nah multipliers were calculated, one using the actual correlation between the two variables in each pair and another using the estimated average value of all correlations of the pairs of two variables. Adjusted percentile values were then calculated so that, when combined into a single variable, the sum of the adjusted percentile values would be expected to accommodate 90, 70, 50, 30, or 10 percent of the intended users. The proportion of individuals whose summed measurements
were less than or equal to the sum of the adjusted percentile values was determined for each of the 20 pairs of variables. 4.3. Combination of two variables involving subtraction In order to subtract one variable from another, the variable to be subtracted was multiplied by 1. The result was then added to the second variable, accomplishing subtraction of the two variables. 4.4. Combinations of three variables A third randomly chosen variable was added to each of the 2variable combinations. The average correlation value was used to calculate the z multiplier first for the combination of two variables, then again to estimate the z multiplier for the addition of the third variable to the sum of the first two variables. 4.5. Combination of three variables involving subtraction Combinations of three variables involving subtraction of one variable were also investigated. The simplified form of the Kreifeldt and Nah multiplier was used; for subtraction, the negative value of the average correlation between the variables was used. The proportion estimated using the sum of the adjusted percentile values was then compared with the actual targeted percentile values, e.g., the estimated proportion using the adjusted percentile values for 90 percent accommodation was compared to the 90th percentile value of the combined measurements. 4.6. Results The results for sums and differences of two variables are presented in Table 6; those for sums and differences of three variables are presented in Table 7. 4.7. Combinations of 2 variables The method of first estimating the z-score multiplier via the modified form of Kreifeldt and Nah's formula, then using that multiplier to adjust the percentile values combined in order to accommodate the desired proportion of the two summed variables appears to work well. The average absolute difference between the nominal percentage accommodated (90, 70, 50,30, or 10 percent) and the proportion of the intended users estimated by adding the two adjusted percentile values averaged 0.5 percentage point whether using the actual or average correlation values. The average algebraic difference was 0 percentage point. For subtraction of one of the two variables from the other, the average absolute difference between the nominal percentage accommodated and that estimated by adding the two adjusted percentile values averages was 1.5 percentage points whether using the actual or average correlation values. The average algebraic difference was 0.4 percentage point. 4.8. Combinations of 3 variables The average absolute difference between the nominal percentage accommodated and that achieved by adding the three adjusted percentile values averaged 1.6 percentage points using the average correlation value. The average algebraic difference was 0.2 percentage point. For subtraction of one of the three variables from the sum of the others, the average absolute difference between the nominal percentage accommodated and that estimated was 3.0 percentage
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
25
Table 6 Proportion of observed individuals whose summed measurements less than or equal to the sum or difference of adjusted percentile values using actual and average correlation values for two variables, N ¼ 20. Targeted Proportion
Correlation Value Used
0.90 (90 Percent)
0.70 (70 Percent)
0.50 (50 Percent)
0.30 (30 Percent)
0.10 (10 Percent)
Actual Average Correlation Correlation
Actual Average Correlation Correlation
Actual Average Correlation Correlation
Actual Average Correlation Correlation
Actual Average Correlation Correlation
0.893
0.696
0.696
0.499
0.498
0.308
0.306
0.106
0.104
0.877
0.689
0.704
0.504
0.505
0.324
0.308
0.125
0.124
Average Proportion estimated, sums 0.894 of two variables Average Proportion estimated, 0.877 differences of two variables
Table 7 Proportion of observed individuals whose combined measurements on three variables are less than or equal to the sum of adjusted percentile values estimated using average correlation values, N ¼ 20. Targeted Proportion
Average Proportion estimated, sums of three variables Average Proportion estimated, three variables, one subtracted
0.90 (90 Percent)
0.70 (70 Percent)
0.50 (50 Percent)
0.30 (30 Percent)
0.10 (10 Percent)
0.919 0.927
0.715 0.751
0.499 0.512
0.279 0.264
0.078 0.078
points. The average algebraic difference was 0.6 percentage point. 5. Discussion While not intended to replace more sophisticated multivariate techniques for estimating accommodation, the techniques described in this paper describe some effective methods of estimating the accommodation on single variables approximated by summing percentile values. These techniques will be useful in estimation of the proportion of individuals accommodated when the practitioner has only very limited information available, such as the most extreme instance where only an average correlation value and the two percentile values to be combined are known. The Kreifeldt and Nah multiplier, when calculated with the standard deviations and correlation between a pair of variables A and B, provides an accurate estimate of the percentile of the sum, that is, the proportion of individuals whose individual measurements on A and B are less than or equal to the sum of the nominal percentiles. However, the individual standard deviations and correlation values may not be available. In those cases, modifying the Kreifeldt and Nah multiplier by transforming the percentile values to indicator function variables (Kreifeldt and Nah, simplified, and Kreifeldt and Nah, average) obviates the problem of unknown sample standard deviations and unknown correlation values. When transformed in this way, the variance and standard deviation are a function of the nominal value of the percentiles that are combined. While this resolves the issue with regard to standard deviation, it still requires knowledge of the correlation between the combined variables. However, the Kreifeldt and Nah (average) method also gives accurate estimates of the proportion accommodated. In this study, using the modified form of the Kreifeldt and Nah multiplier with the estimated average correlation value produced accurate estimates of the proportion of users accommodated. Estimation of the proportion accommodated using the Kreifeldt and Nah (average) multiplier with the average of all possible pairwise correlation values reduces, but does not completely eliminate the problem of unknown correlation values. An interesting aspect of the Kreifeldt and Nah (average) multiplier is that the multiplier has the same value for any combination of equal valued percentiles. Hence a table of percentile values that included the single Kreifeldt and Nah (average) multiplier
associated with that set of anthropometric data would enable a practitioner to make useful estimates of the accommodation achieved by combining any two or more percentile values into a univariate dimension. The three forms of the Kreifeldt and Nah multiplier can be applied to any anthropometric dataset. While the modified forms obviate the problem of unknown variances, they still require some information regarding the correlation values, whether that is the actual or average value. There are at least two situations of interest in which Kreifeldt and Nah multipliers might be employed: estimating a single dimension as a combination of variables, as in ISO 11064-4 and estimating the adjusted percentile values, that, when combined, produce the desired proportion of accommodation on a dimension approximated by summing two or more percentile values. These applications are straightforward: a required dimension for which direct measurement data is not available is specified as a combination of percentile values of variables for which percentile values are available. These are similar to the Derived Dimensions in ANSUR1 (Gordon et al., 1989). The methodology described in this paper provides an accurate estimate of the proportion of individuals in the user population who will be accommodated on the needed dimension by the combination of known variables. It is important to note that, as seen in Tables 2 and 3, combining two equal percentile values that are greater than the 50th percentile value will generally accommodate a greater proportion of all the combined individual values than the nominal percentile values that were combined. For example in Table 2, on average about 97 percent of all individuals' combined measurement values were less than or equal to the sum of the 90th percentile values. That is, if each individual's measurements for variable A and variable B are combined and then compared with the sum of the 90th percentile value of A and the 90th percentile value of B, the proportion of all individuals whose combined measurement values are less than or equal to the sum of the two 90th percentile values will be greater than 0.90 whenever the correlation value is not equal to 1.0, and 0.90 when the correlation is 1.0. Combining two equal percentile values less than the 50th percentile value will generally accommodate less than the sum of the nominal percentile values. In Table 3, the average proportion of individual's summed measurements less than or equal to the sum of the 10th percentile values was about 0.03.
26
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
In a second situation, the practitioner may desire to achieve a level of accommodation for combined measures of a single dimension that more closely approximates the nominal, or desired percentile values; for example, accommodating 90 percent of the combined variables. 6. Percentile based estimates of multivariate accommodation Accommodation was defined earlier (Garneau, 2009) as having two properties: first, a measurement or measurements fits a specified proportion of the population, and second, for multivariate dimensions, no individual is disaccommodated on any dimension. In the latter case, consider the multivariate accommodation for a chair in regard to seat height, seat width and seat depth. While it is not possible with the method of combining percentile values described in this paper to say with absolute certainty that an individual will be accommodated on all three variables, e.g. that the individual's three measurements for seat height, seat width and seat depth will all be within the 5th to 95th percentile range for each of the three variables, it is possible to draw some inferences regarding a sample of individuals' measurements using only percentile values. To do so, we treat each measurement for each individual as an indicator function; it does or does not satisfy the pertinent criterion test. For example, consider a criterion test of whether or not an individual's measurements are greater than or equal to the 5th percentile value and also less than or equal to the 95th percentile value. Then we can treat each of the chair measurements as an independent binomial variable with the probability of a success of 0.90. A success is specifically defined as having an indicator function value of 1 (the individual measurement is greater than or equal to the 5th percentile value and less than or equal to the 95th percentile value). For any binomial function, the expected number of successes is np, where n is the number of trials and p is the probability of a success for a trial. The expected number of successes for the chair example with three dimensions then is 3 multiplied by 0.9, or 2.7. On average, the expectation is that any individual will fit on something more than two but less than three of the three dimensions. As p increases, the expected number of successes approaches, but never quite reaches, n. For example, for 5 variables, if p is equal to 0.90, then the expected number of successes for any individual (number of dimensions fitting) is 4.5. If p is equal to 0.95, then the expected number of successes for any individual is 4.75, if p ¼ 0.99, then the expected number of successes is 4.95, etc. Although the number of dimensions on which an individual will be expected to be accommodated (fit) will never be exactly equal to n as p increases, in some cases the expected number of successes may approximate n closely enough as to be considered practically equivalent to n. An apocryphal piece of advice given to designers who want to gauge accommodation on multiple variables but have only percentile data is: “use the 99th percentile values”. As demonstrated here, using the 99th percentile values will closely approximate n, the desired number of successes with regard to concurrent accommodation on multiple variables. However, a lesser value of p may be sufficient for some cases. Note that this approximation will not be sufficient in other cases; for example, those cases where it is essential that the intended users will be concurrently accommodated on all n variables. Some of the classic papers describing the problems of combining percentiles use the example of clearances in an airplane cockpit from which pilots may be expected to eject. In such a case, it
is vital to know that there will always be sufficient clearance to enable the pilot to eject safely. In other words, it is critical to know that the cockpit design dimensions will accommodate a specified proportion of all intended users on all dimensions. While more sophisticated multivariate analyses of accommodation, such as Principal Component Analysis (Gordon, 2002), regression (Garneau, 2009) or Principal Component Regression (Parkinson and Reed, 2010), might give more efficient estimates of dimensions, the techniques described in this paper are intended for those situations where insufficient data or resources are available to support these multivariate techniques.
7. Conclusions 7.1. Kreifeldt and Nah multipliers as predictors of accommodation Often designers and ergonomists have only limited anthropometric data, such as tables of percentile values, with which to work. Although past advice has been to avoid combining percentiles, this paper demonstrates that it is possible to extract useful information about single dimensions formed by combinations of percentiles using the Kreifeldt and Nah z score multiplier. However, the original Kreifeldt and Nah formulation requires information regarding the individual standard deviations of the variables combined, as well as the correlation between the combined variables, which may not be available. The Kreifeldt and Nah (simplified) multiplier simplifies the original formulation by utilizing indicator function variable transformations of the percentile values that are combined. The advantage of this transformation is that the variance of each variable is a function of the nominal percentile values that are combined, so that knowledge of the variances, and consequently, the standard deviations, is always available. The Kreifeldt and Nah multiplier (average) is identical to the simplified version, except that an average correlation value is used. An advantage of the Kreifeldt and Nah (average) multiplier is that the value is constant for any specific sample of data. It does not seem unreasonable to expect compilers of percentile tables to provide an estimate of the average correlation value or the Kreifeldt and Nah (average) multiplier specific to a collection of data. An unexpected finding was that the predicted accommodation proportion of a single dimension approximated by summing a single pair of variables provides a useful estimate of the accommodation proportion of combinations of multiple variables. It was initially expected that there might be a compounding effect as more and more variables were combined; however, it appears that any such effect is minimal. While the Kreifeldt and Nah procedures were originally derived using exactly equal percentile values, they appear to be robust enough to accommodate small differences in the percentile values combined, e.g. 93rd percentile value of variable A added to the 90th percentile value of variable B. In summary, it is possible to gain useful information from combinations of percentiles when limited data or limited resources preclude using multivariate techniques. All forms of the Kreifeldt and Nah multiplier (original, simplified, average) are useful as predictors of the proportion of individuals accommodated on a single dimension estimated by combining percentile values. Use of a binomial model to determine the asymptotic limit of the expected number of variables upon which any user will be accommodated as a function of p is a useful tool when only percentile values are known.
T.J. Albin / Applied Ergonomics 62 (2017) 19e27
Funding This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Acknowledgement The author wishes to thank Mr. John Roebuck for his comments on an early version of this paper. References Albin, T.J., Molenbroek, J., 2016. Stepwise estimation of accommodation in multivariate anthropometric models using percentiles and an average correlation value. Theor. Issues Ergonomics Sci. 1e16. Garneau, C.J., 2009. Investigation of Accommodation for Products Designed for Human Variability (Doctoral dissertation, The Pennsylvania State University).
27
Gordon, C.C., Churchill, T., Clauser, C.E., Bradtmiller, B., McConville, J.T., 1989. Anthropometric Survey of US Army Personnel: Methods and Summary Statistics 1988. Anthropology Research Project Inc, Yellow Springs OH. Gordon, C.C., Corner, B.D., Brantley, J.D., 1997. Defining Extreme Sizes and Shapes for Body Armor and Load-bearing Systems Design: Multivariate Analysis of US Army Torso Dimensions (No. NATICK/TR-97/012). Army Natick Research Development and Engineering Center, MA. Gordon, C.C., 2002. Multivariate anthropometric models for seated workstation design. Contemp. Ergon. 582e589. ISO 11064-4, 2013. Ergonomic design of control centres - Part 4: layout and dimensions of work stations. Int. Organ. Stand. 1e18, 2013. Kreifeldt, J.G., Nah, K., 1995, October. Adding and subtracting percentilesdhow bad can it be?. In: Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 39. SAGE Publications, pp. 301e305. No. 5. Parkinson, M.B., Reed, M.P., 2010. Creating virtual user populations by analysis of anthropometric data. Int. J. Industrial Ergonomics 40 (1), 106e111. Robinette, K.M., McConville, J.T., 1981. An Alternative to Percentile Models (No. 810217). SAE Technical Paper. Roebuck, J.A., 1995. Anthropometric Methods: Designing to Fit the Human Body. Human Factors and Ergonomics Society.