European Journal of Operational Research 77 (1994) 253-271 North-Holland
253
Theory and Methodology
Constrained regression median for measuring possible salary discrimination Toshiyuki Sueyoshi
College of Business, The Ohio State University, 1775 College Road, Columbus, OH 43210, USA Received November 1991; Revised May 1992
Abstract: This study proposes a new analytical method for empirically examining possible salary
discrimination against a protected (e.g., female or minority) group. The proposed approach is referred to as 'Constrained Regression Median' (CRM). The method can be used for regressing salary with job qualifications as dependent variables. The result of CRM is transformed into a 2 x 2 contingency table and the degree of salary discrimination is determined by the result of the contingency table. The approach can be easily explained to and understood by lawyers, judges and other individuals involved in a discrimination case. The CRM method is particularly useful and practical when a data set maintains the problems of outlier and/or multicollinearity between job qualifications and gender (or race). Keywords: Goal programming; Regression median; Equal employment opportunity; Affirmative action requirement; Salary discrimination
I. Introduction
According to Title VII of the Civil Rights Act of 1964, that was first cited as the Equal Pay Act of 1963 and then amended in 1972, it is unlawful employment practice for employers to discriminate against any individual with respect to compensation, promotion and other employment situations, because of the individual's race, religion, gender or national origin. The law referred to as 'Equal Employment Opportunity' (EEO) has been a cornerstone of the civil rights movement in the United States. As a consequence of this legislation, there have been many arguments and practices related to implementing EEO. However, the objective of equality in employment has not been sufficiently achieved because of a lack of an appropriate method to measure empirically the level of discrimination and bureaucratic resentment in many organizations. In an effort to overcome this methodological difficulty related to EEO, this study proposes the use of 'Constrained Regression Median' (CRM) as a special form of Goal Programming (GP). The CRM has two unique features in measuring possible salary discrimination. First, CRM can produce parameter estimates of a regression hyperplane by which an observed data set is clearly dichotomized. There are two data subsets classified by CRM, each of which contains 50% of the observed sample size. By taking Correspondence to: Toshiyuki Sueyoshi, College of Business, The Ohio State University, 1775 College Road, Columbus, OH 43210,
USA. 0377-2217/94/$07.00 © 1994 - Elsevier Science B.V. All rights reserved SSDI 0 3 7 7 - 2 2 1 7 ( 9 2 ) 0 0 0 0 9 - 3
T. Sueyoshi / Constrained regression median for measuring salary discrimination
254
this approach, CRM becomes less sensitive than conventional methods to the disturbing effect of outliers. This is a typical and serious problem with EEO analysis. For instance, a medical doctor usually receives much higher compensation than nurses or administrative staffs. The data on the medical doctor, thus, becomes an outlier when evaluating possible salary discrimination in a health care system. Second, CRM can easily incorporate various a priori information in the form of GP side constraints. In many EEO cases where prior information or some kind of policy restrictions are known in advance, these requirements must be incorporated into the evaluation process of salary discrimination. For instance, a mail clerk should not receive more compensation than the president of his firm. This type of violation produces a serious problem in maintaining an organization-wide consensus. After obtaining CRM, this study transforms its results into a visible 2 x 2 contingency table. As a consequence of this transformation, findings in the contingency table can serve as an empirical basis for examining the possible salary discrimination. For instance, using a chi-squares test, this study will examine the null hypothesis of no salary discrimination in the contingency table. It is important to note that there is an extensive literature on the use of multiple regression in investigating salary discrimination (see, e.g., Conway and Roberts [12], Finkelstein [14,15], Fisher [16], Gilmartin [19], Goldberger [20], Greene [22], Kamalich and Polacheck [27], and Hashimoto and Kochin [23]). These conventional research works have explored various methodological issues regarding the use of multiple regression in EEO whose algorithmic process is based upon the 'Least Squares' (LS) method. It is now widely accepted that underlying assumptions of the LS regression are often inconsistent with many real data sets (see, e.g., Bassett and Koenker [2,28], Huber [25], Hogg [24], and Sueyoshi [37]). Therefore, this study does not follow the conventional research stream. Rather, this article will attempt to explore the use of multiple regression for examining salary discrimination, following the approach of Least Absolute Value (LAV) estimation proposed by Charnes and Cooper [5-10]. Therefore, this research is concerned with providing a new type of analytical method for measuring possible salary discrimination. The remainder of this article is organized in the following way: Section 2 presents CRM, focusing upon its formulation and use in EEO. A modified form of CRM is also proposed for examining possible salary discrimination in this section. Section 3 proposes a new EEO measure for examining salary discrimination. The use of a conventional A,2 test for examining the null hypothesis of no discrimination is also discussed in the framework of the 2 x 2 contingency table. Section 4 applies CRM and its related EEO approach to illustrative and real data sets. Then, this section documents these results. This section also compares the result of CRM with LS regression. Conclusion and future extensions are summarized in the last section.
2. Constrained regression median In order to describe CRM and its application in measuring salary discrimination, this study explicates a salary data set that can be specified as follows: (a) y is a measure of employee's compensation, (b) X = (x 1, x 2. . . . . Xm) is a matrix in which each column represents a job qualification indicator (e.g., working experience, education, and tenure), and (c) g is an indicator variable representing gender. For simplicity of analysis, this section will focus upon solely salary discrimination related to gender. However, it is a straight-forward matter to extend this EEO analysis from gender to other types of discrimination as illustrated in Section 4.2 of this article. This study fits the following linear regression model: Yi=Xi1~q-ei,
i = 1 .... ,n,
(1)
to the salary data set, maintaining the null hypothesis that there is no salary discrimination. Here, the subscript i indicates the i-th employee as a sample observation. The error is expressed by e; and /3 is a
T. S u e y o s h i / C o n s t r a i n e d regression m e d i a n f o r m e a s u r i n g salary d i s c r i m i n a t i o n
255
parameter vector. This study excludes g (i.e., gender) temporally from estimating (1) so that it can incorporate the null hypothesis of no salary discrimination in estimating a regression hyperplane. Of course, this study will examine possible salary discrimination by comparing the numbers of g above and below the resulting regression hyperplane. The CRM can be formally defined as a GP approach that determines parameter estimates fi, satisfying not only various a priori conditions but also the following requirement:
U ( f i ) / n = 50%,
(2)
where N(fi) represents the employee number i, such that yi < X~fi. Given X, CRM determines Xfi by which an observed salary data set is classified into the following two subgroups:
GA = {ily i>Xi/3} and G B = {ilYi
(3)
The two subsets are referred to as 'high salary group' and 'low salary group', respectively. This study will count the number of male and female employees belonging to each group so that it can examine whether or not female employees are unfairly treated in compensation. Salary discrimination against women occurs when the number of women belonging to G A is much less than that of men, or when the number of women belonging to G B is much larger than that of men. Thus, (3) can serve as an empirical basis for measuring possible salary discrimination.
2.1. Algorithm of CRM The CRM can be broken down into three computational processes. First, the following GP is solved: Minimize
L ( 6+ + 6i-)
(4a)
i=1
subject to
Xi/3+6i ~ - 6 7 =Yi,
i = 1 ..... n,
/3 ~ S, 6 + > 0 and 6 7 > 0 ,
(4b) (4c) (4d)
where 6/+ and 67 indicate positive and negative deviations, respectively. S represents a constraint set that the values of/3 have to be satisfied. Model (4) was first proposed by Charnes et al. [8] in order to deal with a well-known problem of multicollinearity. It was called 'Goal Programming/Constrained Regression' ( G P / C R ) because (4) maintains a GP form with a single priority and the constraint set S. (See also Charnes et al. [7,8] and Sueyoshi [36].) A unique feature of (4) is that it minimizes the sum of total absolute errors and it can be solved by any linear programming algorithm. Second, the CRM rearranges dual variables ooi (i = 1,..., n) related to (4b) in a descending order: O) 1 ~ £0 2 ~_~ " " " ~_~ ~0 k ~___ " " " ~ O J n ,
(5)
where the subscript k (k = 1 , . . . , n) indicates the descending order, while the subscript i denotes an observed order of a data set. Following (5), an observed EEO data set is classified into the following two subgroups: GA =
{i [the i-th observation (employee) has ~oi belonging to the top 50% in (5)},
G B = {i Ithe i-th observation (employee) has wi belonging to the bottom 50% in (5)}.
(6a) (6b)
By the result of the complementary slackness condition of linear programming, (3) is identical to (6). The optimal 00" indicates the rate of change in the objective of (4) with one unit increase in Yi. Thus, the examination of ~o* provides the optimal sum of (4) that may be determined by slightly changing the original location of an estimated regression hyperplane.
256
T. Sueyoshi / Constrained regression median for measuring salary discrimination
After classifying a data set into GA and GB, CRM uses the following GP models to yield two different regression hyperplanes for GA and GB: Minimize
E
a/++L E
i~G A
subject to Xifl+6+=Yi, Xi~
6?
(7a)
i~G a
+ 6/+ -- 6 / -
i~G A = Yi, i ~ GB,
(7b) (7c)
/3 e S,
(7d)
6+ > 0 and 6/- > 0,
(7e)
L E 6/-+ E 6/-
(8a)
and Minimize
i~G A
subject to
i~G B
X i f l + ~ ? -- ~/- = Y i ,
S i / 3 - 6/- = yi,
~ GA,
i
(8b)
i ~ GB,
(8c)
/3 ~ S, 6/+ > 0 and 6/- > 0.
(8d) (8e)
Here, all the symbols in (7) and (8) are the same as described previously, except L, which represents a non-Archimedean large number. Two GP models, (7) and (8), yield regression hyperplanes X/3A for GA and X/3 B for Ga, where /3A and /3B are the two column vectors of parameter estimates measured by (7) and (8), respectively. The two regression hyperplanes X/3A and X/3 B consist of the upper and lower bounds of a CRM hyperplane. Thus, any regression hyperplane )7 = X/3 = X [ ( 1 - a ) / 3 A + a/~B]
(9)
becomes CRM where a is a constant on (0, 1). Any a score can satisfy (2), When a = 0, 9 = X/3 -- X/~A becomes the upper bound of the CRM hyperplane which satisfies y~ > Yi, i ~ GA, and yi < ~i, i ~ G B. The opposite result (i.e., the lower bound of CRM) can be observed in the case of a = 1. In order to determine the optimal a* value, the CRM incorporates the following quadratic programming: Minimize
~ (Yi-Xi/3)2+ i~GA
~-, ( Y i - X i f l ) 2
(lOa)
i~G B
subject to /3 = (1 --a)/3A+a/3B~S '
(lOb)
0 _
(lOc)
which minimizes the sum of squared errors so that it produces uniquely an optimal solution of a. There are two approaches to solve (10), depending upon the assumption whether/3 = (1 - a ) / 3 A + a/3 a can satisfy S on any a ~ [0, 1]. This might be a straight-forward assumption because both /3A and fiB need to satisfy S when being solved by (7) and (8), respectively. In the case where /3 may violate S on some a ~ [0, 1], (10) needs to be solved by conventional algorithms related to quadratic programming [4]. Meanwhile, if the assumption can be maintained, (10) can be solved by the following simple computation: First, substitute (10b) into (10a) and then take the derivative of (10a) to produce
o( z
iEGA
i~G a
i)/I
/\i~G
A
z
i~G B
:
where r/i = (Yi--Xi/3A)(Xi/3B-Xi/3 A) and Ai =Xi/3A-Xi/3a"
T. Sueyoshi / Constrained regressionmedian for measuring salary discrimination
257
The optimal a * to (10a) can be determined by simple rules as follows: a*--
a 1 0
if0~a_ ifa>l, if a < 0 .
1, (12)
(See [37] for a detailed description on (11) and (12).) The determination of a * indicates the end of the CRM algorithm. It is important to note that the robustness of CRM might be limited when an outlying point(s) exists in the dependent variable space. The outlying points referred to as 'leverage points' in the predictor ( X ) direction may cause serious distortion of the CRM solution, leading to misinterpretation of the CRM hyperplane. Therefore, careful screening of the leverage point(s) must be made before CRM applications. For this screening purpose, the R 2 scores of the LS and LAV methods need to be examined in a preliminary study. Moreover, the use of Least Median of Squares (LMS) regression may be useful because the LMS method is robust to not only outliers but also leverage points. (See [34] and [35] for a detailed description on the LMS method; see also [29] for comments on the use of R 2 scores in conventional regression context.)
2.2. Importance of CRM for measuring possible salary discrimination As discussed previously, a unique feature of CRM is that it can separate clearly all the employees into GA and GB, using the dual variables derived from (4). The resulting CRM hyperplane always locates between G A and GB, and therefore, it can ameliorate the influence of outliers. Furthermore, CRM may satisfy various requirements regarding parameter estimates on a priori grounds. Of course, this study knows that it is arbitrary to use CRM as a basis for classification into high and low salary groups. Hence, the side constraints (S) incorporated in CRM needs to be justified from a legal perspective. The development of CRM is acceptable as long as the independent variables are known a priori to be related to salary in some definite linear fashion. To the extent that this relationship does not hold, the CRM classification could be misleading. In addition to these statistical and judicial perspectives, the result of CRM may be used to construct a 2 × 2 contingency table which can serve as an empirical basis for measuring possible salary discrimination. That is, each employee is naturally categorized as being either female (g = 1) or male (g = 0). Furthermore, each individual can be classified into either GA or G B. Thus, using the variable indicating gender (g = 1 or 0) and the dual variables of (4), this research can construct a 2 x 2 contingency table in which n is broken down into nil, n12 , n21 and n22 , as depicted in Table 1. Here, the first subscript indicates whether each employee belongs to G A or GB, while the second denotes whether the individual 1 is female or male. It is always maintained in Table 1 that n~l + n12 = n2~ + n22 = ~n. All the marginal frequencies of Table 1 may be considered to be fixed because n~, n 2 and n are all known before our E E O examination. Only one of the four numbers (i.e., nit, hi2 , n2~ and n 2 2 ) c a n vary independently. Therefore, when investigating salary discrimination against female employees, nli can be selected as an independent random variable. Meanwhile, nl2 becomes a random variable in the case of discrimination against male employees. In Table 1 the null hypothesis indicating no salary discrimination may be
Table 1 Proportions of female and male employees in high and low salary groups Group
High salary (G A) Low salary (GB) Total
Gender
Total
Female
Male
nil n21 nl
n12 n22 n2
/n in n
258
T. Sueyoshi / Constrained regression median for measuring salary discrimination
1 and n~ = nzz = -~n 1 2. Therefore, the degree of salary discrimiexpressed by the condition n u = nzx = ~n~ 1 nation can be measured by counting the frequency difference between n~l and ~n~. The approach proposed in this study has three important features in measuring EEO violation. First, the proposed method can deal with the well-known problem of multicollinearity that may exist frequently between g and some job qualification variable (x), because g is excluded in fitting X/3 to an EEO data set. The gender is used only in the 2 x 2 contingency table. This analytical feature is important because women have recently entered in traditionally male-dominated professional areas (such as medical doctors and corporate executives), so that gender and rank may be strongly correlated in many EEO data sets. As a consequence of excluding g (i.e., gender) from measuring parameter estimates, the proposed approach can ameliorate the influence of multicollinearity from the examination of possible salary discrimination. (See [8] for a description on the use of (4) for dealing with the problem of multicollinearity.) Second, the result of CRM expressed by a 2 x 2 contingency table may be easily explained to and understood by lawyers, judges and other individuals involved in a discrimination issue. Thus, the proposed approach provides the EEO examination with practicality in its investigation. Finally, the 2 x 2 contingency table can serve as an empirical basis for initiating an EEO investigation. The Office of Federal Contract Compliance in the United States uses a '4/5' rule by which the selection rate of a protected group is required to be more than 80% of a majority group (see Miner and Miner [31] and Graham-Moore and Seiford [21]). This 4 / 5 rule can be applied to investigate possible salary discrimination against female employees. The rule can be expressed by n l l / n I > 0.8 n l 2 / n 2 in Table 1. If some organization does not satisfy the 4/5 rule, a protected group can claim the need for investigation regarding possible salary discrimination.
2.3. Extension: Discriminant regression
In an effort to provide an additional EEO evidence, this study extends CRM to discriminant analysis. While regression has been commonly employed in the study of salary discrimination, discriminant analysis has also been commonly used in both parametric and nonparametric forms (see, e.g., [3,17,18,30,32,33] for descriptions on many different forms of discriminant analysis). It is important to note that our proposed approach is, strictly speaking, not discriminant analysis. The conventional discriminant analysis uses only X as characteristic variables (i.e., independent variables in regression) in order to estimate a discriminant hyperplane. Thus, it excludes y in its estimation process. Meanwhile, the CRM-based discriminant analysis proposed here is designed to utilize both X and y in order to predict whether an observation belongs to F or M. Thus, the proposed approach needs to incorporate statistical functions of both regression analysis and discriminant analysis. Therefore, it is referred to as 'Discriminant Regression' (DR) in this article. In the use of DR for measuring salary discrimination, gender is used as a classification variable and salary, along with the other characteristic variables, is used to predict the gender. The DR use in investigating salary discrimination may be defined as a set of the following analytical steps: First, a data set regarding salary is classified into male and female groups (M or F). (CRM separates a data set into two groups, using the information on dual variables derived from (4). Meanwhile, the data classification of DR can be accomplished on an a priori basis.) Second, the following two models are applied to M and F, respectively: Minimize
~ 8i
(13a)
i~M
subject to X,/3 qt_~i=Yi, B ~ S,
6i ~ O,
i~M,
(13b) (13c)
T. Sueyoshi / Constrained regressionmedianfor measuringsalary discrimination
259
and Minimize
~ 3i i~F
(14a)
subject to X~/3 - 6, = Yi, /3~S,
i ~ F,
(14b)
~i >__0.
(14c)
Here, (13) and (14) correspond to (7) and (8), respectively. Model (13) is used for producing a bottom hyperplane on or above which all the data points in M locate. The opposite result can be found by (14). The combination between (13) and (14) needs to be used with an employment situation, where given same job qualification indicators (X), the salaries of male employees are higher than those of female employees. If the employment situation is an opposite case, F is applied to (13) while M is used for (14). Now, let/3 M and/3 e be vectors of parameter estimates measured by (13) and (14), respectively. Using /3M and/3e, DR can determine whether there is some kind of overlap between M and F in (X, y)-space. That is, if
(15)
/3M ~ /3F
is satisfied at all the parameter estimates of the two vectors, then DR can conclude that there is no overlap between M and F groups. (Relationship (15) occurs with salary discrimination against women.) Conversely, if at least one of the parameter estimates violates (15), then DR may conclude that there is some kind of overlap between M and F. As a consequence of this overlap, DR cannot clearly separate M and F in (X, y)-space, using a linear regression hyperplane. Finally, when no overlap is observed between M and F, the following model is applied to determine a DR hyperplane (X/3D) between X[~ M and X/3F: Minimize
~ 6i + ~ 6 i i~M ieF
subject to )(,./3 + 6 i = y i, Xi/3 - 6i = Yi,
(16a) i~M
(16b)
i e F
(16c)
/3 ~ S and 6 i > O.
(16d)
The completion of (16) indicates the end of DR. (When some kind of overlap is found by (15), model (16) usually produces an infeasible solution in a computer run.) As mentioned before, DR is used for predicting the type of gender, using y and X. Based upon parameter estimates of (13), (14) and (16), a newly sampled observation ( X r , Yr) related to the r-th employee may be characterized as being either M or F by the following simple rule: (a) First, in the case of no overlap between M and F in (X, y)-space: if Yr >-- Xr/3D, then r E M, or
if Yr <
Xr/3D, then
(17)
r ~ F.
(b) Second, when some kind of overlap is found between M and F: if Yr >- Xr[~M and Yr > Xr/3F, then r ~ M,
(18a)
if Yr -< Xr/3M and Yr ~ Xr/3F, then r ~ F,
(18b)
if Yr >
Xr[~M and Yr < Xr/3F, then r ~ M U F, or
if Yr < Xr/3M and Yr > Xr/3F, then r ~ M U F,
(18c) (18d)
where r ~ M u F indicates that the r-th employee cannot be characterized as being either M of F.
T. Sueyoshi / Constrained regression median for measuring salary discrimination
260
3. Discrimination tests
3.1. 4 / 5 rule The 4 / 5 rule used in this study can be formally defined as the following decision rule: n11/n 1 < 0 . 8 n l z / n 2 or
(19)
nl2/n 2 < 0 . 8 n l t / n 1.
(20)
The rule determines whether an employment situation needs some kind of investigation regarding salary discrimination. If the proportion of female employees ( n l ~ / n l ) in G A is less than 80% of male employees ( n l z / n z) in G A, then it is usually regarded as an evidence of salary discrimination against women. In this case, the group of female employees can request an E E O investigation. Conversely, the opposite situation can be observed in (20). 3.2. Degree of discrimination The 4 / 5 rule may be used to initiate some E E O investigation on possible salary discrimination. However, it does not reflect the degree of discrimination as to how seriously salary discrimination occurs in an investigated organization. In order to measure the degree of discrimination, this study proposes the use of the following score (D): D = (n,lnz2 - n12n21)/(n,ln22 + n~2n21 ).
(21)
Using the D score, salary discrimination can be characterized by the following three special cases: (a) First, D = 0 if nl~n22 = nlznzl or nH/n2~ = n~2/n22 in (21). The condition indicates that the ratio of the number of female employees in G A to that in G B equals the corresponding ratio concerning male employees. This employment situation is obviously no discrimination. (b) Second, D = 1 if n~2 = 0 a n d / o r n2~ = 0 in (21). This employment situation is considered as salary discrimination against the male group because n12 = 0 indicates no male employee in G A, while n21 = 0 represents all the female employees in GA. (c) Third, D = - 1 if nix = 0 a n d / o r n22 = 0 in (21). This employment situation is considered as salary discrimination against the female group because n H = 0 represents no female employee in GA, while n22 = 0 indicates all the male employees in GA. The D score exists between - 1 and 1. The magnitude of D indicates the level of discrimination in a percentile form and its sign indicates the type of discrimination as characterized above. That is, a positive sign represents the salary discrimination against a male group (reverse discrimination), while a negative sign indicates the discrimination against a female group. The D score in combination with the 4 / 5 rule can serve as empirical evidences for initiating an E E O investigation regarding salary discrimination. (See, e.g., [1] and [31].) The D score has two unique features that need to be discussed from the viewpoint of examining salary discrimination. First, it does not fully uncover possible salary discrimination. It may be used as a signal to request some kind of E E O investigation that might provide a more detailed explanation on a current employment situation. Second, the measurement of D might not be as easy to interpret as the traditional regression fl coefficients for measuring the level of discrimination. 3.3. Extension: Chi-squares test On the hypothesis of no discrimination, the 2 × 2 contingency table could be represented as specified in Table 2.
T. Sueyoshi / Constrained regression median for measuring salary discrimination
261
Table 2 Proportions of female and male employees in high and low salary groups in the case of no discrimination Group
Gender
High salary Low salary (G B) Total
Total
Female
Male
~n I = z l l I 2 ~ Z21 ~n n~
~n 1 = zl2 ~ ~n2 Z22 n2
~n i n
For descriptive simplicity, frequencies at the four combinations between two rows and two columns 1 1 2, z21 = ½n I and z22 = ~n2, 1 are replaced by z11 = ~nl, z~2 = ~n respectively. The deviation of an observed frequency from the situation of no discrimination is measured by (22)
drc = nrc - Zrc ,
where the subscripts r and c indicate the row and column of the 2 × 2 contingency table (r,c = 1,2), respectively. Following the conventional statistical approach, the sum of squares contingency values can be expressed by 2
,2
=
2
E E
2
(23)
drc//Zrc
r=l c=l
that is assumed to follow the chi-squares distribution with one degree of freedom. It is a straight-forward matter to examine the null hypothesis of no discrimination, using (23) and the chi-squares test.
4. Application This article presents an illustrative data set to describe how to use CRM for measuring possible salary discrimination. Then, this study documents empirical results of CRM applied to an E E O real data set, comparing it with the conventional LS approach. Thus, the purpose of this section is to describe the use of CRM in detail from the perspective of examining possible salary discrimination. (Simulation studies related to CRM can be found in [37] and [38].)
4.1. Illustrative example In order to describe the use of CRM in a step-by-step manner, three distinct data sets were artificially generated for regressing the monthly wage of twenty employees by their job qualifications and gender. Table 3 exhibits the three illustrative data sets. As presented in the three subtables, the data sets maintain the same values on Yi and x i but three different combinations on gi. Thus, the three data sets are designed to demonstrate intuitively the relationship between the structure of each data set and the results related to our discrimination tests. As described previously, CRM may be broken down into three algorithmic processes whose results are summarized as follows: First, y = 2050 + 525x is obtained as a linear regression line by applying (4) to the data sets. The result of (4) is summarized in Table 4. Using the dual variables exhibited in Table 4, all the twenty employees are classified into the following two groups: G A = {i
Ian odd number < 20} and
G B = {i Lan even number _< 20}.
Two important findings can be identified in Table 4. As presented in Table 4, the third and eleventh observations (employees) are on an estimated regression hyperplane and the two observations have 1.0 as
T. Sueyoshi / Constrained regression median for measuring salary discrimination
262
these dual variables. This result indicates the occurrence of degeneracy (see [37] for a detailed description on the degeneracy). Moreover, the two observations are categorized as being in GA because of these dual variables (w 3 = w H = 1.0). As a consequence of this classification, all the employees are equally separated into GA or G B. Thus, the examination of dual variables of (4) provides information as to how to dichotomize an EEO data set into GA or GB. Second, (7) has yielded y = 2050 + 525x as the bottom line of GA, while (8) has yielded y = 1950 + 450x as the upper line of G~. Finally, (11) has determined a* = 0.6368 and therefore, the estimated CRM line becomes y = 0.3632(2050 + 525x) + 0.6368(1950 + 450x) = 1986.32 + 477.24x. By the result of the CRM line, the three data sets exhibited in Tables 3a, 3b and 3c can be transformed into three 2 × 2 contingency tables as depicted in Tables 5a, 5b and 5c, respectively. As described previously, the most important feature of the contingency table is that it visually describes a current employment situation related to possible salary discrimination. As a case in point, Table 5a presents that five employees belong equally to each of the four cells. This employment situation indicates clearly no discrimination. Table 6 documents a summary of different discrimination tests applied to the three employment situations, all of which are characterized in Tables 5a, 5b and 5c. For instance, the last column indicates the results of EEO examination applied to Table 5c (the third data set). These can be characterized by the following four findings: (a) First, the 4/5 rule indicates that this employment situation in Table 5c needs some kind of EEO investigation and explanation concerning salary discrimination against male employees, because n12/n 2 (2/10) is less than 80% of n l l / n I (8/10). (b) Second, the discrimination score (D) applied to the result in Table 5c becomes 88.24 (%), exhibiting a considerably high degree of discrimination against male employees. (c) Third, the chi-squares test rejects the null hypothesis of no discrimination at the 1% significance level. (d) Finally, X/3M= 2300 + 100x and X [ ~ F ~- 2200 + 600x are derived from (13) and (14), respectively. The results indicate that there is overlap between M and F in (X, y)-space. Therefore, the gender of a new employee is determined by rule (18). (Note that the second data set belongs to the case of no overlap and therefore, the gender of a new employee is determined by (17).) 4.2. Real application A data set (presented in Table 7) was sampled from the purchasing department of a Japanese subsidiary firm that is currently operating in the midwest region of the United States. An EEO question raised by an American associate (employee) was whether being Japanese, ceteris paribus, predicted higher salary than being American. The individual believes that there is salary discrimination against American associates. (The firm calls each employee 'associate'.) Applying both CRM and LS to the above data set, this study examines possible salary discrimination against American associates in the Japanese firm. The methodological features of CRM will be characterized by comparing it with LS. The EEO issue illustrated in this article is now becoming a serious social problem in the USA. As broadcasted in the ABC program '20/20' (September 27, 1991), many Japanese firms have been recently sued by American employees. Many white-collar employees working in Japanese firms believe that high-ranking positions are reserved for Japanese only, and that American cannot achieve the position of corporate executives. (See, e.g., [11,26] that report recent surveys on this type of discrimination cases in Japanese firms.) As presented in Table 7, 15 Japanese and 40 American associates are working for the department. Each associate is characterized by (a) his/her education level (1: high school, 2: junior college, 3: college, and 4: graduate), (b) working (annual) periods while each associate has worked for the firm,
cs~
s~
q~
v~
264
T. Sueyoshi /
Constrained regression median for measuring salary discrimination
Table 3 Employee
Yi
xi
gi
5200 4600 5800 3000 7000 5500 6800 6000 8000 5500
6 6 7 7 8 8 9 9 10 10
M F F M F M F M F M
(i) 11 12 13 14 15 16 17 18 19 20
(c) work experience measured by annual periods, (d) management rank in the firm (1: associate, 2: senior associate, 3: manager, and 4: general manager), (e) nationality (1: Japanese, and 0: American), and (f) amount of salary in US dollars. As a precondition of any satisfactory scheme of estimation, the following prior requirements are imposed in advance of this study: (a) the estimation results must be consistent on a company-wide consensus in a manner that they do not violate the ranked position-hierarchy of the firm - e.g., an associate must not receive more compensation than a senior associate, and (b) Negative weights should be avoided for all the parameter estimates except nationality - e.g., the salary increases proportionally with all the factors except nationality. (The estimated coefficient of nationality becomes zero only if no discrimination exists.)
Table 4 Errors and dual variables Employee
Yi
xi
Yi a
(i) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2700 2400 3100 2800 4000 3300 4200 3600 5000 4000 5200 4600 5800 3000 7000 5500 6800 6000 8000 5500
(a) Yi is determined by ~3i = 2050 + 525 x i.
1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10
2575 2575 3100 3100 3625 3625 4150 4150 4675 4675 5200 5200 5725 5725 6250 6250 6775 6775 7300 7300
Error
Dual variable
(yi - )3i)
(to i)
125 - 175 0 -300 375 -325 50 -550 325 - 675 0 - 400 75 -2725 750 -750 25 - 775 700 - 1800
1.0 - 1.0 1.0 - 1.0 1.0 - 1.0 1.0 - 1.0 1.0 - 1.0 1.0 - 1.0 1.0 - 1.0 1.0 - 1.0 1.0 - 1.0 1.0 - 1.0
T. Sueyoshi / Constrained regression median for measuring salary discrimination
265
Table 5 Proportions of female and male employees in high and low salary groups Group
Gender
Total
Female
Male
(a) First data set: High Salary (G A) Low Salary (G B) Total
5 5 10
5 5 l0
20
(b) Second data set: High salary (G A) Low salary (G B) Total
0 10 10
10 0 10
10 10 20
(c) Third data set: High salary (G A) Low salary (G n) Total
8 2 10
2 8 10
10
10
l0
l0 20
4.2.1. Results of LS Table 8 summarizes the computer output of LS regression analysis that was applied to the data set shown in Table 7. In interpreting the LS output of Table 8, this study finds that working experience and rank are the most important factors in determining salary at the 1% significance level. According to the LS result, nationality is identified to be an insignificant variable for determining salary. This indicates that we cannot reject the hypothesis of no discrimination. However, the output of Table 8 has a problem. That is, two parameter estimates (i.e., intercept and working period) are negative in these signs. The negative estimate of working period violates the assumption regarding seniority. Thus, the LS result is unacceptable as an empirical evidence to examine the EEO issue because it does not satisfy the precondition of estimation scheme required by the firm. (High R 2 score (0.9488) indicates that the data set presented in Table 7 does not have any problem on leverage points.) As an extension of the LS method explored in Table 8, this study transformed x t (education level) into four dummy variables in the following way: x I = d~ + d 2 + d 3 + d 4. Here, these dummy variables represent high school, junior college, college and graduate levels, respectively. This transformation is important because x I is represented in a discrete form. The LS result applied to the data set with
Table 6 Discrimination tests Test
Data First data set
4 / 5 rule
Degree of discrimination ( D % ) Type of discrimination Chi-squares test
X~M X[3p
525
Second data set 0
4
10
Third data set 4
8
0
- 100.00
88.24
No discrimination
Discrimination against women
Discrimination against men
0
20.00 * * a
7.20 * * a
Not reject
Reject
Reject
2125+337.5 x 1800 + 650 x
2050+525 x 1950 + 450 x 2050 + 525 x
2300+ 100 x 2200 + 600 x
Infeasible solution (overlap)
(no overlap)
a The number with ** indicates the S 2 s c o r e and ** denotes the 1% significance level.
Infeasible solution (overlap)
266
T. Sueyoshi / Constrained regression median for measuring salary discrimination
dummy variables of education can be summarized (see Table 9), which still maintains a negative sign on the working period. A negative sign on the three dummy variables needs to be discussed. The regression coefficients of the incorporated dummy variables represent the difference in the mean salary of those educational levels with the mean salary of the excluded educational level (i.e., dummy variable). Therefore, the negative sign indicates a decrease in salary when education level changes from level 4 (i.e., graduate) to one of the other three levels. The result implies lower salary when the educational level is reduced. If the dummy variable on the high school were omitted and the others were included, then the coefficients would be positive. Thus, the result on the negative dummy variables is consistent with the managerial requirement of the firm. Finally, this study examined the pairwise correlation among five independent variables and salary as a dependent variable. These simple correlations are shown in Table 10. It can be found from Table 10 that the pair of regressor (x 3, x4) is highly correlated since r34 --- 0.883. Furthermore, (x3, y) and (x4, y) are strongly correlated because r36 0.960 and /'46 0.912. These results are perfectly fitted with the wage system of the Japanese firm. That is, a central perspective of Japanese management is that salary is usually determined by each employee's seniority. Here, the seniority implies how long each individual stays in a same firm (not his/her education). All the Japanese employees have worked for the firm in Japan before and then have come to its US subsidiary finn as managers. Therefore, the working experience can be considered to represent the seniority for Japanese employees in this data set. As a result, the correlation between work experience and management rank (r34) is highly correlated. Moreover, r46 0.912 indicates that the management rank has a strong relation with the salary. The correlation is easily acceptable because it fits with the wage policy of the firm. However, the strong correlation frequently produces the problem of multicollinearity. The negative LS estimates might be due to the influence of multicollinearity. (A detailed discussion on the relationship between correlation and multicollinearity can be found in [8].) ----
=
=
4.2.2. Results of CRM A set of side constraints (S), representing the prescribed management requirements, was incorporated in the CRM estimation process as follows: First, in order to maintain the company-wide consensus regarding the hierarchy of the firm, this study selected the highest and lowest combinations of independent variables from each rank. From the two combinations, this study could obtain the highest and lowest compensation estimates for each rank. For example, the salary of a general manager (rank = 4) can be expressed by 3/31 + 3/32 + 22/33 + 4t4 in a linear regression model, using the seventh row of Table 7. Here, the four coefficients represents education level, working period, work experience and management rank, respectively. The highest possible salary estimates of a manager group (rank = 3) can be expressed by 4/3! + 9/2 + 15t3 + 3fl4 from Table 7. Therefore, the relationship (3 - 4)/31 + (3 - 9)/32 + (22 - 15)/33 + (4 - 3)/34 = -/31 - 6/32 + 7133+/34 >--0 needs to be maintained, so that the firm can obtain consensus between the general manager and managers. Thus, management requirements concerning the company-wide consensus can be expressed by the following three equations: - i l l - 6/2 + 7/3 +/34 > 0, 2/3x - 4/32 -/33 +/34 ~-~0,
(24b)
-/31 - 2/32 - 5/33 -[-/34 ~ 0.
(24c)
-
(24a)
These constraints are derived by comparing the highest and lowest salary estimates of the four management groups. Furthermore, the requirement on nonnegative weights can be achieved by restricting CRM estimates in the following way:
ill, i 2 ' t3' i4 ~ 0, Equations (25) and (26) consist of S that needs to be incorporated in CRM.
(25)
T. Sueyoshi / Constrained regression medmn for measuring salary discrimination
267
Table 7 A real data set regarding salary discrimination Employee
Education level
Working period
Work experience
Management rank
Nationality
Salary ($)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55
3 3 3 3 3 3 3 4 3 3 3 3 3 3 3 3 3 2 2 3 3 3 3 3 3 3 3 3 4 4 2 3 3 3 3 3 1 2 3 3 3 3 3 3 2 3 3 3 3 3 2 3 3 2 3
3 1 1 4 3 1 3 4 4 5 5 3 3 3 2 9 7 8 9 3 2 4 5 5 5 5 4 5 5 6 4 5 4 2 2 I 1 1 1 2 2 2 3 3 2 2 1 1 2 2 3 2 2 3 3
8 6 5 7 6 3 22 15 13 12 15 9 7 7 7 12 8 8 11 9 10 4 5 5 5 6 4 5 5 6 6 5 4 4 2 1 1 8 1 2 2 2 3 3 2 2 1 1 2 4 3 2 2 2 4
2 2 2 2 2 2 4 3 3 3 3 2 2 2 2 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
64900 59500 56000 67800 55000 51200 185000 120000 96000 95000 104000 72000 66000 66000 64000 93000 77000 76000 87000 72000 73000 49000 54000 54000 54000 57000 49000 54000 57000 62000 52000 54000 49000 35000 34400 31000 25000 32800 29800 31200 32000 32100 34000 34000 30600 31700 31000 30000 30600 35000 31000 30800 35400 32100 34000
T. Sueyoshi / Constrained regression median for measuring salary discrimination
268 Table 8 Least squares estimates Variable
Degree of freedom
Parameter estimate
Standard error
T for H0: Parameter = 0
Prob. > T
Intercept Education Working period Experience Rank Nationality
1 1 1 1 1 1
- 384.56 3659.45 - 469.09 4447.46 11377.00 2655.12
5749.73 1907.99 713.66 494.17 2914.68 2892.80
- 0.067 1.918 - 0.657 9.000 3.903 0.918
0.9469 0.0610 0.5141 0.0001 0.0003 0.3632
R-squares = 0.9488; adjusted R-squares = 0.9435.
Table 9 Least squares estimates with d u m m y variables Variable
Degree of freedom
Parameter estimate
Standard error
T for H0: Parameter = 0
Prob. > T
Intercept Working period Experience Rank Nationality Dummy 1 Dummy 2 Dummy 3 Dummy 4
1 1 1 1 1
15865.00 - 366.33 4497.73 11044.00 2649.93 - 6040.81 - 10144.00 - 5364.72 0
4973.19 750.00 512.92 3019.44 2928.35 8208.88 4776.21 4204.09
3.190 - 0.488 8.769 3.658 0.905 - 0.736 -2.124 - 1.276
0.0025 0.6275 0.0001 0.0006 0.3701 0.4655 0.0390 0.2082
R-squares = 0.9501; adjusted R-squares = 0.9427.
Table 10 Correlation matrix Variable
Variable Education
E WP WE MR N S
Work experience (WE)
Management rank (MR)
Nationality
Salary
(E)
Working period (WP)
(N)
(S)
1.000 0.056 0.163 0.183 0.219 0.236
0.056 1.000 0.452 0.620 -0.103 0.470
0.163 0.452 1.000 0.883 0.563 0.960
0.183 0.620 0.883 1.000 0.452 0.912
0.219 -0.103 0.563 0.452 1.000 0.566
0.236 0.470 0.960 0.912 0.566 1.000
Table 11 C R M estimates Variable
Intercept Education Working period Experience Rank
L A V estimation
C R M estimation
Model (4)
Model (7)
Model (8)
2454.54 2254.55 0 2818.18 16345.46
248.27 2862.07 0 2786.21 16793.00
7800.00 366.66 0 3133.33 16033.33
2060.68 2263.12 0 2869.52 16610.68
T. Sueyoshi / Constrained regression median for measuring salary discrimination
269
Table 12 Proportion of American and Japanese employees in high and low groups Group
Race
High salary (G A) Low salary (G B) Total
Total
American
Japanese
15 24 (1) a 39 (1)
12 3 15
27 27 (1) 55
a (1) indicates an American employee belonging to both high and low salary groups. The data point is a median point on the CRM regression hyperplane.
Table 11 summarizes parameter estimates derived from (4), (7), (8) and CRM. The CRM estimates are determined by (9) and a * = 0.240 of (10). As presented in the last column of Table 11, CRM estimates are nonnegative and acceptable by all the members of the department, because these parameter estimates satisfy all the prescribed management requirements. Table 12 is a 2 × 2 contingency table derived from the resulting CRM estimates. This study can find the following results related to salary discrimination: (a) First, the 4/5 rule ( ~ < (4) (~)) 12 and the D score (D = - 72.97%) indicate the existence of possible salary discrimination against American associates. The American associates may ask the firm to investigate possible salary discrimination or some kind of explanation on this EEO measure. (b) Second, the chi-squares score (X 2 = 7.45) also rejects the null hypothesis of no discrimination at the 1% significance level. (Japanese employees, as foreign businessmen, are subject to much heavier income tax burdens than American employees in the US tax system. Consequently, this study finds these EEO evidences. A serious policy issue needs to discussed is that the US tax system is not consistent with the perspective of EEO. As long as the US government and congress apply the current tax system to foreign business persons, we can find many salary discrimination cases in foreign-owned firms. This issue has been long neglected in US public policy.) (c) Third, the bottom hyperplane of Japanese group (/3M) becomes X ~ M = 90.91x 1 + 3409.09x 3 + 17136.36x 4,
and the upper hyperplane of American group (/3F) becomes
X~F
=
3155.11 + 3591.84x 1 + 1545.58x 2 + 1670.75x 3 + 15036.73x4,
where x~, x 2, x 3, and x 4 indicate education level, working period, work experience, and management rank, respectively. Since there is overlap between Japanese and American associates in (X, y)-space, the nationality of a new employee is determined by (18). (d) Finally, the comparison between the results of LS and CRM provides an important finding that LS cannot identify salary discrimination in the data set of Table 7, while CRM discovers a high likelihood of its existence. This evidence indicates that CRM has greater analytical capability than LS in detecting the existence of salary discrimination.
5. Conclusion and future extensions
This article presents a new use of CRM for empirically examining possible salary discrimination. An important feature of CRM is that it classifies an observed data set equally into two subgroups, not making any assumption on its sample size and error distribution. Then, a CRM regression hyperplane is set between the two subgroups. This feature increases the robustness of CRM to outliers a n d / o r non-normal error distributions often found in EEO data sets. Furthermore, CRM can incorporate various prior information into its estimation process. The a priori information is made from management requirements to obtain consensus among all the members evaluated by the CRM results. Thus, CRM is
270
T. Sueyoshi / Constrained regression median for measuring salary discrimination
designed to incorporate statistical, judicial and managerial perspectives related to measuring salary discrimination. (Of course, the study is aware of the following fact: the ability to easily incorporate side constraints is an advantage possible with both LAV and RM methods and even LS regression if it is formulated as a mathematical programming problem. Therefore, this presumed advantage is a consequence of the fact that LS regression is typically not solved in a mathematical programming context as opposed to some intrinsic advantage of the proposed CRM procedure.) After measuring CRM, its result is transformed into a 2 × 2 contingency table. Information in the form of the contingency table serves as an empirical basis for measuring the degree of salary discrimination and conducting the X 2 test for the examination of the null hypothesis of no discrimination. All the measures explored in this study can be easily explained to and understood by lawyers, judges and other individuals involved in a discrimination case. There are three research issues to be explored as extensions of this study. (a) First, there is no simple measure of the effectiveness of the fit of the CRM model to a data set. As an extension of this research, the CRM approach needs to incorporate some kind of method; for instance, the bootstrap method proposed in [13], so as to measure the shape of error distributions. Using the bootstrap method, we can examine whether the CRM method provides an adequate fit to a sample data or a statistically generalizable fit to a population from which the sample is drawn. (b) The other issue to be addressed is the development of measuring the type I and type II fairness. Type I fairness indicates the level of salary discrimination between two groups that are the same for a given qualification. Meanwhile, type II fairness indicates the level of salary discrimination between two groups that are the same for a given level of compensation (see, e.g., a discussion of [12] on this topic). The CRM needs to incorporate the two types of fairness in its analytical process, so that it can provide additional empirical evidences regarding salary discrimination. (c) The last research topic is the extension of DR. Focusing upon DR in the L 1 form, this article did not present many different DR models. As proposed for discriminant analysis, different DR forms can be easily developed and these may produce different hyperplanes between dichotomized groups. As a result of this extension, we can predict the gender of a new employee from its X and y, using a DR hyperplane, even if there is some kind of overlap between two groups. This is an important future issue. Finally, it is hoped that this study makes a small step towards improving the opportunity for equal employment. We anxiously await further CRM development and its applications along the lines indicated in this article.
References [1] Agresti, A., Categorical Data Analysis, Wiley, New York, 1990. [2] Bassett, G., and Koenker, R., "An empirical quantile function for linear models with IID errors", Journal of the American Statistical Association 77 (1982) 407-415. [3] Bajgier, S.M., and Hill, A.V., "An experimental comparison of statistical and linear programming approaches to the discriminant problem", Decision Sciences 13 (1982) 604-618. [4] Bazaraa, M.S., and Shetty, C.M., Nonlinear Programming, Wiley, New York, 1979. [5] Charnes, A., and Cooper, W.W., Management Models and Industrial Applications of Linear Programming, Vols. I and H, Wiley, New York, 1961. [6] Charnes, A., and Cooper, W.W., "Goal programming and constrained regression: A comment", Omega 4 (1975) 403-409. [7] Charnes, A., Cooper, W.W., and Ferguson, R.O., "Optimal estimation of executive compensation by linear programming", Management Science 1 (1955) 138-151. [8] Charnes, A., Cooper, W.W., and Sueyoshi, T., "Least squares/ridge regression and goal programming/constrained regression alternatives", European Journal of Operational Research 27 (1986) 147-157. [9] Charnes, A., Cooper, W.W., and Sueyoshi, T., "A goal programming/constrained regression review of the Bell System breakup", Management Science 34 (1988) 1-26. [10] Charnes, A., Cooper, W.W., and Sueyoshi, T., "A programming/constrained regression analysis of AT&T as natural monopoly", in: O. Davis (ed.), The Practice of Policy Analysis: Mutual Implications of Context and Methodology, (Forthcoming). [11] Chikushi, T., and Tsurumi, Y., Rise and Fall of Japan, Gakushu Kenkyu Sha, Tokyo, Japan, 1990.
T. Sueyoshi / Constrained regression median for measuring salary discrimination
271
[12] Conway, D.A., and Roberts, H.V., "Reverse regression, fairness and employment discrimination", Journal of Business and Economic Statistics 1 (1983) 75-85. [13] Dielman, T.E., and Pfaffenberger, R.C., "Bootstrapping in least absolute value regression: An Application to hypothesis testing", Communications in Statistics. Simulation and Computation 17 (1988) 843-856. [14] Finkelstein, M.O., "The judicial reception of multiple regression studies in race and sex discrimination cases", Columbia Law Review 80 (1980) 737-754. [15] Finkelstein, M.O., "Multiple regression models in employment discrimination cases - The problem of imperfect proxies", Jurimetrics 31 (1990) 109-124. [16] Fisher, F.M., "Multiple regression in legal proceedings", Columbia Law Review 80 (1980) 702-736. [17] Freed, N., and Glover, F., "Simple but powerful goal programming models for discriminant problems", European Journal of Operational Research 7 (1981) 44-60. [18] Freed, N., and Glover, F., "Evaluating alternative linear programming models to solve the two-group discriminant problem", Decision Sciences 17 (1986) 151-162. [19] Gilmartin, K., "Identifying similarly situated employees in employment discrimination cases", Jurimetrics 31 (1991) 429-440. [20] Goldberger, A.S., "Reverse regression and salary discrimination", Journal of Human Resources 19 (1984) 293-318. [21] Graham-Moore, B.E., and Seiford, L.M., "A DSS for managing EEO/affirmative action requirement", Journal of Information and Optimization Sciences 9 (1988) 151-158. [22] Greene, W.H., "Reverse regression: the algebra of discrimination", The Journal of Business and Economic Statistics 2 (1984) 117-121. [23] Hashimoto, M., and Kochin, L., "A bias in the statistical estimation of the effects of discrimination", Economic Inquiry 43 (1980) 478-486. [24] Hogg, R.V., "Estimates of percentile linear regressions using salary data", Journal of the American Statistical Association 70 (1975) 56-59. [25] Huber, P.J., Robust Statistics, Wiley, New York, 1981. [26] Japan Society Research Report, Japanese Companies in American Communities, Nippon Keizai Shinbun, Tokyo, Japan, 1990. [27] Kamalich, R.A., and Polacheck, S.W., "Discrimination: Fact or fiction? An examination using an alternative approach", Southern Economic Journal 49 (1982) 450-461. [28] Koenker, R., and Bassett, G., "Regression quantile", Econometrica 46 (1978) 33-50. [29] Kusabe, I., Tokei Teki Hoho Enshu, Nikagiren, Tokyo, Japan, 1974. [30] Lee, C.K., and Ord, J.K., "Discriminant analysis using least absolute deviations", Decision Sciences 21 (1990) 86-96. [31] Miner, M.G., and Miner, J.B., Employee Selection within the Law, Bureau of National Affairs, Washington, DC, 1979. [32] Markowski, E.P., and Markowski, C.A., "Concept, theory and techniques: Some difficulties and improvements in applying linear programming formulations to the discriminant problem", Decision Sciences 16 (1985) 237-247. [33] Nath, R., and Jones, T.W., "A variable selection criterion in the linear programming approaches to discriminant analysis", Decision Sciences 19 (1988) 554-563. [34] Rousseeuw, P.J., and Leroy, A., Robust Regression and Outlier Detection, Wiley, New York, 1987. [35] Rousseeuw, P.J., and von Zomeren, B.C., "Unmasking multivariate outliers and leverage points", Journal of the American Statistical Association 85 (1990) 633-639. [36] Sueyoshi, T., "Estimation of stochastic frontier cost function using data envelopment analysis: An application to the AT&T divestiture", Journal of the Operational Research Society 42 (1991) 463-477. [37] Sueyoshi, T., "Empirical regression quantile', Journal of the Operational Research Society of Japan 34 (1991) 250-262. [38] Sueyoshi, T., and Chang, Y., "Goal programming approach for regression median", Decision Sciences 20 (1989) 700-714.