IWZZY
sets and systems ELSEVIER
Fuzzy Sets and Systems 72 (1995) 1-26
Statistical tests for fuzzy data Christoph Rrmer, Abraham Kandel* Computer Science and Engineering Department, Universityof South Florida, Tampa, FL 33620-5350 USA Received May 1993; revised June 1994
Abstract In this paper we investigate the impacts of vague and imprecise data on the statistical task of hypotheses testing. Test results are evaluated with respect to this ambiguity. The issue of how to aggregate fuzzy data to a fuzzy sample vector is discussed and different eventualities are considered. Their consequences on the test results are compared. The domain of the used test statistics is extended to fuzzy sample vectors by the means provided in the first part of this paper. Different methods for the defuzzification of the obtained fuzzy test results are discussed and the fuzzy nature of the classical verification task for statistical hypotheses is studied. In this context, a way of how to overcome the general insufficiency of statistical tests of being conclusive only in one direction is proposed by admitting fuzzy results. Keywords: Fuzzy data; Hypotheses testing; Ambiguity; Defuzzification; Fuzzy sample vectors
1. Introduction One of the prime objectives of statistics is to check presumptions about the general behavior of a population. With regard to that, the statistician draws a sample from the population in question and uses the degree to what this specific sample is supporting the assumption as a probabilistic verification. Because of its probabilistic nature, this verification does not claim to represent the objective truth. Furthermore, in most cases statistical hypotheses can be neither verified nor falsified in a classical sense. However, for a given sample and a properly chosen test procedure one can classify his hypothesis in terms of degrees of confidence. The choice of the test procedure depends on the kind of hypothesis and the additional information about the population, if there is any. Hence, two large classes of statistical tests are distinguished. The first one deals with hypotheses referring to the form of an unknown distribution function, called non-parametric hypotheses. A procedure of testing a non-parametric hypothesis is called non-parametric test. The second class takes advantage of some prior information about the population in question. It deals with hypotheses, called parametric hypotheses, which refer only to the numerical values of unknown parameters of a random variable, whereas the functional form
* Corresponding author. 0165-0114/95/$09.50 © 1995 - Elsevier Science B.V. All rights reserved SSDI 0 1 6 5 - 0 1 1 4 ( 9 4 ) 0 0 2 7 0 - 3
2
C. Rdmer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
of its distribution is assumed to be known. A corresponding test for a parametric hypothesis is called parametric test. What all statistical tests have in common is that verification is performed on the basis of a sample, regarded as miniature of the population in question. In classical statistics observed data are represented by real numbers or real-valued vectors in ~", respectively. In fact, observed data are often imprecise and contain some ambiguity caused by the way they have been obtained. Origins for this kind of ambiguity may be • inaccuracy of the used devices involving an error of measurement of fuzzy nature [1], • linguistic nature of observed data, • subjective nature of observed data (events have been judged by a human since no objective device was available). Since this kind of ambiguity is of fuzzy nature rather than of probabilistic nature, both the random variation of the data and their vagueness have to be employed in the test procedure. Basically, a fuzzy framework would be necessary for all statistical concepts dealing with continuous quantities since observed data pertaining to a continuous quantity always contain ambiguity, however small. The statistician is put in charge to decide whether the vagueness of the data is relevant in relation to their random variation. Once the statistician has decided not to neglect the vagueness of the sample data, he has to find a proper representation for the vague data. This is usually done by modeling the observed data by properly chosen fuzzy numbers. The shape of the fuzzy numbers modeling the vague data should reflect the nature of ambiguity in the data. The crucial point then is to aggregate the obtained fuzzy numbers to a fuzzy sample vector. The way this is performed has influence on the results of statistical inference. In this paper we present some impacts of vague data on the task of statistical hypotheses testing and stresses the crucial step of aggregating the fuzzy sample vector. It demonstrates how the statistician can gain a possibilistic interpretation of the test result besides its probabilistic interpretation and it indicates the fuzzy nature of a statistical hypothesis itself. Both parametric and nonparametric tests are being considered. Existing work in the field of statistics with vague data are concerned, among others, with descriptive statistic, limit theorems and some aspects of statistical inference [11], estimation problems and Bayes methods [21,24] and reliability estimation [22-23]. Ref. [20] touches upon the general problem of the necessity to develop fuzzy Bayesian inference. Ref. [7] provides a basic survey of fuzzy techniques with applications of the above, as well as potential-related applications to the techniques discussed here are investigated in [8-10]. Ref. [18] provides a survey of the impacts of vague data on descriptive statistics and estimation problems and [19] is using fuzzy data in linear dynamic systems.
2. Fuzzy samples As mentioned in the preceding section the nature of sample data pertaining to the state of nature in question may require a proper modeling of these data by fuzzy subsets of the real line. Because most vague data can be properly modeled by normalized and convex fuzzy subsets of the real line and not at least, because such fuzzy sets allow an easier handling within the mathematical framework, we follow the presentation of fuzzy numbers and fuzzy intervals given by Zadeh [25] and Dubois and Prade [3], respectively. In favor of a faster computation, L - R representations of fuzzy numbers and fuzzy intervals are used in this paper:
I L[(ml --x)/al], (6)(x) = J R[(m2 - x)/a2], /
[ 1,
x <~/911, x >1m2, otherwise.
(1)
C R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
3
Clearly, 6 is a fuzzy number in the sense of 1-25, 3] iffml = mE; otherwise, 6 is said to be a fuzzy interval. In this paper, if not otherwise stated,fuzzy data are referred as fuzzy sets with characteristic functions given in Eq. (1). Two common instances of such modeled fuzzy data are 1. linear fuzzy data T(ml,mE,al,a2): L[(m, - x)/al]
=
max
_x-m1 + a l 0 ] , -,
R[(m2
--
x)/a2]
~
m a x l m z + a2 - x ~ O ]
,
(2)
02
al
2. exponential fuzzy data E(m~, m2, al, a2, px, P2): R(x) = exp
-
X -- m2 P2~. a2 /
(3)
For symmetrical linear fuzzy numbers T(m, m, a, a), we abbreviate T (m, a) and to symmetrical exponential fuzzy numbers E(m, m, a, a, p, p) we refer by E(m, a, p) hereafter. Once we have obtained n fuzzy data 6, i = 1, ..., n, the question arises how to aggregate them to a fuzzy sample vector 3 = (61, ... ,6,), which is supposed to be a fuzzy subset in R". A general definition may be given as follows. Definition 1. Let 6, i = 1.... , n be n fuzzy data, then a corresponding fuzzy sample vector ~ is a fuzzy subset of ~" given by its membership function (~)(x)
=
A,((fl)(Xl) . . . . .
(6.)(x.))
Vx = (x1,
. . . ,Xn) • ~n,
(4)
with A,: [0, 1] ~ ~ [0, 1] being an aggregation operator conditioned on parameter n such that (1) A,((61)(xl) . . . . . (6.)(x.)) = 1 if (3,)(x,)= 1, i = l(1)n, (2) A . - i d i f n = 1, (3) A.(fllx, l, ..., D~x,})--- I]¢(......... )}, (4) A.(al .... ,a.) is a non-decreasing function in ai • [0, 1], i = l(1)n, where DI,1 denotes the classical characteristic function. A fuzzy sample vector 6 is said to be convex if (5) Vxl, xz • S~, V2 • [0, 1]: (6)(2xl + (1 - 2)x2)/> min[(~)(x~),(~)(x2)]. Property (1) of the aggregation function provides normalization of the fuzzy sample vector, property (2) ensures that for samples of size 1 the fuzzy sample vector is represented by the single fuzzy data itself, and property (3) implies that no fuzziness is obtained by applying the aggregation function to crisp data. Item (4) means, in other words, that the fuzziness of an aggregated sample vector cannot decrease if the fuzziness of the corresponding data is increasing. It remains to give a principle of how to extend statistical inference on fuzzy sample vectors, which are fuzzy subsets of the sample space rather than elements of it. Most concepts in statistical inference base on mappings from the sample space S~ _ ~" to the real line. The following definition provides us a tool to deal with the fuzzification of the domain of statistical inference. Definition 2. Let ~ be a fuzzy sample vector and s: S]. w-, • a mapping from the sample space to the real line, then the image p = s(3) of ~ under map s is given by (/~)(z) =
sup
(6)(x),
xES~ : z=s~x)
with (p)(z) -- 0 if z v~ s(x) Vx e S"x.
(5)
4
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
The rule given above allows us to induce from each fuzzy sample vector a fuzzy subset of the real line. The result of statistical inference therefore depends on the aggregation function used for crafting the fuzzy sample vector from the given fuzzy data. Among others the following two aggregation functions can be considered 1. minimum-aggregation [11]: A.((hl)(x~) . . . . . (6,,)(x,,))=
[(6i)(x,)],
min
(6)
i=l ..... n
2. product aggregation [20]: An(((~l)(Xl)
, ... ,((~n)(Xn))=
fi I-(6i)(Xi)].
(7)
i=1
The minimum operator is the commonly used aggregation operator in fuzzy set theory, which provides the big advantages of fast computation and simple representation of the corresponding or-cuts. The product operator implicitly supposes some interactivity or compensation between the fuzzy data 6~, whereas the minimum operator does not assume any interactivity. Note that the following holds for the fuzzy sample vector 6mi. aggregating fuzzy data via the minimum operator and 6p~odapplying the product operator on the same data: ~ p r o d ~--
~min,
(8)
where _ denotes the fuzzy inclusion. It is quite clear that both operators meet the requirements (1)-(4) of Definition 1 and hence are valid aggregation functions. But while the minimum operator always generates convex fuzzy sample vectors, regardless of the specific shape of the fuzzy data to be aggregated, the convexity of ~p~odcannot be shown in general. However, there exists a class of fuzzy data where convexity is attained also via product aggregation. Lemma 1. (1) ~min is convex regarding any fuzzy data, (2) 3prod is convex regarding the exponential fuzzy data 6i = E(mi, ai,p), p > O. Proof. Let xl, xz e ~" and x3 := 2xl + (1 (1) For any fuzzy data the following holds: (~min)(X3) :
~.)x2,
,~ ~
[0,
1].
rain [(6i)(x3,)] = (6k)(X3~) >/ min[(6k)(Xl,),(6k)(X2,)] i = 1(1)n
>/min[ min (6i)(xl,), L i = l(1)n
(~i)(X2i)l = min[-(~min)(Xa),(~mia)(X2)].
min i = l(1)n
_[
(2) Consider ~i = E(mi,ai,p), i = l(1)n. Thus, (~prod)(Xk)=
(
[((~i)(Xki)]= exp -i=1
~
, k e {1, 2, 3}.
i
Now suppose (~vrod)(X3)< mink~{1,2}( ~ p r o d ) ( X k ) • Hence, i=1
ai
< min
~
ke{1,2}
i=1
~ ai
i=1
x3--~'--ai
ai
> max ~,, xk--~'-m''--
B
ke{1,2}i=l
ai
ai
C R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1 26
5
By using the following notation:yk = (xk,/al, ..., Xk,/a.), for k • { 1, 2, 3} and m' = (ml/al . . . . . m./a.), we can write in terms of p-norms, Ily3 - r e ' l i p > max I l y k - m ' [ l p . ke{1,2}
But since [[Y3 - m' 1[, = [[2(yl - m') + (1 - 2)(y2 - m')[[v ~< 2 [[Yl - m' [Iv + (1 - 2)]]Y2 - m' [[p, we derive that 2 I[Ya - m' [tp + (1 - 2)HYz - m' [Ip > maxk~l,2~ I[Yk -- m' I[p, which is a contradiction. [] The purpose of statistical inference, in general, is to draw conclusions about the distribution of the state of nature under considerations from a drawn sample vector pertaining to the population. Some concepts of inference require complex functions of the sample vector. Particularly in the field of hypotheses testing, test statistics can be very burdensome to compute when applied on fuzzy sample vectors. It is interesting to note that only in rare cases an analytical representation of their membership functions can be achieved. The concept of the ~-cut representation can help to interpolate reasonable results. F o r this reason we take a closer look at the ~-cut representation of fuzzy sample vectors. The a-cut of a fuzzy sample vector 8 is denoted by [ 8 ] , = {x • 0~": (8)(x)/> ~} and the membership functions of a fuzzy sample vector can be recovered from his a-cut representation by (8)(x)= sup,~o. 17min [a, D/ ~61~/(x)]. [Sprod], -~ [Smi,],, V~ • (0, 1] follows immediately from Eq. (8). The shape of a-cut [8], depends on the aggregation function used for modeling the fuzzy sample vector 8. It is easy to see that the a-cuts of 8mi. equal to the cartesian products of the a-cuts of the corresponding fuzzy data: [Smi.], = [61], x ... x [ 3 , ] ,
Ya • (0, 1].
(9)
The a-cuts [Sprod], of fuzzy sample vectors aggregated by the product operator cannot be expressed in terms of a-cuts of the fuzzy data. However, in case of exponential fuzzy data 6~ = E(m~,ai,2), i = l(1)n, [Sp~od], turns out to be bounded by the hyperellipsoide circumscribed by [Sm~,],:
[8.ro~].= X • ~ " : -
-----[Sm,.]o Vae(0, q.
(10)
i=l
The shape of [Sp~od], reflects the assumed interactivity of the underlying fuzzy data. The following proposition describes the general shape of a-cuts of fuzzy sample vectors 8 and derives unimodality of 8 from it.
Proposition 1.
a-cuts [8], of f u z z y sample vectors 8 aggregating the f u z z y data 6i, i = l(1)n are star-shaped subsets of ~ . and therefore connected sets. In particular, [8]~, ~ • (0, 1] are convex sets iff 8 is a convex f u z z y sample vector. Hence, 8 are unimodal f u z z y subsets of ~".
Proof. Let Xo = (Xo ..... ,Xo,) e ~" such that (6i)(xl) -- 1, i = l(1)n. Thus, (8)(Xo) = A.((6x)(xl) . . . . . (6.)(x,)) = 1 and Xo • [811 ~ [8],, Va • (0, 1]. T o prove that [8], is star-shaped we have to find a reference point x • [8],, such that for all y • [8], and 2 • [0, 1], 2x + (1 - 2}y = z • [8]~ . If we take x = Xo as reference point, we get (8)(z) = (8)(2Xo + (1 - 2)y) = A.((61)(2Xo, + (1 - 2)yl) . . . . . (6,)(2Xo, + (1 -- 2)y,))
~> A.(min [(6,)(Xo, ), (31)(yl)], ..-, min [(6.)(Xo.), (3,)(y.)]) = A.((~0(yl),
... ,(~.)(y.))/>
~.
6
C R6mer, A. Kandel / Fuzzy Setsand Systems 72 (1995) 1-26
It remains to prove the relation between convex m-cuts and convex fuzzy sample vectors. Suppose (i is convex. Hence, for X 1 , x 2 ~ [~']~t and x3 = 2Xl q- (1 - / ] . ) x 2 , the following holds: (~)(xa) >/min[(~)(x~),(~)(x2)] = a. Thus xa belongs to [~],, which implies the convexity of the a-cuts of ~5.Now, assume convexity of the a-cuts of ~ and, without loss any generality, xt,x2 e S~, such that Xl ~ [~], and x2 e [~]~, with m ~< ft. Hence, x2 G [t5]~ and thus x3 e [6],, since [~], is convex. Thus (~)(xa) >/a = min[(~)(Xl),(~)(x2)], which implies the convexity of (i.
[]
Next we investigate the representation of a-cuts of images of fuzzy sample vectors under certain mappings. Once the shapes of the m-cuts [[5]~ of the fuzzy sample vector itself are certain, we are able to use this information for crafting the a-cuts [s(~)]~ of the image s(~). But while
s([aL) _ [s(a)]~ holds for all mappings In the following we inclusion of Eq. (11). In an equality. But first a
(~ ~) s(.) applied on any fuzzy sample vector 3, s([~]~) = [s(~)]~ is not true in general. show that there are closer relations between s([~]~) and [s(~)], than the simple particular, we also give some conditions for ~t and s(-) under which Eq. (11) turns into useful lemma is given.
Lemma 2. Let ~ be any f u z z y sample vector and ([~$]~)~(o, 11 its family of m-cuts, then
[a]p = n [a]~,
vfl ~ (o, 1].
~t1
Proof. [~]~ ~ 0~
[t~]p = n [a]~k,
vfl G (0, I],
k~l~
Proposition 2. Let s : X ~-. Y, and ¢$ a f u z z y sample vector defined on X ~_ R". Then (1) s([aL) = [s(a)]~, vm ~ [o, 1), (2) [s(a)]~ = (-)~
Proof. (1) I f y G s([6]~), then qx e s- l(y): (6)(x) > a. Thus (s(~))(y) = supxEs ,(y)(~)(x) > m. Hence y e [s(~$)]~. If yCs([6])~), then Vx e s-:(y): (6)(x) ~< a. Thus (s(6))(y) = supxEs l(y)(~)(x) ~< m. Hence y¢[s(~)]~. (2) If y ~ s([6]:) Vm < fl, then y E [s(6)]: Va < ft. Thus (s(~))(y) = sup{m: y E [s(6)]:} >/#. Hence y G [-s(~)]~. If y ~ [s(&)]~, then y ~ [s(~)]~ Vm < ft. Thus y G s([~$]~) Vm < ft. Hence y e s([~]~) Va < ft. []
Corollary 1. I f s : X ~-~ ~ is a continuous function and ~$ a f u z z y sample vector defined on X ~_ R", then s(~$) is a convex normalized f u z z y set in R, i.e. a f u z z y number or a f u z z y interval.
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
7
Proof. Since [~], are connected subsets of X and s(') is a continuous function, s([~],) are connected subsets and thus convex subsets of fi~ for all a e (0, 1]. Because of Proposition 2(2), [s(~)]a are intersections of convex sets Vfl e (0, 1], and therefore convex by itself. Hence s(~) is a convex fuzzy set in N. Clearly, normalization holds. []
Proposition 3.
L e t s: X w-~ Y be a continuous mapping and ~ a f u z z y sample vector defined on X ~_ ~" such that
[6]~ are compact subsets o f X f o r all a • (0, 1]. Then s([~]~)=[s(~)]~,
Va•(0,1].
Proof. Let y • [s(6)]o, then y • s ( [ ~ ] , ) Va • (0,fl). Thus Va • (0,fl) 3x~ • [6],: s(x~) = y. If there exists an ~o < fl: X~o • [~]~, then s(X,o) = y • s([6]t0. If there is no such ~o, then x , ¢ [ ~ ] o V a • ( 0 , f l ) . Let (ak)k~ ~--(O, fl), such that akTfl, then (X,~)k~ is a sequence in X, such that x,~ • [/i],~ Vk • N. Since [6],~ are c o m p a c t subsets of X, Vk • N, there exists a convergent subsequence (x,~,)i~, such that x,~, ~ x with x • [6],~, Vi • N. Thus x • 0g~[t~],~, = [~]a. By taking a d v a n t a g e of the continuity of s(.), we derive s(x) = s ( l i m ~ x , ~ , ) = l i m ~ s(x~J = y, and finally
y • s([,~]~).
[]
Proposition 4.
L e t s: X ~ Y be a map f r o m X ~_ ~" to the finite set Y = {ai: i = 1. . . . . m} and ~ a f u z z y sample vector defined on X . Then
s([~]~) = [s(6)]~,
f o r almost all a • (0, 1].
(There are at most m different a • (0, 1], such that s([~]~) ~ [s(~)]~.)
Proof. Since s ( [ 6 ] , ) _ s([~]p) ___ Y, V0 < fl ~< a ~< 1, {s([~]~): a • (0, 1]} consists of at the most m different subsets of Y, denoted by Ak ~ Ak+ x, k = 1, ... , m -- 1. Let ak = sup{a: Ak ~-- S([6]~)}, k = 1, ... ,m - 1. N o w suppose 3ak < fl < a < ~k+l, such that s([~]~) c s([~]a); then Ak+, -----s ( [ ~ ] , ) c s([6]0 ) = A,, what is a contradiction to the fact that {s([~],): a • (0, 1] } = A,: k = 1.... , m}. The same justification applies to all a, t : ao -- 0 < fl < a < ax. N o t e that due to normalization of ~, a,, = 1. Hence s ( [ ~ ] , ) = s([6]¢) Va, fl • (ak,ak+l), k = 0 . . . . , m - - 1.
O n the other hand, by taking a d v a n t a g e of Proposition 2(2), we k n o w that s ( [ ~ ] , ) _ c [s(~)], c s([~]a) V0 < fl < a ~< 1. Hence s ( [ ~ ] , ) = [s(~)],, Va • (ak,ak+l), k = 0 . . . . . m -- 1, i.e.
s([aL)=[s(O)],,
w~a~,
k = 1,...,m.
[]
The following example shows how different aggregation functions m a y affect the results of statistical inference with fuzzy data.
Example 1.
O n e of the prime characteristics of a r a n d o m variable is its expected value. O n the basis of a sample one can estimate the expected value by the unbiased statistic 1
n
e.(x) = ~ F. x,, i=1
where x = (xl . . . . , x,) represents the sample vector. The substantial relevance of this statistic does not only stem from its importan.t role in estimation problems, but also from its capacity as a test statistic.
8
C. Rijmer, A.
Kandel1 Fuzzy Sets and Systems 72 (1995) 1-26
Now suppose we have a fuzzy sample vector 6 with compact a-cuts. Hence, since Xn: R” H R is a continuous function, we can apply Proposition 3 and thus derive for the a-cuts of the fuzzy sample mean _
C~.Wl, = CC~,@)l., CU4lJ
=
minR,(x),max8,(4 . XSPI. [ XE[IlCI I
(12)
Instances for fuzzy sample vectors with compact u-cuts are among others: (i) 6,r”, aggregating fuzzy data with compact a-cuts [si]a = [ma, [SJ,], i = 1, . . . , n, aggregating the exponential fuzzy data E(mi, ai, 2) i = 1, . . . , n. See Eqs. (9) and (lo), respectively. The corresponding a-cut of the fuzzy sample mean, given in Eq. (12), then assumes the following forms:
(ii) &d,
This result can be proved as follows: Item (a) is easily derivable from Eqs. (9) and (12). To prove item (b), we have to maximize and minimize the function 8, : R” I+ R on a subset of R” bounded by the hyperellipsoide (13) respectively. Since X3,(.) is a monotonous function in Xi, i = 1, . . . , n, the maximum and the minimum will be assumed on the hyperellipsoide. By employing Eq. (13) via any xi (here x,). (14) into the sample mean function X.(X) = l/nz:l= 1 Xi, we get after partial differentiation, = 0, j = l(l)n - 1. Jl
(15)
- Cy1:((Xi - Wli)/(aiJ-lncc))*
Thus,
and hence, by applying Eq. (13) (17) Summing up leads to (18)
C. Riimer, A, Kandel / Fuzzy Sets and Systems 72 (1995) 1 26
which can be substituted in Eq. (14), such that x.~ln.
"]z=
/
x.--m.
"~z
1
'2"1 2in
(19)
follows. Thus, x. = m. +
N/
-
a.41n~ 27=,
(20) "
The same applies to i = 1. . . . . n - 1.
[]
Fig. 1 illustrates the different impacts of m i n i m u m rule and product rule aggregated fuzzy sample vectors on the fuzzy sample mean, respectively. An artificial fuzzy sample of exponential fuzzy data E(rfi, 2, 2), at
(2
~,,,I11/
(]. t. -5
0
5
IO
15
2O
25
3O
fuzzy sample mean
via product aggregation
minimum aggregation
1.0
1.0 ~
2
2 .5 L
0.
l
L
-,1,,i,
-5
0
I0
I5
20
25
30
-5
0
5
l(]
15
2.0
25
Fig. 1. Comparison between minimum-rule-based and product-rule-based fuzzy sample means.
30
C R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
10
which r~ is exponential distributed with expected value 5, is employed in this example. One can see that the fuzzy sample mean adopts the shape of fuzziness of the underlying fuzzy data in case of minimum aggregation, whereas in case of product aggregation the fuzzy sample mean tends to be more crisp than the underlying data, which reflects the assumed interaction between the data.
3. On the fuzzy nature of statistical hypotheses testing Aside from the classification of statistical tests into parametric and nonparametric tests, as mentioned in the introduction, another categorization can be performed by the distinction whether a test is a significance test or not. In this section, we outline the fuzzy potential of both significance and nonsignificance tests, tacitly approved by statisticians so far.
3.1. Significance tests Generally speaking, significance tests are statistical tests which provide a certain numerical value, representing a measure of reliability for the corresponding test procedure. Basically, it is defined as the probability of error which consists of rejecting the hypothesis in question when the hypothesis is true. Such an error is said to be an error of the first kind. In effect ct is the known extent to which we commit such an error; in essence, ~ can be referred to as the size of the test and usually it coincides with the significance level and thus = ~ ( x e ~ I ~Vfois true),
(21)
where ~ is the subset of sample space S~ supporting the rejection of the null hypothesis ~ o , called the corresponding rejecting space. Then ~ is called the level of significance of test J-, which is determined by the partition (~, ~c) of the sample space. ~ c denotes the complement of ~ in S~ and is also referred to as the acceptance space d of test ~--. It should be noted that it is quite useful to consider randomized tests since there are conceptual differences between the aim of the critical or test function of a randomized test [12] and our discussion here. Those differences as well as extension of the present work to the case of randomized tests are discussed in another paper of ours soon to appear. Now, on the basis of a drawn sample x E ~ , we may reject the null hypothesis ~f~o, if the error of the first kind seems to be reasonably small (for instance, lower than or equal to 0.01). Such a decision can be justified by the fact that, if we were not to reject the null hypothesis we would have to admit that we had observed a very rare event. As we can see, the implication x ~~
=~ rejecting Jf~o
trusts in a proper definition of the predicates reasonably small or very rare. But since premises like reasonably small and very rare are fuzzy in nature rather than probabilistic in nature, we have to acknowledge that, though we are working within a probabilistic framework, the final step in such a decision process remains fuzzy. Those statisticians who would not neglect this fact may employ fuzzy sets at this point, what might be done like in the following way. Since the domain of ~ is the real interval [0, 1], we have to model the linguistic variable reasonably small by a fuzzy subset A of [0, 1]. Note that this embraces the classical approach of defining the set of reasonably small ~'s as a classical subset of [0, 1] (for instance [0,0.01]). Then the fuzzy extension of the coupling
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
11
between the truth value of the named implication and the distinction whether ~ belongs to the set of reasonably small ~'s or not, becomes r(x ~ ~ ==>rejecting ~ o ) = z(~ E A) = (A)(ct), where z is the truth evaluating function and (A) the membership function of fuzzy set A. Thus, by taking advantage of fuzzy logic relation z(P =~ Q) = max[1 - z(P), z(Q)], we get z(rejecting ~ o ) = (A)(~), since z(x e ~ ) = 1. In other words, the hypothesis ~ 0 will be as much rejected as ~ belongs to A. Now the question arises, how to perform a rejection with respect to a certain degree of possibility? A proper approach to this issue might be done in terms of fuzzy sets. Since we are rejecting the hypothesis ~ o not to the full extent but to the degree of(A)(~), we might forward this ambiguity to the hypothesis itself in the sense that we say, we are rejecting the fuzzy hypothesis ~ * , being the fuzzy subset of hypothesis afro with degree of membership (A)(~). Now, since the transition from being a member of the set of reasonably small ~'s to not being a member of it is rather smooth than abrupt, the alteration of our decision in dependence on ~ is of quantitative nature, rather than of qualitative nature, like in the classical approach (rejection, if = ct e [0, 0.01 ]; no decision else). This forces us to consider the following: Due to the nature of hypotheses, there must be some other alternatives that may be true (at least one), usually called the set of alternative hypotheses. Together with the null hypothesis, they form the set of admissible hypotheses. As we saw, event x e ~ leads to the rejection of the fuzzy hypothesis aft*. But what happens to the alternative hypotheses? There is no reason, why we should neglect the fact that the same test also rejects the fuzzy hypotheses ~ * , being fuzzy subsets of the alternative hypotheses o~., respectively, with degree of membership (A)(ctj), such that ~j = :~(x e ~ l ~ j
is true).
(22)
As a matter of fact, we can say that event x e ~ forces us to reject a fuzzy set ~ * of the set of all admissible hypotheses. At this point we would like to remind that the rejection of a hypothesis ~ o corresponds with the acceptance of --7 afro, where --1 operates on the set of admissible hypotheses. Thus, in terms of fuzzy sets, a rejection of ~ * is performed by the fuzzy complementation operator - - , and the final result can be described by "accepting ~ * " . To illustrate the meaning of the above results, we consider the case of parameter hypotheses, that is, when the functional form F(O,. ) of the distribution of the population in question is supposed to be known and the hypotheses are pertaining to the unknown parameter of the distribution function, i.e. ~ = ~ ( 0 = 0i). Thus, Eqs. (21) and (22) assume the form J,
(23)
d,
with {~Icgj(0 = 0i), j e J } representing the set of admissible hypotheses. Hence, with respect to the preceding considerations, we can say that x ~ ~ forces us to reject the fuzzy set 0",
O* = f~ OJ(A)(~j),
(24)
and finally to accept 0". It should be noted that in the case of modeling the fuzzy set A = {reasonably small ~} by the membership function (A)(~) = 1 - ~,
~'~ e [0, 1],
(25)
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
12
the membership function of 0* equals to a function, well-known as the operating characteristic function (OC function) of test ¢-. The OC function of a test is closely connected to the power function P of test 3-, namely by the relation 1 - OC = P = (0"), where - - denotes the fuzzy complement. In general, we have (0") - (A)o P.
(26)
Thus far, we have considered only the case of rejecting a hypothesis. The major lack of significance tests is given by the fact that in general significance tests allow us to make decisions only in one direction. It is crucial to stress the fact that in the case of event x • J , it is not reasonable from a probability theoretical point of view to accept the hypothesis to be tested. In the following we will demonstrate how this insufficiency can be fixed by employing fuzzy sets into the test procedure, as shown in the preceding paragraphs, but first we will recall on the origin of this problem. Although a significance test provides a value of reliability, given by the level of significance, we have no knowledge about the error we would commit by accepting the hypothesis when it is not true. Such an error, measured by the probability of the acceptance space under the assumption that oVfois not true, is called an error of the second kind. It is obvious that, as long as there are more than one alternative hypotheses, such an error is not uniquely defined. For that reason the theory of hypotheses testing provides the term of "the power of test ~-- against the alternative hypothesis ~,~j", which is denoted by flj = ~ ( x • ~ 1 1 ~ i is true), j • J,
(27)
or in terms of parameter tests,
=jr df(Oj),
j • J.
(28)
The quality of the performance of a test for accepting a hypothesis is then represented by either the power function P(Ot) = 1 - / ~ t = at or the OC function OC(Ot) = 1 - P(Ot) = [~t. The ideal power function would be, of course, a function, such that
P(Oo)=O,
P(0t)=l,
forallj•J\{0}.
However, such powerful test do not exist in general. Even worse, in most cases, not even a "reasonable powerful" test, that is
OC(Oo)>~ l - a ,
OC(Ot)<<.fl f o r a l l j • J \ { 0 } ,
with ct and fl reasonably small, can be obtained, such that one may accept the hypothesis o~go for x • d . The only justifiable conclusion is that x does not contradict ~ o . For example, consider F(O,.) to be continuous in the unknown parameter 0 • O ___N and S} _ N", with n < oc. Let the set of admissible hypotheses be the set of all hypotheses pertaining to a connected subset of O. Hence, since the OC function is continuous, there always exist some j • J such that 1 - ~ ~ fit. Thus, either ~ or flj is not reasonably small. In other words, there is no reason to accept hypothesis Jgo by neglecting all other admissible hypotheses, since there are alternative hypotheses which are nearly as much acceptable a s ,l(~o •
From the theory of most powerful tests and uniformly most powerful tests, respectively, we know that in particular classes of tests, there exist an upper threshold for the quality of the performance of a test for accepting the null hypothesis. Once again, this quality of performance is expressed in terms of a function rather than by a numerical value. We refer the reader to classical literature in the field of mathematical statistic, such as Fisz [5] or Schmetterer [17], as further details. The theory of statistical hypotheses testing is due to Lehmann [12] as well as to Neymann and Pearson [14, 15] and has been subject of investigations to several researches. In
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) I 26
13
particular, power functions have been studied by Ferris et al. [4], Neyman and Tokarska [16], and Johnson and Welch [6], to mention just a few of them. Now, from the things mentioned so far, it is quite obvious how this deficiency of expressibility in significance tests can be overcome by admitting fuzzy results. Consider a test Y = (~, ~¢) with a level of significance 0 < ~ ,~ 1 pertaining to the null hypothesis ~ o and a drawn sample x, belonging to the acceptance space ~¢. Hence, in terms of classical statistical hypotheses testing, Jgo is not contradicted by sample x and test Y . But is there really no contradiction at all? Consider the inverted test J ' = (~¢, ~), where x belongs to the rejection space and whose level of significance, with respect to the same null hypothesis, is ct' = 1 - 0c In the classical approach 3-' is of no relevance, since its level of significance is not reasonable at all, but in the fuzzy approach every ct has to be taken into account, since its significance is encoded in fuzzy set A anyhow. This leads to the contradiction of hypothesis Jgo to an extent of A(~') or in other words to the rejection of fuzzy hypothesis ~o* = ( ~ o , A(~')). Furthermore, since the fuzzy approach urged us to drop the distinction between null hypothesis and alternative hypotheses, by pointing them together in the set of admissible hypotheses, we have to apply the same reasoning to all other admissible hypothesis and thus, in terms of parameter hypotheses, a fuzzy set 0* with membership function (0") -= (A) o P' has to be rejected. Finally, since P', denoting the power function of test .Y--', equals to the operating characteristic function OC of test Y , we may accept fuzzy set 0", with (0") - (A)o OC.
(29)
Specifically, for (A) -= 1 - ~, we accept a fuzzy set, whose membership function equals to the operating characteristic function OC of test 9-. That this is a reasonable result can also be justified by the fact, that, by accepting 0* rather than the single hypothesis JUgo,we do not neglect henceforth other admissible hypotheses which are nearly as much acceptable as the null hypothesis. We can now summarize the results of this section as follows. By investigating the nature of statistical significance tests, which suffer not only from the major insufficiency of being in general conclusive only in one direction, but also from the subjective judgment of being "significant enough" or not for contradicting a hypothesis, we saw that, by employing this ambiguity into the test procedure and by allowing fuzzy results, which manifest themselves as fuzzy predictions about the unknown parameter, we can overcome this deficiency. Such a fuzzy approach boils down to a proper fuzzy modeling of "reasonable significance", for which the statistician is put in charge. 3.2. Nonsignificance tests The fuzzy nature of statistical tests becomes even more obvious in the case of tests, which do not provide any numerical value of reliability. Such tests, called nonsionificance tests in this paper, completely rely on the subjective judgment of the statistician. The prime representative of such tests is the so-called probability paper, whose task it is to decide, on the basis of a sample, whether or not a population is normal distributed. The main idea of this test is based on the linear relation between the fractiles up of the standard normal distribution N(0, 1) and those of any arbitrary normal distribution N(/~, a2), here denoted by xr: xp = aup + it.
(30)
Hence, by transforming an equidistant system of (x, y)-coordinates into a (x, y)-system of coordinates, such that z = 4~- ~(y), where 4~(.) denote the distribution function of the standard normal distribution, we obtain a plane of coordinates in which all normal distribution functions appear as straight lines. A test for normal distribution is then performed by the following steps: (1) draw a sample pertaining to the population in question; (2) approximate the underlying distribution function with respect to this sample (for instance, calculate the empirical distribution function);
14
C R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
(3) enter a representative set of points into the system of (x, z)-coordinates; (4) decide whether or not they are approximately on a straight line; (5) accept normal distribution in the affirmative, reject else. There is no doubt that this test procedure involves uncertainty of a fuzzy nature; for example in step (4), where we have to decide whether or not a set of points forms approximately a straight line. Since we do not expect to see these points lying exactly on a straight line, and because we do not have any measures for the reliability of our decision, we have to justify any conclusion only with subjective arguments. This involves the inconsistency that even the same statistician with the same sample might come to different conclusions at different times. It should be pointed out that statisticians may allocate different conclusions to the same sample at different times, because uncertainty can then be intended as a kind of randomness due to variation in time. In any case, the statisticians may express their decision in step (4) in a fuzzy way. In the next section we see that, when using fuzzy data, we are able to provide a numerical value of a possibilistic nature, which characterizes the corresponding fuzzy decision and which secures us the reproducibility of such a decision.
4. Examples of statistical tests for fuzzy data In this section we illustrate the impacts of fuzzy data onto the task of statistical hypotheses testing by taking advantage of the results of Section 2. First we study the consequences on a representative of nonparametric and nonsignificance tests as well, namely the probability paper, whose test algorithm was reviewed in Section 3.2. Subsequently, we investigate the impacts of fuzzy data on parametric significance tests by carrying out some examples. 4.1. The fuzzy probability paper
As we mentioned in Section 3.2, the so-called probability paper is a test for normal distribution, which is performed in five steps, where the second one demands an approximation of the underlying distribution function with respect to the drawn sample. The empirical distribution function was proposed for this issue. Now, since we have to face fuzzy data, the question arises how to calculate an approximation for the distribution function on the basis of a fuzzy sample vector, or in particular, what is the fuzzy empirical distribution function? In this section we use the fuzzy empirical distribution function as the pointwise fuzzy extension of the classical empirical distribution function
1
fin(t) = - ~ 4 - ~,o(xi), ni=l
(31)
Vt ~ R Vx ~ S'x.
Since fn(t) =/~(x) is a mapping from the sample space to the real line, in particular to the finite subset Yn = {0, 1/n,2/n . . . . . 1}, for every fixed t e R, the image of a fuzzy sample vector ~ under the m a p / ~ : S ] ~ Yn is given by Vy~ Yn, Vt~ItL x~Xt,y
(32)
n i= 1
But in favor of an easier handling of such a fuzzy image within a mathematical framework, we have to find the a-cut representation of/~(~). Since/~: S] ~ Yn is a map from a subset of ~n to a finite set,/~ meets the
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) I 26
15
requirements of Proposition 4 and hence
[.~(,~)],~
= {y • Y. ly = •(x),x• [~i],,}
(33)
for almost all s • (0, 1]. In particular, for a minimum rule based fuzzy sample vector ~min, aggregating fuzzy data with compact s-cuts [6i], = F[6i]~, [61],], we have (34) i=1
ni=l
for almost all s • (0, 1]. Eqs. (33) and (34), respectively, can be used as pointwise fuzzy approximation of the underlying distribution function with respect to a certain degree of possibility, namely s • (0, 1]. For a representative set of t's in •, let us say T _ ~, we are now able to resume the test procedure in the following way: (3') figure the minimum and maximum in [/~(/~)]~, Vt • T; hereafter denoted by y~ and y~, respectively; (4') enter the points z± = (t,y~) and T, = (t,~) Vt • T, so far as y~ > 0, respectively ~ < 1, into the system of (x, z)-coordinates; (5') connect the points z±, t • T as well as the points 7 , t • T by polygon strokes; (6') check if there fits a straight line in between the two polygon strokes; (7') accept normal distribution with respect to the degree of possibility s in the affirmative, reject else. This algorithm can be executed for all s • (0, 1], what leads to a set of decisions (D,),~o. t]. It is due to the inclusive nature of or-cuts, that D~o = 0, i.e. rejection of the hypothesis "normal distributed" at the degree of possibility s0, implies D, = 0 Vs > So. Of course, the same applies in the other way. Thus, there might exist a/3 • (0, 1], where the decision switches from acceptance to rejection, if not, we are forced to say that this fuzzy sample either completely supports the hypothesis or completely contradicts the hypothesis, so that in these cases we may assign 1, respectively 0, to the critical value ft. Hence, the test result can be summarized by a statement like: normal distribution acceptable up to the degree of possibility ft. It should be noted that step (6'), which replaced step (4) in the classical test procedure, does not depend on subjective judgment anymore, what makes this test procedure, although we are using fuzzy data, in a certain way more precisely than its classical relative. The fuzzy test result is reproducible. Fig. 2 depicts two fuzzy samples and there corresponding fuzzy probability papers in the s-cuts 0.1, 0.5, 0.9 and 0.3, 0.7, 1, crafted with respect to the minimum aggregation. In favor of clarity, the or-cuts are illustrated in two different pictures. Artificial fuzzy sample represents (A) 25 linear fuzzy data of the form T(mi, 5), i = 1. . . . . 25 (the corresponding test result is fl = 0.5); (B) 20 exponential fuzzy data of the form E(m, 3, 2), i = 1, ..., 20 (the corresponding test result is fl = 0).
4.2. Parameter tests for fuzzy data In this section we study the impacts of fuzzy data on parameter significance tests. As we saw in Section 3.1, such tests manifest themselves as partitions of the sample space, and the test result depends on the distribution whether a drawn sample belongs to the rejecting space or to the accepting space. Aside from the issue of conclusiveness, which has been discussed already, we have to face on additional problem when dealing with fuzzy data: we cannot expect in general that a drawn fuzzy sample, regardless of the used aggregation function, belongs exclusively to the rejecting or the acceptance space, respectively. In other words, a fuzzy sample might support the rejection as well as the acceptance of a hypothesis. At this point we would like to stress on more time that the conclusion "acceptance" has to be treated with appropriate care.
16
C. R~mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
Cr I
,
I
0
i
.4
20
1:
40
~0
63
70
~0
90
~
t::
2:
~0
~0
60
70
eO
90
o~,± [%] C75 ~I
O.S 0,|3 i
I 25
22 t
]:
J5
4:
45
SO 55
&~
E!
7~
l~
7~
.~5
I:
)~
' 47
4".
:O
55
SO
~5
7Q
:'~ t [%1 /...
°" i [%]
Q 75 0,5
F~-
C,2t.
F I ~J L
a.$ i I~l'~
1,11 ]a
Fig. 2. The fuzzy probability paper.
ii
,i . . . . ' . . . . . . 4~ :9 :5
lh* . . . . in 6~
':~ ~0
C R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
17
In order to deal with this ambiguity, we have to extend the domain of the classical test function ~o~r:S~ v--,{0, 1}, w hich provides the test result for each element of the sample space S] with respect to a certain test ~ = (~, ~¢) such that ~o~(x) = n~,l(x)
to fuzzy subsets of the sample space. Hence, our fuzzy test result, provided a fuzzy sample vector ~, is a fuzzy subset of the two alternatives 0 and 1, which means, that we have to expect a certain degree of acceptance as well as certain degree of rejecting the hypothesis. Hereafter, we will refer to these values as the acceptance and rejecting indices (pa(6) and ~or(~), respectively, specified by the fuzzy test function q~,: ~(S~:) ~ [0, 1] x [0, 1] such that ~o*(~) = (~or(~), ~oa(~)), where ~-(S~) denotes the set of all fuzzy sample vectors. They are given by (pa(~) =
(~{,9¢}(¢~))(0)
=
sup(~)(x), ~Pr(~) = (D{,.}(~))(1) = sup(~)(X); x~,e/
(35)
x~
or in terms of possibilities ~ga(~) = POSS[~5e d ] , ~o~(~)= Poss[~ • #t].
(36)
From the property of normalization of fuzzy sample vectors, we can conclude that max[~oa(~i),q~(~)] = 1
V~5• ~(S~).
(37)
In practice, test results of significance tests usually observed on real line, in the course of which the sample space and the real line are engaged to each other by a certain statistic s:S~x ~-*Sv ~-R, such that ~0:(x) = q~stj-)(s(x)), where s(~--) = (s(~), s(~¢)) is a partition of the sample space Sr and Y = s(X"). This allows us to infer the test result of test ~-" from the test result of auxiliary test s(9-) and vice versa, since they are fully equivalent. The question now arises if this equivalence is still available when employing fuzzy data, or in other words, does ~o*(~) = ~p*~:)(s(~))
(38)
hold in general? The answer is in the affirmative, since
Poss[s05) • s ( ~ ) ] = sup (s(~5))(y) = sup(s(~5))(s(x)) = sup(~)(x) = Poss[~5 • ~t]. The same applies to the acceptance index. We have to note that the identity of Eq. (38) is not trivial since it depends on the way how the classical test function has been generalized to a fuzzy test function. The proposed way of evaluating acceptance and rejecting indices is only one possible extension to a fuzzy domain among others. Consider that we define a fuzzy test function ~b~-:~-(S~)~ {0, I}, such that
4,*(,~) = ~,~(g(,~)), where g(~) specifies the sample space coordinates of the point of gravity of fuzzy sample vector ~. This kind of defuzzification is known in fuzzy controlling as center of area method (COA) and in applying it here, it provides the advantage of exact test results in lieu of fuzzy results, like the preceding fuzzy test function does. The disadvantages of such a fuzzy test function can be described as follows: the test result equals to the classical test result for all symmetrically aggregated fuzzy sample vectors and hence the fuzzy approach was a redundant effort; the test result cannot be observed by an auxiliary test s(~), since
~*(,~) = ~ : ) ( s ( , ~ ) ) does not hold in general. Of course, one might define a fuzzy test function ~ ( " ) on the basis of the auxiliary fuzzy test function q~*ta-)(s(• )), but in the sequel, there is no geometrical interpretation of q~-(~) in the sample space itself, which may or may not be considered as crucial. We now give some examples for parameter tests with fuzzy data. In particular, we demonstrate in the first example the impacts of different aggregation functions onto the fuzzy test result, whereas the test results of
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
18
the subsequent examples are throughout due to minimum aggregation. The corresponding figures illustrate the used fuzzy data, which are representing artificial fuzzy samples, as well as the fuzzy test result, observed on the real line. The presented test results are calculated on the basis of Proposition 3, since the used fuzzy data as well as the specific test statistics meet all requirements of Proposition 3. Only the lower index of fuzzy test function (p* is specified as a numerical test result, since the other one is given due to Eq. (37). The first test also uses the results of Example 1. All tests have been constructed with respect to a level of significance = 0.01. Since it is not relevant to our discussion, the deductions of the used test statistics are omitted here. We refer the reader to classical literature in the field of mathematical statistics, such as Fisz [5] or Schmetterer [17-1, for details in this matter. Test 1. It shows a test for the mean value of a normal distributed quantity where the variance is known. The used test is given by its rejecting space.
] where /to denotes the null-hypothesis and tr the standard deviation, u(1-,) is the (1 - ct)-fractile of the standard normal distribution and ~ the level of significance. Fig. 3(a) illustrates a sample of exponential fuzzy data E(mi,2,2), i = 1, ...,5, being tested for the null-hypothesis ~ o (/~ = 35) by contra-hypothesis a~l (/~ > 35). Fig. 3 (b) illustrates 20 fuzzy data of the same shape, being tested for the null hypothesis ~f~o(# --- 40) by contrahypothesis ~'¢~1(/~ > 40). Both minimum and product aggregation have been considered. The test results are q~in = 0.6, ~0~ pr°d= 0.1 and ~0rmi"= 0.9, tp~r°d = 0.02, respectively. As expected, the test results pertaining product aggregation are more crisp than the corresponding test results with respect to minimum aggregation, whereas the characters of both results are not different, since both are having acceptance index 1. We n o t e that this is a relation which is valid in general, because of Definition 1(1) of fuzzy sample vectors. We can also see how an increase of the sample size affects the fuzzy test statistic in case of product aggregation. Test 2. It shows a test for the mean value of a normal distributed quantity where the variance is unknown. The used test is given by its rejecting space ~=
{
xeS~[
Sn(x )
~tn-1;l-,/2
}
,
where #o denotes the null-hypothesis and S,2 the unbiased statistic for the unknown variance, t, _ a; ~_,/2 is the - ~/2)-fractile of the student's t-distribution with (n - 1) degrees of freedom and a the level of significance. Fig. 4(a) illustrates a sample of linear fuzzy data T(m~, 2), i = 1. . . . . 10, being tested for the null-hypothesis ~(o(/~ = 50) by contra-hypothesis ~ ( / ~ ¢ 50). The test result is q~r = 0. Fig. 4(b) illustrates a sample of exponential fuzzy data E(mi, 2, 1), i = 1, ..., 10, being tested for the null-hypothesis ~,Ugo(/~= 70) by contra-hypothesis ~ ( / ~ # 70). The test result is q~ = 0. Fig. 4(c) illustrates a sample of exponential fuzzy data E(m~, 3, 3), i = 1. . . . . 10, being tested for the null-hypothesis ~¢go(/~= 60) by contra-hypothesis 9~(/~ # 60). The test result is ~o~ = 0.86. Test 3. It shows a test for the variance of a normal distributed quantity where the mean value is unknown. The used test is given by its rejecting space
(1
~l={xeS]l(n-1)S2 •
2
2
} ,
where ao denotes the null-hypothesis value of the standard deviation and )~2 ~ the corresponding fractiles of the z2-distribution with (n - 1) degrees of freedom. Fig. 5(a) illustrates a sample of linear fuzzy data T(m~, 2), i = 1. . . . . 10, being tested for the null-hypothesis s~f'o(az = 100) by contra-hypothesis ~ ( a 2 # 100). The test result is ~o~= 0. Fig. 5(b) illustrates a sample of
C. RSmer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
(~)
19
(b) Q
t.O
1.0
/ 25
30
35
40
,o
45
50
1 55
~'~-~
.
BO
,,
30
,.o
.5
40
50
EO
70
-
.5
It
O. 25
t I I J +I!
30
I I I +
35
40
i,
'4.5
itl
,lllltl!
50
55
z
llllnllll+llilll+++I
O.
,,,nl~w,+v~ftil,,,rl
30
60
40
50
60
70
Q 1.0
1.0
.5
.5
O.
O.
25
30
~5
40
45
50
55
-
60 Fig. 3. Test 1.
lllIll+~+l++llll1+ll~
30
40
t!''+!!IP+++'+l+lltllflf
50
B0
I Z
70
x
C R6mer, d. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
20
Q
(3 I'g
(a)
"IE
II
'.° -;
-
C
.IS
--
a
C"
a
7-
A I
7
C
{.
2.
~
- "
-q. ¢Z
.4
-J
-2
-I
;,
2
1
X
4
O
1.0 ~
(b) "°i
E
E
)'. 7,:
1:
20
4G
.tO
60
"tO
i
I~0
-1!
t$ t.4
+J:
-2. j
-20
-:!
-~:
-!
0
!
l:
I c.
o
c)
',-
/
.8
.4
.2
,.
E.. 2G
~ ~,.//\.,A
4
2.~
]G
1".,
4O
,11
IG
)y,!\ I~
, ~,\ ...... GO
61
"t0
'
71
tl
X
~Q
Fig.
4.
Test
2.
C. R6mer, A. Kandel Fuzzy Sets and Systems 72 (1995) 1-26
21
o
1 .:e
.~
a
,5
.4
o.
X ;:
30
;:
4:
~0
70
6G
c
~0
(c)
1.0
'°r
fj
E E ,5
,6
p.
/
T
[ 4
~
F
::
:~.
]:
~'.
4C
4:
~.:
;t
|¢
IS;
7=
?;
40
Fig. 5. Test 3.
2:
4~
lO
a~
I::
C. Ri~mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
22
(b)
(~)
a
Q 1.0
i 0 I
n~ .6
.0
i
1
"g
' ~:
'
~ 11 ,
~
6.c
i;
7=
X
.
~.
•
' "I~
?t
i
C
2~
•
,
,.,;;
4
IL
i l,lill,
80
100
80
10g
X
0
,.0
~
"i
[
~i
.6
11
P
.2
k 2 I.
3:
),t
4a
,~c
~:
t!
60
6~
y
7-~
?Q
0 [
'"'
.....
'"'
0
10
~:
1.0
Y
,q;
IL~
I
C
0 1.0
......
- -
..~ E
..ig E ! -:
C
I
2
3
4
6
7
!
g
Fig. 6. Test 4.
\\
I
tC
I~
2~
.
X
C R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1 26
23
(~) ,.°~ 0
Y
'I .!
?."
.! E t
~i 2 ~.
3~
.I:
40
$C
~
60
4
~
Y
6
8
I:
o
(b)
1.0
m
.c
E
@ @
=1
1~._31 -1-"
r-
.2
F ''~
............ 30
'" 4C
............. 5~
~'
E~
I"11:
.......
?~
~
: lili'j¸'
t
X
•T *
y
,o!
......
I
9~
I'lr
1in'l'In'
2
u
3
o
(c)
1.0
E r
rz
2~ g
.6
E
~c E
F ,4
°t ,IO
J i
I
1< I
I¢
2:
3a
4¢
~:
~C
?C
8:
gO
t ,,
,.~,
,r ~ ,
,,,
,
t
~ m b
4
Fig. 7. Test 5.
'
4
"
24
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
exponential fuzzy data E(mi, 2, 1), i = 1. . . . . 10, being tested for the null-hypothesis ~ 0 ( 0 "2 ~-- 64) by contrahypothesis 3¢t~1(a 2 ~ 64). The test result is tpr = 0.36. Fig. 5(c) illustrates a sample of exponential fuzzy data E(mi, 3, 3), i = 1. . . . . 10, being tested for the null-hypothesis 3¢~o(a2 = 25) by contra-hypothesis ~ l ( t r 2 ~ 25). The test result is ~Pa = 0.8. Test 4. It shows a test for the identity of the variances of two normal distributed quantities. The used test is given by its rejecting space
= (x,y)eS~xS~l-g-~¢[F,-1.m-l.~/2,F,-1.m-l.l-~/2] Sm
'
"
'
'
'
where F,_ 1 : , - 1; denotes the corresponding fractiles of the F-distribution with degrees of freedom (n - 1) and (m -- 1). Fig. 6(a) illustrates a two samples of exponential fuzzy data E(mik, 2, 2), i = 1. . . . . 10, k E { 1, 2}. The test result for the hypothesis of identical variances is q~r = 0. Fig. 6(b) illustrates two samples of linear fuzzy data T(mik, 2), i = 1, ..., 10; k • { 1, 2}. The test result for the hypothesis of identical variances is q~r = 0.93. Test 5. It shows the assymptotical test for the independence of two normal distributed quantities by means of Fisher's Z. The used test is given by its rejecting space =
z•S~I
~
m
~
/>u~/2 ,
2 2 where z is a sample of the two dimensional normal distributed quantity Z ~ N(#x,/~y, trx, at, p), R denotes the sample correlation coefficient and ul -~/2 is the (0t/2)-fractile of the standard normal distribution. In the following figures, the s-cuts 0.1, 0.5 and 0.9 of two, the dimensional fuzzy data fiz, obtained by combining the two feature data fix and 6y by means of the minimum operator, are depicted in order to illustrate the shape of the underlying data. Fig. 7(a) illustrates a sample of two-dimensional fuzzy data, where both underlying feature data are linear fuzzy data of the shape T(mik, 2), i = 1.... ,10; k • { 1, 2}. The test result for the hypothesis of independence is tpa = 0.6. Fig. 7(b) illustrates a sample of two-dimensional fuzzy data, where both underlying feature data are exponential fuzzy data of the shape E(mi~, 2, 3), i = 1. . . . . 10; k • { 1, 2}. The test result for the hypothesis of independence is q~ = 0. Fig. 7(c) illustrates a similar sample as used in Fig. 6(b). Only the shape of the underlying feature data is different: E(mi~, 2, 1), i = 1, ..., 10; k • { 1, 2}. Hence a different test result, q~, = 0.2, is obtained.
5. Summary and conclusion We investigated the impacts of vague and imprecise data on the task of statistical hypotheses testing. In particular, we discussed the crucial step of how to aggregate fuzzy data to a fuzzy sample vector and with regard to that, we suggested some properties an appropriate aggregation operator should feature. Two qualified aggregation operators, namely the minimum operator and the product operator have been subject to more detailed studies. The minimum operator turned out to be the more fuzzy approach, whereas the product operator implicitly presumes some interaction between the underlying fuzzy data. It should be noted that this interaction, of course, is possibilistic in nature rather than probabilistic in nature. In the sequel, test statistics evaluated with respect to the minimum operator showed more ambiguity than those due to product aggregation. The test statistics have been evaluated on the basis of the results given in Section 2. We also discussed the issue of defuzzification as a means of providing numerical values as a test result, and in doing so we saw that there are defuzzifications which may have a geometrical interpretation in the sample space but not on the real line, where test results are usually observed. We proposed a two-valued test result
C. R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
25
by introducing the acceptance and the rejecting indices, provided by the fuzzy extension of the classical test function. Such a fuzzy test result has a probabilistic as well as possibilistic interpretation, which can be used for inferring a final decision as follows: • the maximum of both indices indicates whether the corresponding hypothesis is more likely being accepted or rejected from a probability theoretical point of view; • the minimum of both indices and its difference to the maximum indicate whether or not such an acceptance or rejection is justifiable from a possibility theoretical point of view. In other words, if the possibility of the acceptance (space) and the possibility of the rejection (space), with respect to the possibility distribution of the fuzzy sample vector, are not significantly different then there is no evidence for inferring a crisp conclusion. As a representative of nonparametric and nonsignificance tests, we considered the so-called probability paper and adapted its test procedure in a way to deal with fuzzy data. The test result was expressed in terms of a numerical value, representing a degree of possibility for accepting the hypothesis (normal distribution), on the basis of a drawn fuzzy sample. In Section 3 we discussed the issue of conclusiveness in classical statistical hypotheses testing. We disclosed the fuzzy potential of statistical hypotheses and by taking advantage of this, we proposed a way of how to overcome the general deficiency of being conclusive only in one direction. We saw how one can infer a fuzzy subset of the set of all possible hypotheses, representing the most likely verified hypotheses. Finally, we can say that employing fuzzy data into the test procedure and defuzzifying the obtained fuzzy test result can yield to other conclusions than those inferred by classical test procedures applied on defuzzified data. The statistician has to decide whether the fuzziness of the data is significant enough in relation to their random variation, so that he can expect such deviations.
Acknowledgements The first author gratefully acknowledges the encouragement and ideas given by Professor R. Viertl and Dr. S. Schnatter, presently at Institut fiir Statistik und Wahrscheinlichkeitstheorie, Technische Universit~it Wien, Austria and Institut fiir Statistik, Abt. experimentelle Mathematik und Statistik, Wirtschaftsuniversit~it Wien, Austria, respectively. Thanks are also due to R. Rehling for his keen interest and support shown in the development of this paper. The second author would like to acknowledge the partial support of this research by USF DSR research grant number 7903-902RO.
References [1] H. Bandemer and A. Kraut, A case study on modelling impreciseness and vagueness of observations to evaluate a functional relationship, in: W.H. Janko, Ed., Progress in Fuzzy Sets and Systems (Kluwer Academic, Netherlands, 1990) pp. 7 21. [2] D. Dubois and H. Prade, Operations on fuzzy numbers, Internat. J. System Sci. 9 (1978) 613-626. [3"] D. Dubois and H. Prade, Fuzzy Sets and Systems: Theory and Applications, (Academic Press, New York, 1985). [4] C.D. Ferris, F.E. Grubbs and C.L. Weaver, Operating characteristics for the common statistical tests of significance, Ann. Math. Statist. 17 (1946) 178. [5"] M. Fisz, Probability Theory and Mathematical Statistics (Wiley, New York, 1963). [6] N.L. Johnson and B.L. Welch, Applications of the non-central t-distribution, Biometrika 31 (1939) 362. I-7] A. Kandel, Fuzzy Mathematical Techniques with Applications (Addison Wesley, Reading, MA, 1986). [8] A. Kandel, Ed., Fuzzy Expert Systems (CRC Press, Boca Raton, FL, 1991). I-9"] A. Kandel and G. Langholz, Eds., Architectures for Hybrid Intelligent Systems (CRC Press, Boca Raton, FL, 1992). [10] A. Kandel and G. Langholz, Eds., Fuzzy Control Systems (CRC Press, Boca Raton, FL, 1994).
26
C R6mer, A. Kandel / Fuzzy Sets and Systems 72 (1995) 1-26
[11] R. Kruse and K.D. Meyer, Statistics with Vague Data, Theory and Decision Library, Series B, Mathematical and Statistical Methods (Reidel, Dordrecht/Boston, 1987). [12] E.L. Lehmann, Testing Statistical Hypotheses (Wiley, New York, 1986). [-13] H.T. Nguyen, A note on the extension principle for fuzzy sets, J. Math. Anal. Appl. 64 (1978) 369-380. [14] J. Neymann and E.S. Pearson, On the problem of the most efficient tests of statistical hypotheses, Phil. Trans. Roy. Soc. London Ser. A 231 (1933) 289. 115] J. Neymann and E.S. Pearson, Contributions to the theory of testing statistical hypotheses, Statist. Res. Memoir I (1936) 1; 2 (1938) 25. 1-16] J. Neymann and B. Tokarska, Errors of the second kind in testing Student's hypothesis, J. Amer. Statist. Assoc. 31 (1936) 318. 1-17] L. Schmetterer, Introduction to Mathematical Statistics (Springer, Berlin, 1974). [18] S. Schnatter, Linear dynamic systems and fuzzy data, in: R. Trappl, Ed., Cybernetics and Systems "90 (World Scientific, Singapore, 1990). [19] S. Schnatter, Descriptive statistics and statistical estimation for fuzzy data, Research Report RIS-1989-8, Institut fiir Statistik und Wahrscheinlichkeitstheorie, Technische Universit~it Wien (1989). [20] R. Viertl, Is it necessary to develop a fuzzy Bayesian inference? in: R. Viertl, Ed., Probability and Bayesian Statistics (Plenum, New York, 1987) 471-475. [21] R.Viertl, On statistical estimation for fuzzy data, Research Report RIS-1989-3, Institut fiir Statistik und Wahrscheinlichkeitstheorie, Technische Universit~it Wien (1989). [22] R. Viertl, Modelling of fuzzy measurements in reliability estimation, in: V. Colombari, Ed., Reliability Data Collection and Use in Risk and Availability Assessment (Springer, Berlin, 1989). [23] R. Viertl, Estimation of the reliability function using fuzzy life time data, in: P.K. Bose, S.P. Mukerjee, K.G. Ramamurthy, Eds., Quality for Progress and Development (Wiley, New Delhi, 1989). [24] R. Viertl and H. Hule, On Bayes' theorem for fuzzy data, Research Report RIS-1988-6, Institut fiir Statistik und Wahrscheinlichkeitstheorie, Technische Universit~it Wien (1988). [25] L.A. Zadeh, Fuzzy sets versus probability, IEEE Proc. 68 (3) (1980) 421.