Fuzzy data dependencies and implication of fuzzy data dependencies

Fuzzy data dependencies and implication of fuzzy data dependencies

sets and systems ELSEVIER Fuzzy Sets and Systems 92 (1997) 341-348 Fuzzy data dependencies and implication of fuzzy data dependencies Wei-Yi Liu Dep...

567KB Sizes 17 Downloads 123 Views

sets and systems ELSEVIER

Fuzzy Sets and Systems 92 (1997) 341-348

Fuzzy data dependencies and implication of fuzzy data dependencies Wei-Yi Liu Department of Computer Science, Yunnan University, Kunming, Yunnan, PR China Received May 1995; revised June 1996

Abstract

In this paper, we give the definitions of fuzzy functional dependency (FFD), fuzzy multivalued dependency (FMVD) and fuzzy join dependency (FJD). We show that the inference rules of FFDs and FMVDs, which are similar to Armstrong's Axioms for classical case, are sound and complete. Furthermore, we show that a classical data dependency satisfies the definition of a fuzzy data dependency.We also show that FFD can be viewedas an FMVD and an FMVD can be viewed as an FJD. We discuss the test for implication of fuzzy data dependencies. A series of results, which are similar to the classical result, is obtained in this paper. © 1997 Elsevier Science B.V.

Keywords: Semantic proximity; Fuzzy data dependency;Implication of data dependencies

1. Introduction

In real world applications data are often partially known or ambiguous. For example, we say that the height of Zhang is around 190 cm or simply that Zhang is tall. The classical relational model does not deal with this information about the height Zhang. How to extend relational model to handle fuzzy data? Recently, many good results have been obtained in the fuzzy information researches [2-5, 7, 9]. The data dependencies are the most important topics of the theory of relational databases. Raju defined fuzzy functional dependencies in terms of the membership function of the element of the fuzzy relation I-7]. The inference rules for fuzzy functional dependencies in [7] are incomplete. The inference rules form a complete set only when some conditions holds (Theorem 5.1 in [7]).

Tripathy defined fuzzy multivalued dependencies in terms of fuzzy Hamming weight. Tripathy did not discuss the completeness of the inference rules I-9]. Up to now, there is no complete and identical theory of fuzzy data dependencies. In this paper, we give the concept of the semantic proximity between two fuzzy attribute values. The semantic connections between two attribute values are more natural than Raju's and Tripathy's. There is a unified description of null values, fuzzy values and classical values in this paper. Based on the semantic proximity, we give the definitions of fuzzy functional dependencies (FFDs), fuzzy multivalued dependencies (FMVDs) and fuzzy join dependencies (FJDs). The inference rules of FFDs and FMVDs, which are similar to Armstrong's Axioms for classical case, are complete in this paper. Furthermore, we show that a classical data dependency

0165-0114/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved PI1 S01 65-01 1 4 ( 9 6 ) 0 0 1 73-X

342

W-E Liu / Fuzzy Sets and Systems 92 (1997) 341-348

satisfies the definition of a fuzzy data dependency. We also show that F F D can be viewed as an F M V D and an F M V D can be viewed as an FJD. In addition, we discuss the test for implication of a data dependencies by a set C~ of F F D s and FJDs. We obtain a series of good results, which are similar to the theorems for classical case.

2. Basic concepts [6,11] A universal relation U is a finite set of attributes. An instance r of relation scheme R over U is a two-dimensional table in which columns correspond to attributes and rows correspond to tuples. A functional dependency (FD) on U is a statement X --', Y, where X, Y c U. The F D X ~ Y, holds in an instance r on U if, for all tuples tl, tz e r, if tl [x] = t2 [x], then tl [ Y ] = t2 [Y]. A multivalued dependency (MVD) on U is a statement X--*-~ Y, where X , Y = U and Z = U - XY.The M V D X ~ Y, holds in an instance r on U if, for any two tuples tl, tz e r with tl [ X ] = t2[X], there exists a tuples t3 e r with t 3 [ X ] = t l [ X ] , t s [ Y ] = t l [ Y ] , and t 3 [ Z ] =

t2[z]. Let R = {R1 .... , R,} be a set of relation schemes over U. A join dependency (JD) on U is a statement of the form *[R1 . . . . . R,]. The * [R1, ..., R J holds in an instance r on U if r decomposes losslessly onto R1,R2,...,R,. That is, r = Y[RI(r)~I~R2(r) D~ "'" D~ IIR. (r). We also write * [R1, ..., R,] as * R. Let C be a set of data dependencies (i.e., FDs and JDs) and let p be a data dependency on U. The dependency p is implied by C if and only if, whenever C holds in an instance r on U, then p also holds in r. Let C and C' be two sets of data dependencies on U. If both C implies C' (i.e., every dependency of C' is implied by C) and C' implies C, then C and C' are equivalent. The tableau TR associated to the J D *R = • JR1 .... , R , ] is a two-dimensional table in which the columns correspond to the attributes on U. There are n rows in TR. Each Ri in R corresponds a row wi. Values of TR have variables chosen from a set V. V is the union of two sets, Vd and V~. Vd is the set of distinguished variables, denoted by sub-

Table 1 TR

A1

A2

A3

A4

A5

A6

wl W2 w3

aa b4 b8

a2 b5 b9

a3 aa b~o

ba a4 a4

b2 b6 a5

b8 b7 a6

scripted a's, and Vn is the set of nondistinguished variables, denoted by subscripted b's. wl row contains a's in the R r c o l u m n s and b's in the (U - R~)columns. Let R = {A1,...,A,}. The distinguished variable appearing in the A~-column will be ai. Nondistinguished variables are different in TR.

Example 1. Let R = *[AIA2A3,AaA4,A4AsA6]. See Table 1. An X-column tableau, written Tx, has two rows, Wo and Wx. Row Wd is all distinguished symbols; Wx has distinguished symbols in the X-column and distinct nondistinguished symbols elsewhere. That is, Tx = TR for R = { U, X}. Let C be a set of FDs and JDs and let *R be a JD. The Chase of TR under C, denoted by CHASEc(TR), is the tableau obtained from TR by applying the following two rules, until no rule can be applied anymore: (1) FD-rule. Let X ~ Y be an F D in C and let A be in Y. Let Wi and W; be two rows of TR that agree in all the X-columns but disagree in A-column and let W~ have a nondistinguished variable in Acolumn. The FD-rule for X ~ Y replaces all occurrences of the variable appearing in A-column of Wj with the variable appearing in A-column of Wi. (2) JD-rule. A JD-rule associated to a JD *R in C, where *R = [R1, . . . , R , ] , adds rows to TR as follows: Let W1,..., Wn be n rows (not necessarily distinct) of TR. If there exists a mapping W defined on R such that for each i, 1 <. i <~n, Wi [Ri] = W [Ri] and W is not already in TR, then row W is added to TR.

3. The semantic proximity of fuzzy values and the fuzzy relational operations Each attribute A; ~ U has associated with it a domain denoted by dom(Ai).

W-Y. Liu / Fuzzy Sets and Systems 92 (1997) 341-348 The fuzzy set theory and fuzzy logic proposed by Zadeh [11] provide a requisite mathematical framework for dealing with such fuzzy data values. Let dom(Ai) be an universe of discourse. A fuzzy subset X in dom(A~) is characterized by a membership function fx:dom(Ai) ~ [0, 1] where fx(u) for each u e dom(A~) denotes the grade of membership of u in the fuzzy subset X. The interval number description and the center number description are two commonest examples of fuzzy values [3, 4]. 3.1. The interval number description A fuzzy subset X in dom(Ai) is characterized by an ordered couple [a, b]/p. [a, b] is called the interval number, where a, b are real numbers. It expresses this fuzzy subset lies between a and b. p (0 ~< p ~< 1) is the degree of confidence.

The degree of proximity between two fuzzy values is described by the semantic proximity. Based on the concept of the interval number description, we discuss the semantic proximity of fuzzy values, written SP(fl, f2) (0 <~ SP(fb f2) ~< 1). The following properties ought to be satisfied by SP(fl, f2) Let fl = [al,bl], f2 = [az,b~], gl = [cl,dl], g2 = [c2,d2]. 1. SP(fl,f2) = 1 if and only if al = bl = a2 = b2. 2. SP(fl,f2) - 0 if and only if f l ~ f z = O. 3. If a l = a s , bl=b2, cl=ca, dl=d2 and Idl - cll > Ibl - all then SP(fl,f2)/> SP(Ol,g2). 4. If ] a 2 - b z l = l a l - b l l and f l ~ g l > ~ f z c ~ g l then SP (fl, gl) >1 SP(fz,gx). We give a concrete instance. S P ( f b f 2 ) = ]1fl (3f2 [1/I[fl t-)f2 ]1 - Nfl ~f211/~, where 1th 11is the modular of interval h.

3.2. The center number description A fuzzy subset X in dom(Ai) is characterized by (c,r)/p. It expresses this fuzzy subset lies in the spherical region, c is the center of the sphere, r is the radius of the sphere. For instance, we say the length of the string is 10.34+0.02 cm. In this paper, our discussion takes, for instance, the interval number description. Of course, we can use the same method to deal with the center number description and Zadeh's description. For convenience sake, we sometimes omitted the confidence degree of fuzzy values, i.e. the confidence degree of every fuzzy value is united into 1. [a, b]/p is simplified by [a - fl, b + fl], where fl = Ib - a[ * (l-p). For example, [3,5]/0.9 is denoted [2.8, 5.2]. This is a simple and intuitive method. An exact method would not be discussed in this paper. We can extend the description of fuzzy values to deal with null values (i.e., the values is existent but unknown). A null value can be viewed as a special case of a fuzzy value. The range of the null value is the whole universe of the discourse. The null k is denoted by [l, u], where u is the upper bound and l is the lower bound. Of course, a classical value b is denoted [b, b]. The confidence degree of a classical value is 1.

343

Ilhll =

0

h=0,

6 [b-al o~

h = [a, a], h = [ a , b ] and a # b , h =- oo.

is a given coefficient about the universe of the discourse, c~ >~ IIfl wf2 II. 6 is a very relatively small number. We select 6 = e/10000. Let tl = (x~l, ... ,Xl,), t2 = (x21.... ,x2,) be two tuples in an instance. SP(tl, t2) = MIN {SP(xli, x2,)}. Semantic proximity on the basis of the other descriptions of fuzzy values can be similarly defined.

Example 2. Let c~= 300 and 6 = c~/10000. 1. Suppose fl = [10, 10] and f2 = [10, 10], then SP(f~,f2) = 6/6 - 6/300 = 1. 2. Suppose fa = [2,5] and f2 = [6,7], then SP(fb f2) = 0/4 -- 0/~ = 0. 3. Suppose f ~ = [ 2 , 5 ] , f 2 = [ 2 , 5 ] , g 1 = [ 2 , 8 ] , g2 = [2, 8]. Then SP(fl, f2) = 3/3 - 3/300 = 0.99 and SP(gl,g2) = 6/6 - 6/300 = 0.98. We have S P ( f l , f2) >~ SP(gl, g2). 4. Suppose fl = [3,53, f2 = [2,4], and gl = [3,6]. Then SP(fl, gl) = 2/3 - 2/300 = 0.66 and SP(fz,g~) = 1 / 4 - 1/300 = 0.25. We have s e ( f , , g l ) ~> sP(Z2,gl).

344

W-Y. Liu / Fuzzy Sets and Systems 92 (1997) 341-348

Table 2

tt t2 t3

A1

A2

A3

A4

A5

[4,12] [3,6] [3,6]

[5,12] [3,8] [3,10]

[5,12] [10, 10] [10,10]

[6,13] [4,6] [4,6]

[8, 11] [6, 10] [6,11]

In a fuzzy environment, the problem of duplicate tuples must be considered. The duplicate tuples must be deleted from the fuzzy instance. Intuitively, two tuples are duplicate in a fuzzy instance if the semantic proximity of each attribute value in two tuples is greater than the given value.

Definition 1. Let r be a fuzzy instance, ti(d~l, di2 .... ,di,), tj(djl, dj2 .... , dj,), be two tuples in r. t~,tj are duplicate with respect to level L (0~ L, for k = 1,2 . . . . . n. For the classical case, if SP(t~[dk],tj[d~])= 1, (k = 1, 2 , . . . , n). we say that t~, tj are duplicate. Example 3. Let U = (A1,A 2.... ,As) and Level = 0.7. r is a fuzzy instance on U (see Table 2). We noticed that MIN(SP(t2[A1],ta[A1], ..., SP(tu [As], ta [As])) = SP(t2 [A2], t3 [Au]) = 0.702 > 0.7, t2, t3 are duplicate tuples. Under the new definition of duplicate tuples, the classical operations almost suit the fuzzy relational operations. (1) Projection. The idea behind this operation is that we select some columns from instance r and rearrange a new instance. If t~, tj in the new instance are duplicate, then t~ should be deleted from the new instance. (2) Union. The union of fuzzy instances r and s is the set of tuples that are in r or s. There are no duplicate tuples in the union of r and s. (3) Cartesian product. It is the same with fuzzy instances as with classical instances. (4) Selection. Fuzzy instances are similar to classical instances. For example, 6sPti, ttAl>e(r) denotes the set of tuples in r whose attribute A satisfies S P ( f t [ A ] ) > e. The given value f may be a fuzzy value.

(5) Set difference. The difference of instances r and s is denoted r - s, r - s = (t I t e r and t ~L t' where t' e s} (6) Join. The 0-join of r and s, written rj ~x~ ioj s, are the tuples in r x s such that the ith component of r stands in relation 0 to the jth component of s. For example, if the expression iO / i s SP(t[Ai],t'[Ai] > e (t e r, t' e s), this operation is called the natural join, written r t>
4. Fuzzy data dependencies Data dependencies of fuzzy relation involve fuzzy concepts, such as 'the salary almost depends on the experience and the job'. Based on the concepts of the semantic proximity, we give the definitions of the data dependencies.

Definition 2. A fuzzy functional dependency (FFD) X ~ Y with X, Y c U holds in a fuzzy instance r on U, if for all ti, tje r, we have SP(ti[X],tj[X]) < SP(t,[Y], tj[Y]).

Theorem 1. A classical FD satisfies the definition of FFD. Proof. If ti[X] = ti[X]

implies ti[Y] = t/[Y], then SP(t,[X], t j [ X ] ) = SP(t~[Y],tj[Y])= 1. Definition 3. Let X, Y c U and Z = U - X Y . A multivalued dependency (FMVD) X - - * ~ Y holds in a fuzzy instance r on U if, for any two tuples ti, t i e r with S P ( t [ X ] , t i [ X ] ) = a there exists a tuple t in r with SP(t[X],ti[X])>~o~, SP(t[y], t~[y])/> a, and S P ( t [ Z ] , t j [ Z ] ) / > o~.

Theorem 2. A classical M V D satisfies the definition of FMVD. Proof. We have the conclusion from the definitions of MVD and F M V D immediately.

[]

We will give the inference rules for F F D s and FMVDs which are the same as Armstrong's

W-Y. Liu / Fuzzy Sets and Systems 92 (1997) 341-348 Axioms for the classical case. FA1. I f Y _ _ _ X t h e n X ~ Y . FA2. If X ~ Y, then X W ~ Y W . FA3, If X ~ Y, and Y ~ ~ Z then X ,-,~Z. FA4. I f X ~ , - , ~ Y , then X ~ ~ , , ~ (U - X Y ) . FA5. If X ~ ~ , ~ Y , and V___ W then X W ,,,~ ~ ~ Y V . FA6. If X ~ - , , ~ Y, and Y - , ~ Z then X ~ , , , ~ (Z - Y). FA7. |f X ,--~ Y, then X ,-,~ ~ Y. FA8. If X , , , ~ Y , Z ~_ Y, Y c ~ W = O and W ,-,-~ ,-,-~Z then X ,-,~Z.

Theorem 3. Rules F A 1 - F A 8 are sound. Proof. The proofs follows directly from the proofs of Armstrong's Axioms for classical case [1, 9]. []

Theorem 4. Rules F A 1 - F A 8 are complete. Let F, G be the sets of F F D s and F M V D s on U respectively. All of F F D s and FMVDs, which are logically implied by F and G, can be deduced from F, G by FA1-FA8.

Proof. We do so by showing the following proposition. Let (F, G) ÷ be the closure o f F and G. For a given F F D f = A ~ B or F M V D g = C ~ , , , ~ D that does not belong to (F, G) ÷, there exists an instance r on U such that all the dependencies in F and G are valid in r but A --,~B or C ,,,~ ,,,~D is invalid in r. Let F', G' be two sets of classical dependencies, which correspond to F and G (i.e., F ' = {X ~ Y I X~YeF}, G'= {S~ylX,,,~--, Y e G}). Let f ' = A ~ B and g' = C --, ~ D. We construct an instance r', which satisfies F' and G' but does not satisfy f ' and 9'. By Theorem 1, 2 we know that r' satisfies F and G but does not satisfy f and 9. This problem transformed into the correspondence classical problem [1]. Theorem 5. An F F D can be viewed as a special case of an F M VD. Proof. By FA7, we know that if r satisfies the F F D X ,-~--,Y, then r satisfies the F M V D X ,,,~ ~ ~ Y.

345

Let R = {R1 .... , R,} be a set of relation schemes over U and r be an instance on U. I-IR,(r) D<1 ... D~ [IR,(r) is denoted by mR(r). We have the following conclusions with respect to the project-join mappings, which are same as the classical case. r c mR(r), FIR, (mR(r)) = FIR, (r), mR(mR(r)) = mR(r). Based on the above conclusions, we define the fuzzy join dependency.

Definition 4. Let R = {R~ .... , Rn} be a set of relation schemes over U and r be an instance on U. Fuzzy join dependency (FJD) *[R1 . . . . . Rn] holds in r, if r = mR(r). Theorem 6. A classical JD satisfies the definition of FJD. Proof. We have the conclusion from the definitions of F J D and the fuzzy natural join immediately. [] We can also define F J D in a manner similar to the definition of F M V D .

Definition 5. Let R = {R1,...,Rn} be a set of relation schemes over U and r be an instance on U. Let tl, t2, ..., tn be tuples (not necessarily distinct) of r. We say that F J D *[R1 .... ,Rn] holds in r if, for any two tuples ti, t j e r with SP(ti[Ric~Rj], tj[Ric~Rj]) = ~i1, there exists a tuple t in r with SP(t[Ri], ti[Ri]) = a where a ~> MIN{~til, ...,ain} (i = 1,...,n). Theorem 7. Definitions 4 and 5 are equivalent. Proof. 1. Suppose an instance r on U satisfies Definition 5, we shall show that r = mR(r). We denote ri = []R,(r). Let si(al, a2 . . . . . an) e ri sj(bl, b2 .... ,bin) e rj be two tuples and SP(an, bl) ~> e, where e is coefficient of the natural join between r~ and rj. We denote ggsj = (al ... an-~, an, b2 ... bin).

346

W-E Liu / Fuzzy Sets and Systems 92 (1997) 341-348

Let s be an arbitrary tuple in r~ D~ "" D~ r,. By the definition of the fuzzy natural join, there are tuples, s~ e rl, s2 e rz, s, s r, such that sl = s~ ~ s 2 , ... , ~ s , . Let S P ( s [ R x ] , s O = a~ .... , S P ( s [ R , ] , s , ) = a , . W e k n o w that ai/> MIN(ai~,ct~2,...,~i,) where ~ij = S P ( s ~ [ R i ~ R j ] , sj[Ric~Rj]) is the coefficient of the natural join between r~ and r~. By the definition of the projection, there are tl, t2 .... , t, in r such that t~ [R~] = s~ .... , t,[Rn] = s,. Obviously, t~,t2,...,tn satisfy SP(t~[R~c~Rj], t j [ R ~ R j ] ) ~> cq~. Since r satisfies Definition 5, there exist a tuple t e r such that SP(t[R~],h[R1])>>. M I N ( c q ~,a~2 .... ,0q,) .... , SP(t[R~], t,[R~]) /> MIN(a,~, a,2, ..., ~n). Therefore SP(t, s) is close to 1. We regard s is in r. r satisfies r -- rl ~<~ ... ~ r,. 2. Suppose r = r~ D<~ "'-D<]r,, we shall show that r satisfies Definition 5. Let t~,t2 .... ,tn be tuples in r and S P ( t i [ R i c ~ R j ] , t j [ R i c ~ R j ] ) = a i j (i, j = 1,..., n). By the definition of the projection, there are s l ~ r x , s 2 ~ r 2 , . . . , s , ~ r ~ such that s~ = tx[R~], s2 = t 2 [ R 2 ] , ...,Sn = tn[Rn]. We k n o w that there exists a tuple s ='gT~s2,... ,As,, in r [ ~ . . . ~<~r,. Thus S P ( s [ R i ] , t ~ [ R I ] ) = d t l / > M I N ( a ~ I , aXE,..., a~.), ..., SP(s [ R , ] , tn[R,-]) = d, ~> M I N ( a , ~ , ~,2,-.., a..). Since r = r~ t>~ "" t>~ r,, s is a tuple in r. r satisfies Definition 5. [] Theorem 8. An F M V D can be viewed as a special case of an FJD. Proof. By Definitions 3 and 5, we notice that if R = { R 1 , R 2 } , then * [ R 1 , R 2 ] is R l c 3 R 2 ~--~ R1 - R2. T h a t is X ~ - ~ Y can be regarded as * [ X Y , Y Z ] where Z = U - X Y .

5. Testing implication of data dependencies Based on the definitions of the fuzzy data dependencies, we discussed the implication a m o n g d a t a dependencies. We obtain a series of theorems, which are similar to the classical cases. Theorem 9. Let R = {Rx . . . . . Rp}, S = {$1, ... ,Sq} be two sets of relation schemes over U. I f for each Sj ~ S, there is a Ri ~ R such that Ri ~ S2, we say R covers S, written R >1 S. mR(r)~-ms(r) for all instances r on U if and only if R <<.S.

Proof. 1. Suppose R ~< S We shall show that mr(r) ~ m R for all instances r on U. Let t' be a tuple in ms(r). We suppose t = tl [$1] ~" t2 [$2] "-" --. ~ tq[Sq], where tl, t2 . . . . . t~ ~ r (i.e., S P ( t I [ S i ~ S i ] , t j [ S i n S j ] ) = aij and S P ( t [Si, t i [ S ~ ] ) = a~/> M I N {a~l, a~2, ... ,a~q} (i, j = 1, 2 , . . . , q). Since R ~< S, for any R1, Rk ~ R, there are S~,Sj ~ S such that R g _ Sj, R k - Sj. If RkC~R1 ¢ 0 then SjnS~ ~ R k ~ R ~ . F r o m the definition of the semantic proximity, there are tuples tk,tl ~ r such that S P ( t l [ R l c ~ R k ] , tk[Rs]C~Rk]) >~ SP(tl [Sjc~Si], t k [ S i v 3 S j - l ) = aij. T h u s SP(t[R1], tl[R1])e~ ( l = l . . . . . p). T h a t is t = t x [ R 1 ] ' - " t2 [R2] ~ , . . . , ~ tp[Rp], t is a tuple in m~(r). 2. Suppose mR(r) ~_ ms(r). F o r all instances r on U, we shall show that R ~< S. We construct an instance r as follows: r has q tuples t~, t2 . . . . . tq. T h e tuple t~ is defined as {~ ti[A] =

ifA~S~ otherwise

i ~< i ~< q.

We denote to = (0, 0 , . . . , 0). It is not hard to see that to must be in ms(r). Therefore, to in mR(r). By the nature of mR, for each relation scheme R~ in R, there has to be a tuple tj in r such that t j [ R i ] = to[Ri]. Thus R i ~ Sj and R ~< S. M a k i n g use of the chase tableau, we have a test for implication of F J D by a set C - of F F D s and FJDs. Theorem 10. Let C ~ be a set of F F D s and F J D s over U. Let R ~ = *[R1,R2, ... , R , ] be an F J D over U. C~ implies R - if and only if C H A S E c - ( T R - ) contains a row Wd = (a, a , . . . , a). Proof. Let C be a classical set of F D s and JDs, which correspond to C - and R be a classical J D on U, which c o r r e s p o n d to R ~. 1. W e shall show that if R ~ is implied by C ~ then C H A S E c - ( T R - ) contains a Wa = (a, a .... , a). C H A S E c - (TR-) is denoted by T*. The a b o v e proposition is equal to the following proposition: If there is no W d = ( a , a . . . . . a) in T * , then there exists an instance r on U such that all dependencies in C ~ are valid in r but R is invalid in r.

W-Y. Liu / Fuzzy Sets and Systems 92 (1997) 341-348

347

supposition of Wi[A] = b, we know Wi[A] ~ Wd[A]. T' does not satisfy X ~ A . 2. We shall show that if every row of T ' is symbol 'a' in the A-columns, then X ~--*A is implied by C ~ . The above proposition is equal to the following proposition. If there is an instance r on U such that r satisfies C ~ but does not satisfy X ~--,A, then there is a row IV/with Wi [A] = b. We look upon T ' as a concrete instance on U. Obviously, T ' satisfies C ~ . Since T ' is a classical instance and T ' does not satisfy X ~ A , we have Wd[X ] Wi[X ] but Wd[A] ¢ Wi[A]. F r o m the construction of T', we know Wd [A] = a. Therefore Wi [A] = b.

Suppose there is no Wd = (a, a . . . . . a ) in T*. We look upon T * as a concrete instance on U. F r o m the classical case (Fact 1 in I-8]), we know that T * satisfies C but T * does not satisfy R. By Theorem 1.6, T * satisfies C~. Since T * is a classical instance and T * does not satisfy R, therefore T * does not satisfy R~. 2. We shall show that if there is a row Wa = (a, a .... , a ) in T *, then R~ is implied by C ~. The above proposition is equal to the following proposition. If there exists an instance r on U such that all dependencies in C~ are valid in r but R~ is invalid in r then there is no Wd = (a, a . . . . . a ) in T * We look upon T * as a concrete instance on U. F r o m the classical case (Fact 1 in [8]), we know that if T * satisfies C but does not satisfy R then there is no Wd = (a, a , . . . , a ) in T*. By Theorem 1,6, T * satisfy C~. Since T * does not satisfy ~ R and T * is a classical instance, therefore T * does not satisfy R~. We now turn to a test for implication of F F D by C~.

the

=

6. An application example The above theory is widely used in our daily life. Here is an example of the environmental pollution.

Example 4. Assume that there are nine elements (groundwater (G), potable water (Pw), water (W), dust (D), destructive gas (Dg), air (A), soil (S), plant (P), animal (An)) in every environmental region. The polluted state of environmental regions is described by the pernicious matter in the nine elements. Assume that r is an instance (see Table 3). Let c~ = 10 for all attributes. F r o m Table 2, we have S P ( t l [ D ] , t 2 [ D ] ) = 1 / 4 - 1/10 = 0.15; SP(tl[Dg],tz[Dg])= 1/51/10 =0.1; S P ( t l [ A ] , t 2 [ A ] ) = 1 / 4 . 6 - 1/10 = 0.117. Similarly, we have M I N {SP(t/[D], tj [D]), SP(ti [Dg], t~[Dg])} ~< SP(ti[A],tj[A]) (i,j = 1,2, ...,4). According to the definition of FFD, D, D G ~ A . F r o m Table 2, we noticed that the elements satisfy the following F F D s F ~ = {G, Pw ~ W , D, Dg ~ A, W , A , S ~ P , W , A , P ~ A n } . According to the classical 3NF-decomposition algorithm

Theorem 11. Let C ~ be a set of FFDs and FJDs over U and X A ~_ U. C- implies X ~ A if and only if CHASEc-(Tx) has only distinguished symbols 'a' in the A-column.

Proofi 1. We shall show that if C - implies X ~ - ~ A then CHASEc-(Tx) has only distinguished symbols 'a' in A-column (CHASEc-(T~) is denoted by T'). The above proposition is equal to the following proposition. If there is a row Wi with Wi [A] = b in T * then there exists an instance r on U such that r satisfies C ~ but does not satisfy X - - ~ A . We look upon T ' as an instance on U. Since T ' is the result of applying the chase rules, C~ holds in T'. F r o m the construction of T ' , we know that every row of T ' is symbol 'a' in the X-columns. On Table 3 r

Pw

G

W

D

Dg

A

S

P

An

tl t2 t3 t4

[1,3] [2, 4] [5, 9] [4, 8]

[2,4] [3, 5] [6, 8] [5, 6]

[1.3,3.3] [2.3, 4.3] [6.2, 8.7] [4.3, 7.4]

[1,2] [ 1, 5] [4, 6] [6, 9]

[4,6] [5, 9] [3, 5] [6, 8]

[2.8,4.4] [3.4, 7.4] [3.4, 5.4] [-6,8.4]

[2,4] [3, 6] [5, 7] [6, 8]

[3,4] [3, 6] [3, 5] [3, 8]

[4,6] [-3,7] [4, 7] [4, 8]

348

W-Y. Liu / Fuzzy Sets and Systems 92 (1997) 341-348

(Algorithm 5.4 in [101), we obtain the decomposition R ~ = . [ ( G , Pw, W), (D, Dg, A), (W,A,S,P, An), (G, P, D, Dg, S)]. We noticed that CHASEe(T~-) contains a row Bid = ( a . . . a). This means that if r satisfies F ~ , then r has a lossless decomposition in R ~ . 7. Conclusion This paper deals with fuzzy data dependencies in databases. Based on the semantic proximity, we give the definitions of FFD, F M V D and FJD. It has been shown that an F F D can be viewed as an F M V D and an F M V D as an FJD. Furthermore, we discuss the inference rules of FFDs and F M V D s and the implication of fuzzy data dependencies. An integrated fuzzy data dependency system has been set up. References [1] C. Beeri, R. Fagin and J.H. Howard, A complete axiomatization for functional and multivalued dependencies in database relations, ACM SIGMOD Conf. (1977) 47-61.

[2] B.P. Buckles and F.E. Petry, A fuzzy representation of data for relational database, Fuzzy Sets and Systems 7 (1982) 213-226. [3] X.G. He, Semantic distance and fuzzy user's view in fuzzy database, Chinese J. Comput. 10 (1989) 757-764. [4] X.G. He, Data models of the fuzzy relational databases, Chinese J. Comput. 2 (1989) 120-126. [5] W.Y. Liu, The reduction of the fuzzy data domain and fuzzy consistent join, Fuzzy Sets and Systems 50 (1992) 89-96. [6] D. Maier, The Theory of Relational Databases (Computer science press, Rockville, MD, 1983). [7] K.V.S.V.N. Raju and A.K. Majumdar, Fuzzy functional dependencies and losslessjoin decomposition of fuzzy relational database systems, ACM Trans. Database Systems 13 (1988) 129-166. [8] D. Sacca, Closures of database hypergraphs, J. ACM 32 (1985) 774-803. [9] R.C. Tripathy and P.C. Saxena, Multivalued dependencies in fuzzy relational databases, Fuzzy Sets and Systems 38 (1990) 267-279. [10] J.D. Ullman, Principles of Database Systems (Computer Science Press, Potomac, MD, 1979). [11] L.A. Zadeh, Fuzzy sets, Inform. and Control 8 (1965) 338-353.