Fuzzy Information and Construction of Questionnaires

Fuzzy Information and Construction of Questionnaires

Copyright © IFAC Th~ory and Application of Digital Control New Dd hi , India 1982 FUZZY INFORMATION AND CONSTRUCTION OF QUESTIONNAIRES B. Bouchon CNR...

881KB Sizes 0 Downloads 23 Views

Copyright © IFAC Th~ory and Application of Digital Control New Dd hi , India 1982

FUZZY INFORMATION AND CONSTRUCTION OF QUESTIONNAIRES B. Bouchon CNRS, "Structures de l'Injormation ", Universite Paris VI, Tour 45, 4, place Jussieu, 75230 Paris Cedex 05, France

Abstract. We consider questions admitting semantic answers such as "the SlZe is small, the weight is heavy, the temperature is low ... ", which enable us to characterize the elements of a given population or the conditions of a studied process. By using the concept of information, we define a fuzzy 01" crisp partition of this population or the states of this process, allowing to make a non-fuzzy decision . A choice between several questions is also possible, by maximizing the information processed by each of them. Ke~ords. Decision theory; identification; information retrieval opt,mal search techniques, process control.

QUESTIONS AND FUZZY PARTITIONS

We cons~der the ~i-probabi1ity of every class g~ of Gi defined (Bouchon, 1980) by P (g~) ~ ~~ (u) Pi (u) ~i' uEU

Definition of fuzzy partitions We consider a sequence of questions concerning variables which belong to a set C ={C i , iEI}, Ci taking values in a finite universe of discourse U. ., Every question is defined by a family Gi=~~' jEJ (i)} of fuzzy subsets of Ui , for iEI, which is supposed to constitute a fuzzy partition of U.. These questions enable us to characteri!e the elements of a given population E. If ~~ is the membership function defining g~, we have : V ~EU, ,--1: .. . ) ~1j (u) = 1 , J~\' . and 3j EJ(i) such that~~ (u) > o. Let Pi (u) denote the probability of an element of E to be characterized by the value Ci = u. The information deduced from the question 1 2 "1 s Ci gi or gi or ....?" can be evaluated by the fuzzy information processed by the fuzzy partition G, (Bouchon, 1980). '

i

,

It represents the expected value of the grades of membership greater than ~i in the definition of g{. From the existence ~,

, ,

of G,' , we , deduce that the index jEJ (i) for whi ch ~ ~ (u P-l1 i is uni que for every uEU . • If G,' denotes the union of all the , classes g~, jEJ (i), the above ~i-proba­ bi1ity can be regarded as the average share of, the grade of remembership of G, due to g~. Then the ~i-probabi1it~ of Gi is P~ (G,) = 'cl (.) p. (g~) ,

,

i

'

J

,

~i

'

and the information processed by G. is evaluated by the following quantit}, where L (x) is written for -x log x : j

-

I (G i ) = jE3 (i) L (P~ . (gi)) / p~ . (G i )· More explicitly, let us'consider th~ simple question q : "Is the size small, medium or large ?" concerning a given population . It can be represented by the family of membership functions given in Fig. 1 :

We suppose that, for at least one value E [O,IJ, there exists a crisp partition

~i

Gi~ i , ~i-associated with Gi , for every i~I : its classes are the ~i-1eve1 sets of g~ for jEJ(i), whose elements have a value ot the membership function not less than ~ .;

,

~~ (u)~,

,

.

455

456

B. Bouchon

1

0,5 b a b1 a' c b2 c' usize : Membership functions defining question q. A crisp partition of Usize may be deduced from these functions, by taking ~size = 0,5 if b ~ 1N and b ~ TN : 1 2 e

~

U.1 (2), and null elsewhere . Let G~1 denote the new fuzzy partition of U;. Then the ~iprobabilities of G; and Gi are equal and the inequality of Polya (Hardy, Littlewood, Polya 1952) impl ies : -(~. J.l1(u)Pi(u))109 (~ . J.l1(u)Pi(u)) s .

1

_

~

PO,5 (gi) -uE[o,b [n lN J.lsmall (u), 1 and the fuzzy information obtained by asking question q equals I(q) = -1 s1s3 Po ,5(g1) log with

PO,5(g1)!PO,5(Gsize~

~ PO,5(Gsize) - uE[O,b]('(N jT(u) Pi (u). Let us remark that an ambiguity may occur in the definition of the crist partition if ~i corresponds to the intersection of the graphs representing the memberships functions, for the value UoEU i . It would be the case in the previ ous example if b1 or b2 EIN. We choose the crisp partition~i-associated with Gi which produces the ma ximum of information I (G ) by putting Uo in the class of smallest ~i- i probability. For instance, if b1 ElN and b2 ~IN, two crisp partitions are possible [0, b1] , ] b1 , b2[ , ]b 2 , b] and [0, b1] , [ b1 , b2[ , ]b 2 , b]. The obtained fuzzy information is greater if we use the first one in the case where PO,5 ([0, b1[ ) < PO,5 (] b1 , b2[ ), as it can be easily verified. Refinement of partitions The refinement of the fuzzy partition G. 1 describing a question increases the corresponding fuzzy information: we divide the fuzzy subset g~ of U. into two fuzzy subsets (l)g~ (2)1 . 1 1 and g~, defined by ~embership functions and uE respectively equal to J.I~ when uEU~l)

.

1

J.l1(u)~i

- ~ (~ J.l1(u)Pi(u)) x k =1,2 uEU 1. (k)

G~i~e = {[O,b 1 [,]b 1 , b2 [,]b 2, b]},

which means that an element of E is said small if it corresponds to a size less than b1 , medium if its size is between b1 and b2 , and large if it is greater than b2. The graph of the membership function; defining Gsize is indicated by dots and dashes in Fig. 1. The 0,5 - probabil ity of "sma 11 ", for example, is defined by

1

J.l1(U)~i

J.l1(u)~i log

(~ . (k) .

J.l1 (u) Pi (u) )

1

J.I~(U)~i and, consequently : I (G i ) s I (G i ').

In the above example, we may say that the size is tiny if it corresponds to a value u
is clear that, for a given 7lj-probabil ity G., the greatest. information . corresponds 1 fuzzy subsets g1 such that J.l1(u)~i and only if u = UjEU i yjEJ(i). SEQUENCE OF QUESTIONS

Information of a pair of questions. It would be satisfying to use a sequence of two questions providing more information than each question taken separately, since the acquaintance with population E increases as questions are askea. We suppose that we first use a question described by G. , iEI, and after having obtained an an~wer, we ask a question described by G , kEI. If the questions are independant~ the final results are described by fuzzy classes g~ x g~, jEJ(i) and mEJ(k), and we consider that their membership functions are : j m J.l jm uEU i , vEU k, ik (u, v) =J.l i (u) " J.l k (v) where " denotes the infimum. The probability of an element of E to be characterized by the values Ci = u and Ck = v is obviously Pi(u)Pk(v) . Let us denote by C1 and c~, jEJ (i) and mEJ (k), the classes of the crisp partitions

Fuzzy Information and Construction of Questionnaires

Tj i Tj m Gi and Gk of Ui and Uk corresponding to Gi and Gk. In the case where a crisp partition (G i , Gk)Tj .of Ui x Uk can be defined from the classes C~ x c~, we evaluate the fuzzy information of the pair of questions represented by Gi and ~, by using the values of the membership functions not less than Tj . This condition will be specially verifi-ed if Tji equals Tjk and thus equals 11 ; for example, this common value can equal 0,5. We measure the Tj-probability of g~ x g~, ~(k)

jE J(i),

PTj(g~ x g~)

=

:

1: j 1: m 1J~~(u,v)Pi(u)Pk(v) .

uEC

i

vEC

k

Then : PTj (G i )APTj (Gk) ~ PTj (G i ,Gk) ~ P11 i (G i )PTjk (G k) · We get : I(G . ,G k) = 1: , jEJ(i) ~ 1: jEJ (i) and : L( PTj (g~ x g~)) ~ - (1: j uEC

i

a way to determine a partition associated with a sequence of questions, in the case when there does not exist any crisp partition (G i , Gk) associated with the sequence of question~ described by Gi and Gk. If an element (u, v), UEU i , ~Uk' corresponds to a membership function greater than 1/ in two fuzzy classes g~ x g~ and g~'X g~: we put it in the crisp class of smallest 1/-probability, in order to maximize the information of the fuzzy partition, as explained for the element Uo in the first section. In the exemple described in Fig. 2, with Ui = {I, 2, 3, 4} and Uk = {I, 2, 3, 4, 5}, two fuzzy classes of Ui and Uk are defined f 'r 1/ i = \ = 0,5. They correspond to the crisp partitions C~ = {l, 2}, C~ = { 3, 4} bf Up and C~

= {1,

2}

log( 1: . 1J~(u)p.(u) 1: ,

vEC~

Pk(v))

- ( 1: . IJ ~ ( u) p . (u) )log ( 1:. IJ ~ ( u) p . (u)) x

,

R:,

of

Ui

1 2 3 4

uEC~'

= { 3, 4 . 5}

:f~t: (G~':~t)

k

uEC~'

C~

of U . k It is easy to see that C~ x C~ constitute

1: m 1J~(u)IJ~(V)Pi(u)Pk(v))X ~

vEC

457

,

uEd' ,

,

m

1: IJk(v)Pk(v). vECm k The latest quantity is not greater than L(PTj . (g~)) , . When summing up the previous result~, we conclude that I(G i , Gk) is not less than a quantity at most equal to I(G i ). The only case where we are sure that I(Gi,G k) is greater than I(G i ) occurs when TjiATjk ~ lie, because we have then : L(P (g~ x gmk)) ~ L(P (g~)) 1/ , 1/ i ' as a consequence of the decrease of L(x) for x ~ l/e.

A similar proof would have been obtained concerning G instead of G., and the information I(G., kGk) is indepe~dent of the order of the que~tioMs . Choice of a partition associated with questioffi . Let us note that the fuzzy information provides

1 2

3 4 5

Fig.2 : Membership functions defining two questions. GRANULAR QUESTIONS .

UK

We suppose that the answer of the question represented by Gi is granular(Zadeh, 1979) 1. , . C. , s g. , s 1\ 1 · C' i 's

~ .

A

gi ' s 2

where AI' A2 , ... are fuzzy probabilities. Coming back to the first example, we consider the following answer: "the size is small is unlikely, the size is medium is likely, the size is large is very unlikely". If ~j denotes the possibility distribution function of Aj' jEJ(i), the possibility of the .probability ~f Gi is IT ( g~) = 'P . ( P ( g~ )) jEJ ( i ) . J 1/ i , ' Consequently, obtaining a granular answer for a question concerning Gi provides the following weighted information: J IT (G i )

.EJ1: ( . ) PTj . ( g~ )IT ( g~) 1og PTj . ( g~) I J

(l:

"

jEJ (i)

PT! .

,

(g~)).

,

B. Bouchon

458

It is obvious that J (G.):5 I(G.), WHic~ meanslthat a granular answer processes less information than a simple answer.

problem, the fuzzy information may have two distinct utilizations.

Nevertheless, we way use this quantity, as previously, to choose the crisp partition associated with the different possible answer~ when there existsan ambiguity. This situation happens for instance in the first example, when blE~. M~re generally, we consider the case where {g~, jEJ(i)} does not represent a fuzzy partition of Ui , an element u of Ui . belonging to both of the fuzzy classes g~ and g~ with grades of membership at least equal to ~i' We need to define a true fuzzy partition for G.in order to evaluate the. fuzzy informal tion that it processes. Let u~ (resp. U~) denote the subset of Ui - {u} corresponding to a.grade of membership at least equal to ~i in g~ (resp. g~). As P~. (g~) depends on the level ~l' and the 1 crisplsu~set of Ui represented by the ~i-le~el set of g~, we may associate u with either U~ or U~, the choise being done by calculating the information in each case. We evaluate the probabilities P(1) (g~)

~~

1

£ Jl

i (v)Pi (v)

1

and

l)1 + / 1 (u) p1. (u) £ = j Or m, corresponding to two values of P~ . (G l.) P(a) = l: P (g~) + P(a) ( g~) +1 P(13 ) ( g~) , kEJ(i) ~i 1 1 1 P( 2 ) (g~ )

P( 1 ) (

1

k~j,m

for a = 1 Or 2, !3 = 1 Or 2, a

!3 .

~

Consequently, we compare the following informations : J (a ,!3) = _ (1/ P(a) )[

l:

kEJ(i)

P ( g~)'" . ( P ( g~ )) x ~i

~i

J

1

1

k~j,m

log P (g~)+p(a)(g~)", .(p(a)(g~))lOg p(a)(g~) ~i

1

J

1

1

1

+ p(!3)(g~)", (p(!3)(g~)log p(t~)(g~)]. 1

m

1

We associate u with U~ if J(2,1) and wi th U~ if J (1,2) 1 > J (2,1) .

1

~ J(I,2),

1

CONCLUS ION. If we consider a series of questions as the representation of a decision process or a sequence of tests resolving an identification

The first one is the choice of a question among several questions giving results which concern the same variable. We choose the question processing the greatest quantity of fuzzy information, corresponding to the maximum of "accuracy" in the obtained indications. The second utilization is the distribution of the studied states of a phenomenon or members of a population, in crisp classes defined by means of a fuzzy partition associated with the results of a given question. We propose to use the crisp partition deducm from this fuzzy partition which provides the maximum of fuzzy information. Other utilizatiorsof fuzzy information may be considered, concerning decision processes or identification problems, which will be studied in a next paper. REFERENCES Bezdek, J.C., and Harris, J.D.(1978). Fuzzy partitions and relations, an axiamatic basis for clustering. J. Fuzzy Sets and Systems, 1, 111-127. Bouchon, B. (1980 a). Information contenue dans un systeme d'evenements flous. Proc. Table Ronde C.N.R.S. sur le Flou, Lyon (to appear). Bouchon, B. (1980 b). Information transmittm by a system of Fuzzy Events. Proc. Int. Congo on Applied Systems Research and Cybernetlcs, Acapulco (to appear). Hardy G., [ittlewood J., Polya G. (1952) : Inequalities. Cambridge University Press. Okuda T. ,Tanaka H., Asai K. (1979). A formulation of fuzzy decision problems using fuzzy information using probability measures of fuzzy events. Inf &Control 38, 135-147. Zadeh L.A. (1979). Fuzzy sets and Information Granularity. In M.M. Gupta, R.K. Ragade, R.R. Yager (Ed.), Advances in Fuzzy Sets and Applications, North Holland.