Experiments in evidence composition in a Speech Understanding System

Int. J. Man-Machine Studies (1983) 19, 19-31 Experiments in evidence composition in a Speech Understanding System LORENZA SAITTA Istituto di Sc&nza ...

Download PDF

697KB Sizes 1 Downloads 63 Views

Report

PDF Reader
Full Text

Int. J. Man-Machine Studies (1983) 19, 19-31

Experiments in evidence composition in a Speech Understanding System LORENZA SAITTA

Istituto di Sc&nza dell'Informazione, Universita' di Torino, Italy A method for composing partial evidences in pattern recognition problems is presented and experimental results, referring to speech understanding, are also discussed. The method is well suited for real-time problems, where speed and parallelism in taking decisions are fundamental requirements. The case study presented in the paper is a simple one, for the sake of clarity, but a generalization to complex production systems can be easily obtained.

1. Introduction Automatic Speech Recognition and Understanding offer a good opportunity for designing complex parallel processing systems, consistent with human perception models like those of Massaro (1980), Massaro & Oden (1978), Marslen-Wilson (1975) and Klatt (1979). Central to the organization of a Speech Understanding System (SUS) are representation of knowledge, structured on several levels of abstraction, and control strategies which use the knowledge efficiently. The speech signal is, in general, interpreted with errors and ambiguities; nevertheless, there is often enough redundancy in it that an unambiguous global interpretation of the uttered sentence can be obtained. A search methodology which can be powerful enough and efficient for dealing with the complexity of the speech understanding task is based on the so-called Hypothesize-and-Test paradigm (Reddy, 1976). This methodology consists of generating and evaluating alternative hypotheses, at a given level of data abstraction, and then building up from these hypotheses more complex ones, at a higher level of abstraction. Several Knowledge Sources (KSs) co-operate in this hypothesization process by supplying information about a particular aspect of the recognition task (acoustic, phonetic, lexical knowledge, etc.) (Erman, Hayes-Roth, Lesser & Reddy, 1980; Lesser & Corkill, 1981). In the system presented in this paper (De Mori, 1982; De Mori, Laface & Piccolo, 1976; De Mori & Laface, 1980) each KS consists of a set of context-dependent taskand speaker-independent rules, which can be, to some extent, automatically inferred (De Mori, 1977; De Mori & Saitta, 1980; De Mori, Giordana & Laface, 1982a). These rules can be expressed as sets of Production Rules (Van Melle, 1979; Weiss & Kulikowski, 1979; Michalski, 1980). Moreover, semantic rules are associated to the syntactic ones for evaluating numerically the degree of worthiness of the symbols occurring in the rules and corresponding to partial hypotheses about the content of speech intervals. These semantic rules are based on Zadeh's Theory of Possibility (Zadeh, 1978; De Mori, Laface & Saitta, 1982b). An alternative formalization of the knowledge contained in the KSs can be obtained by means of Attributed Grammars (You & Fu, 1979). 19 0020-7373/83/070019 + 13503.00/0

9 1983 AcademicPress Inc. (London)Limited

20

I_ S A I T T A

The interaction among the various KSs thus results in a complex set of competing and co-operating processes, which require a very large amount of computational power. For this reason a Distributed Processing approach seems to be the most suitable (Chandrasekaran, 1981; Fox, 1981; Smith & Davis, 1981; Giordana, Laface & Saitta, 1980, 1982; De Mori et al., 1982a; Lesser & Erman, 1979; Giordana & Saitta, 1981). However, there is still a lack of adequate formal models and extensive experimental work for conceiving and implementing systems performing such kind of processing. Moreover, in real world data analysis, the strategies of process invocation depend strongly on the way partial hypotheses are combined in order to establish priorities among hypothesis-growing processes. The method of assigning priorities must take into account the peculiarities of the problem at hand, especially Uncertainty, both in the data and in the knowledge, and Complexity, both in computation and in information handling. Furthermore, the requirements of concurrent computation, minimum computation time and non-deterministic behaviour also impose strong constraints on the hypothesis evaluation techinques. In this paper a set of experiments in partial hypotheses evaluation and combination is presented, using a method that meets the preceding requirements. In order to clarify the motivations of the method, the computation scheme which has been adopted is briefly described in the following section.

2. Overview of the computation scheme Given an unknown input sentence, the computation to be performed on it has the goal of finding out a global interpretation of the speech signal consistent with all the constraints imposed by the KSs present in the system (Lexicon, Syntax, Semantics, etc.). Each KS communicates with the others in some way, but each one uses, in general, different procedures and data structures, depending on its specific domain of expertise. As mentioned in section 1, the knowledge of each KS can be represented by means of a set of Production Rules, which reflect the conceptual hierarchy of the KSs; in fact, each rule at a given level of abstraction contains atomic terms (conditions and actions) which can be, in turn, expanded in terms of rules of lower level of abstraction (Fox, 1981; Kanal, 1979; Stockman, 1977; Nilsson, 1971). Each rule is contextindependent and semantic rules are supplied to handle context dependency. In order to model the behaviour of the KSs and their interactions, a computation scheme, considering each KS as a Society of (micro- )Experts, has been adopted. In this model an Expert is associated to each atomic term occurring in the rules and it is provided both with declarative and procedural knowledge. All the Experts are connected in a loosely coupled, hierarchically organized interconnection network, in which the control is completely decentralized, because each Expert has its own control power. A scheme of the interconnection network can be seen in Fig. 1. Each Expert communicates directly only with the Experts connected to it in the network. The communication is realized by means of two flows of information, one Top-Down (model-driven) and the other Bottom-Up (data-driven). In the top-down processing the Experts at a given level activate other Experts at lower level, whereas the contrary happens in the bottom-up processing.

21

EVIDENCE COMPOSITION

1 iEx02,t I

Input doto

E 0 L,

f

,, 1

I

FIG. 1. Example of hierarchical interconnection network among Elicits. Each Expert knows the relationships existing among those Experts (at lower level) whicl't ate connected to it. Each Expert is supplied with procedural knowledge, allowing it to take decisions without the supervision of a centralized global control.

During the activations contexts must be handled, because speech interpretation is a strongly context-dependent task. As an example, the rules for detecting a given consonant must take into account the vocalic context in which the consonant occurs; furthermore, the presence of a given word in a sentence may depend upon the presence of adjacent words. Hypothesized contexts are then transmitted (as inherited attributes) from higher level to lower level Experts during top-down activations; on the contrary, already verified contexts are sent (as synthesized attributes) from lower level to higher level Experts during bottom-up activations. The contexts are described by means of Context

Descriptors. Each Expert uses context information in order to select few alternative hypotheses among all those allowed by the syntactic context-independent rules known by it. Multiple instantiations of the same Expert with different contexts may exist concurrently and they are considered different hypothesization processes. Due to the great number of alternative hypotheses which would be generated, a Focus-of-Attention mechanism must be provided: this mechanism is realized by letting the top-down and bottom-up strategies effectively interact between each other, according to the scheme described in the following. At the beginning, the most reliable intervals of the input sentence are used to stimulate, bott0m-up, some first level Experts. These Experts require, in general, some conditions to be verified in order to generate a consistent hypothesis; then, they send to lower level Experts requests of verification, starting thus also the top-down processing. When both kinds of processing occur, an Expert could be frequently activated both top-down and bottom-up, because of the asynchronism of the two strategies. It is

22

L. SAITTA

then necessary to provide each Expert with tools for co-ordinating the two activities, in order to avoid wasteful duplication of instantiation processes. In Giordana & Saitta (1981) an implementation model satisfying this requirement was presented. The stimulation mechanism, coupled with that of context handling, showed itself as a very powerful tool for focusing the attention of the system: in fact, only a small subset of the possible Experts is actually fired and the top-down strategy is mainly used to suggest hypotheses about the interpretation of spe,'ch intervals showing low bottom-up evidence. The decribed processes offer also the possibility of using Non-Deterministic constructs (Giordana et al., 1980, 1982): if an Expert activates a given number of lower level Experts, it does not need to wait for all the responses, but can take its decisions on the basis of only the first-arrived ones, provided the information carried by them is considered to be sufficient. If a better response arrives after the choice has been made, it can be sent to the higher levels as a stimulus, thus recovering the error due to the non-deterministic behaviour. Such a recovery possibility allows one to use less-sophisticated (and hence less-complex) decision algorithms. Experimental results (Giordana et al., 1982) indicate a time reduction of about 30% for reaching the same final solution, by allowing the described non-determinism.

3. Evidence composition In order to allow the evaluation of hypotheses and the computation of priorities according to the scheme described in section 2, a mechanism for measuring and combining evidences of partial interpretations has to be supplied (Sharer, 1976; Duda, Hart & Nilsson, 1976; Ishizuka, Fu & Yao, 1981; Barnett, 1981). This mechanism must allow concurrency in evaluating alternative solution paths and, moreover, it must let the right hypothesis quickly emerge among all the generated ones. The method presented in this paper has been tested in complex cases (Giordana et al., 1982); nevertheless, a simple application will be described here for the sake of clarity. The case considered corresponds to the hierarchical phoneme classification (for the Italian language) represented in Fig. 2. The activation of a non-terminal node in the

A4= III Ir gll Iml lnl Ignl ~

~T~

~

,

~

.

.

~

~

~

ItJ'l Itzl I~51 Idz/ Ill Isl /f/ Ivl Izl Ipl Itl Ikl Ibl Idl Igl FIG. 2. Phonetic feature tree used in the recognition of consonants. The symbols in the tree have the following meaning: CONS--Consonant; SON--Sonorant; NONSON--Nonsonorant; LIQ--Liquid; NAS--Nasal; AFFR Affricate; CONT---Continuant, INT--Interrupted; AL--Affrieate Lax; AT--Affricate Tense; IL--Interrupted Lax; IT--Interrupted Tense; CL--Continuant Lax; CT--Continuant Tense.

EVIDENCE

23

COMPOSITION

tree corresponds to the right-hand side of a production rule, whose left-hand side contains, as condition, the presence of the father node in the data under analysis. The activation of node Ak corresponds to the process of measuring a feature vector s in the input data (in a given context); on the basis of a set of fuzzy linguistic variables describing s the node Ak generates hypotheses about the presence of its sons in the unknown sentence and evaluates their possibility values according to Zadeh (1978). The possibility value of a node Aj, resulting from the preceding evaluation process will be indicated as/zj in the following and will be referred to as local evidence of A t in the input sentence. As the details of the relationships between phonetic features and acoustic cues have been published elsewhere (De Mori et aL, 1976; D e Mori, Laface & Saitta, 1982b ; De Mori & Laface, 1980), they will not be recalled here. Rather, the problem of evidence composition will be discussed. In fact, in the problem at hand, what is important is not only the values/zts, but an evaluation of the worthiness of Aj based on the whole path ranging from the root of the tree to A t. To this goal, let us indicate a generic node of Fig. 2 as A t (1---j_
"/'1,i - 1

local evidence of Ai; probability that/J.j is higher than a given threshold (for example, 0.5) when Aj is false (and hence "~i is true) in the case that Aj has been activated by the father Ai-1; probability that ~zi is lower than the threshold when A i is true, in the case that A i has been activated by the father Ai_l.

The probabilities trj.t_l and %j-1 can be expressed as o-t.t_ 1 -- Pr {/zt - 0.S[.At[Ai_I]}

(1 --
(1)

"l"j,j _ 1 = Pr {/~t < 0.5[A;[Aj_I]}

(1 - / < M).

(2)

For every node, the following a priori probabilities can be defined:

I qi.t - 1

=

Pr {At[A~-I]}

= Pr {Ai[At_l } Pr {At_I[Aj_2]} ql,0 = Pr {A1}.

(2 -
(3)

24

L. SArrrA

Equation (3) represents the a priori probability of the node Aj, conditioned by the whole path y, Let, moreover, { P~,i-1 = Pr {AjlAj-1}

(2 <-/-
Pl.o = Pr {A1}.

(4)

Expression (3) now becomes (2 <-] <-M),

q j j - l = p j , j-lqi-l.i 2

(5)

ql,0 =pl.o. From equation (3) one gets Pr {,~j[Aj_t]} = 1 - P r {Aj[Ai_I]} = 1 - qi,j_l

(2-
Pr {.Ad = 1-ql,o, and from equations (1) and (2) the following joint probabilities can be obtained: Uj, j _1 = Pr {/xi - 0.5, Aq[Aj_I]} = (1 - qj,j_l)crj, j-t

(1-
(7)

Wj,j

(1 -<] - M).

(8)

1 =

Pr {/zi < 0.5, Ai[Aj-1]} = qi, i 1Tj.J- 1

Ui,i_ 1 and Vi,i 1 represent global error probabilities when A i is false or true, respec-

tively. They take into account two types of information. The accuracy of the recognition system: in fact o-j,j_l and rj,j_l express, respectively, the probability of false alarm and of missing the right hypothesis. The a priori probabilities Pi.i-~ of the paths in the tree. Let II(A.,) be the global evidence of A . . given the path 7' from the root to Am. In order to account for the a priori information, a global evaluation of the path can be defined as in the following. Let - ~ wi(b-atzj)lnlz~ R(3,) -

i=1

,

(9)

~ wj j=l

where 1

wi - f l + M a x (Ujj-1, Vjj-1)

(lo)

and ~z* = { : i

ififpqr= 0.

(11)

if R(-/) = - b In a, othcrwise.

(12)

Let, moreover, II(Am) ={~_R(~)

EVIDENCE COMPOSITION

25

R(y) is related to the measure of the average content of uncertainty (De Luca & Termini, 1972) along the path y, weighted by a priori knowledge, also taking into account other sources of ambiguity not contained in the/zjs evaluation. We notice that the use of Shannon's function S(/zj) is not well suited here, because symmetry with respect to/xj = 1/2 must be eliminated: in fact,/xis values near 0 or 1, respectively, must produce different effects on the recognition process. Moreover, the use of a function F(x) = - ( b - ax) In x instead of the simpler one - x In x (which can be obtained by setting b = 0 and a = - 1 ) is motivated by the need of more freedom in evidence combination. In fact, the philosophy underlying equations (9) and (12) requires that the combination of two local evidences/xl and ~2, both greater than a given threshold, must produce a global evidence greater than Max (~1, ~z2). A theoretical analysis of equations (9)-(12) showed that this result can be achieved by controlling a parameter (b - 1)/a. In this way, the most promising paths quickly differentiate themselves from the wrong ones. Moreover, parameters a and fl have been introduced in equations (10) and (11) in order to make the logarithm and w i less than infinity when/xj or, respectively, Uj.j_~ and Vj.j-1 are zero. By defining k

Sk = ~" w#

(13)

i-1

equation (9) can be rewritten in a recursive form, useful for computational purposes: S1

=

Wl,

(14) Sk=Sk

(k-->2)

l+wk

and Rl =-(b-alzx)

Rk

ln tz *,

Sk 1Rk--1--wk(b--atzk) lntx'~ =

Sk

(15) (k -> 2).

The main characteristics of the proposed method for composing evidences can be summarized as follows. The value of the global evidence of a path does not depend on the order of partial evidences considered. This is important when the condition of a left-hand side of a production rule contains more than one term; in this case a graph of reasoning chains must be considered rather than a tree. Moreover, this fact allows application of the method to A N D / O R graphs, where A N D and O R nodes are all at the same levels and can be searched concurrently, without worrying about the time they take to be evaluated. The composition of evidences along a path is widely independent of the length of the path itself. This makes the evaluations along paths of different lengths comparable. This feature is very important for the described computation scheme; in fact, decisions are taken by each Expert on the basis of the only information it receives from its neighbours, without knowing the state of computation along other paths. The evidence composition along a path does not depend on the composition of evidences along other paths; this feature makes this method well suited for parallel

26

L. S A I T T A

search algorithms. In fact, to maximize parallelism, the evaluation of a path must give, in some sense, an absolute measure of evidence rather than a relative one (like, for instance, in Bayes classification). This aspect, coupled with that of independence of the order in composition, is also fundamental for allowing the non-deterministic behaviour mentioned in section 2; the possible errors may be recovered by the stimulation mechanism briefly described in the same section. This composition method does not constrain the sum of evidences over a set of alternative and exclusive hypotheses to be equal to one (as many methods assume); in fact, several competing hypotheses are allowed to have high evidence and are analysed in parallel or, on the contrary, all the possible alternative hypotheses existing at a given level may have very low evidences, indicating thus that, possibly, an error occurred before in the computation. In this way the control strategies dispose of very detailed information on the state of matching between data and interpretations, in order to assign priorities to competing processes. As a matter of fact, the described situation corresponds to common perceptual phenomena; for instance, it is possible that both the nasal c o n s o n a n t s / m / a n d / n / r e c e i v e high evidences (say 0.9), but it is not possible to distinguish between them on the basis of the features considered. Moreover, the search can be quickly focused on a small set of alternatives, rapidly emerging among the remaining ones. The evaluation of a path accounts both for evidence of the data and for a priori knowledge. This fact allows one to consider separately all the sources of uncertainty and, moreover, it makes possible the extension of this evaluation method to complex systems of weighted production rules. In this case an algorithm is also supplied to infer the weights of the rules automatically (Lesmo, Saitta & Torasso, 1982). The method gives good results even with a crude estimation of probabilities, which can be obtained from a small set of experiments. The proposed method is computationally inexpensive and suitable for real-time applications.

4. Example of application Let us consider the phoneme classification tree of Fig. 2. Both theoretical and experimental results suggested us to choose the following values for the parameters: a =4

b =5,

o~ = 10-10,

/3 = 0.05.

TABLE 1 E x a m p l e o f parameter values for some nodes occurring in the tree o f Fig. 2

Node A i

gi.i x

ri.i -~

Pi.i 1

q,.,-1

Uj.~ .~

Vj.i_l

wi

SON NONSON INT IL /g/

0.10 0.01 0.00 0.02 0-00

0.01 0.10 0.00 0.02 0-00

0.39 0.61 0.50 0.38 0-05

0.24 0-38 0-19 0.07 0-01

0.076 0.006 0.000 0.019 0.000

0.002 0.038 0.000 0.001 0-000

7.9 11.4 20.0 14.5 20.0

27

EVIDENCE COMPOSITION

As an example, let us consider the cases of recognition of the c o n s o n a n t s / I / a n d / g / in the vocalic c o n t e x t / e l a / a n d / e g a / . The consonants are extracted from continuous speech. Table 1 contains, as an example, the values of the parameters for some nodes in the tree. Figures 3 and 4 contain, respectively, the nodes of the classification tree which have been activated in the two cases. Finally, Table 2 contains the global evidences obtained in the two cases for the resulting competing hypotheses: in Table 2 the results obtained /~ = 1.00

/~ =0 85

~

,~ = J u u ~

~

~

~

-

,~=o.00 1]:0.000

a= 0 8 5

-

n z ~ q / ~ ' ~ O0 i. . . . j n : o o o o

~_r~zn~,u=OOO~-"~FS7~ ,~,=0.90 It, u N U ~ rt:o89,

,a o oo

- ,~ : o-oo

9 # : o.65 I~1 lI : o. 728

FIG. 3. Competinghypothesesfired during the recognitionprocess of the consonant/g/in the context/ega/, extracted from continuous speech. ,~ _~I00

=0-15 3

_~ : 0.70

/ r / 11aO.O00

/ I / 11:0.917

/gl/ El~O.O00

/m/ [ 1 - 0 0 7 0

/ n / I"[-0.254

/9 n/

rI~o.ooo

FIG. 4. Competinghypotheses fired during the recognition process of the consonant/1/in the context/ela/, extracted from continuous speech. TABLE 2

Results of the recognition of the consonants/g/and/1/by means of the proposed method for combining evidences. The results obtained by taking the minimum and the product of the evidences along a given path are also given

/ega/

/ela/

Competing hypotheses

Global evidence

Minimum value over path's evidences

Product of path's evidences

/r/ /g/

0.723 0.735

0-680 0.650

0.578 0.428

/1/ /m/ /n/

0.917 0.070 0.254

0.800 0.070 0-230

0.760 0.047 0.153

28

L. SAITTA

by taking the minimum value or the product of the evidences along a path are also reported. As a node A will not be activated when its global evidence is below a given threshold, a large number of experiments have been performed in order to analyse the behaviour ofthe system with respect to this threshold. A set of 830 consonants was used in the experiments; they were extracted from continuous speech, obtained by four male speakers and one female speaker, and their exact classification was the following. Liquid

/ r / , /1/, / g / ~ 230

Nasal

/m/,/n/,/gn/~

Affricate

/ d z / , /ds/, t J / , / t z / ~ 50

Continuant

/ v / , / z / , / f / , /s/, / f / - ~ 140

Interrupted

/ p / , / t / , / k / , / b / , / d / , / g / ~ 320

90

By setting the threshold equal to zero, all the nodes in the tree are always activated. This case allows computation of the lowest error rate. By considering that an error occurs when the right hypothesis obtains a total evidence less than or equal to that of a wrong one, the error rates obtained in different steps of the classification are given in Table 3. TABLE 3

Error rates obtained in different steps of phoneme classification in continuous speech (five speakers) Feature Sonorant/Nonsonorant

Continuant/Interrupted/ Affricate Interrupted Tense/Interrupted Lax Nasals ( / m / , / n / ) Interrupted Tense (/p/, /t/, /k/) Interrupted Lax (/b/, /d/, /g/)

Error rate (%) 4.2 8.0 4.5 6.8 13.2 11.0

Let us now define, for each terminal n o d e / y / , the length h (/y/) as the number of non-terminal nodes along the path from the root A t o / y / . The length h (/y/) is the minimum number of nodes which must be activated for generating the h y p o t h e s i s / y / . Let ~b(/y/) be the number of nodes which have been really activated during a successful recognition of / y / . Table 4 summarizes the experimental results; the values of 4~(/y/)/A (/y/) are averaged over all the cases in which a correct recognition has been obtained.

5. Conclusions A method for evaluating the global evidence of a pattern on the basis of the evidences of sub-patterns is presented. The method is well suited for real-time systems, where speed and parallelism in taking decisions are fundamental requirements. The method

29

EVIDENCE COMPOSITION TABLE 4

Experimental results of consonant recognition obtained by varying the threshold above which the evidence of a node must be in order to activate the node itself. The value of ~ ( / y / ) / h (/y/) has been averaged over all the cases in which a right recognition occurred Threshold, r

Number of times the right hypothesis was missed (%)

Recognition error rate (%)

Average value of ~(/y/)/h (/y/)

0.000 0.100 0-200 0.300 0.400 0.500 0.600 0.700 0.750 0.850 0.900

0.000 0.000 0.000 0.000 0.041 0.071 0.127 0.240 0.352 0.523 0.949

12.2 12.2 12-2 12-2 13.7 14.7 17.8 27.6 37.3 53.0 95.2

3.950 2.782 2-278 2.278 2.150 1.969 1.861 1.500 1.467 1.292 1.270

can also be easily generalized to complex systems of production rules, supplying a powerful tool for assigning priorities to concurrent production activations, also in presence of incomplete knowledge. The need of introducing in our Speech Understanding System this new method for composing evidences was suggested by the fact that existing methods resulted unsatisfactory from the performance (error rate) point of view a n d / o r from the computational complexity point of view. The speech system is implemented in Fortran and P A S C A L languages on a VAX11/780 and facilities for simulating concurrent computations are also supplied. Excluding the time required for Fast Fourier Transformation and formant tracking, the time required by the presented algorithm for generating the phonemic interpretation of a given speech interval can be quite different from case to case, according to the kind of uttered phoneme and to the quality of the speech; on the average, the phonemic labeling takes about 3 ms per 100 ms of speech, with Fortran procedures. Examples of recognition rates in other speech systems may be found in Erman & Smith (1981), Kohda & Nakatsu (1978), Woods et al. (1976), Jelinek (1976), Reddy (1976), Davis & Mermelstein (1982) and Brown & Rabiner (1982). The good experimental results obtained suggest further analysis of the theoretical background underlying equations (9)-(12) and clarification of the possible relationships between the present formulation and others in the literature. The author is indebted to Professors R. De Mori, A. Giordana and P. Laface for their criticism and useful suggestions.

References B ARNETI, J. A. (1981 ). Computation al methods for a mathematical theory of evidence. Proceedings of 7 th International Yoint Conference on Artificial lntelligence, Vancouver, pp. 868-875.

30

L. SAITTA

BROWN, M. K. & RABINER, L. R. (1982). An adaptive, ordered, graph search technique for dynamic time warping for isolated work recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-30, 535. CFIANDRASEKARAN, B. (1981). Natural and social system metaphors for distributed proble msolving: introduction to the issue. IEEE Transactions on Systems, Men and Cybernetics, SMC-11, 1. DAVIS, S. B. & MERMELSTEIN, P. (1982). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing (to appear). DE LUCA, A. & TERMINI, S. (1972). A definition of the non-probabilistic entropy in the setting of fuzzy sets. Information and Control, 20, 301. DE MORI, R. (1972). Syntactic recognition of speech patterns. In Fu, K. S., Ed., Syntactic Pattern Recognition--Applications. Berlin: Springer-Verlag. DE MORI, R. (1982). Computer Model of Speech using Fuzzy Algorithms. New York: Plenum Press. DE MORI, R. & LAFACE, P. (1980). Use of fuzzy algorithms for phonetic and phonemic labelling of continuous speech. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2, 136. DE MORI, R. & SAITTA, L. (1980). Automatic learning of fuzzy naming relations over finite languages. Information Science, 21, 93. DE MORT, R., LAFACE, P. & PICCOLO, E. (1976). Automatic detection and description of syllabic features in continuous speech. IEEE Transactions on Acoustics, Speech and Signal Processing, ASSP-24, 365. DE MORI, R., GIORDANA, A. & LAFACE, P. (1982a). Parallel algorithms for interpreting speech patterns. In PRESTON, K. & UHR, L., Eds, Multicomputers and Image Processing, pp. 193-205. New York: Academic Press. DE MORI, R., LAFACE, P. & SAITTA, L. (1982b). Use of fuzzy rules in a speech understanding system. In SANCHEZ, E. & GUPTA, M., Eds, Fuzzy Information and Decision Processes, pp. 177-189. Amsterdam: North-Holland. DIETIP2RICtt, T. & MTCHALSKI, R. S. (1981). Inductive learning of structural descriptions: evaluation criteria and comparative review of selected methods. Artificial Intelligence, 16, 257. DUDA, R. O., HART, P. E. & NILSSON, N. J. (1976). Subjective Bayesian methods for rule based inference system. SRI Technical Report 124, Menlo Park, California. I~RMAN, L. D. & SMITH, A. R. (1981). N O A H m a bottom-up word hypothesizer for large vocabulary speech understanding systems. [EEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-3, 41. ERMAN, L. D., HAYES-ROTH, LESSER, V. R. & REDDY, D. R. (1980). The HEARSAY II speech understanding system. Integrating knowledge to resolve uncertainty. Computing Surveys, 12, 213. F o x , M. (1981). An organizational view of distributed systems. IEEE Transactions on Systems, Men and Cybernetics, SMC-11, 70. GIORDANA, A. & SAITTA, L. (1981). A distributed knowledge representation for real world data analysis. Proceedings of IEEE International Conference on Cybernetics and Society, Atlanta, Georgia, pp. 133-138. G1ORDANA, A., LAFACE, P. & SAIFrA, L. (1980). Modelling control strategies for Artificial Intelligence applications. Proceedings of International Conference on Parallel Processing, Boync Highlands, Michigan, pp. 347-349. GIORDANA, A., LAFACE, P. & SAITTA, L. (1982). P U Z Z L E - - a system oriented to real world data analysis. Report ISI-23, Universita' di Torino, Italy. ISHIZUKA, M., FU, K. S, & YAO, J. T. (1981). Inexact inference for rule based damage assessment of existing structures. Proceedings of 7th International Joint Conference on Artificial Intelligence, Vancouver, pp. 837-842. JELINEK, J. (1976). Continuous speech recognition by statistical methods. Proceedings IEEE, 64, 532. KANAL, L. N. (1979). Problem-solving models and search strategies for pattern recognition. 1EEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1, 193.

EVIDENCE COMPOSITION

31

KLATT, D. H. (1979). Speech perception: a model of acoustic phonetic analysis and lexical access. Journal of Phonetics, 7, 279. KOHDA, M. & NAKATSU, R. (1978). An acoustic processor in a conversational speech system. Review of the Electrical Communication Laboratories (NNT), 26, 1486. LESMO, L., SAITTA, L. & TORASSO, R. (1982). Learning of fuzzy production rules for medical diagnosis. In SANCHEZ, E. • GUPTA, M., Eds, Fuzzy Information and Decision Processes, pp. 249-260. Amsterdam: North-Holland. LESSER, V. & CORKILL, D. (1981). Functionally accurate, cooperative distributed systems. IEEE Transactions on Systems, Men and Cybernetics, SMC-U, 81. LESSER, V. & ERMAN, L. (1979). An experiment in distributed interpretation. Report CMUCS-79-120. MARLSEN-WILSON, W. D. (1975). Sentence perception as an interactive parallel processing. Science, 189, 487. MASSARO, D. W. (1980). Letter and Word Perception. New York: Elsevier/North-Holland. MASSARO, D. W. ~,~ ODEN, G. C. (1978). Integration of featural information in speech perception. Psychological Review, 85, 172. MICHALSKI, R. S. (1980). Pattern recognition as rule guided inductive inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-2, 349. NILSSON, N. J. (1971). Problem Solving Methods in ArtificialIntelligence. New York: McGrawHill. REDDY, D. R. (1976). Speech recognition by machine: a review. Proceedings IEEE, 64, 501. SHAFER, G. (1976). A Mathematical Theory of Evidence. Princeton, New Jersey: Princeton University Press. SMITH, R. & DAVIS, R. (1981). Framework for cooperation in distributed problem-solving. [EEE Transactions on Systems, Men and Cybernetics, SMC-11, 61. STOCKMAN, G. C. (1977). A problem reduction approach to the linguistic analysis of waveforms. Ph.D. Thesis, University of Maryland. VAN MELLE, W. (1979). A domain independent production rule system for consultation program. Proceedings of 6th International Joint Conference on Artificial Intelligence, Tokyo, Japan, pp. 923-925. WEISS, S. M. 8,= KULIKOWSKI, C. A. (1979). EXPERT: a system for developing consultation models. Proceedings 6th International Joint Conference on Artificial Intelligence, Tokyo, Japan, pp. 942-947. WOODS, W. A., BATES, M., BROWN, G., BRUCE, B., COOK, R., KLOVSTAD, J., MAKHOUL, J., NAStt-WEBBER, B., SCHWARTZ, R., WOLF, J. & ZUE, V. (1976). Speech understanding systems. Final Technical Report, vols I-V. Report iV. 3438, Bolt Beranek and Newman Co., Cambridge, Massachusetts. YOU, K. C. & Fu, K. S. (1979). A syntactic approach to shape recognition using attributed grammars. IEEE Transactions on Systems, Men and Cybernetics, SMC-9, 334. ZADEH~ L. A. (1978). Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1,3.

Experiments in evidence composition in a Speech Understanding System

Experiments in evidence composition in a Speech Understanding System

Recommend Documents