Computers and Electronics in Agriculture, 2 (1988) 277-300 Elsevier Science Publishers B.V., Amsterdam - - Printed in The Netherlands
277
A Rule-based Inference System for Animal Production Management N. WAIN, C.D.F. MILLER and R.H. DAVIS Department o[ Computer Science, Heriot- Watt University, 79 Grassmarket, Edinburgh EH1 2HJ (Great Britain) (Accepted 30 October 1987)
ABSTRACT
Wain, N., Miller, C.D.F. and Davis, R.H., 1988. A rule-based inference system for animal production management. Comput. Electron. Agric., 2: 277-300. In Scotland, sheep production constitutes a major proportion of agricultural output, and a substantial number of producers lose money as a result of poor management practices. Governmentrun advisory services offer help to combat ignorance and inefficiency, but the necessary expertise is scarce. This paper describes a system which accepts raw data of relevance to the domain of animal production, together with production rules in the form of arithmetic expressions acquired from an expert. It derives information which is immediately useful in formulating constructive advice, and which is readily available to many users. The design is achieved through a series of logically progressing stages. The first of these is to construct a working prototype of the systems interface, the second to structure the knowledge of the domain, the third to devise a simple inference mechanism, the fourth to add an information-gathering module and the final stage to cloak the system with a user-friendly interface.
INTRODUCTION
T h e p r o d u c t i o n of l a m b m e a t for t h e d o m e s t i c m a r k e t is subject to m a n y factors w h i c h g o v e r n its viability as a f a r m e n t e r p r i s e , a n d t h e v a r i a t i o n in p r o f i t a b i l i t y f o u n d b e t w e e n p r o d u c e r s is substantial. Due to t h e s e a s o n a l i t y of t h e b r e e d i n g cycle of t h e sheep, p r o d u c e r s are c o n t i n u a l l y fighting to b a l a n c e t h e feed r e q u i r e m e n t s of t h e i r stock w i t h availability o f forage resources o n t h e farm, a n d a n a l a r m i n g p r o p o r t i o n of t h e m fail to achieve this, c o n s e q u e n t l y r u n n i n g at a n u n n e c e s s a r y loss. T h e g o v e r n m e n t - r u n agricultural advisory services operate at a regional level t h r o u g h o u t t h e U n i t e d K i n g d o m , a n d offer help to t h e f a r m e r in t h e coordin a t i n g a n d realising t h e p o t e n t i a l o f his assets. In Scotland, sheep p r o d u c t i o n c o n s t i t u t e s a m a j o r p r o p o r t i o n of agricultural o u t p u t , a n d t h e r e f o r e , it is o f
0168-1699/88/$03.50
© 1988 Elsevier Science Publishers B.V.
278 particular importance that management advice is of a consistent and high quality, to combat the inefficiency of the poorer producers. The aim of this project was to formalise the expertise of a sheep production specialist and encapsulate it in the form of a general-purpose rule-based 'expert system', implemented as a computer program, for eventual use as an advisory aid. The result was a system which accepts raw data concerning the nature of a sheep enterprise, and produces information in a form that is immediately useful as a foundation for taking important management decisions.
Breeding sheep The domestic ewe sheep reproduces annually, mating with the ram in the early autumn. After a gestation period of about 5 months, she gives birth to her progeny in the early spring, and feeds them on her own milk until they can consume forage a few weeks later at about 2 months of age. The growing lambs are then left to feed on grass until they have gained sufficient bodyweight, or body 'condition', to be sold for slaughter. For the majority of animals, the period when this is achieved falls in the summer months and, as a consequence, the supply of meat is most abundant at this time: the market price of lamb responds to this by dropping as the consumer demand is fulfilled. The producer may strive to mate his ewes earlier in the autumn, so as to be ahead of the majority of his competitors, and therefore capitalise on the higher pre-season lamb prices. However, there is a limit to how early in the year he can mate his animals due to the genetic characteristics of the sheep which have evolved with time. The ewe comes into season only in the autumn, in order that after the winter months have passed and new grass begins to grow in spring, the ewe will have an adequate supply of forage to sustain milk production for her young, until the lambs are weaned and move on to feeding from grass themselves.
Government subsidies In October 1980, the EEC introduced a price support system on the sale of sheep meat, to assist European trade, and thereby ensuring a common price throughout the community. Initially, it enforced a seasonal imbalance in favour of winter meat production, to attempt to stem the glut in supply during the summer months. In 1984/5, the guide prices were revised, and the discrepancy between support for lamb production during the winter months, as opposed to early summer and autumn, was widened. The seasonal variation in lamb meat prices is illustrated in Fig. 1 (Dickson et al., 1984). Now, as well as satisfying the consumer demand, the lowland producer who has surplus resources is encouraged by the new scale to buy and finish immature lambs from the hill and upland sector, to complement his
279 l~nce/kg 260 250 240 230 220 210 200 1984/85
190
Apr'May' Jun' Jul 'Aug' Sep' Oct 'Nov'Dec' Jan 'Feb'Mar'
Fig. 1. U.K. sheepmarket support: seasonalscale guideprices. existing lamb-breeding system. In this way, the U.K. sheep industry benefits due to an extended lamb finishing season, rather than from the production of more lambs to be slaughtered during the summer months.
Scottish hill farmer The Scottish hill farmer, whose livelihood depends greatly upon sheep production, is at a marked disadvantage. The harsh winter conditions of the North prevent him from gaining an early start to the breeding season, as lamb fatality due to the cold weather and absence of adequate forage for the ewe both become a considerable limitation. In fact, by the time that the annual grass crop has ceased growing in usable quantities, a large proportion of lambs have failed to reach a salable body condition. According to Speedy (1980), 40% of lowland flock lambs, 60% of upland lambs and 95% of hill lambs require further finishing in the subsequent autumn or winter months. These animals are referred to as store-lambs, since they have to be 'stored' and brought up to the correct body weight and finish after the normal growing season is over. If a 'breeder' cannot finish the lambs from his breeding flock, he can sell them to a 'feeder' who will take on that task and finish them himself over the winter. Frequently, it is the task of the district agricultural advisory unit to help producers assess the merit of over-wintering their lamb stock, although where the sheep enterprise is merely one component of a varied farming mix, it cannot be viewed in isolation. The strategy must be correct for the system as a whole, and often one enterprise must be allowed to perform at less than its optimum potential, in order that another can be exploited for a greater return. The management skill is required to balance all the components to the greatest overall financial gain, and therefore, the decision to over-winter store lambs may not be as simple as it might initially appear.
2 8 0
Management considerations Finishing of weaning lambs is an enterprise which shows very great variation in profitability. A survey conducted in 1979, over farms in the South East of Scotland, showed that the average gross margin per hectare was £161. However, the top third of farmers achieved a gross margin of £450/ha, the middle third made £96/ha, but the bottom third lost £58/ha (Lloyd, 1983b). Results such as these suggest that there is considerable scope for improvement in management, especially in the bottom third of producers, bearing in mind what potential returns it is possible to achieve. Unfortunately, success depends to a large extent on the buying and marketing skills of the farmer, and most of the higher trading margins in the best units appear to be due to a higher price per kg obtained for the lambs, which in turn depends on the successful price forecasting. The producer (or breeder) must ensure that the lambs which are retained on the farm are suitable for the supply of feed he has available, and that the potential value of the finished store-lamb will exceed the returns to be gained from immediate sale. For the finisher (or feeder), the price paid for the store-lamb must take into account its growth potential, the costs of feeding and the time taken to finish it to a salable condition. To all intents and purposes, the producers considerations are identical to those of the feeder at the point when he is deciding to sell or store his lambs. Finished store lambs are criticised by the meat trader for inappropriate levels of finish, poor conformation and carcass yield, and experimental trials have clearly shown that breed, liveweight and condition score together are good prior indicators of subsequent performance. The quality of feed used also has direct influence, though, in practice, the overall structure of the farm business will dictate the type of crops which are available as feed for finishing lambs. Therefore, although this also is a factor of considerable importance to the success of the enterprise, it is not one over which the farmer has any degree of flexibility. Apart from the buying and selling price considerations, the key to success is in selecting the right lambs to suit the existing system. The most important points for success in store lamb finishing can be summarised as follows: buying price related to expected returns - high forage crop yield high stocking rate good lamb performance - effective health control minimum costs. -
-
-
-
281 DESIGN
Expert system The emergence of expert systems in the field of computer artificial intelligence has lead to wide, if naive, interest in many areas of manufacture and production (Forsyth, 1984 ), but often the potential beneficiaries are unaware of the types of application to which the expert system technique is best suited. The head of the advisory unit, at the Edinburgh School of Agriculture, was approached and presented with the opportunity of having a small pilot expert system implemented on the colleges DEC VAX computer. It was agreed that there was a requirement to formalise the reasoning processes that the expert manager employed when deciding whether he should opt to finish store-lambs, either as bought-in stock, or as residual lambs from his own breeding flock. By documenting this reasoning structure in some way, it was felt that a system could then be devised that would allow the processes to be understood and adopted by others who lacked the same level of expertise. It was clear that this would lead to a more consistent quality of store-finishing management within the industry, if it could be applied in a regular and routine manner by either farm manager, or indirectly through his regional advisory unit. It was decided that the most appropriate area to study would be the overwintering of store-lambs.
Knowledge acquisition As mentioned above, there are a handful of key considerations which influence the decision-making process, and it was agreed that the model rule-base for an expert system should be kept as simple as possible. Therefore, in a preliminary interview with the subject {Gammack and Young, 1985), only a central core of information was discussed (Lloyd, 1983a), which pertained to those factors affecting the growth performance of lambs and the feed quality of forage crops. The goal of the system is to answer the question: "Can I feed my lambs up to marketable body condition on the forage crops I have at my disposal?" and in doing so, the user must provide information about his circumstances, such as:
Breed of sheep. Different breeds grow at different rates and eat different volumes of forage. They also have varying 'finished' weights-a light breed lamb will have eaten less, and will weigh less when finished, than a heavy breed lamb, and consequently, will fetch a lower price (Cassie, 1982 ).
282
Crop yield. Different crops yield different quantities of feed matter per area grown. The comparative measure is the yield of dry matter per unit area, which is directly related to the nutritional value of the crop for the lamb consuming it. In addition, the time of year that crops become available as forage varies considerably. Grass is traditionally fed to lambs early in the autumn months, followed by rape around mid-October, or swedes to take them through the winter period. Stocking rate. The quantity of feed available has to be equated with the requirements of the lambs, and the stocking rate, i.e. the number of lambs per hectare of forage (or more fundamentally, per tonne of dry matter), brings the two together. If forage is abundant, then more lambs can be brought in and fed up to saleable body condition, and, alternatively, if the stocking rate has to be restricted and the forage is consequently underutilized, a calculable proportion of it can then be released for use in other ways. The user will also want to be able to invert the line of questioning. For example, assuming that his stock will finish, then - What breed of lambs should he buy in? - What starting weight should they be? - What is the minimum level of daily forage DM he must provide? How many lambs can he keep on the land available? -
After the subject domain has been discussed over several interviews, clarifying and refining ideas, it was desirable to see the human expert actually work through a real-life problem. Surprisingly, her method followed a rigid application of tables of data and formulae, to yield mathematical solutions. Her own judgement as an expert was only apparent in two phases of the process. Firstly, the accuracy of the figures she was using as input to the formulae was inherently unstable, and as an example, she explained that the liveweight gain of the Suffolk cross lamb, recorded as 110 g per day, could vary, in her experience, from 80 to 220 g per day. She was obviously using personal judgement to adjust the tabulated data to fit the prevailing local conditions of the case study. Secondly, having derived the information about lamb stocking-rates, breed, days to finish, etc., she was then in a position to view the enterprise in relation to the other concerns on the farm, and pinpoint any shortcomings from an overall stance. It was clear that the arithmetical aspect of the process was a fundamental base, albeit primitive, on which to found the expert reasoning, and that this was a step which was usually ignored by the average farmer. Subsequently, a more detailed study of the calculations involved showed that one formula, and various invertions of it, was at the heart of the process. It is expressed as follows:
283
Stocking Rate = (Lambs/ha)
Crop Dry Matter Yield (DM kg/ha) Ave Lamb Intake Ave Grazing Liveweight × Utilisation × Days per Lamb (kg) Factor
Crop dry matter (DM) yields and average lamb liveweights are those at the start of the grazing period on any crop. The intake utilisation factor assumes a DM requirement of 3.5% of lamb body weight and a 70% crop utilisation. In reality, forage yields are notoriously variable, and it is essential to calculate stocking densities on the basis of crop DM yield and lamb weight. The traditional finishing systems in Northern Britain characteristically involve a succession of forage crops, and lambs receive approximately equal crop DM allowances per lamb grazing day. To summarize, the range of crops and animal breeds used is relatively wide (there are 50 purebred and over 300 crossbred types of sheep in the U.K. ), and since current scientific models are crude, this is justification for allowing the expert to override any built-in database and use alternative values. However, where information of a local relevance is sparse, there is also a need to access national standard figures quickly. In addition, scientific information is being gathered continuously, and therefore, easy updating of the database is essential-all of which emphasises a need for inherent flexibility within the program structure.
Prototype Prototyping for information systems development is a relatively new procedure (Alavi, 1984), and, simply defined, a prototype is an early version that exhibits all the essential features of the later operational system. The traditional 'life-cycle' approach normally adopted by systems designers follows a 'once-through' path from analysis and specification to installation and, almost invariably, maintenance. In contrast, the prototyping approach aims at getting a working model of the system, albeit in perhaps a very crude form, running as soon as possible so that the end user can inspect and criticise it. Having done so, the designer uses this feedback to enhance and, if necessary, even conceptually alter the prototype with the minimum of delay, and at relatively little cost, to produce a second version. This is again presented to the user for further examination, and the cycle of improvements and modifications continues until both parties are confident that they have a common understanding of the proposed systems objectives. A schematic comparison of the prototype and life-cycle approaches is shown in Fig. 2. For a prototype to be most effective, it is important to appreciate the advantages it has over the life-cycle approach, and whether these advantages can be applied to the current situation.
284 The
The Prototype
i d e n t i f y initial user requirement~
[ syst......lysls]
develop a prototype
use the
and evaluate prototype
revise the prototype
Life-Cycle
[
requirements specification
system design
system development
]
installatlon and review
1
maintenance
l
Fig. 2. Comparisonof prototypeand 'life-cycle'developmentsystems. Users too often have a preconceived notion of what they want an information system to do and assume the designers have the same idea, and as users often have a difficult time visualising what results an information system will produce, designers need to develop ways of communicating and demonstrating such results before actually implementing the information system. Prototyping seems to be effective in coping with undecided users and clarifying 'fuzzy' requirements. In addition, prototyping represents a common reference point for both users and designers to exchange ideas about potential problems and opportunities early in the development process. As well as developing better communication and rapport between designers and users, it helps ensure that the nucleus of a system is right, and performs as expected, before the expenditure of resources for development of the entire system. The prototype also provides the user with a tangible means of comprehending and evaluating the proposed system, who can then reciprocate with more meaningful feedback in terms of his needs and requirements. Often the user is well capable of criticising an existing system, even though not too good at specifying or anticipating his needs. It is important to emphasise that the development of the prototype must be
285 managed and controlled by planning the effort in advance, in terms of cost, resources and time, and deadlines for modifying and experimenting with the prototype should be set. In summary, the prototyping approach facilitates fast response to user needs, allows clarification of user requirements, and offers an opportunity for experimentation. Although there are pitfalls and shortcomings, none seem troublesome enough to outweigh the potential benefits.
Implementation It was decided that building a prototype would be an essential beginning to the project, because, from the interviews that had been conducted to acquire the base of mathematical relationships, it was clear that the reasoning processes which the expert adopted were still sketchy. It was important to establish that the information which was thought was required of the program was indeed what the expert needed, in order that she could then to proceed in making useful advisory decisions. The prototype was written in the Prolog language (Clocksin and Mellish, 1981; Warren et al., 1985), which proved to be the most convenient to alter, in what was effectively a trial and error design approach. The most important consideration was that the program should present an interface similar to that of the final system, so that the user could immediately see what information was going to be produced, and what data would be required for it to do so. The resulting program, which was menu-driven throughout, consisted of two separate sections. The first displayed a list of all the questions which the user might want to have answered, and the second provided the user with the means to alter or add to the database of facts which were used in the calculations. A prototype which is designed only to give an impression of the appearance of the final system need not contain data in any usable form. Results can be calculated by hand and then written into the prototype so that each time a question is asked of it, the same answer is given, regardless of changing circumstances. This dispenses with the need to develop the application itself, and programming time is appreciably reduced. However, for this project, it was decided that the prototype must be seen to perform in an intelligent manner, and demonstrate its ability to ask the user for missing information when appropriate, and, in addition, to remember that information from one question to the next. Therefore, the prototype was given a partial base of knowledge and a simple set of functions, with each one specifically designed to ,.nswer but a single question from the menu. Although it was capable of asking the user for unknown data, which it put into a common pool for continual reference, no attempt was made to design the inference mechanism in a generalised way. The agricultural context was wholely integrated into the logic of the program, and this meant that it was therefore completely inflexible-it could not be ex-
286 tended to answer any questions, other than those in the menu, without having the appropriate Prolog code explicitly added. However, since this was merely the prototype, it was of no consequence for the purposes of conveying the 'feel' of the final system to the user. Feedback from the user When the prototype was demonstrated to the agricultural expert, the reaction to its content was most encouraging. It appeared that the general format of the prototype was much as she had expected, and the results that it produced were generally useful to her. However, she pointed out that one or two questions that it offered to answer were of no practical value, since in real-life situations, either they would not arise, or their outcome could be predicted automatically. In addition, she proposed extra questions which should be added to the list, and which had not arisen in the discussions prior to the prototype's implementation. Apart from obvious criticism of the poor facilities for maintaining the database of knowledge which was used for the calculations, the only other fault raised was in the somewhat ambiguous wording of a couple of the questions displayed. All comments were noted down while the session was in progress, and afterwards a study was made of how best to implement the required changes into the prototype. On eonsideration, the problems that had been encountered were either too time-consuming, or too trivial to rectify. So it was decided that although the prototype had proved its worth, it would be inappropriate to spend further time in correcting its deficiencies, and that work should be started immediately on implementing the final system described in the next section. IMPLEMENTATION Data, data sets and equations Initially, the core of parameters and their relationships with one another was studied, to discover in general terms the extent of the mathematical functions involved. In addition, it was clear that there were varying types of data which had to be worked with, and it was for this reason that Prolog was selected as the language in which to write not only the prototype but also the final system. Because of the symbolic way it represents its data, there is no typechecking involved until it actually has occasion to use the values. The rest of the time, they are merely passed around in variables and lists, or substituted into equations, without the need to evaluate them (lazy evaluation). From the study, it emerged that there were four types of data which needed to be represented. The first of these was the 'fact', which merely associated a legitimate value with a parameter name. Numerous facts could exist for a pa-
287 rameter, as is demonstrated by, for example, 'sheep breed' which has two possible values associated with it as facts: 'suffolk cross' and 'scottish black face'. Independent parameters such as 'breed' and 'crop' and 'desired carcass composition (fat class )' stood as the base from which all other datatypes were built or derived. The second type was similar in as much as it was a fact stating the value of a parameter, but, in addition, it was dependent for this value upon the states of other independent parameters, and as such, had to encapsulate their respective values as well. An example is the 'finish weight' of an animal, i.e. the weight at which it is deemed marketable. The finish weight varies from breed to breed: the value for the suffolk cross is greater than that for the scottish black face, because the latter is a smaller breed. Combined with this is the dependency of finish weight upon fat class, which can take any one of a range of six values, according to muscle/fat ratio. Therefore, typical values might be: finish wt = 44 kg while breed = suffolk cross and fat class = 3L finish wt = 38 kg while breed--- sc. black face and fat class = 2 There arose a need to represent sets of values and manipulate them in various ways, so the third type was a list structure, whose elements constituted a subset of the existing values for some specified parameter. One particular example came to light in the representation of the range of crops which a producer might have at his disposal to feed lambs with. The parameter 'crop' may take any one of the values 'grass', 'rape', 'swedes', 'turnips', etc., but the subset 'crop sequence' is a list of those crops which the producer actually intends to use, e.g. [grass,rape ], or [grass,swedes ]. Certain parameters were represented with explicit values, and others with conditional values, but some could only be defined in terms of relationships between, or as functions of, other parameters. Therefore, to complete the small set of types allowed, the 'equation' was defined. A parameter of this type is associated with some algebraic expression, consisting of other known parameters, as well as operators to manipulate them. An example of this was the parameter 'grazing days', that is, the number of days for which an animal needs to feed in order to reach marketable condition. It was represented by the equation: Grazing days--
finish w e i g h t - start weight liveweight gain
where finish weight and liveweight gain were defined as dependent variables, and start weight as an independent variable.
Operators As mentioned above, the algebraic equations are expressed in terms of operators acting upon parameters to give some resulting value. In order to solve
288 the numerous goals represented by this system, two different types of operator had to be described. Prolog itself has a range of mathematical operators defined internally, and for the sake of simplicity, those which can legally be utilised in an equation are the operators for addition, subtraction, multiplication and division ( + , - , * , / ) , which maintain their normal precedence. Additional ones, such as 'sqrt (X)' and ' X ^ Y' (square root of X and X t o the power of Y), can easily be added to the system, but as yet have not been found necessary. Together with these are a more complicated set of operators which Prolog does not provide. On the whole, this set was designed in response to a need to manipulate lists of data in various ways. The notion of lists in equations stems from the use of the 'subset' datatype. For instance, the operator 'sum' takes all the elements of a list and adds t h e m together to yield one value, as would be expected. The operator 'average' is similarly self-explanatory, dividing the 'sum' of a list by the number of its elements. Others may be more complicated, such as 'min' or 'max', but all of these operate on a list of data, reducing them to a single value. In contrast, the operator 'foreach' was defined to construct lists from individual occurrences of data taken from the library of facts already established. It requires two arguments, one of which is a parameter that has either been defined as a subset of some other, or is represented by an equation which itself yields a list. This parameter is associated with a label, e.g. 'X', and takes the form: X < -parameter X < - c r o p . sequence (crop. sequence is a list, e.g. [grass,rape] ) The preceding argument to 'foreach' is itself an algebraic expression, whose component parameters may be bound to the label X: e.g. foreach (area (X)/2.5, X < - crop_ sequence ) The expression is evaluated once for each element in the list, where X takes the value of the element. The parameter which is bound to X will therefore be solved for that value of X as a constraint. A verbal interpretation of this expression is 'for each crop in the crop sequence, find its area and divide that by 2.5'. To illustrate this, assume the areas of different crops that a producer has are known to be 100 acres of grass, 25 of rape and 50 of barley, and also that the crops which he intends to feed to his stock (crop.sequence) are grass and rape. The above expression will yield a list with two values, namely [40,10] (a conversion from acres to hectares!).
Inference engine The program as a whole revolves around a single predicate called, simply, 'calculate' and this is primarily responsible for solving the various goals that are referenced by each c o m m a n d in the m e n u of options presented to the user.
289
It takes three arguments, the first of which is the name of the goal to be solved, the second is a label referencing details of any constraints which may have been imposed on the way in which the solution is found, and the third is the returned solution itself. Figure 3 illustrates the general flow of the algorithm involved.
Y N
'calculate' each ..
return list of values
subgoal
return N
Y
Y
N
the
number
'calculate' the expression
return the expression
representing it
value
'calculate' arguments a n d evaluate goal
user-deflned operation
is
~al
represented
ask
Y
by data?
N Y
Y
user
for missing data
and
store
it
ask user to select value and label it
return the item as solution
Fig. 3. Predicate 'calculate'.
)
290
Utilities Together with the inference mechanism, data structures and operators, there were a great many predicates which had to be written in Prolog to support the program as a whole. These included routines to display, select from and move between menus, to accept and validate user input, to retrieve and archive files of data, to process lists of information (since Prolog is a language which works well with list structures), to generate formatted output of solutions, to maintain the database, to facilitate debugging, and so on. Some were general Prolog utilities which would be of use no matter what the application, and others were specific to the subject domain. A handful were imported from the work of others (Byrd and O'Keefe, 1983 ) and occasionally modified to knit well with this system, but apart from these, all were written from scratch.
Knowledge base It was fully appreciated that the end user of the system was likely to be largely unfamiliar with using computer software of any sort, be he agricultural advisor or farmer, and, moreover, it had always been a specific intention that this and additional knowledge bases could be constructed by the user, unaided. Therefore, due to the terse nature of the Prolog language, it was felt unrealistic to expect the novice to be able either to understand an existing rule set or to create one of his own, and that some kind of interface to the rule set was required. Whilst the application was being developed to a working level, the knowledge remained in Prolog form, which of course, was necessary for it to be usable. However, once this was complete, attention was directed to creating an interface to the rule set. A number of design alternatives arose in response to satisfying three basic requirements. These were that, firstly, there had to be a facility for printing off the knowledge base so that it could be taken away from the computer itself and studied at leisure-not all people can visualise a large volume of text, when only a small proportion of it can actually be displayed on a screen at any one time (Maguire, 1982 ). Secondly, the knowledge base itself had to be described in such a way that would be easy for the user to read and understand. This involved breaking down the Prolog representation into English-like sentences, with predicates and arguments being qualified by some meaningful, if redundant, descriptors, and spread out in a more conceptually acceptable manner. Finally, there had to be an interactive module that would guide the user in constructing new rule sets of his own, that were syntactically correct. The next issue was that of defining a suitable syntax for the rules in the text file. A study of other similar systems such as TES S (S. Davis, 1984 ) and Faultfinder {Hammond, 1982 ) demonstrated that it was not necessary to construct
291 a set of Prolog grammar rules for parsing the text with, and that the task could be easily achieved by declaring a set of unique operators and interspersing them between descriptors and data. An interpreter would look for well-formed text sentences, break them down into lists of properties and associated values, and then build up the corresponding rules in Prolog. The transition back to text would be virtually the same process but in reverse. The final part of the knowledge-base design was to create an interface to it which could take the user through a selective question and answer session, and acquire the definitions of new parameters as well as gather factual data about existing ones. INTERFACE
Menu system The design of an interface to an interactive computer system must be regarded as eqfially important as the internal logic and correctness of the system itself. If a user cannot fully comprehend its intended function, or cannot easily establish how to make it operate as it was designed, then that system will not be used to its full potential, if at all. Moreover, the user, who is normally a most valuable source of feedback on the success of a system, cannot constructively assess it and propose modification and development for the future. Therefore, it is imperative that the interface is both meaningful and precise, and also that the type of interface adopted reflects the level of computer expertise that the typical user will possess. One can distinguish three alternative methods of interacting with a computer which are currently used in many software packages: the menu system, the command-language system and the natural-language system (Haupmann and Green, 1983). In the menu system, whenever the program needs to know something, it displays a menu of options that describe the things that can be done, or are available at this point. The user selects an option of his choice, by typing in the number or keyword that represents it, and then the program branches to the subroutine that corresponds to that option. This process repeats itself until the program only needs some specific information to complete the task, which it can elicit from the user directly. The main advantages are that the user only has to know and understand the options available at the current point, and is not overloaded with a complete list of information. Yet, just a few consecutive choices can lead him to hundreds of alternatives. Also, both program and user know exactly what inputs are appropriate at each point, and error checking can be greatly simplified. However, the drawbacks are that the system has to be organised into menus within the program, and that the experienced user often becomes frustrated with having to tediously leaf through many pages of
292 menus before arriving at his required choice. Hence, the menu system has become popular in applications mainly for the novice. The command-language is the most universally popular of the three input systems, although it is the least accommodating to the user. The 'command' is an instruction typed in to the computer, telling it exactly what to do in a very precise syntax and format, and a user must learn the correct way of expressing the job he wants performing before there can be any dialogue with the system. The vocabulary of a command language usually consists of abbreviations for English words, as, for example, in the UNIX* operating system, where ' m a n X' means ' d i s p l a y t h e o n - l i n e m a n u a l e n t r y f o r p r o g r a m X'. This terse language leads to simple error checking, and it fast and convenient for the expert, but the novice or infrequent user tends to find the syntax confusing and awkward to remember. The third method of interacting with the computer can be viewed as an 'English-like, easy to use' command language. Here, the computer accepts phrases in a form that is natural for the user, and asks clarifying questions where the meaning is ambiguous. The disadvantages lie in the extreme complexity of the program that is necessary to analyse the English input. It seems that experts .still prefer terse command languages to verbose English instructions, but interactive natural-language programs are generally considered preferable for inexperienced users, because they lift the restrictions imposed by the rigid syntax of a formal language. Of all the methods, it was decided that the menu system was most appropriate for this application, due to the ease of implementation and anticipated low computer proficiency of the end user. A module was written which would allow the user to define his own menu pages and their commands, link them together with labels, and get them to initiate appropriate routines. Each knowledgebase has its own unique menu set, and the two reside in separate files, but with associated filenames so they can be references as a pair. Unfortunately, menusets must be created using the standard operating system editor, since there was not sufficient time to develop a suitable menu-maintenance interface. The command lines in a menu are displayed in a vertical column on the screen, which are numbered from one upwards, and to select an option, the user simply types in the corresponding number. Each command line in a menu page is internally represented as a Prolog fact with three arguments. The first is the name of the page on which the command is to appear, the second is the command string itself, and the third is a label. This label indicates the function of the command, which is either to call another menu page, or to attempt to solve a specific goal. The following depicts the appearance and representation of a typical menu page together with its Prolog format: *UNIX is a trademarkof Bell Laboratories.
293
Menu: Select a Question to answer 1) Changethe value of any variables 2) How long is forage available for (days)? 3) What is the total dm of the forage (tonnes) ? 4) How many days grazingis neededfor the sheep? 5 ) What level of dm is available per grazing day (tonnes/day)? 6 ) What level of dm is available per day overall (tonnes/day)? 7) For how longcan each crop be fed (days)? 8) On what day willthe stock be fat? 9) What is the total stock I can finish? Which number [E ] ? menu (top,'Select a Questionto answer',title). menu (top,'Changethe value of any variables',change). menu(top,'How long is forage available for (days)?', total.crop_dur). menu (top,'What is the total dm of the forage (tonnes) ?',total_dm). menu (top,'How many days grazing is neededfor the sheep?',grazing.days). menu(top,'What level of dm is available per grazingday (tonnes/day) ?',min_dm). menu(top,'What level of dm is availableper day overall (tonnes/day) ?',overall dm). menu (top,'For how long can each crop be fed (days)?',crop_days). menu (top,'On what day will the stock be fat?',finish.day). menu(top,'What is the total stock I can finish?',total_stock). As an aside, although the system is menu-driven, there is, as has already been mentioned, a limited reliance on the operating system of the computer to provide editing and printing facilities. The machine t h a t will house the application is a DEC VAX 11/750 with VMS operating system, and since this has a command-language interface, it is envisaged t h a t the user will have to be given a small subset of appropriate commands to learn.
Gathering information The purpose of the application is to find solutions to mathematical problems t h a t are interesting to the user. The wording of a particular problem is displayed as a line of text in the menu, and is selected by typing in the number t h a t refers to it. The predicate 'calculate' attempts to solve the goal which has been associated with t h a t problem description, and hopefully returns an answer. Whilst finding the solution, the program will refer to the database for relevant information, which may or may not be present. For every parameter t h a t is represented by a specific value, as opposed to an algebraic expression, one of three situations may arise. Firstly, if an appropriate value can not be found, t h e n a question is displayed which asks the user to provide one. The wording for this is laid out in the full definition of the parameter itself. Where the
294
parameter is dependent upon the values of others, these are displayed first so that the user knows exactly what to type in. For example, suppose the program needs to know the finishing weight of some lambs. It knows that finish weight is dependent upon breed and fat class, and it has values for both of these. It would ask a question in the following way: I~°°
breed=sbf fat_class = 3L then... What is the finish wt. (kg)? 42
Secondly, where only one value exists for a parameter, t h e n it is used unreservedly, and no intervention is required. Finally, however, where there are numerous possible values, the user is asked to choose between them. Again, the menu-selection method is adopted, where each value appears along side a number, and the user simply types in the right one. For example, the 'fat class' of an animals body composition falls into one of six categories, and the program will have to ask which class the user wants his lambs to finish in. The display would appear as follows: To answer the question... What is the fat class? 1)1 2)2
3) 3H 4) 3L 5)4
6)5 Which number [El? 4
As an aside, the ' [ ]' brackets are used to contain any input which is valid in addition to the numbers one to six. In this case, 'E' is valid, and means 'Exit'. Where numerous strings are contained in the brackets, the first in the list is the default, and obtained by simply pressing " R E T U R N " . This simply illustrates how menu-selection has been incorporated whenever possible, in order to make it easier for the user to operate, and to reduce the incidence of erroneous input.
Example session To initiate the session, the user simply types 'go' at the terminal as the comm a n d to the operating system. The following text is the trace of typical question and answer dialogue between user and program:
295 %go C Prolog version 1.5a.edai Whatdata-filedoyouwantto use [sheep,E]? sheep sheep reconsulted 4532 bytes 1.4 sec. sheep.menu reconsulted 560 bytes 0.41667 sec. finished! Hit "RETURN" to continue Menu: Sheep Over-Wintering Decision Aid 1 ) Select a Question to answer 2 ) Forget all temporary constraints 3 ) Define new parameters Which number [E]? 1 Menu: Select a Question to answer 1 ) Change the value of any variables 2) How long is forage available for (days)? 3 ) What is the total dm of the forage (tonnes)? 4) How many days grazing is needed for the sheep? 5 ) What level of dm is available per grazing day (tonnes/day) ? 6 ) What level of dm is available per day overall (tonnes/day) ? 7) For how long can each crop be fed (days)? 8) On what day will the stock be fat? 9) What is the total stock I can finish? which number [E]? 3 If... crop = swedes then... How soon after 1 September is the crop available as feed (days)? 70 If... crop = swedes then... What is the duration of the crop (days)? 155
Crops? can be fed for ? days to achieve 0.37333 dm (tonnes/day) grass, 64.286, swedes, 160.71, Hit 'RETURN' to continue Menu: Select a Question to answer Which number [E]? 8
296 Option: On what day will the stock be fat? On what day will you start feeding the stock? 20 To answer the question... What is the breed abbreviation? 1) sbf 2) sx 3) txl Which number [E]? 2 To answer the question... What is the fat class (1,2,3L,3H,4,5)?
1) 1 2)2 3) 3H 4) 3L 5)4
6)5 Which number [El? 4 If.,, breed = sx fat_class-- 3L then... What is the finish wt. (kg)? 42
Option: What is the total dm of the forage (tonnes)? To answer the question... What is the sequence of crops for feeding? 1 ) swedes 2 ) rape 3) grass Which numbers [E ] ? 31 If... crop = grass then... What is the area of the crop (hectares)? 20 If... crop ----grass then... What is the dm yield of the crop (tonnes/ha)? 1.2
297
If... crop = swedes then... W h a t is the d m yield of the crop (tonnes/ha)? 10
The total dm available is 84 tonnes Hit "RETURN" to continue Menu: Select a Question to answer Which number [E]? 7 Option: For how long can each crop be fed (days)? If...
crop=grass then... How soon after 1 September is the crop available as feed (days)? 0 If,.. crop = grass then... W h a t is the duration of the crop (days) ? 12 5
What is the ave. starting wt. of the lambs (kg) ? 30 If...
breed = sx then... What is the liveweight gain (kg/day)? .11 The stock will be fat on day 129.09 Hit "RETURN" to continue CONCLUSION W o r k on t h e p r o j e c t was b r o u g h t to a h a l t w h e n it was c o n s i d e r e d t h a t a useful w o r k i n g s y s t e m h a d b e e n achieved, w h i c h could be i n s t a l l e d on t h e agricultural a d v i s o r y u n i t s c o m p u t e r a n d o p e r a t e d w i t h o u t t h e help of t h e student. F r o m t h e beginning, it was felt t h a t t h e o v e r r i d i n g p r i o r i t y was to get s o m e t h i n g up a n d r u n n i n g , in w h a t e v e r form, a n d leave t h e d e v e l o p m e n t of "bells a n d w h i s t l e s " u n t i l later. As a c o n s e q u e n c e , t h e p r o j e c t was c o n c e i v e d as being ' o p e n e n d e d ' and, r a t h e r t h a n as a whole entity, it was designed a n d i m p l e m e n t e d in logically p r o g r e s s i n g stages. T h e first o f t h e s e was to c o n s t r u c t a crude, w o r k i n g p r o t o t y p e o f t h e s y s t e m s interface, t h e s e c o n d was to struc-
298 ture the knowledge of the domain, the third to devise a simple inference mechanism, the fourth to add an information-gathering module, and the fifth and final stage was to cloak the system with a user-friendly interface. The final result was demonstrated to the agricultural specialist, and its achievements discussed at length. It was obvious that the range of tasks it could perform where, as yet, of limited use, but the specialist appeared to grasp the scope of the underlying mechanism, and enthusiastically lead the conversation towards a critical re-appraisal of her own methodology in dealing with the subject domain. New objectives were raised, and to fulfill them, additional tasks were defined for the system to solve. Some of these were within the scope of the existing range of functions that the system could handle, whilst others required its flexibility to be expanded considerably. Throughout the development period, various design problems arose, and these were solved in the simplest and most immediately effective way. However, when the project was completed and objectively reviewed, it was clear that, if development was to be resumed at a later date, then not only would new methods have to be incorporated, but some of the existing ones would have to be completely redesigned to cope effectively. For instance, knowledge bases can be constructed in either of two ways. Firstly, the user can enter new information for them with the help of an interface, which conducts a question and answer dialogue to glean all the details it requires. Secondly, he can learn to use the editor that resides on the native operating system of the computer itself. The latter is terse and unhelpful, but at least it does exist and does work correctly, should the interface fail. Together with each knowledge base is a dedicated menu file that contains Prolog code. Every line of code represents an option on a page in the menu, and the user constructs the menu-file according to how he wants it to appear and function. Unfortunately, there is no specific interface to help him do this, and so he m u s t resort to using the editor. In fact, it is doubly important that, in the future, an interface be written, because not only would it ensure the syntax of the resulting Prolog code were correct, but it would also dispense with the need for the user to understand Prolog in the first place. What is required is a module that will allow the user to ask for some justification of its reasoning process, where he finds the output too terse. This is a feature that is considered essential in the design of expert systems, where conclusions inferred from the prevailing conditions can be even more abstract. It is hoped that a suitable implementation will be installed before the application is finally submitted to the advisory unit, but as yet, the exact form it will take remains unresolved. There are a couple of alternatives to presenting an explanation system. Firstly, where the user is presented with a question to answer, he could respond by typing in the word 'why', meaning 'why do you want to know this?'. The program would then explain the immediate significance of the question within
299 the context of the goal being solved. Repeatedly typing 'why' would take the level of explanation deeper and deeper, to a point where the user understands the reasoning. This is a useful forward-chaining method, which explains how basic pieces of information are combined in solving the goal. However, the program may not always give the user an opportunity to type the 'why' command before it prints an answer, so a backward-chaining method should also be incorporated. This would work in the opposite way, starting at the goal and explaining to increasing depth, how it derived the final solution. Secondly, the simplest and probably least informative kind of explanation system would be to attach a 'trace' facility, which would crudely plot the progress of the inference mechanism throughout the problem, printing out the names and values of sub-goals (and sub-sub-goals) as they are found on the way. If the program has to ask the user for information in order to solve a problem, it will store what it is given in the database, in case the same problem arises again some time in the future. The user may, however, delete it in the mean time if he so desires, or, if the session is terminated, it may be saved permanently, depending upon how it has been defined. To the user, therefore, it appears to learn and remember new information in an intelligent way. However, it does not have a mechanism for remembering anything it derives without the users help. At present, the program cannot make the necessary transformation from one equation to any other, and consequently, each form must be individually specified. By incorporating a module to perform this function, the program would gain flexibility and be able to handle a more generalised set of production rules. The technique has already been explored in the work of others, such as PRESS (Prolog Equation Solving System) (Bundy and Welham, 1981 ) and SMCS (Symbolic Mathematical Computation System) {Conkie, 1986), and simply needs to be reduced and adapted to fit this system. At the outset of the project, the intention was to construct an expert system for the advisory unit at the Edinburgh School of Agriculture, consisting of an inference engine, a knowledge-base of facts and production rules, and a database of relevant information. The subject domain was to be an aspect of sheep production in the East of Scotland, with expertise being provided by a specialist from the unit itself. The final system was intended for use as an aid to making advisory management decisions in direct response to the needs of the producer-customer, and also as a learning tool for colleagues at the unit with insufficient expertise. The provision of an appropriate database, however, proved to be a complex task, and it was to this end that the project eventually addressed itself. Consequently, the program that was forthcoming constitutes only a part of the expert system which was originally conceived, and although it can take raw data and transform it into useful information, it is unable to formulate any notion of how or why that information might be useful.
300
It is hoped that in the future, development will be extended to provide a complete application, where the information thus generated from this database module will fuel a true inference engine as it manipulates its production rules in solving real problems.
REFERENCES Alavi, M., 1984. An assessment of the prototyping approach to information systems development. Commun. ACM, 27: 556-563. Bundy, A. and Welham, B., 1981. Meta-level inference for application of rewrite rules in algebraic manipulation. Artif. Intell., 16: 189-211. Byrd, L. and O'Keefe, R.A., 1983. Prolog set and input utilities. Edinburgh University, 25 pp. Cassie, M.M., 1982. The influence of breed, start weight and body condition on the growth performance of store lambs. Edinburgh University, 35 pp. Clocksin, W.F. and Mellish, C.S., 1981. Programming in Prolog. Springer, Berlin, 225 pp. Conkie, A., 1986. A symbolic mathematical computation system. Masters thesis, Heriot-Watt University, Edinburgh, 70 pp. Davis, S., 1984. TESS-the expert system shell. Masters thesis, Heriot-Watt University, Edinburgh, 58 pp. Dickson, I., Lloyd, M., McClelland, H., Vipond, J. and Volans, K., 984. Finishing Store Lambs. The Scottish Agricultural Colleges, Edinburgh, 35 pp. Forsyth, R., 1984. Expert Systems: Principles and Case Studies. Chapman and Hall Computing, London, 195 pp. Gammack, J.G. and Young, R.M., 1985. Psychological techniques for eliciting expert knowledge. In: M.A. Bramer (Editor), Research and Development in Expert Systems. British Computer Society Workshop Series, Cambridge University Press, Cambridge, 20 pp. Hammond, P., 1982. Faultfinder-a simple expert system. Masters thesis, Edinburgh University, 55 pp. Hauptmann, A.G. and Green, B.F., 1983. A comparison of command, menu-selection and naturallanguage computer programs. Behav. Inf. Technol., 2: 163-178. Lloyd, M., 1983a. An evaluation of commercial store lamb finishing systems. Paper to JCO Sheep Committee, Edinburgh School of Agriculture, 25 pp. Lloyd, M., 1983b. Making the most of your store lambs. East of Scotland College of Agriculture, 25 pp. Maguire, M., 1982. Evaluation of published recommendations on design of man-computer dialogues. Computer Laboratory, University of Leicester, 35 pp. Speedy, A.W., 1980. Lamb finishing on forage crops. Paper to the Annual Conference, Ingliston, East of Scotland College of Agriculture, Edinburgh, 30 pp. Warren, D., Bowen, D., Byrd, L. and Pereira, L., 1985. C-Prolog users manual, 1.4d.edai. Department of Artificial Intelligence, Edinburgh University, 115 pp.