Computers and Electronics in Agriculture 25 (2000) 271 – 293 www.elsevier.com/locate/compag
Pl@nteInfo® — a web-based system for personalised decision support in crop management Allan Leck Jensen *, Peter S. Boll, Iver Thysen, B.K. Pathak Department of Agricultural Systems, Danish Institute of Agricultural Sciences, P.O. Box 50, DK-8830 Tjele, Denmark
Abstract Pl@nteInfo® (www.planteinfo.dk) is a decision support system, which uses the World Wide Web to supply farmers and agricultural advisers with just-in-time information and decision support for crop management. A subscription system enables personalised information. Background data are collected from different sources, processed by decision support models, and the results are integrated into personalised web pages with embedded graphics, expert interpretations and links to additional information. This article presents the system with its decision support facilities and subscription system, the architectural design of the system using collaborating web servers and the technical solutions for creating personalised information in real time. Through the example of Pl@nteInfo®, the article shows that it is possible to build web-based decision support systems, where personalised advice is given in real time, based on user profiles together with distributed data and decision models. The article also analyses the user acceptance of the system. This analysis showed that the farmer and adviser subscribers are very dedicated users. Both the activity patterns and the preferences of subjects in the system are significantly different between these subscriber types, with farmers generally searching specific advice and advisors using the system to keep their knowledge up-to-date. © 2000 Elsevier Science B.V. All rights reserved. Keywords: World Wide Web; Personalisation; Decision support; Subscription; User analysis
* Corresponding author. Tel.: +45-8999-1662; fax: +45-8999-1200. E-mail address:
[email protected] (A.L. Jensen) 0168-1699/00/$ - see front matter © 2000 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 8 - 1 6 9 9 ( 9 9 ) 0 0 0 7 4 - 5
272
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
1. Introduction The agricultural sector in Denmark, along with other sectors of the national economy, is undergoing a transition towards ‘informatisation’ after having achieved a high level of mechanisation and automation in the past decades. In addition, the focus of the agricultural production is changing from quantity towards quality and sustainability. These transitions force farmers and agricultural advisers to deal with increasing bulks of information. In order to make precise decisions, they need to access and analyse vast and sporadically located information resources. Often, the task to select, combine and analyse the information is demanding, even for agricultural advisers. For this reason the development of personal computer (PC)-based decision support systems (DSS), with one or many embedded components like numerical models, expert systems and databases, has been a major activity of the agricultural scientific community. The main objective of developing such systems has been to provide an effective conduit for the transfer of scientific knowledge from research institutions to the end-users in order to enable them to efficient decision making. Normally, DSSs are based on non-static facts, or they require large amounts of up-to-date input data. Consequently, PCs do not appear to be the optimal platform for such dynamic DSSs. The recent advancement in the field of Internet has opened up new challenges as well as opportunities to fulfil the increasing needs for up-to-date and precise information. With the emergence of the World Wide Web (WWW), the advantage of the Internet as a medium for global access to information has been generally acknowledged. Since the evolution of the web during 1993–1994, its application has grown among scientists, technologists and end users in the field of agriculture. A number of agricultural web sites have emerged on the internet providing on-line information in the form of HyperText Markup Language (HTML) pages with static embedded images and, in some cases, databases. With the dramatically increasing number of web sites, users of web-based information face the problem of locating sources of relevant information, and of assessing the reliability and quality of the information. This problem is taken care of with a ‘web directory’, like the Finnish Agronet (www.agronet.fi) and the German DAINet (www.dainet.de), where selected reliable information services from independent organisations are made available within the same web site. Normally, the information services in a web directory do not collaborate with each other, so it is up to the user to combine the information. If, for example, one service provides a model requiring certain data and another service provides this kind of data, the user has no guarantee that the data is on a form to be used directly by the model. The potential of using the Internet to give user-tailored advice based on real-time processing of distributed data and models was described by Jensen et al. (1997). The concept of a ‘Web-based collaborative information system’ was launched, where the web is used as a computing infrastructure to combine different information components (models, data, texts, graphics) distributed across collaborating, but autonomous organisations.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
273
In order to carry out practical investigations on combining the advantages of DSSs for knowledge transfer with the advantages of the internet for information transfer, Pl@nteInfo® was launched in 1996. The system is continuously evolving, utilising the emerging facilities of the web. This article describes Pl@nteInfo® in the 1998 version, and the main focus is on the decision support facilities and the personalisation of information. It demonstrates how information components from different distributed sources can be processed and integrated into a single (and in most cases personalised) web page in real time. The article also discusses the user acceptance of the system.
2. System description
2.1. Organisation and objecti6es Pl@nteInfo® (www.planteinfo.dk) was developed in collaboration between the Danish Institute of Agricultural Sciences (DIAS) and the Danish Agricultural Advisory Centre (DAAC). DIAS is a research institute under the Danish Ministry of Food, Agriculture and Fisheries, while DAAC is owned by the farmers’ organisations with the main task to give specialist information to the local agricultural advisers. Since 1996, Pl@nteInfo® has delivered daily updated information and decision support for crop production. The primary target groups for the information in Pl@nteInfo® are farmers and crop advisers. The objectives for developing Pl@nteInfo® include the following. 1. To conduct research on the possibilities of using the Internet for real-time advice and to illustrate the research results through a practical example (the main objective). 2. To collect and visualise the formalised agricultural knowledge that is used for practical advice in Denmark. 3. To build a user-friendly and valuable information and decision support system for farmers and advisers.
2.2. Architectural design The architectural design of the Pl@nteInfo® system involves a main WWW server at the DIAS and a collaborating server at the DAAC. The data, models and information of Pl@nteInfo® are distributed between these two WWW servers. The main web server at DIAS contains the entire logical framework to extract and compile information from the collaborating organisations. Most of the application logic is performed on the server side, and only static or assembled HTML documents with graphics and JavaScript codes are sent to the clients. The dynamic HTML documents are assembled in real time from static or real-time-produced components using SAS programs and Perl CGI scripts at the DIAS server. The algorithmic code of the decision support models and the procedures for graphical presentation of the model output were written as SAS
274
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
programs and stored on a SAS application server. The web server communicates to the SAS application server using the SAS/IntrNet™ component (SAS Institute Inc., 1998). As most of the dynamic information is based on weather data, databases with original and derived weather data are essential. For efficient model execution, the weather databases are stored on the Pl@nteInfo® server together with the model programs. Hence, the database with original weather data is updated every morning by File Transfer Protocol (FTP) method from the Danish Meteorological Institute (DMI). The database contains a selection of weather parameters from different weather stations with a timely resolution of 1, 3 or 24 h, depending on the weather parameter. Also, weather data and weather forecasts interpolated to 10-km squares are transferred from DMI. Weather radar images are transferred every 10 min for animations to predict local precipitation.
2.3. Facilities The dynamic information in Pl@nteInfo® is created either from database look-up (e.g. field recordings of diseases and pests) or from the activation of decision support models. Naturally, Pl@nteInfo® also contains static (but up-to-date) HTML documents, as seen in Table 1. The different facilities are located on the WWW servers of the two collaborating organisations, but in general, the origin of the information is not visible to the user. The first two categories of items in Pl@nteInfo®, static documents and databases, are self-explanatory. In web-based information systems, these two components are Table 1 Different categories of subjects in Pl@nteInfo® 1998a Category
Subject/subject group
Personalised
Source
Static documents
Latest news on crop production
−
DAAC
Latest news on crop production from advisers Biological information about diseases and pests Help facility Agricultural discussion group Weather information Field recordings of diseases and pests Experimental data from field trials User-specific databases Advice for diseases and pests
+ − − − +/− − − + +/−
Advisers DAAC DIAS DAAC DIAS DAAC DAAC DIAS DIAS
Corn yield prediction Nitrogen fertilisation Irrigation
+ − +
DIAS DIAS DIAS
Databases
Decision models
a The subjects come from different sources and may or may not be personalised (+/−, some subjects in the group are personalised).
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
275
Table 2 The decision support models of Pl@nteInfo® 1998 Decision support model
Personalised
References
Risk of frit fly in oats and corn Risk of brassica pod midge in oilseed rape Risk of Septoria spp. in winter wheat
Somewhat Somewhat Somewhat
Risk of leaf and net blotch in winter barley Risk of leaf and net blotch in spring barley Risk of potato late blight Recommendation for Septoria spp. in winter wheat Recommendation for eyespot in winter grain crops Prediction of corn yield Nitrogen fertilisation Irrigation
Somewhat Somewhat Somewhat Much
Lindblad & Sigvald (1996) Hansen (1994) Hansen et al. (1994), Secher et al. (1995) Secher & Jørgensen (1997) Secher & Jørgensen (1997) Hansen (1995) Nielsen (1998)
No
Jørgensen et al. (1996)
Somewhat No Very much
Brown (1969) Anonymous (1997) Plauborg & Olesen (1995)
quite often used. The third category, the decision support models, is the most interesting in Pl@nteInfo®, since this type of agricultural decision support through the web is still unusual. As outlined in Table 1, Pl@nteInfo® contained four groups of decision support models in 1998: advice for various diseases and pests, corn yield prediction, and field-specific need for nitrogen fertilisation and irrigation. The different decision support models are listed with references in Table 2. None of the decision support models have been developed for Pl@nteInfo®. They have all been in operational use with dissemination of the information through traditional sources: PC programs, fax, mail, teletext and newspapers. While implementing the decision support models to the WWW platform, the associated facilities of hypertext and graphics have been utilised. The output from the decision support models for risks of diseases and pests, and for corn yield prediction follows the same overall design, with a graphical map of Denmark representing the result of the decision model, an associated selection of expert interpretations, and with links to additional information. This type of output is the most common in Pl@nteInfo®, and how it is generated and presented in real time will be described later. The outputs from the decision support models for nitrogen fertilisation needs and for irrigation needs are different in nature, partly because these facilities are field specific. The output from these facilities will not be described in this article.
2.4. Personalisation A large part of the information in Pl@nteInfo® is public. The public information often uses fill-out forms and clickable maps to allow the user to specialise the information to his/her own conditions, e.g. his/her geographical location, his/her field and crop, etc.
276
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
In contrast to specialised information, personalised information is automatically tailored to the user by the use of pre-recorded data about the user and his/her particular conditions. The user data are stored centrally on the main Pl@nteInfo® server, and include name and address of the subscriber, co-ordinates of geographical location, preferences, fields, crops, locally measured precipitation and applied irrigations. The personalised information is organised through a subscription system, where subscribers identify themselves through a log-in procedure. Being a multi-user system, it is imperative for Pl@nteInfo® to be able to recognise the subscriber and his rights and settings at every request after log-in. Therefore, a unique session identification number is attached to all URLs in pages sent to the subscriber to ensure that each subsequent request from the subscriber contains this identification number. A list of active subscriber sessions is maintained, to ensure fast look-up of subscriber information and rights. The subscription system has three subscription types: Farmer. The farmer subscriber is entitled to weather data, weather forecasts and decision support facilities for his own geographical location. He/she also has access to up-to-date weather radar animations and more detailed recordings of diseases and pests. Ad6iser. The adviser subscriber is entitled to the same facilities as the farmer subscriber, and is allowed to use the information professionally. Furthermore, he/she can obtain permission to set up an agricultural news service within Pl@nteInfo®, where he/she can supply clients with current, local news and current, local comments to Pl@nteInfo® facilities, e.g. the pages with risks of crop diseases and pests. Guest. The guest subscriber is only entitled to a certain personalisation of the pages. Like other subscribers, the guest can personalise the menu and can select between the available news services. The rights of these subscription types are designed to fit the corresponding professions, but they are not restricted to them. For example, some advisers have farmer subscription. In 1998, the yearly charge for farmer, adviser and guest subscription was DKK 400, 1600 and 0, respectively (about 53, 212 and 0 ECU). On 12 August 1998, Pl@nteInfo® had 1084 subscribers, distributed as 333 farmer subscribers, 59 adviser subscribers and 692 guest subscribers. Eleven active news services have been established, with seven of them being administrated by local agricultural advisers. A total of 458 documents with news, local interpretations to Pl@nteInfo® facilities, etc. have been uploaded by the news service administrators. Subscribers can select which news services they wish to receive information from, and the seven local news services had an average of 67.0 (standard deviation, 22.4) readers. The news service administrators can see who subscribe to their news service, and they may decide that they wish to approve the readers. In 1998, all news services were free, but the system is prepared to allow news service administrators to charge the readers for the information.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
Fig. 1. An example of real-time decision support for diseases and pests in Pl@nteInfo®. 277
278
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
2.5. System output The output from Pl@nteInfo® is web pages consisting of different information components — graphics, text explanations, hyperlinks to additional and related information, forms for the user to personalise the information. Fig. 1 gives an example of a web page in Pl@nteInfo®. The page has been slightly edited to fit a printed medium in English. Fig. 1 illustrates how risk values are presented graphically by maps of Denmark (left screen). Each municipality of Denmark is coloured green, yellow or red, according to increasing risk levels. By clicking on a particular region of the map, the user can obtain local information about the development of the risk value over time and compare the development with the previous two growing seasons (right screen). It is important that the maps of calculated risk levels of diseases and pests do not stand alone. They are combined with textual interpretations and general advice from selected experts of the particular subject. The experts may be from DAAC, DIAS or local advisory centres. The subscriber selects which experts he/she wants to receive expert comments from, and only the most current comments to the particular subject are shown. The particular web page in Fig. 1 is public, and apart from the selection of expert comments, there is no personalisation to subscribers in this example.
2.6. Real-time compilation of dynamic web pages The essential information components in Pl@nteInfo® web pages like Fig. 1 are the graphics (here, the risk map) and the associated selection of expert interpretations. The map is generated at DIAS as output from a decision model. The decision model is implemented at the SAS application server, and it uses weather data from the weather database to calculate response values and a graphical component to present the response values graphically. The news service preferences of the subscriber are read from the subscriber database and used to extract the relevant expert interpretations for the subject from the database of expert comments.
Fig. 2. Information flow to create a web page in real time.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
279
The flow of information applied to integrate the map, the expert comments and the other information components into the web page are illustrated in Fig. 2. All background data for the information components are stored on the main Pl@nteInfo® internet server for fast execution. The weather database is updated daily using FTP transfer, as explained earlier. The comment database contains file system addresses of text files with comments stored as ASCII text with HTML codes. The comment database is updated through web-based HTTP file upload technology. The personalisation system ensures that news service administrators have an ‘upload new comment’ link on the web pages they are allowed to give comments to. This guarantees that the comment is connected to the correct page automatically. The news service administrator writes the comment locally in their favourite text editor and uploads it to the internet server. He/she is presented to the comment within its context before authorising the publication of the comment. Hence, the updating of expert comments is distributed entirely to the source and does not involve any administration at the server side. The integration of information components into a web page is done with programs mainly written in SAS. When a user clicks a link in Pl@nteInfo® in order to receive decision support, a program call is made to the SAS application server, with reference to a specific SAS program and to a sequence of relevant parameters. The SAS application server invokes the specified program with the parameters, which processes the user requests and generates an output HTML document from pre-processed or real-time-processed components. The output HTML document is sent to the user’s web browser. One of the principal information components in many web pages in Pl@nteInfo® is the graphical map. This is embedded into the output document using the standard HTML image tag IMG. The source of the image is not a pre-processed image and the SRC attribute of the IMG tag is not a normal URL. Rather, the SRC attribute is yet another SAS program call to the SAS application server, with reference to a general SAS program for map drawing. In this hierarchy of program calls, the first call leaves space for the map image on the web page and, after termination, the output HTML document is sent to the user’s browser. Here, the processing will reveal the call to the SAS application server for the source of the IMG tag. The SAS application server activates the SAS program, which opens the proper weather database and processes the data with the decision support model, which is specified by the program with appended parameters. After the data processing, the results are presented graphically as a graphical map, and this output is returned to the user’s browser and incorporated into the HTML document. The programs for integration of information components have one additional task, which is not directly visible on the web pages. The maps in these web pages are clickable, resulting in a graph of the development over time for the clicked region. In order to make this possible, a client-side map and a JavaScript defining the actions for a click on the map are included in the web page by the program. In the case of Fig. 1, the client-side map contains the pixel co-ordinates of polygons representing the Danish municipalities. The JavaScript calls a second general
280
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
program, likewise with an embedded IMG tag with a SAS program call. This program will create a web page, like the one shown on the right part of Fig. 1, based on the appended parameters, including an identification (ID) for the clicked municipality.
3. User acceptance
3.1. Methods The use of Pl@nteInfo® was analysed for the 3-month period from 6 May to 5 August 1998 (both dates inclusive). The Pl@nteInfo® server at DIAS hosts most facilities in Pl@nteInfo®. Facilities hosted by other servers than the Pl@nteInfo® server, mainly by the server at DAAC, were not included in the analysis. Different information sources were used. The web server log, which logs data for all requests to the web server. The subscriber log, which logs data for log-in of subscribers, i.e. time of log-in and subscriber ID. The subscriber database with background information about subscribers, e.g. subscription type. Different measures of the use of Pl@nteInfo® were applied: number of requests (hits); number of visits; number of log-ins. The number of hits of a web site is a commonly used measure of visitor access. It is not a good measure, however, since it often encompasses all requests, including images, icons and unsuccessful transfers. Another measure of the use of a web site is the number of visits. Unfortunately, this measure cannot be determined exactly from the web server log, since the web server does not record information about the visitor, but about the host server used by the visitor. Different visitors may be served by the same host server, in which case they cannot be distinguished in the web server log. Obviously, when log-in sessions are analysed, information about the visitor is available, but the analysis is restricted to requests made by subscribers after log-in. An estimate of the number of visits was made by defining a visit as a series of requests from the same host server. After 60 min without requests from the host server, the visit was considered completed. The limitations of this estimate of visits were demonstrated when the web server log and the subscriber log-in log were merged, since this revealed cases of multiple subscriber log-ins within the same visit. One example comprised 34 log-ins from 21 different subscribers within the same visit. The explanation was that the 21 subscribers were all advisers using the same Internet access supplier. Therefore, they were recorded as the same client in the web server log, and since no 60-min break occurred between requests from this client, the 34 log-in sessions were considered as one visit almost 9 h long. This could only be revealed due to the subscription system, demonstrating the danger of relying too much on the web server log alone.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
281
Table 3 Geographical distribution of visits Region
Number of visits
Visits with known origin (%)
Denmark Rest of Scandinavia Rest of Europe Rest of World com, org, edu, net Unknown
16 855 1441 169 26 1629 2909
91.2 7.8 0.9 0.1 – –
Total
23 029
100.0
3.2. General data for the use of Pl@nteInfo ® According to the web server log, Pl@nteInfo® had 617 340 external hits (hits from developers not included) in the time period, corresponding to an average of 6710 hits per day. Using the measure of a visit already defined, Pl@nteInfo® had 23 029 visits in the period, or 250 visits on average per day. This is a marked increase from the previous seasons, since Pl@nteInfo® had an average of 25 and 125 visits per day in 1996 and 1997, respectively. It should be noted, however, that the definition of a visit and the time period of the calculations have not been exactly the same for the three seasons. To analyse the geographical distribution of the users of Pl@nteInfo®, the number of visits is the best measure, since almost all subscribers are Danish. The distribution of visits on geographical regions is shown in Table 3. Table 3 shows that Pl@nteInfo® is a very local service, with 91% of visits with known geographical origin being domestic. If the number of requests is considered instead of the number of visits, the result is even more explicit, since 96% of requests from known geographical origin are from Denmark.
3.3. Subscriber acti6ity patterns Within the investigation period, Pl@nteInfo® had 877 active subscribers (i.e. they had at least one log-in within the investigation period). During the 92-day period, 11 294 log-ins were recorded, or 123 per day on average. Of these, 9982 (88%) log-ins could be identified in the web server log: 285 735 requests (46% of all requests in the time period) were served to subscribers after log-in. Table 4 shows information about the subscribers and their log-ins. It follows that, on average, subscribers had 11.4 log-in sessions in Pl@nteInfo®, with an average duration of 13 min (796 s) and with an average time interval between log-ins of 2.9 days. There seems to be a distinctive difference in activity pattern between the three subscriber types. In order to verify this, a statistical analysis was performed.
282
Subscription type
Active subscribers
Logins (total)
Number of log-ins
Duration of log-ins (s)
Time between log-ins (days)
Average
Standard
Average
Standard
Average
Standard
Farmer Adviser Guest
315 58 504
5603 2387 1992
17.8 41.2 4.0
(20.3) (38.3) (5.7)
694 1106 711
(1111) (1652) (1098)
2.8 1.6 5.4
(5.3) (3.4) (9.2)
Total
877
9982
11.4
(19.2)
796
(1271)
2.9
(5.9)
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
Table 4 Data for log-ins averaged by subscription type
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
283
For two of the variables, the duration of log-ins and the number of log-ins by a subscriber, the data fitted a model where the logarithm of the variable follows the normal distribution. Given this model, the mean as well as the variance of the distribution was different for the three subscriber types (PB 0.0001). The third variable, the time since the previous log-in, did not fit satisfactorily to this model. Therefore, non-parametric methods, where no assumptions about model distributions are required, were considered. Due to the large number of data, the use of non-parametric methods was considered safe. The non-parametric methods Wilcoxon, Median and Van der Waerden Scores (SAS Institute Inc., 1996) all showed that the level of observed time since previous log-in is significantly different for the three subscription types. The distributions for the subscription types were compared both two by two and all together, and the test probabilities were all less than 0.01, and normally less than 0.0001. Finally, the first two variables were tested likewise with the non-parametric methods. The results were similar, so the number of log-ins, the duration of log-ins and the time between log-ins are significantly different between subscription types. Hence, it is significant that the average guest subscriber has much fewer log-in sessions than the paying subscribers do, and with longer intervals between log-ins. The average duration of log-in sessions for guest subscribers is close to the average for all subscribers. The average farmer subscriber has shorter log-in sessions than the other subscriber types, but a medium number of log-in sessions with a medium time interval between them. The average adviser is the most active of the three types, with more log-in sessions, longer sessions and shorter time intervals between sessions. The number of log-ins for the average adviser is more than double the average farmer and more than 10 times higher than that for the average guest. Fig. 3 shows the distribution of log-ins by the day of week. For all subscriber types, Monday – Friday are more or less equally active, with Tuesday being slightly the most active day of the week. There is a difference between the subscriber types in the activity level during weekends. Farmers are almost as active as on weekdays with 21% of their log-ins, and advisers are almost inactive with only 4% of their log-ins. Guests show almost the same pattern as farmers with 16% of their log-ins at weekends. To test whether the distributions of log-in times were different for the subscriber types, a two-way contingency table with number of log-ins on the various weekdays by the various subscriber types was made. A standard chi-square test for independence of subscription type was rejected (PB 0.0001), both when the subscription types were compared two by two, and when they were compared all together. This means that the patterns for log-in over the week shown in Fig. 3 are significantly different for the three subscription types. Fig. 4 shows how the activity pattern differs between the subscription types over the hour of day. The most remarkable result is the peak of activity among advisers between 8:00 and 9:00 h. The advisers have more than 20% of their log-ins during this hour. With 84% of their log-ins between 8:00 and 16:00 h, the advisers are mainly active within normal working hours. The activity patterns of the farmers and the guests are quite similar to each other. They are more evenly distributed
284
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
over the day, with a peak in the morning and another in the evening, apparently with the farmers being a bit earlier than the guests are. The evening peaks can be explained by a reduction in rates of telephone calls. The timing of the morning peak is surprising, since weather data and, hence model results based on weather data, are normally not updated until 9:00 h in Pl@nteInfo®.
Fig. 3. Time distribution of log-ins over weekdays for each subsciption group.
Fig. 4. The distribution of log-ins over hour of day for each subscription group.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
285
As for the distribution of log-ins over weekday, a standard x 2-test showed that distribution of log-ins over the hour of day is significantly different for the three subscription types (P B 0.0001).
3.4. Subscriber preferences An analysis of the popularity of subjects in terms of number of requests was performed. A comparison between the subjects within a limited time period is not completely fair, because of the differences in topicality of the subjects over the season. The differences in access is another reason to be careful with comparisons between subjects, since if the number of requests for a subject with public access is low among subscribers, it may be because they access the subject before log-in. However, the differences between subscriber types in popularity for a given subject can be compared safely. The analysis comprised 82 285 requests from the web server log, which were primary requests in relation to the subjects in Pl@nteInfo®. The remaining portion of the 617 340 requests in the web server log was mainly related to the menu, the opening page, graphics and icons. The analysis revealed that the linking of the web server log and the subscriber log is quite good, although not perfect, since 226 (0.8%) requests were associated with a subscriber type without access to the subject of the request. The low frequency of detected errors in the identification was obtained by using a careful merging method, where requests were not associated with the assumed subscriber in cases of doubt. Hence, the subscriber was only identified for 88% of the requests requiring log-in. Table 5 shows that the popularity of the subjects depends on subscriber type. Among farmer subscribers, the irrigation category is clearly the most popular with 35.2% of all requests from farmer subscribers. Among advisers, crop pests and diseases exceeds the irrigation category in popularity, while the most popular category for guests is the subscription category, where information for personalisation is given. The high popularity of the subscription category, together with the low average number of log-ins for guests (Table 4), could indicate that many users establish a guest subscription to investigate the possibilities and then either change to payment subscription or stop being active. Table 5 is constructed from a two-way contingency table with number of requests classified after subject category and subscriber type. A chi-square test for independence of the classification criteria was rejected (PB 0.001), both when the subscriber types were compared two by two and when they were compared all together. Hence, the preferences of the subscriber types are statistically different in Table 5. When the popularity of individual subjects is analysed, the differences between subscriber types become more evident. Table 6 shows the subjects which are among the five most popular within one of the subscriber types. For example, the most popular subject among farmer subscribers is the web page from the irrigation DSS with calculated irrigation needs for all the fields of the subscriber. This subject, with ranking 1 in the column for farmer subscribers in Table 6, comprised 8.5% of all requests by farmer subscribers.
286
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
Table 5 Distribution within subscriber types of requests divided into subject categories (%) Subject categories
Subscriber type Farmer
Unidentified
Total
Adviser
Guest
11.7
16.0
33.6
9.4
13.5
0.7
1.5
2.7
1.4
1.3
Crop pests and diseases
11.5
26.0
20.9
31.9
22.3
Irrigation
35.2
21.7
6.8
11.8
21.7
Weather data
17.5
10.7
4.8
6.6
11.2
The weather
15.8
12.4
13.2
17.1
15.5
About Pl@nteInfo®
1.9
4.0
5.5
8.2
4.9
Help
5.7
7.7
12.5
13.8
9.7
Total
100.0
100.0
100.0
100.0
100.0
Subscription Crops and cultivation
Total requests
30 720
13 306
7610
30 649
82 285
For the guest subscribers, the five most popular subjects are related to subscription, help, and a demo, i.e. not directly applicable agricultural information. The adviser subscribers also use the help documents, as well as weather information and decision support for two diseases. The priority of the farmer subscribers is rather different from this, even though the ranking for weather information is also high among farmers. The farmers prefer the irrigation decision support and use the help documents as much as the other types. Most remarkable, however, is the low interest in decision support for diseases and pests among farmer subscribers, at least after log-in. Table 7 shows the distribution of requests for decision support subjects within the subscription groups. Like Table 6, Table 7 shows the popularity of the system for irrigation needs among farmers, and the higher popularity of subjects for pest and disease risk among advisers. Table 7 was tested for independence of subscriber type, with methods like in Table 5. The results were also similar, leading to the conclusion that the preferences for decision support facilities are statistically different between the subscriber types (PB 0.001). Relative to the time period of the analysis, prediction of corn yield is quite early and calculation of nitrogen fertilisation need is quite late, which may explain the low request numbers for these facilities. Within the time period, the 172 requests for nitrogen fertilisation needs lead to calculations for 121 fields, which should be compared with the total of 2651 calculated fields for 1998.
Subject
Subscriber type Farmer
Help documents Demo of weather radar animation Risk of Potato Late Blight Introduction to weather data subjects Risk of Septoria spp. Geographical information about subscriber Irrigation need for subscriber’s fields Weather radar animation Field-specific irrigation need Selection of news services from advisers Personal information about subscriber a
Total Adviser
Guest
Rank
Requests (%)
Rank
Requests (%)
Rank
Requests (%)
Rank
Requests (%)
9 10 12 3 20 5 1 2 4 19 18
4.5 4.2 3.7 5.9 1.9 5.5 8.5 6.4 5.5 2.0 2.1
2 9 1 4 3 11 7 5 12 6 31
6.9 3.7 6.9 4.6 4.8 3.1 3.9 4.3 2.9 4.1 1.0
2 3 10 8 6 11 – – – 4 5
11.1 9.9 3.7 4.2 5.9 2.8 – – – 7.1 6.5
1 2 3 4 5 7 8 11 12 13 21
8.1 7.4 5.8 5.3 4.6 4.4 4.3 3.5 2.8 2.4 1.9
Ranking and percentage of all requests within the subscriber type is given for each subject.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
Table 6 Subjects with a top-five ranking within one of the subscriber typesa
287
288
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
Pl@nteInfo® has two examples of recommendation systems, giving field-specific advice for dose of fungicide treatments. The recommendation system for eyespot is not personalised, while for Septoria spp., the system uses local weather data (own observations when available, otherwise interpolated). The use of the recommendation systems is somewhat disappointing, according to Table 7. One reason for the extremely low use of the system for eyespot is that there was no access to the system from the menu. Experts assessed the need for treatment of eyespot to be very low in 1998, based on winter and spring climate and on early field recordings.
4. Discussion In order to discuss how the experience gained from developing Pl@nteInfo® can be used generally, it is important to remember the objectives listed in Section 2.1, since these objectives are quite different and sometimes contradicting. According to Table 7 The distribution of requests on decision support subjects (%) Decision support subjects
Subscriber type Farmer
Adviser
Unidentified
Total
Guest
Risk of frit fly
2.2
10.2
7.2
10.4
7.8
Risk of brassica pod midge
4.3
9.2
11.7
12.8
9.5
10.4
18.2
35.0
25.7
20.5
Risk of leaf/net blotch, winter barley
3.7
9.8
14.3
12.5
9.5
Risk of leaf/net blotch, spring barley
2.6
3.9
5.6
4.2
3.8
20.7
26.1
22.1
29.6
25.8
Recommendation for Septoria
6.7
3.5
–
–
2.6
Recommendation for eyespot
0.1
0.0
0.3
0.3
0.2
Prediction of corn yield
1.2
3.3
3.0
3.0
2.5
Nitrogen fertilisation
0.3
0.9
0.7
1.4
0.9
47.8
14.9
–
–
16.9
100.0
100.0
100.0
100.0
100.0
Risk of Septoria spp.
Risk of potato late blight
Irrigation Total Total requests
5472
3518
1283
8337
18 610
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
289
the objectives, Pl@nteInfo® was developed at the same time as an example to illustrate research results, as a framework to visualise the formalised knowledge within the domain of crop production, and as a tool for agricultural decision support. Considering Pl@nteInfo® as an informatics research example, the system has succeeded in demonstrating that information from different sources can be collected, processed and integrated into the same information unit (web page) in real time. Since some of the information components (e.g. the graphical map and the expert interpretation of it) are directly related to each other and not available separately, the integration of the information components increases the value of the information for the end user. The example is technically quite simple, since all information components used for the integration are, although independently updated from the sources, readily available on the same server at the time of web page generation. Future development of Pl@nteInfo® will investigate the integration of information components from different servers, or from the user’s PC. Future development will also increase the value of information by combining the information from different information components. For example, predictions of the risk of diseases and pests can be improved by including information of field recordings of the diseases and pests. Considering Pl@nteInfo® as a framework for agricultural decision models, the system has helped to focus on the existing knowledge within the domain, to pinpoint deficiencies and limitations, and thereby to direct the research into improved knowledge and better decision models. The decision models applied in Pl@nteInfo® are generally very simple, and research projects are planned to improve the complexity of some of the models and to develop new models, for example using probabilistic modelling methods. As a framework, Pl@nteInfo® is prepared to include new models, hence promoting a fast and secure dissemination of research results to the end user. Considering Pl@nteInfo® as a tool for agricultural decision support, the routine tasks of maintaining, updating and servicing an applied system has sometimes conflicted with the research tasks. Often, the development of facilities in Pl@nteInfo® has been driven more by the need to illustrate new research results than by the needs of the users. This conflict between the roles of the developers as researchers was pointed out in a master’s thesis analysing the human–computer interaction for Pl@nteInfo® (Raunow and Jensen, 1997). Procedures for daily data updating, system maintenance and user contact by non-researchers have been implemented and are under continuous improvement. Pl@nteInfo® as an applied system has given a direct contact between researchers and end users, which has been very advantageous. Often questions or comments from users have revealed insufficient information on the web pages, or even errors in the programs. With the possibility for fast updating and immediate consequence of a centralised system, the communication with users often resulted in fast improvements of early versions of facilities, together with a feeling of influence among users.
290
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
An increased number of news services in Pl@nteInfo® in the future will also facilitate improved contact between advisers and farmers. Pl@nteInfo® can be used to organise networks consisting of a news service and its readers, and with this infrastructure available, the direct electronic communication between advisers/experts and farmers/customers, either one-to-one or one-to-many, is simple. For example, an adviser with specialised knowledge of an unusual crop can disseminate information directly to farmers growing this crop, and the farmers can exchange experience within the group. This development has commenced already, since in 1998, news services for potato growing and for bee keeping were established. The analysis of the user acceptance gave valuable information about preferences and activity patterns of the users. Among the results were also surprises which will have to be analysed further, e.g. with a questionnaire, and which may affect the further development of Pl@nteInfo®. One experience of general interest from analysing the user acceptance of Pl@nteInfo® has been that, even though the standard web log contains enormous amounts of data (here, more than 1 MB per day), the level of information to be gained from it is very low. This is because the only information about the user in the web log is the host name of the server to which he/she is connected to by the Internet access supplier. The subscription system gave an opportunity to retrieve more detailed information about how the subscribers use the system. In the future, Pl@nteInfo® will make a more targeted logging of the use of the system in order to gain even more detailed information. The analysis of the web log together with the subscription log gave a rather detailed impression of the three subscriber types and their preferences in Pl@nteInfo®. Statistical tests showed clearly significant differences between the subscriber types. In the following, a profile of each subscriber type is given. The profiles will be used to guideline the development of Pl@nteInfo®.
4.1. Guest The average guest subscriber is not very dedicated. He has few, relatively short log-ins. The number of days between log-ins is higher than for the other subscriber types, but yet quite low. This indicates a rather short period with relatively high activity. The most popular subjects among guest subscribers are about subscription, which is not surprising, since these subjects are mainly what the guest subscription gives access to. All this indicates that the average guest subscriber is indeed a guest, testing the system. He could be a farmer considering a farmer subscription or just an interested Internet surfer. Even though the profile of the average guest subscriber fits the intention of the subscription type, the analysis points to the question, whether the guest subscriber is offered sufficient advice on what information and facilities he/she is prevented from with a guest subscription.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
291
4.2. Farmer The average farmer subscriber is quite dedicated. With respect to the weekly and diurnal activity patterns, he/she is quite similar to the average guest subscriber. The high proportion of log-ins during weekends among farmers and guests demonstrate the importance of daily updating of the system. The average farmer has much more frequent, although slightly shorter, log-ins than the guest. The top-most selected subjects by farmers are different from the other subscriber types. For example, the number of requests for information about diseases and pests is quite low. It cannot be implied from this fact, that the farmer subscriber does not want this information, since it does not require log-in in general. One of the subjects that the farmer subscription gives access to is the irrigation DSS, and this is the most popular subject for the average farmer subscriber. It is an experience of general interest that the farmers have adopted the irrigation DSS, since it is by far the most complex decision model in Pl@nteInfo®. For the personalisation of information, it contains several screens for first-time establishment of fields, crops and source of precipitation data, together with screens for reporting of crop development stages and applied irrigations. The analysis shows that farmer subscribers are not averse to using complex decision models if they find the trouble worthwhile, and the user feedback does not imply that they are not capable of doing it correctly.
4.3. Ad6iser The average adviser subscriber is very dedicated. He/she has relatively long and frequent log-ins, and seems to have more fixed routines for information gathering in Pl@nteInfo® than the other subscriber types. The log-ins are almost entirely within normal working hours on working days, with the hour from 8:00 to 9:00 h being a distinctive peak. The average adviser uses Pl@nteInfo® more to keep his professional information about crops, diseases, pests and weather up to date than to obtain specific advice. Only seven out of 59 adviser subscribers have decided to establish a news service in Pl@nteInfo®. It will be investigated further in a planned questionnaire among subscribers about why they seem to refrain from using their subscription fully.
5. Conclusion This article has discussed the conceptualisation, design, development and implementation of a collaborative decision support system using distributed World Wide Web based architecture. The objectives for developing Pl@nteInfo® have been met so far, as described in the article. 1. Pl@nteInfo® illustrates how information system components distributed across heterogeneous platforms in different organisations can be assembled and transformed into a universally accessible DSS application, giving personalised advice to farmers and advisers in real time.
292
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
2. Most of the relevant and available decision models, which are applied under practical conditions in Denmark, have been implemented in Pl@nteInfo®. Thereby, focus has been given to the limitations of these models and to areas with incomplete knowledge. Pl@nteInfo® serves as a framework for future web-based decision models, offering a common user interface, common input data and the possibility to apply the same data in several applications. The framework will ensure fast dissemination of research results to the end users. 3. The analysis of the user acceptance of Pl@nteInfo® has shown that the system offers valuable information and decision support for farmers and advisers. Both farmer and adviser subscribers are generally dedicated users with frequent log-ins. The two subscriber types have significantly different activity patterns and preferences. The farmers use the system for specific advice, and the advisers use it to keep their professional knowledge updated. The analysis also showed that farmers are capable of using even quite complex decision support systems, like the personal irrigation management system. Pl@nteInfo® is a continuously evolving web-based DSS. In the future development of the system, the decision models will be distributed across organisations. For example, the plant pathologists responsible for the disease risk models will be able to maintain their models without central administration. Pl@nteInfo® will also be improved for even more personalised information through stored user profiles at the server side and allowed access to production data at the client side. Finally, additional decision support models are under development. Pl@nteInfo® is becoming increasingly popular among farmers and agricultural advisers, as Internet access is becoming more common among farmers. Understanding and adoption of web-based applications will become imperative both at farm level and in scientific premises. All this holds great promise for a new and more efficient way of developing models and decision support tools for the food and agricultural systems.
Acknowledgements This work has been sponsored in part through Dina, Danish Informatics Network in the Agricultural Sciences. The authors wish to thank Asger Roer Pedersen, Biometry Unit, DIAS, for support with the statistical analysis.
References Anonymous, 1997. Fertilisation According to the Nitrogen Mineralisation Method. Internal Report. Danish Agricultural Advisory Centre, 14 pp. (in Danish). Brown, D.M., 1969. Heat Units for Corn in Southern Ontario. Factsheet, AGDEX 111/31. Ontario Ministry of Agriculture and Food, Ontario. Hansen, L.M., 1994. Brassica pod midge in winter oilseed rape — strategy for control. Dan. Inst. Plant Soil Sci. 7, 167–173.
A.L. Jensen et al. / Computers and Electronics in Agriculture 25 (2000) 271–293
293
Hansen, J.G., 1995. Meteorological dataflow and management for potato late blight forecasting in Denmark. Dan. Inst. Plant Soil Sci. 10, 57 – 63. Hansen, J.G., Secher, B.J.M., Jørgensen, L.N., Welling, B., 1994. Thresholds for control of Septoria spp. in winter wheat. Plant Pathol. 43, 183 – 189. Jensen, A.L., Thysen, I., Secher, B.J.M., 1997. Decision support in crop production via the Internet. Petria 7 (Suppl. 1), 147–154. Jørgensen, L.N., Secher, B.J.M., Schultz, H., Elkjær, K., 1996. Eyespot — status on thresholds, distribution and control. Dan. Inst. Plant Soil Sci. 4, 167 – 184 (English summary). Lindblad, M., Sigvald, O., 1996. A degree-day model for regional prediction of first occurrence of frit flies in oats in Sweden. Crop Protection 15, 559 – 565. Nielsen, G.C., 1998. Recommended Treatment Thresholds 1998. Newsletter Number 15-054, Danish Agricultural Advisory Centre to Crop Advisers, 3 pp. (in Danish). Plauborg, F., Olesen, J.E., 1995. MVTOOL version 1.10 for developing MARKVAND. Dan. Inst. Plant Soil Sci., Report No. 27, 64 pp. Raunow, R., Jensen, H.B., 1997. Human Computer Interface Evaluation of PlanteInfo. Master’s thesis, Computer Science. University of Aarhus, Denmark, 151 pp. (in Danish). SAS Institute Inc.. 1996. SAS/STAT™ Software: Changes and Enhancements through Release 6.11. SAS Institute Inc., Cary, NC, 1996. SAS Institute Inc., 1998. SAS/IntrNet™ Software: Delivering Web Solutions. SAS Institute Inc., Cary, NC, 1998, 40 pp. Secher, B.J.M., Jørgensen, L.N., 1997. Control of net blotch (Pyrenophora teres) and adjustments in PC-Plant Protection 1997. Dan. Inst. Plant Soil Sci. 8, 73 – 82 (English summary). Secher, B.J.M., Welling, B., Jørgensen, L.N., Hansen, J.G., 1995. The role of precipitation for the development of Septoria tritici in winter wheat. Dan. Inst. Plant Soil Sci. 4, 139 – 148 (English summary).
.