Technological Forecasting & Social Change 73 (2006) 679 – 704
Eliciting experts’ knowledge: A comparison of two methods B Fabiana Scapolo a,*, Ian Miles b a
DG JRC-IPTS, Edificio Expo, C/Inca Garcilaso, E-41092 Sevilla, Spain b Manchester Business School, The University of Manchester, UK
Received 13 May 2004; received in revised form 27 February 2006; accepted 2 March 2006
Abstract This paper reports on a detailed comparison of the practical application of two well-known forecasting methods—a surprisingly rare exercise. Delphi and cross-impact analyses are among the best-known methods that apply quantitative approaches to derive forecasts from expert opinion. Despite their prominence, there is a marked shortage of clear guidance as to when and where–and how–particular methods can be useful, or as to what their costs and benefits are. This study applied the two methods to the same area, future European transport systems, using the same expert knowledge base. The results of the implementation of the two techniques were assessed and evaluated, in part through two evaluation questionnaires completed by the experts who participated in the study. This paper describes these encounters with methodology and evaluation, presents illustrative results of the forecasting study, and draws lessons as to good practice in use of these specific methods, as well as concerning methodological good practice in general—for example, stressing the need for systematic documentation, and the scope for debate about established practices. D 2006 Elsevier Inc. All rights reserved. Keywords: Cross-impact; Delphi; Forecasting; Transport
1. Introduction Expert opinion is often taken into consideration in policy making. Expert views can give policy makers added information and insight into the fields where they lack sufficient knowledge to B
The views expressed in this article are those of the author only and may not in any circumstances be regarded as stating an official position of the European Commission. * Corresponding author. E-mail addresses:
[email protected] (F. Scapolo)8
[email protected] (I. Miles). 0040-1625/$ - see front matter D 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.techfore.2006.03.001
680
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
comprehend complex issues. This is especially (but by no means exclusively) the case in areas of science and technology (S&T), where policy makers are highly dependent on the quality and reliability of the information they have at their disposal. They require the advice and guidance of experts, often relying upon particular experts to tell them about what expert communities think. However, politicians are liable to be over reliant on a few bpetQ experts, with whom incestuous relationships can be established. This was graphically clear in the relations between politicians and intelligence sources concerning Iraq’s supposed Weapons of Mass Destruction capabilities in the run up to the second Gulf war. Technological forecasting is often undertaken to produce information on the supply of and demand for new technologies, in the context of evolving capabilities and social and physical contexts. Expert opinion can be solicited by including the experts in meetings or by interviewing them; in-depth interviews are particularly valuable. But such techniques are restricted to the inputs of relatively few people, and there are opportunities for the loudest or most prestigious voices (which are not necessarily the most informed ones) to dominate affairs—though new Information Technology (IT) assisted methods of group interaction allow for a great many more views to be generated and reviewed than traditional methods permit. (There have been problems of information overload when it comes to the use of large volumes of qualitative information captured in these ways, though new text mining techniques may help matters here.) Techniques such as the Delphi method were developed to circumvent such problems. These techniques have been developed with the aim of sampling the opinion of fairly large numbers of experts, and avoiding dominance by particularly assertive individuals. In the past decades, the literature on using such methodologies has proliferated [1],1 though–as we shall see–there are many gaps in these descriptions. But there is a lack of evidence-based advice on how to make an informed choice as to which technique is more appropriate to use when undertaking technological forecasting. Different forecasting methods can be expected to yield somewhat different outputs, even when they are drawing on evidence and perspectives that should be yielding broadly similar results. We would hope at a minimum that the results will not be contradictory, when they draw on similar perspectives and differ only in methodological assumptions, though even this may be in question (for example if some approaches place more emphasis on feedback loops). Different methods will require different inputs—varying in terms of data requirements, forecaster time and skills, technology support, etc. With a great deal at stake in the decision to undertake one or other method, what are useful criteria to help us select among methods? The present study set out to elucidate how to select the most appropriate methodology to apply when undertaking a technological forecasting study. The starting point was the idea of a comparative evaluation of two methods applied to the same topic, to determine how the processes of gathering information differ, and to compare the types of knowledge developed from the application of the technique. However, a simple comparative evaluation proved very difficult to undertake, for reasons that will be described below. This study undertook a comparative analysis and evaluative implementation of Delphi and crossimpact analysis methods, applied to the same area, and drawing on the same expert knowledge base. The two techniques applied here were the Delphi method (using the standard anonymous questionnaire survey, with two rounds of iteration), and a well-known method of cross-impact analysis, the SMIC (a French acronym for Cross Impact Systems and Matrices). Both involve questionnaire-based approaches,
1
This source provides a fair description of both the Delphi method and cross impact analysis as well as examples of implementations.
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
681
and we used them to elicit expert views of future European transport systems. This raised questions about how we can elicit what sorts of knowledge from experts; and how the results of this elicitation process can be applied so as to (hopefully) improve the quality of information inputs into for the decision making process. The study thus casts light on the requirements for the implementation of these techniques, both in terms of practical steps and concerning the content necessary for their applications. The task of comparing and contrasting methods proved to be extremely complicated. Numerous challenges had to be confronted in implementing the methods so that they would be comparable. It would clearly be inappropriate to attempt to contrast the two methods, while implementing them at different levels of quality. In principle, each method should be implemented at a best practice level. But the key difficulties revolved around the immense lacunae in documentation of these two methods. The nature of good practice has been very poorly explicated—in terms of all sorts of features of design and application of the methods. Much remains largely a matter of tacit knowledge, so that newcomers to the use of either method are liable to find themselves adrift, unless they can gain apprenticeship with experienced practitioners. The study provides insight as to the requirements of the methods, when each is given each of them a bgood runQ at addressing the issue of future transport systems. The results of the implementation of the two techniques were assessed and evaluated, in part through two evaluation questionnaires completed by the experts who participated in the study. This examined their views of the usefulness of the results for the field of application, and obtained their assessment of the methods as implemented. Though we sought to construct implementations of comparably high quality, to avoid weighting the dice in favour of one or other approach, the study cannot provide a definitive test of Delphi vis-a`-vis cross-impact analysis. What we have been able to assess here is one particular implementation of each method, and while we would hope that these are representative of good practice, it is risky to generalise from so few cases.
2. Implementation of the two forecasting techniques It was hoped that this assessment would produce knowledge useful for decision making about forecasting approaches. The objectives of this study2 were to establish the extent, nature and importance of: a) Generic issues concerning elicitation of expert opinion (e.g. selection of experts, boundaries of tasks, etc.); b) Specific issues posed by this class of methods—methods based on the use of questionnaires, and with qualitative and quantitative inputs and outputs. (e.g. selection of questions, coding of questions and responses, responses rates, etc.) c) Specific issues related to the specific techniques—considering whether it is possible to overcome particular shortcomings of the techniques, and d) Specific issues related to implementation (e.g. possible items of good practice of implementation). 2 The complete study of this article is referring to is a PhD dissertation (Scapolo, F., 1999). Prospective on Environmental Consequences of Changes in Urban Transportation—Comparison of Two Methods Eliciting Experts’ Knowledge. Faculty of Economics and Social Sciences. Manchester, University of Manchester.
682
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
The Delphi and SMIC methods are both based on eliciting expert knowledge. Experts’ contribution is seen as a help in areas of research where an explicit conceptual framework may not exist or where data are very impoverished (i.e. where formal methodologies which make the use of any existing theory and data are not available, or underdeveloped, or not widely accepted). Those expert in a particular problem area may possess unstated dmental modelsT and knowledge of the causal structure of a particular system, and are likely to have reasonably well-grounded appraisals of the state of affairs in the topics of concern. This approach attempts to externalise, materialise and manipulate such informalised dexpertT opinion as a basis for forecasts (or at least for informing thinking about the future). The first serious attempt to systematically implement this approach–beyond traditional means of group discussion–was the Delphi technique. Over the years, the Delphi method has seen a tremendous number of applications across a huge range of topics—the method has a high level of flexibility. A conventional Delphi application asks when, if ever, a series of events will occur; often this will be supplemented by estimates as to the desirability of events being forecast. One variation is where the experts are asked to give probability estimates of an event’s occurrence by a certain date in the future, rather than just guessing when an event might occur. This particular variation of the Delphi technique may be most suitable when we are looking at a particular point in (future) time, or seeking to estimate the likelihood of an event, when mutually exclusive events may occur. Moreover, it gives a picture of a particular year, rather than a future history. The Delphi methodology typically generates forecasts for many events, without providing any direct information as to the respondents’ views as to whether questioned events are interrelated [2].3 Thus, it is possible that the outcomes of a Delphi implementation could produce a forecast of events mutually reinforcing or exclusive of each other, and that an artificial dconsensusT may be reached. The initial works of Helmer and Gordon on cross-impact analysis [3]; see also [4] in contrast, aimed to determine interdependence between events. The cross-impact technique attempts to provide the probabilities of occurrence of an item, adjusted in view of the occurrence of related items with potential interactions on its occurrence. The developments and analysis of cross-impact methods have concentrated their attention on the manipulation and refinement of probability estimates. It is suggested that these techniques can help understanding and clarifying the complex causality of dynamic socio-economic systems. Cross-impact methods, applied together with other available techniques, like the Delphi method, produce not just forecasts, but also better comprehension and evaluation of views about how a system works. If applied in this context, cross-impact methods can be seen not as a technique able of providing exact and neat solutions to the future, but as an aid to learning, a heuristic device which might be able to improve the understanding of complex problem areas, which are already tackled by forecasting. In this study we tried to establish whether cross-impact analysis employed in such a manner, would be able to underline the differences between the dmental modelsT used by experts. In this way, the implementation of cross-impact models can be relevant to the process of decision making, providing its users with a method for synthesising a wide range of beliefs and opinions concerning future developments. The final outcome of the cross-impact method used for this specific 3
By examining correlations between the opinions of experts about different topics, it may be possible to identify varying scenarios or sets of expectations across the pool of respondents. This approach has not been much used in practice—for one example of building scenarios from expert views on a range of topics, see Rush and Miles [2]. bSurveying the social implications of information technology.Q Futures 21(3): 249–262.
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
683
implementation (SMIC) [5] is a cardinal sequence of possible scenarios, supplying decision makers with an additional dimension for problem evaluation. Of course, the scenarios that stem from cross-impact analysis are only images of what the future could look like, if the events considered do, or do not occur. If all the possible combinations are generated, then the right one should be present—though in principle it might be judged as having low probability. But it is always possible that the formulation of statements leaves something to be desired, that in the future we will look back and say that the future has turned out to be neither black nor white. In qualitative scenario workshops, we usually see that the most likely future is some (unknown) combination of the various specific futures discussed. The same could well be true for SMIC bscenariosQ. 2.1. Sequence of procedural steps for the implementation of the Delphi method and the SMIC method A series of decisions had to be made in order to implement the two forecasting techniques in such a way to allow any type of comparison. Fig. 1 outlines the main steps undertaken in the course of this study. Once the objectives of the study were defined, we had to take a number of decisions related to the technologies to forecast. Another set of decisions related to the implementation of the two techniques— including decisions about how particular elements of the methods should be shaped so as to allow a comparison between them. Below we discuss some of these main decisions. 2.1.1. Choice of the subject One of the first decisions had to do with the technological area whose future was to be examined. We set out to contribute to an important and topical theme, the future evolution of the urban transport sector in Europe. The focus was on the role and development of Transport Telematics (ATT) technologies that are being developed to help to improve the current situation of traffic in European cities. Often a forecasting study would begin with a literature review of with polling a set of experts as to important emerging technologies in the area of study. In this case, however, we were able to identify specific technologies to forecast from the European Commission programme on ATT (Advanced road Transport Telematics), undertaken in the course of the 3rd Framework Programme of the European
1 st round Delphi survey
Analysis of results
2 nd round Delphi
Analysis of results
Evaluation questionnaire
Expert Group 1 FINAL OUTCOMES
Objective of the study
Choice of the subject to forecast
Selection of the experts
Expert Group 2
Expert Group 3
SMIC questionnaire
Analysis of results
Evaluation questionnaire
Fig. 1. The main steps required for the Delphi and SMIC implementations.
684
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
Union. This programme piloted projects addressing seven different areas of major operational interest. (These areas spanned private and public transport, and the collection, processing and distribution of travel and traffic information, including such services as traveller information, trip planning and route guidance. In addition the programme included technologies to assist the driver and communicate information between vehicles.) The pilot projects of the Programme involved several European cities [6– 11]4–cities are the focus of the present forecasting study–and also some projects covered barteriesQ. (These are only parts of the transport network, and one challenge for the future lies in the application of ATT to much wider portions of the transport network.) The study examined views of the future level of development of ATT applications, the forces that facilitate or impede this, and how far these developments are liable to achieve goals of reducing transport-related problems. Thus we set out to address: a) the likely level of use of transport telematics technologies and their impact on the transport system in the future in Europe; and b) the consequential direct impact of these technologies on the level of congestion and environmental issues. c) How far factors of economic, social, legislative and political nature may impede the development of ATT technologies. Specification of this general content of the forecasting exercise is only the first of a series of decisions that proved necessary for the implementation of the two forecasting tools. To best compare the two techniques, it would be most helpful to maintain the same structure for the implementations of the Delphi and the SMIC method. But the precise characteristics of the two methods impose limits on these common elements. 2.1.2. Number of topics First, one major constraint concerns the number of questions (or events) that can readily be included in the two implementations. The Delphi method is fairly flexible in this respect. It could in principle contain as many questions as the moderator would like to ask, though a lengthy questionnaire will reduce participation levels. Thus one authority [12] suggests that 25 topics (questions) should determine the upper limit to make sure that the exercise can be manageable, both for the moderator and for the experts. (This is very much a matter of precise organisation and use of topics, we would suggest.) The result of adding further topics to the Delphi is a straightforward additive one. However, cross-impact studies require each new topic to be assessed against each exiting topic. The effect of adding more topics is an exponential one. The SMIC method limits the number of events handled, generally a maximum of six, so as to constrain the maximum number of questions that can be reasonably put to the experts [13]. This limitation is not related to the method itself. In fact cross impact analysis and its dedicated software 4 For further detail see CEC [6–11]. Advanced Transport Telematics—1993 Annual Project Review Part 6. Brussels, Commission of the European Communities—DG XIII.; CEC [6–11]. Advanced Transport Telematics—1993 Annual Project Review Part 1. Brussels, Commission of the European Communities—DG XIII.; CEC [6–11]. Advanced Transport Telematics—1993 Annual Project Review Part 2. Brussels, Commission of the European Communities—DG XIII.; CEC [6–11]. Advanced Transport Telematics—1993 Annual Project Review Part 3. Brussels, Commission of the European Communities—DG XIII.; CEC [6–11]. Advanced Transport Telematics—1993 Annual Project Review Part 4. Brussels, Commission of the European Communities—DG XIII.; CEC [6–11]. Advanced Transport Telematics—1993 Annual Project Review Part 5. Brussels, Commission of the European Communities—DG XIII.
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
685
could be expanded allowing handling more than six events. However, the constrain in the number of the events is mainly related to avoid that experts spend lots of time in completing large matrices involving conditional probability questions. This implementation of the SMIC method was asking experts to complete 12 matrices for a total of 60 conditional probabilities questions. This constraint on the number of questions that can be used in the implementation of the two techniques immediately leads to problems when we try to apply them to a broad area such as the future of transport telematics technologies. The ATT Programme covers not only various telematics technologies, but also policy measures helping to reduce traffic, and organisational and social considerations related to the provision of information and the response of users. It is hard to capture these in 25 topic items, let alone 6! So as to compare results, and to allow SMIC to be applied to the areas of most significance, the decision was made to select, in effect, the events to be included in the SMIC method on the basis of the results of the first round of the Delphi exercise. The SMIC was applied to the most prominent transport telematics technologies identified. Furthermore, two implications of those technologies (on congestion and environmental issues) were studied across both techniques. 2.1.3. Time horizon and focus While the methods differed sharply in terms of the number of topics that could be selected, we were able to work in much more common ways in respect of certain other facets of the study. Where it came to the time horizon to be studied, both methods could be brought to bear on the developments that could be anticipated by a particular point in time. Given that ATTs are at quite different stages of development, the decision was made to enquire as to the degree of penetration in the market of different systems at a particular point in time (about 20 years ahead). This is a how far question, rather than one asking when a certain technology will be available on the market. The SMIC method asks experts to give probability estimates of occurrence of events by a certain date in the future. This is a how likely question. The approaches are thus reasonably comparable, and their results can be directly compared—to the extent that the questions formulated are also similar. 2.1.4. Experts The selection of experts for the implementation of the two techniques was made comparable by drawing in each case on ! participants to latest international conferences on Advanced Transport Telematics Technologies; ! experts co-nominated by others; ! experts selected from national research centres and academia on transport (also with the support of Internet); ! contacts made in the course of the transport research. This procedure identified about 300 experts who would be appropriate to involve in the exercise. This population of experts was divided into three sub-groups: the aim was to send one group of experts the Delphi method only, another one the SMIC method only, and the last group was sent both the Delphi and the SMIC methods. This would enable us to examine (i) how far results are coherent; (ii) whether expert responses and participation levels are affected by receiving only one or both questionnaires, and (iii) how far they perceive the issue in a different manner, and apply different degrees of effort to it, in the different
686
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
contexts. (We could also see if the experts might assess the methods and their outputs differentially, according to whether they had participated in them.)
3. Operationalising the Delphi study 3.1. The background information of the exercise The Delphi survey was conducted in two rounds. The structure of the questionnaire was divided in three stages corresponding to those indicated above in the text by points (a), (b), and (c). The package sent to respondents also included a general background discussion which embodied a scenario and a definition of European medium size city. This city definition was intended to reduce the propensity for informants to make forecasts for the cities which they are most familiar with. They were encouraged, then, to provide more of a dbaselineT forecast. This general background was the same for the Delphi and for the SMIC implementation. This background information also included a baseline scenario that was intended to help experts to place themselves in a common context. The selected scenario could be defined as one of dmoderated changeT, lying between dbusiness as usualT and daccelerated innovationT scenarios. The starting point was the economic, social and political trends of the last 20 years, which have favoured the use of personal transport and encouraged the use of cars, to the detriment of use and development of Public Transport. 3.2. The selection of transport telematics technologies from the DRIVE/ATT Programme The selection of the transport telematics technologies to include in the questionnaire, as well as the formulation of the questions to the experts, was one of the most crucial points of the design of the questionnaire. Examination of the Transport Telematics Programme allowed us to list in the first place 42 ATT systems which can be applied in the context of the urban transport. This was reduced to 24 systems to be included in the Delphi exercise—a reduction achieved by eliminating similar technologies that were implemented in different pilot projects for distinct applications and with different purposes. It is good practice in Delphi, as with other surveys, to pilot the questionnaire before sending it out to the experts. This process permitted us (as will be described later), to further reduce the technologies covered from 24 to 19 ATT systems. Table 1 illustrates the final list of the 19 technologies examined in the Delphi inquiry. Before reaching the final version of the Delphi exercise it was necessary to take many other small decisions, in order to have the clearest possible questionnaire, and especially in view of the desired comparison with the SMIC method. These decisions referred mainly to the topic of the forecast—we sought to ensure that the information generated would actually provide new insight into the anticipated development of the technologies. The challenge of ATT systems lies in their widespread use on a large part of the existing transport infrastructure. At the time this study was being designed, such use was foreseen—but it was very hard to predict when such systems might be implemented on a wide scale, as opposed to the limited areas involved in pilot projects. Various constraints become apparent as ATT systems pass from the research and development step to that of roll-out and, in some cases, commercialisation. These constraints are not only of a technological nature. The complexity of inserting
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
687
Table 1 List of the ATT technologies included in the Delphi inquiry T1 T3
Detector systems to improve safety conditions for pedestrians and cyclists at junctions and crossings. Autonomous collision avoidance systems on-board vehicles.
T2 T4
T5
Inductive loops to detect congestion and incidents, such as accidents and breakdowns.
T6
T7
System architecture for smart card technologies linked to Urban Transport access rights and traffic, which enables the integration of electronic payment systems for road vehicles, Dual mode route guidance system in city centre areas; involving the combination of in-vehicle route guidance systems together with infrastructure based systems. Integrated system architecture to support portable traveller information services.
T8
T9
T11
T13
T15
T17
T19
Intelligent on-board telematics systems based on radar and/or on thermal imaging, allowing driver support to cars to travel safely at normal speed and in adverse weather conditions. Information exchanging and processing systems based on microwave and smart card for a multitude of ADS (Automatic Debiting Systems such as, parking management, non-stop tolling, multilane road pricing, congestion metering and pricing). Public Transport information systems providing real-time information on park-and-ride, trip costs, timetables, and duration for different modes. Dynamic telematics traffic control systems, using real-time traffic data for managing queues of traffic, giving priority to Public transport.
T10
T12
T14
Computer Vision Technology, to detect congestion automatically. Automated systems to monitor traffic in real-time and which are able to provide real-time on-board journey information, as well as congestion level and incident warnings. Telematics systems able to automatically provide real-time information about the Origin/Destination of traffic streams, including detectors and VMS (Variable Message Signs) for re-routing recommendations. In-vehicles devices which can provide continuous dynamic updating of digital road maps for the use of route guidance systems. Home based and office based advanced traveller information systems through terminals, which provide accurate and reliable information on all the transport modes available for a journey. Dynamic route guidance system of vehicles based on European information architecture which use Global system of mobile communications (GSM). Remote sensing of high occupancy of vehicles and of vehicle type to allow variable road charging and controlled access on most urban road network.
T16
Transponder-based systems and Automatic Vehicle Location (AVL) systems being used to give priority to Public Transport through adjustment of traffic signal timings.
T18
Public Transport monitoring systems for real-time information on Public Transport availability and delays at bus stops.
ATT technologies into the existing institutional and legal frameworks is a major challenge. There is considerable diversity across European countries, where the roles of the authorities vary enormously in term of competencies. These factors can make for very uneven uptake (and diverse implementation) of these technologies. For these reasons, experts were asked, in the first step of the Delphi questionnaire, what in their opinion the likely level of use of the selected ATT systems applied to urban transport will be, in European medium size cities in the year 2015. In order to facilitate this task, experts had to indicate, for each technology, what proportion of European medium size cities will adopt the technology by the time horizon considered (i.e. they had to indicate the level of diffusion of the forecasted technologies).
688
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
The second step of the Delphi inquiry aim to achieve more qualitative answers from experts. The aim here was to gain some knowledge on the direct effects and implications of the implementation and development of this range of technologies on the transport system. In the third step of the Delphi exercise, experts were asked to qualitatively assess factors which might hinder the occurrence of the widespread use of ATT technologies. Experts were asked to assess whether social acceptance of technologies, economic viability, technological feasibility and availability of European standards had no influence or were considered a low, medium, high constraint to the widespread use of the forecasted technologies. Table 2 gives an overview of some decisions taken at the level of Delphi implementation for this study, and distinguishes these from alternative choices available for the implementation of the method. 3.3. Delphi practice reviewed Over the years, much has been published about the Delphi method, especially when compared to the amount of literature related other forecasting methods such as cross-impact analysis. This literature addresses several issues relevant to the problems confronted in implementing the methodology and evaluating its outcomes. 3.3.1. Participation of experts in a Delphi exercise A first point refers to the lack of precise indications on the number of experts to involve in a Delphi inquiry [14]. There are indications, in the literature, that the minimum size of a panel of experts to involve in a Delphi exercise, should be no less than 8 to 10 members [15], though usually the number of experts to involve in the exercise is a choice left completely to those undertaking the forecasting exercise. The inherent risk is that not all the competencies required to investigate the inquired subject may be represented in an adequate manner. For our specific implementation we have decided upon having a substantial number of transport experts, rather than a small sample. In doing so, we assume that we increase the possibility of representing in a balanced way, the wide range of competencies on the future of ATT technologies in European cities by the year 2015. However, we can confirm that the lack of precise criteria on the number of experts to include in the Delphi survey–and the lack of criteria on Table 2 Summary of some decisions for the Delphi implementation Issue in structure of Delphi
dTypicalT implementation or choices available
Overall framework Often diffuse Nature of experts Typically national Topic formulation Few words (recommended = most 25) Empirical focus
Nature of the forecast Time horizon
Often rather vague, though good practice has developed around use of terms such as bfirst demonstrationQ and bwidespread useQ When dXT will happen? To a given extent
Implementation in this study Background scenario + medium-sized city defined Cross-national Relatively many words (average = from 7 to 29 words). In part this reflects the next point. Precisely targeted on specific technology applications in cities of a specific size in the EU. How far dXT will happen in a given time?
Fixed time horizon Different time horizons are most common, though there are many experiences of other types of Delphi
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
689
how to determine the number of experts in a specific case–was perceived as a drawback of the methodology. A second point related to the use of experts in the Delphi method, refers to the validity of expert judgement compared to non-expert judgement. In the literature it is possible to encounter controversy about this. Sackman [16] argues that, a panel of experts composed by people with similar backgrounds and interests may tend to comprise an elite with a vested interest in promoting the area under Delphi investigation. He argues that the participation of prestigious individuals does not guarantee improved accuracy: it can even be counterproductive because they may not provide carefully considered responses, if compared to those stemming from, say, younger panelists. In relation to this point, the Delphi literature provides examples of research where experts and non-experts were used, without any substantial underlying differences in the results [17]. But since our study’s focus is on technologies that are often not known to non-experts, a panel of experts with at least some knowledge of ATT systems was appropriate. It was not a requirement for informants to be specialised in this field, but it was a requirement to work at least indirectly in the transport field. 3.3.2. Construction of Delphi topic statements The description of the event statement [18,19]5 is crucial, since the expert has first to decide what interpretation to give to the event, and then try to translate this interpretation into a prediction. To the extent that there is more or less conflict in the information describing the event, the operator could expect that the expert would have to add or subtract information to derive his/her interpretation. This process varies, and is very subjective. The event statements need to be both clear and concise. Out of the 19 technologies considered in our inquiry, only one event statement was composed of more than 30 words (i.e. event 15). The formulation of all other events counted from 7 to 29 words. Nevertheless, this very concise description of the ATT systems was criticised by some experts—it was seen as giving the questionnaire too technical an orientation. It was more difficult to avoid overlap between topics, related to the fact that some ATT systems, based on different technology, serve similar applications (and vice versa). Since it is not clear yet which system is going to lead the market in the future, it was of interest to assess the opinion of experts on the likely level of use of those systems. 3.3.3. The problem of dropout rate A problem with the Delphi method discussed in the literature is the tendency for high panel attrition [20]. The influence of, and reasons for, drop-out have not been widely investigated in the literature. Sackman [16] suggests that, several factors operate to determine a hard-core group that sticks with a Delphi study through all iterations—for instance strong motivation and interest in the target area. Dropout may reflect strong disagreement with the design and content of the questionnaire, a critical attitude towards the utility and methodology applied of the study, and to ddesign contentT of the technique applied. 5
The problem of construction of Delphi event statements was raised by Salancik et al. [18]. bThe construction of Delphi event statements.Q Technological Forecasting and Social Change 3: 65–73. and further examined by Martino [9]. Technological forecasting for decision making— 3rd ed. New York, McGraw-Hill. Salancik noted that the goal has to be to describe the potential event so that all respondents interpret it in an identical way.
690
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
The first round of the Delphi exercise was sent to 224 experts. Out of the 224 inquiries, 87 experts (more than one third) returned their questionnaires completed. Only one expert sent back the first round of the Delphi method uncompleted. He explained that he does not believe in the results of forecasting exercise—previous experiences had led him to believe that these exercises are a waste of time. We do not have any information on the 110 other experts who did not reply to the first round of the Delphi exercise. Presumably some proportions of these experts were in disagreement with the content and methodology applied for this study. However, we can be fairly sure that factors such as lack of time will have played a role. The second round was sent to the 87 experts who took part in round one, and obtained 58 responses—a dropout rate of 33%. The repetitiveness of the Delphi procedure may lead to some weariness among respondents, who might think that the procedure of the Delphi method is too demanding. Others may feel no need to amend their earlier replies. 3.3.4. The issue of consensus The Delphi method was designed as a tool to overcome the unwarranted dominance of particular individuals in face-to-face discussion in a group. Its goal is to get a better grasp on the common opinions within the group. (Note that Delphi can be used to examine different clusters of opinion, however, it does not have to emphasise points of convergence.) The iterative process, with controlled feedback, should assure that the experts have a way of autonomous judgement on their opinions over successive rounds. However, the pressure to reconsider their opinion over successive rounds can constitute a pressure for conformity, leading to an apparent convergence of opinion. The most critical reviewers of the Delphi technique believe that the method (as usually implemented) manipulates responses towards the convergence of opinions so as to achieve consensus. The supporters of the technique believe that, the method does not force the achievement of consensus, but seeks to facilitate consensus [14]. Nevertheless, Martino [19] who is favourable to the Delphi method, has pointed out that there is ample evidence, from a number of experiments, that if the panelists feel that the questionnaires are an imposition on them, or if they feel they are rushed and do not have time to give adequate thought to the questions, they will agree with the majority simply to avoid having to explain differences. Therefore, in this respect the Delphi procedure is not an absolute guarantee against the influences of the dbandwagon effectT and fatigue. Much may depend on how information is fed back to respondents. If it is presented in a graphical form that suppresses information on outliers, and highlights the median value or some other measure of central tendency, then this might be interpreted by some participants as the dcorrectT answer. The experts may consider that it would be appropriate to get their estimates closer to the median value. In our implementation, we restricted the number of rounds to two, in order to limit as much as possible the dbandwagon effectT and the weariness of respondents. We provided information on the complete pattern of judgements in the first round, so as to reduce this conformity pressure. 3.3.5. The accuracy of results It is impossible to assess (yet!) whether the outcomes of our implementation of the Delphi method on the future of ATT technologies in European cities in the year 2015 are accurate. Nevertheless, the point of the method is to determine the collective knowledge of expert views on the issues. This should in any case have some utility, regardless of accuracy. Additionally, we were able to collate information, by
For European medium sized cities in the year 2015 Your view on the likely level of use of the technology
Please, indicate which of the percentage ranges you think is most likely, by checking the appropriate box. 0–10%
11–30%
31–50%
51–70%
71–90%
+ 90%
Your view on the direct effect of the technology on each of the two transport problems From 3 = strong/high decrease, through 0 = no effect, to +3 = strong/high increase.
Your view on factors which could restrict the occurrence of the widespread use of the corresponding technology From 0 = no influence, through to + 3 = high constraint.
Congestion
Social Economic Technological Availability acceptance viability feasibility of European standards
Traffic volume of passengers vehicles
For each of the event statements, the informants were asked to provide judgements in terms of the criteria depicted above.
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
Table 3 Structure of the Delphi questionnaire
691
692
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
mean of the evaluation questionnaire, which tells us how the experts involved in the study perceived the forecast obtained. Once the content of the questionnaire has been established, other features can be designed–and other decisions can be made–which can facilitate the experts in their task of replying to the Delphi exercise. All the material that accompanies the questionnaire can influence the experts. It is not enough to define the content of the questionnaire. Other elements such as guidance on how to answer the questionnaire are also extremely important. A clear explanation on how to complete the questionnaire can help respondents to better understand the aim of the study. If the questionnaire is presented in a clear way, the dropout rate can be reduced—the respondent is liable to feel that the designer has taken care, so care is required in return. In general, if a questionnaire is complicated, experts see it as too time consuming, and this may prompt the decision not to participate in the exercise. Table 3 gives an overview of the implementation of the Delphi questionnaire. For each technology listed in Table 1, the expert had to provide information on the likely level of use of the technology, its direct effect on the level of congestion and traffic volume, and which of the four listed factors could restrict the widespread use of the technology.
4. Operationalising the SMIC Use of the SMIC method has often involved a postal inquiry, as used in the present study. The implementation of the SMIC method followed the computation of the results of the first round of the Delphi questionnaire. One goal of the SMIC assessment was to validate the information, stemming from the results of the Delphi study, on the future evolution of ATT technologies. One approach followed was to send the SMIC questionnaire to a group of experts who were also involved in the Delphi inquiry. This was done to check the reliability and consistency of the outcomes of the forecast on the future of ATT in European cities in the year 2015. The SMIC method, and its dedicated software, provides an established way for structuring the questionnaire. This means that implementations of SMIC are highly constrained–the person applying the methodology has to design the sequence of the decisions involved for the implementation in a less flexible way than in the Delphi method–but design decisions still play a very large role. As was the case for the Delphi method, this specific implementation of the SMIC method was applied to the future of ATT technologies in European cities in the year 2015. The SMIC method has a limitation in the handling of events, which in general are reduced to six. In order to have some common points for the comparison of the results achieved by the two methodologies, it was decided to design the questionnaire including four events concerning the adoption of ATT technologies by European medium size cities in the year 2015. The remaining two events addressed the effect of these technologies on two transport problems, which were again the same as those included in the Delphi questionnaire—that is level of congestion and level of traffic volume of passenger vehicle in urban areas. It will be noted that there were no events involving factors which could influence the widespread use of ATT technologies in European cities. This reflected space limitations. The background information provided with the SMIC questionnaire comprised (a) the dbackgroundT scenario and (b) the definition of medium size cities; these were identical to those used for the implementation of the Delphi questionnaire.
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
693
4.1. Structure of the SMIC questionnaire The questionnaire of the SMIC method was only applied in one round, but the inquiry was divided in two steps. In the first stage, experts are asked to estimate the simple probability of occurrence of the selected events, through means of a probability scale from 1 (i.e. very low probability) to 5 (i.e. very high probability). The assumption is that at this stage the conditional probabilities are not implicitly included in the experts thinking. In the second stage, experts are asked to estimate, in the form of conditional probability, the likelihood of an event coming true as a function of another event. The events are considered in pairs. The experts express their opinion through means of the same probability scale as used in the first stage, but an extra option is available that allows them to indicate the independence of two events. Basically, in the second stage experts have to fill in two matrices for each event. The first matrix refers to the probability that each of the other five events listed would occur given that event (En) turn out to be true. In the second matrix they had to indicate the probability that each of the other five events listed will occur if event (En) does not turn out to be true. Table 4 summarises some of the characteristics of the design of the Delphi and the SMIC inquiries, underlining the common items used for the implementation of the two methodologies. 4.1.1. Criteria for the selection of the technologies The selection of the four technologies to include in the SMIC method was accomplished on the basis of the results achieved on the first round of the Delphi questionnaire. This was done especially to allow some comparison of results the two techniques were able to generate on the same subject. One criterion used was to select four technologies amongst the most representative, where the level of convergence of opinions achieved in the course of the first round of the Delphi was relatively high on all the questions asked. This was done also to see if the results achieved in the context of the Delphi method could be confirmed by the SMIC implementation. A second criterion was to include four ATT technologies belonging to the three different main groups considered for the implementation of the Delphi method. These were: ! technologies able to measure the traffic level on the network; ! technologies that, on the basis of the existing traffic level, are able to adjust traffic signals to give priority to Public Transport vehicles; Table 4 Characteristics of the Delphi and SMIC inquiries Background scenario Description of the city Number of topics Inputs to questionnaire Number of rounds Probability Number of experts
a
Delphi method
SMIC methoda
Same Same 19 Same 2 Occurrence by date 224 (first round) 87 (second round)
Same Same 6 (4 taken from 1st round Delphi) Same 1 Occurrence by date conditional probability 126 77 (only SMIC) 49 (SMIC and Delphi)
The SMIC method was conducted only in one round.
694
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
! those technologies that can assess the level of congestion in real-time, and can communicate with the vehicle (i.e. re-routing of vehicles). 4.1.2. Formulation of the event statements Among others, the main goals of the implementation of the SMIC method were to: ! assess the validity of the results achieved from the Delphi exercise; ! assess if and how far the application of two different forecasting methodologies on the same subject produces similar or different results; ! examine the interdependency among questions. Experts had to assess, the simple probability of the events in the first stage of the SMIC inquiry, whereas in the second stage they estimated the probability of pairs of events. In designing these questions, the researcher was aware that the formulation of the event statements would lead to some loss of information. In fact, the decision to include a percentage range of European medium size cities in the formulation of the event statement involved posing a clear threshold. Box 1 Example of a SMIC statement 51–70% of European medium size cities use automated systems to monitor traffic in real-time which are able to provide realtime on-board journey information, as well as congestion level and incidents warnings
The reaction of experts to this formulation of event statements may vary from one expert to another. If one expert disagreed strongly with the threshold posed, his/her response might be to decide not to answer the questionnaire. We suspect that this option is most likely, if the expert thinks that the indicated percentage range is too optimistic. There are always some drawbacks to consider while designing a questionnaire. Any formulation of questions or events can be criticised on the ground that they restrict the information that can be elicited. Unfortunately, there is no clear indication on what is the dbest practiceT on how to implement this type of forecasting tool, based on eliciting expert knowledge. Moreover, it is sheer utopianism to determine a dgeneral best practiceT on how to implement forecasting tools, since they are applied to complete different topics. The decisions taken by the person applying a forecasting and or prospective methodology will have some drawbacks, or pitfalls. Nevertheless, there is little existing advice in the literature, or indications of how overcome this problem. The arbitrary components of any implementation of this range of methodologies, and related consequences, should be taken into consideration, both from the person conducting the forecast, and from the person participating in such exercise. 4.1.3. Presentation of the SMIC inquiry to the experts In order to complete the implementation of the SMIC method, material such as an explanation to the experts on how to complete the questionnaire, and an accompanying letter with a description of the aim of the study, was necessary. The presentation of the SMIC questionnaire to the experts, once the content has been established, was articulated in the same manner as for the Delphi implementation.
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
695
The explanation of how to use the SMIC method was important, since this methodology has been less widely applied than the Delphi method. The guide to filling in the questionnaire indicated that it was divided in two steps. Examples of answers were given, with clarification of the meaning of the probability scale that experts had to use to complete the exercise. The SMIC inquiry was sent to two different expert groups. The first group of experts was not contacted beforehand, and received the SMIC inquiry only. The second group of experts had already been contacted in the framework of the Delphi inquiry. This group of experts received both the Delphi and the SMIC inquiries. The main reason for sending two different inquiries to experts was to assess: ! if results are consistent on an individual basis; ! if the attitude of the experts is different when they receive only one as opposed to both questionnaires (e.g. whether there is a higher level of drop out from the exercise); ! whether using both methods leads to a different response to one or other of them (i.e. for example, thinking about interdependency in SMIC might change judgement on one single issue in Delphi). Compared with the literature on the Delphi method, much less has been written about the way these techniques should be implemented. However, especially in relation to the SMIC method, it is possible to identify some common points, which deal with implementation that has been considered in the framework of the Delphi method. Since, the two methodologies are both based on eliciting experts’ knowledge through mean of inquiry systems, there are some common procedures, such as the selection of the experts, which can also influence the outcomes of this technique. The critical literature concerning the SMIC method was most active from the 1974, when the first application of the SMIC method was presented, to the year 1976 [21–24].6 This literature was principally concerned with two major issues: ! the problem of inconsistency in cross-impact probabilities; and ! the use of cross-impact analysis, the SMIC method in particular, to forecast the future of social systems. Our application of the SMIC method aimed to deepen and extend the knowledge achieved within the Delphi exercise. We were especially keen to check the interdependencies among events and to examine scenarios, which were not considered in the framework of the Delphi method. The literature reviewed gives little insight into the dbest practiceT for implementing cross-impact analysis. The emphasis has often been on how the various techniques refine and manipulate probability estimates, underlying the computation aspect rather than the conceptualisation one. Some of the considerations discussed in the context of Delphi are also applicable to the SMIC method. For instance, there has been little deliberation on the SMIC method on issues related to the role 6
There have been a series of articles on the journal dFuturesT. For further detail not discussed in this text see, Mitchell and Tydeman [21]. bA Note on SMIC 74.Q Futures 8(February 1976): 64–67, Godet, M. (1976). bSMIC 74—A reply from the authors.Q Futures 8(4): 336–340, Mitchell, R. B. and J. Tydeman (1976). bA further comment on SMIC 74.Q Futures 8(August 1976): 340–341, Kelly, P. (1976). bFurther comments on cross-impact analysis.Q Futures 8(4): 341–345, and. McLean, M. (1976). bDoes cross-impact analysis have a future?Q Futures 8(4): 345–349.
696
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
of experts, the influence of the size of the panel on the results, and the issue of the validity of experts’ judgement. In relation to the number of experts to whom the inquiry should be sent, Godet [13] indicates that the group should be composed by 40 to 60 experts, it is not clear if there is a rationale behind this indication. However, little indication is given on how to adequately represent all the competencies of the inquired subject. As with the Delphi method, there is a lack of criteria on how to compose the group of experts as a drawback of the methodology. Concerning the involvement of experts in Delphi exercises, the literature has been critical of the use of experts’ judgements, since these can be biased. Similar discussion of the possibility of obtaining biased forecasts, when a cross-impact method is applied, is absent. But, if this has been considered in the framework of the Delphi method, it should be considered also for cross-impact analysis, since the support of mathematical–statistical computation cannot certainly eliminate biased judgements. We believe that, if the arguments noticed for the Delphi method–already described above in this text–have to be considered valid, the same should be applied for the SMIC method and cross-impact analysis more in general. In addition, it was not possible to locate any substantive discussions of how to construct SMIC event statements (there is guidance on this for the Delphi method). Yet, it is evident that for the SMIC method, which can be composed of a maximum of six events, the construction of the event statements plays an essential role. For example, in the Delphi method the expert without implying a serious distortion for the final outcomes can skip an ambiguous event. But for the SMIC method the implications will be completely different. In fact, an ambiguous description of an event statement does not allow the expert to reply to the cross-impact questions. It will thus invalidate the entire questionnaire. For our implementation of the SMIC method we followed the criteria suggested in the framework of the Delphi method, derived from the literature review and discussed above in the text. Thus, we have tried to construct a dbest practiceT approach. However, the lack of specific criteria for the SMIC method has been considered as a drawback, in particularly the lack of indication addressing which type of event statements it is preferable not to have combined in the same inquiry. In fact, it is not specified whether it is better to avoid mixing among the six events, some that are technological oriented with some that are more economical or social oriented. Finally, the fact that the SMIC method takes form of a single questionnaire to the experts means that the debate which involves the Delphi method, in relation to the evolution of forecasts over successive rounds, does not apply. This does not necessarily imply that the forecasts, which stem from cross-impact analysis–including the sometimes dmysteriousT mathematical and statistical computation and manipulation of data–are any more accurate or credible than those which stem from a Delphi exercise.
5. Results The study set out to compare two different approaches to using experts’ opinions of future developments in the policy decision process. It was organised so as to explicate: ! the decision sequence for the implementation of the two selected techniques;
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
697
The study involved different levels of implementation decisions for the two techniques, which can be summarised as follows: a) decisions about the operationalisation of the inquiry; b) decisions about the analysis of the collected data; c) decisions about the presentation and evaluation of results; d) decisions for the dissemination of the final results. The four levels of decisions are related to each other, with decisions taken in the first level have implications on the other levels. ! the actual operationalisation of the methodologies; ! the analysis, reporting and comparison of the results of the implementation of the two techniques; An evaluation of the final results achieved by the implementation of the two methods was obtained directly from the experts participating in the various phases of the study through an evaluation questionnaire. Table 5 provides a synthesis of the follow-up questions asked to the respondents of the Delphi and SMIC inquiries. The evaluation questionnaire addressed issues related to the specific implementation of the two techniques, the usefulness of the results achieved for transport policy makers and, in the case of the Delphi implementation, for the experts replying to the inquiry. We will now synthesise what can be generalised from the experience of the specific implementation of the two techniques, especially by focusing on: ! what the evaluation shows about how useful each method was, about the relative merits of each approach; ! what the practical experience of the operator was of using the two methods. 5.1. Evaluation of the implementation of the two forecasting methods Compare Delphi and cross-impact approaches requires many concrete decisions. The implementation of the two techniques for this study involved using a particular form of the Delphi method, and a pregiven cross-impact technique, the SMIC method. Inevitably, this study is evaluating and comparing these specific implementations and their results. The effort was made to apply dbest practiceT within those versions of the methods. But obviously, the conclusions of this study have greatest bearing on the particular versions applied. But we can consider how far the conclusions reached here are liable to apply to other versions and implementations of these methods, and to methods based on eliciting experts’ knowledge more generally. 5.1.1. The usefulness of the methods In order to gather some insights on the usefulness of the two methods the experts were asked to complete an evaluation questionnaire. Table 5 gives the summary of the content of the evaluation questionnaires. In respect to the Delphi method, none of the experts reported that the outcomes of the Delphi inquiry included major gaps on how the topic was addressed through the in this implementation of the Delphi method. Thus, the experts assessed that there were few gaps in the operationalisation of the Delphi inquiry, and how the inquiry was elaborated and managed. This positive assessment indicates
698
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
Table 5 Results from the evaluation questionnaires on Delphi and SMIC General questions on the questionnaire (specific implementation of the method)
For overall qualitative assessment of the method Contribution of results for transport policy makers
Provision of knowledge for respondents
Delphi method
SMIC method
– Easy to answer; – Difficult to answer; – Time consuming; – Lack of alternative scenario as restricting factor;
– – – –
– Fixed time horizon as restricting factor – Implementation was closing off too many subjects prematurely – Lack of possibility to consider interdependencies among questions limiting or enhancing factor – Indication of percentage of answers changed in the 2nd round – Specific implementation causing serious gaps in overall results Listing three advantages and three disadvantages – Providing insights at European, National, City level; – Useful for background information; – Highlighting key technologies – Identifying obstacles – Helping setting priorities – Useful for transport research planning – Useful for regional and urban transport planning Results of the inquiry: – Confirming existing views – Suggesting new insights – Providing issues to think about – Revealing disagreement among experts – Providing better understanding of technology trends – Providing better understanding of transport policy trends
Easy to answer; Difficult to answer; Time consuming Lack of alternative scenario as restricting factor; – Fixed time horizon as restricting factor – Implementation was closing off too many subjects prematurely – Specific implementation causing serious gaps in overall results
Listing three advantages and three disadvantages Providing insights at European, National, City level;
that this implementation and the results achieved were rather comprehensive. Furthermore, the experts assessed the inquiry as a valuable mean of communication for exchanging opinions on the topic. This was especially valid for the Delphi method, which provides results through statistics of group responses, where the opinions of the entire sample of experts are represented. The experts found that the Delphi method makes it possible to assess the existence of broad views on the future of a topic, increases the credibility of expectations, and acts as a tool that help to reinforce personal opinions. Finally, such results can encourage action and investments in a field (i.e. in our case on ATT technologies). On the other hand, the experts felt constrained–in this particular implementation–since there was no space to forecast other topics than the ones selected by the designer of the method. (This would not be
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
699
the case in a more traditional form of Delphi, where topics for study were initially predefined by the pool of experts. In this respect the study fell short of good practice—but this was a flaw also reproduced in the SMIC. Of course, the existence of a major research programme on the topic areas covered suggested that we would be duplicating effort by asking for topics as if from a blank slate. But there is always the possibility of new topics arising after the programme was formulated, and also there might be alternative views of key topics.). Then, not all participants appreciated one of the characteristics of the Delphi, which is that the inquiry takes part over successive rounds; this was seen as something of a forced way to reach convergence of opinions, and might have led some experts to drop out after the first round. However, the implementation of Delphi requires that the inquiry take part over successive rounds. Avoiding successive rounds would have made it impossible to evaluate the methodological principles of the Delphi technique. Finally, the evaluation of the Delphi demonstrates that a limitation of this exercise was seen as being the lack of consideration of possible interrelations and interdependencies among different items. This is a problem with practically all Delphi exercises: to overcome it means moving well beyond standard Delphi practice. The experts considered that this Delphi implementation was particularly useful for improving understanding of the future trends of ATT technologies at European level, less so at National or Local level. The experts reported that the results achieved by this inquiry formed useful information to provide evidence to help transport policy decisions, and to help guide new research on ATT technologies. A remarkable result of the Delphi evaluation was that all participants felt that results would be useful for transport policy makers. The degree of perceived usefulness varies among experts, who again expected that the results are more useful for policy makers which deal with transport policy at European level, rather than at National and or Local level. The results were especially seen as helpful as a source of background information for transport policy, to identify key technologies, and in setting priorities in the field of ATT technologies. The evaluation of the implementation and of the results of the SMIC method was undertaken by a much smaller sample of experts. (Out of 126 experts contacted for the SMIC inquiry we received 69 valid responses, but only 13 of this 69 returned a completed evaluation questionnaire.) This could be a source of bias (for example, perhaps only the experts who were particularly favourable to the SMIC method replied to the evaluation questionnaire; those that had disliked the experience might have been disinclined to complete yet another survey). As with the Delphi, none of those that did reply reported that there were many gaps neither in this implementation of SMIC, nor in the results achieved. As a general observation, experts considered the task of completion of the SMIC questionnaire more difficult than for the Delphi. This was further influenced by the formulation of the selected event statements which included reference to the percentage of European medium size cities using a certain transport telematics technology. However, the experts regarded one of the major advantages of the technique to be its precision. This method provided very clear instructions and explanation on its purposes of use. As reported for the Delphi method, the study was evaluated by experts as a useful occasion for a large sample of experts to provide and/or share an overview of their opinions on future trends of ATT technologies. Further, the presentation of results, through means of possible scenarios and the explanation of the more likely ones, was considered to be an interesting way to evaluate results obtained from the method (independently of whether the experts agreed with its outcomes). On the other hand, one of the main goals and supposed advantages of the method–the assessment of interrelationships among events–was evaluated by experts as complex and time consuming. The major
700
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
problem was that sometimes experts reported that it was difficult to assess cross-impacts, because they had the feeling they were contradicting opinions expressed in previous answers. This reflects two problems of the methods. The first is cognitive: the difficulty of keeping in mind the factors that led to previous decisions. The second problem is methodological. The method only allows for twofold crossimpacts to be assessed. But more complex interactions may occur as a result of factors co-occurring. Though not explicitly addressed in the method, the phenomenon may well enter respondents’ consciousness. 5.1.2. The two methodologies in retrospect A general observation is that these (and other) forecasting methods have always been treated separately in the literature, with little comparative analysis. The techniques have somewhat different methodological principles, and may be intended to provide information for different purposes, though the literature is not so clear about what these are. But both use survey tools to elicit and share the knowledge of expert views, and comparison is thus both cogent and desirable. Indeed, during the implementation of the two methods, the two techniques raised many similar issues and questions. For example, both Delphi and SMIC confront the problem of how many experts should receive the inquiry. Issues such as how to build a clear event statement, and which is the dbestT time horizon for a reasonable forecast, have to be considered in the application of Delphi and SMIC. There is scope for trying to harmonise practice in respect of these commonalties. Further research could examine in how far it is possible to have common drulesT for the implementation of experts’ based methods. Compared to the Delphi method, relatively little has been written about using the cross-impact method—which seems to have been used much less frequently. This might reflect its limited flexibility in terms of being applied to complex topics as compared to the Delphi method. The limits to analysis created by the mushrooming of the number of estimates that are required as new statements are added in a cross-impact study proves a very serious issue. However, cross-impact analysis does focus on interrelationships among events, which may be a useful tool when forecasting well-defined topics. Guidelines that can help in developing dbest practiceT implementations of these methods are sorely needed. There is little indication in the literature of which type of event statement might better be combined or kept well apart in a cross-impact inquiry—for example, should we avoid mixing events that are technology-related with others that are more economics-oriented. (Such a mixture might invalidate the outcomes by making it impossible to assess, or for one’s pool of experts to legitimately assess the cross-impacts among events.) Another drawback that emerged in the implementation of the SMIC method, and where there is still scope for further research, is the difficulty of establishing the consistency of the technique. The underlying mathematical structure and computational routines are not easy for non-experts to understand, and thus whether more or less arbitrary btechnicalQ assumptions are influencing the results is unclear. The explanations in the SMIC literature and software handbook effectively expect the user to trust a bblack boxT. Notwithstanding the numerous applications of the Delphi method, few studies have been undertaken to clarify aspects of implementation of this technique—and most of these date from the early days of the method. Of course, there were many attempts and discussions in the literature on the advantages, drawbacks and pitfalls of this technique. But we lack a technical template that provides answers to the various questions the user has to face when dealing with the implementation of this method. To create
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
701
such a template would be a large task. But some more detailed general guidance notes at least would ease (and harmonise) the implementation of Delphi. For example, from the evaluation of our Delphi implementation it was apparent that the experts would have liked to weight responses. In the past, self-rating of expertise has been assessed as tool to weight responses. Some practitioners are dubious about any formal weighting scheme, preferring simply to exclude non-experts or at least to contrast the judgements of experts with that of the whole sample. However, the literature does not provide enough evidence that self-rating is an appropriate metric to weight responses, or how such a metric should be applied in practice. It would be desirable to further investigate whether it is possible to find a standard tool that allow to weight expert responses, and under what circumstances this might be applicable. Another aspect of Delphi that suggests further research is related to the issue of consensus. What degree of consensus is required to make a forecast believable or useful in specific circumstances? Delphi researchers appear to be using more or less arbitrary criteria to assess convergence of opinions and report on it. Theory-based or statistically derived standards could be helpful. However, we also need to be more sensitive to the argument that, as already mentioned, the search of convergence of opinion should not be an overriding objective of a Delphi study. Knowledge that there are disagreements can be valuable—especially where there are informed outliers, or where there is a clear divergence if opinion among many respondents
6. Conclusions and implications Nothing in this study challenges the view that there is a vital role for forecasting to inform the S&T policy decision process, and that methods based on soliciting experts’ opinion are important tools here. However, we are still far from having enough knowledge about these tools to show conclusively how they can best be systematically fed into (let alone used in) the policy decision process. A striking lesson from the present study, relevant to these particular futures studies methodologies based on eliciting expert opinions, quite plausibly applies to many other methods, too. It is that, in the published literature, at least, very little explicit progress has been made since the 1970s in terms of analysis of the implementation of such forecasting methods, and of the implications of different implementation and design decisions. Though there is almost endless debate as to the philosophy and epistemology of Futures Studies, there is a remarkable lack of reflection and research on practical application of futures methods. This is surprising, given the boost to the field of Futures Studies since the beginning of the 1990s, when Foresight exercises, undertaken at very broad levels, have been undertaken very widely, with considerable resources, and with striking policy impacts in several cases. Often these studies have made considerable use of the Delphi method. This study has not been able to draw firm conclusions as to how to select the most appropriate forecasting methodology for a particular task. It is evident that the choice of methodology is highly related to both the topic and the precise objectives of a study. In principle, methodologies based on soliciting experts’ opinions are applicable to any topic. In practice, the degree of flexibility of these techniques varies, and the boundary conditions associated with different methods can constrain the freedom of the user. (The results of the implementation of the two methods in the present study demonstrated this. Both approaches were limited in the number of topics that could practically be considered—SMIC especially was limited to very few topics. Similarly, a limited set of relationships between topics, if any, that could be elaborated.) The input data elicited from experts will be affected by
702
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
these boundary conditions, and by the way data are processed and filtered in the specific implementation. What information is captured (and how it is framed) is very dependent on these factors. Different techniques, then, even if based on soliciting experts opinions through mean of inquiries and applied on the same topic, are likely to achieve different levels and types of knowledge and information. This is attributable to the fact that each technique has a different process of gathering and analysing information. The study shows that it is possible to use and combine different methodologies to confirm results, achieve different types and different degrees of detail of knowledge and information. The user of the technique has to face and implement many decisions, where a substantial degree of tacit knowledge is necessary. Not only is there necessarily a need for knowledge about the mechanics of the method, but also levels of comprehension of the topic of enquiry is liable to shape the formulation of the questions. The same technique may be applied to the same topic, even using the same experts, but may be implemented in different ways by different designers, drawing on different sets of tacit knowledge (on the topic and on the technique). A sequence of many decisions is required for the implementation of expert-based methods. The precise decisions taken may well influence the usefulness of a forecast. Some of these decisions will be taken on the basis of tacit knowledge, some reflect specific circumstances. A technical template, with some ground-rules on implementation should facilitate users’ decision-making—at the very least in terms of helping them avoid running into the pitfalls of the technique, avoiding suboptimal applications of these techniques, identifying where there are tricky decisions to be made. Finally, the study is not only a methodological contribution. Its results are intended to help inform decisions concerning Research and Development (R&D) and related issues in the area of transport telematics technologies systems and their implications. The transport experts involved in the study generally considered the (particular implementation of the) methods to be a useful tool to share knowledge. The specific results of the study can be used as a basis for further work and research in the transport field—for example it is possible to examine the results more specifically at national levels, and an inquiry along these lines could be undertaken nationally focusing on cities of a selected country (or cities of different types). This could be useful for examining the constraints on, and the requirements for, the adoption of ATT systems within countries and across different sorts of city. Further work on these lines could valuably extend to consultation exercises directly with transport policy makers (or those engaged with making decisions in any field in which such methods are employed). For example, a policy workshop could be constructed, with results of the forecasting work being used as a basis for debate. Or personal interviews with policy makers could explore their reactions to a forecasting study. Such approaches would enable assessment of user opinions on the practical value of the outcomes, in terms of providing useful information as to the policy domain, and in terms of helping to develop actions, priorities or recommendations for further study. Appendix A. Other references7 Alter, S. (1979). bThe evaluation of generic cross-impact models.Q Futures 11(2): 132–150. Bardecki, M.J. (1984). bParticipants response to the Delphi method an attitudinal perspective.Q Technological Forecasting and Social Change 25(3): 281–292. 7
This list of references are relevant for the issues debated in this article and are part of the bibliography of the completed study.
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
703
Beasley, J.E. and R. Johnson (1983). bSome cross-impact refinements.Q Futures 15(3): 226–228. Blackman, A.W. (1973). bA Cross-impact model applicable to forecasts for long-range planning.Q Technological Forecasting and Social Change 5(3): 233–242. Dalkey, N.C. (1975). Toward a theory of group estimation. in H. A. Linstone and M. Turoff (eds). The Delphi method: Techniques and applications. Reading, Massachusetts and London, Addison-Wesley Publishing Company: 236–261. Dalkey, N.C., B. Brown, et al. (1969). The Delphi method III: use of self-ratings to improve group estimates, The Rand Corporation. Dalkey, N.C. and D.L. Rourke (1971). Experimental assessment of Delphi procedures with group value judgements, ARFA. Ducos, G. (1983). bDelphi et analyse de interactions.Q Futuribles 71(November 1983): 37–44. Duperrin, J. C. and M. Godet (1975). bSMIC 74—A method for constructing and ranking scenarios.Q Futures 7(August): 302–312. Duval, A., E. Fontela, et al. (1974). Cross impact: a handbook on concepts and applications. Geneva, Battelle Geneva Research Centre Enzer, S. and S. Alter (1978). bCross-impact analysis and classical probability: the question of consistency.Q Futures 10(3): 227–239 Evans, J. S. B. T. (1987). Beliefs and expectations as causes of judgmental bias. in G. Wright and P. Ayton (eds). Judgmental Forecasting. Chichester, John Wiley and Sons: 31–47. Fischer, R. G. (1978). bThe Delphi method—A description, review and criticism.Q Journal of Academic Librarianship 4(2): 64–70. Helmer, O. (1977). bProblems in future research—Delphi and causal cross-impact analysis.Q Futures 9(1): 17–31. Kane, J. (1975). A primer for a new cross-impact language—KSIM. in H. A. Linstone and M. Turoff (eds). The Delphi Method—Techniques and applications. Reading, Massachusetts and London, Addison-Wesley Publishing Company: 369–382. Kaya, Y., M. Ishikawa, et al. (1979). bA revised cross-impact method and its applications to the forecast of urban transportation technology.Q Technological Forecasting and Social Change 14(3): 243–257. Kelly, P. (1976). bFurther comments on cross-impact analysis.Q Futures 8(4): 341–345. Makridakis, S., S. Wheelwright, et al. (1983). Forecasting: methods and applications. New York, John Wiley and Sons. Masini Barbieri, E. (1993). Why futures study? London, Grey Seal Books. Parente´, F. J., J. K. Anderson, et al. (1984). bAn examination of factors contributing to Delphi accuracy.Q Journal of Forecasting 3(2): 173–182. Scapolo, F. (1999). Prospective on Environmental Consequences of Changes in Urban Transportation—Comparison of Two Methods Eliciting Experts’ Knowledge. Faculty of Economics and Social Sciences. Manchester, University of Manchester: 397. Stover, J. (1973). bSuggested improvements to Delphi/Cross-impact technique.Q Futures 5(3): 308–313. Turoff, M. (1972). bAlternative approach to cross-impact analysis.Q Technological Forecasting and Social Change 3(3): 309–339. Woundenberg, F. (1991). bAn Evaluation of Delphi.Q Technological Forecasting and Social Change 40: 131–150.
704
F. Scapolo, I. Miles / Technological Forecasting & Social Change 73 (2006) 679–704
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24]
J.C. Glenn, T.J. Gordon (Eds.), Futures Research Methodology, The United Nations University, Washington, 2003. H. Rush, I. Miles, Surveying the social implications of information technology, Futures 21 (3) (1989) 249 – 262. O. Helmer, Reassessment of cross-impact analysis, Futures 13 (5) (1981) 389 – 400. T.J. Gordon, H. Hayward, Initial experiments with the cross impact matrix method of forecasting, Futures 1 (2) (1968) 100 – 116. M. Godet, F. Bourse, et al., Problemes and methodes de prospective—Boite a outils, ADITECH-futuribles, Paris, 1990. CEC, Advanced Transport Telematics—1993 Annual Project Review Part 1. Brussels, Commission of the European Communities—DG XIII, 1993. CEC, Advanced Transport Telematics—1993 Annual Project Review Part 2. Brussels, Commission of the European Communities—DG XIII, 1993. CEC, Advanced Transport Telematics—1993 Annual Project Review Part 3. Brussels, Commission of the European Communities—DG XIII, 1993. CEC, Advanced Transport Telematics—1993 Annual Project Review Part 4. Brussels, Commission of the European Communities—DG XIII, 1993. CEC, Advanced Transport Telematics—1993 Annual Project Review Part 5. Brussels, Commission of the European Communities—DG XIII, 1993. CEC, Advanced Transport Telematics—1993 Annual Project Review Part 6. Brussels, Commission of the European Communities—DG XIII, 1993. F.J. Parente´, J.K. Anderson Parente´, Delphi inquiry systems, in: G. Wright, P. Ayton (Eds.), Judgmental Forecasting, John Wiley & Sons, Chichester, 1987, pp. 128 – 156. M. Godet, From Anticipation to Action—A Handbook of Strategic Prospective, UNESCO, Paris, 1993. G. Marbach, C. Mazziotta, et al., Le previsioni-Fondamenti logici e basi statistiche, ETAS Libri, Milano, 1991. V.M. Mitchell, The Delphi technique: an exposition and application, Technol. Anal. Strateg. Manag. 3 (4) (1991) 333 – 352. H. Sackman, Delphi Critique, Rand Corporation, 1975. M.T. Bedford, The future of communications services into home, Bell Canada Business Planning, 1972 (September). J.R. Salancik, W. Wenger, et al., The construction of Delphi event statements, Technol. Forecast. Soc. Change 3 (1971) 65 – 73. J.P. Martino, Technological Forecasting for Decision Making, 3rd ed, McGraw-Hill, New York, 1993. K.Q. Hill, J. Fowles, The methodological worth of the Delphi forecasting technique, Technol. Forecast. Soc. Change 7 (1975) 179 – 192. R.B. Mitchell, J. Tydeman, A note on SMIC 74, Futures 8 (1976 (February)) 64 – 67. M. Godet, SMIC 74—A reply from the authors, Futures 8 (4) (1976) 336 – 340 (Cross impact method—A methodology report, Working Paper (prepared for). African Futures). R.B. Mitchell, J. Tydeman, A further comment on SMIC 74, Futures 8 (1976 (August)) 340 – 341. M. McLean, Does cross-impact analysis have a future? Futures 8 (4) (1976) 345 – 349.
Dr. Fabiana Scapolo is leading the European Foresight team at the European Commission Directorate General Joint Research Centre Institute for Prospective Technological Studies (DG JRC-IPTS), Seville, Spain. She is in charge of different projects aiming at contributing and consolidating the European Foresight knowledge base. Ian Miles is Professor of Technological Innovation and Social Change at Manchester Business School, the University of Manchester (UK). He is co-Director of the research centres CRIC and PREST at that University, where he works on Foresight studies, innovation research (especially concerning Information Technology and services sectors), and the role of knowledgeintensive business services.