Comparing willingness-to-pay: bidding game format versus open-ended and payment scale formats

Comparing willingness-to-pay: bidding game format versus open-ended and payment scale formats

Health Policy 68 (2004) 289–298 Comparing willingness-to-pay: bidding game format versus open-ended and payment scale formats Emma J. Frew a , Jane L...

89KB Sizes 2 Downloads 28 Views

Health Policy 68 (2004) 289–298

Comparing willingness-to-pay: bidding game format versus open-ended and payment scale formats Emma J. Frew a , Jane L. Wolstenholme b , David K. Whynes c,∗ a

Health Economics Facility, University of Birmingham, Birmingham, UK Health Economics Research Centre, University of Oxford, Oxford, UK School of Economics, University of Nottingham, Nottingham NG7 2RD, UK b

c

Accepted 14 October 2003

Abstract The willingness-to-pay technique is being used increasingly in the economic evaluation of new health care technologies. Clinical trials of two methods of screening for colorectal cancer are currently being conducted in the UK and willingness-to-pay for screening has already been estimated by means of a questionnaire survey, using open-ended (OE) and payment scale (PS) formats. This paper addresses the same medical issue, although it elicits willingness-to-pay values by means of a bidding game in an interview setting. Interviews were conducted with 106 subjects in Nottingham. The bidding game format produced considerably higher valuations than had either of the previous questionnaire formats, whilst the significant differences between agreed valuations obtained using different initial bids supported the existence of starting-point bias in the bidding game. As with the questionnaire study, the majority of interview subjects offered relative valuations of tests at variance with their expressed preferences over the same tests. Given the significant difference in valuations generated by different formats, it follows that the economic case for preferring any one technology over others will depend considerably upon whichever format happens to have been used to generate the valuations. © 2003 Elsevier Ireland Ltd. All rights reserved. Keywords: Bidding game; Colorectal cancer; Screening; Valuation; Willingness-to-pay

1. Introduction Faced with resource constraints, decision-makers in public health care systems look towards economic evaluation as a guide when selecting, from amongst innovative health care technologies, those most ap∗ Corresponding author. Tel.: +44-115-951-5463; fax: +44-115-951-4159. E-mail address: [email protected] (D.K. Whynes).

propriate for implementation. The willingness-to-pay (WTP) technique—originally developed by environmental economists—is now well established as one possible method for such evaluation [1–3]. In essence, WTP attempts to assess the worth of a prospective intervention directly, by asking subjects to nominate their monetary valuations of the benefits perceived or expected to result, were the intervention to be made available. Whilst applications of WTP in health evaluation have proliferated, methodological controversies remain, not least with respect to the method or format

0168-8510/$ – see front matter © 2003 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.healthpol.2003.10.003

290

E.J. Frew et al. / Health Policy 68 (2004) 289–298

through which the WTP valuations might be elicited. In part, this controversy persists because of lack of evidence. It is rare for researchers to use more than one format when undertaking any one evaluation, with the consequence that the comparative performance of different formats used to value the same intervention remains largely unresearched. Specifically, it remains unclear whether the WTP valuations obtained are invariant to elicitation format. Colorectal cancer is one of the most commonly occurring cancers in the industrialised world and two alternative screening protocols have been the subjects of major randomised controlled trials in the UK. These are biennial faecal occult blood (FOB) testing for those aged 50–74 years [4] and once-only flexible sigmoidoscopy (FS) at age 60 years or thereabouts [5]. As a component of the UK evaluation of colorectal cancer screening, a WTP study has already been undertaken [6]. Data on WTP for both FOB and FS screening were captured by means of a survey, using questionnaires designed for self-completion without supervision and distributed to a general population drawn from across the Trent region (east-central UK). The survey elicited WTP valuations under two different formats (offered to subjects on a random basis), the open-ended (OE) and the payment scale (PS). Under the OE format, the subject is invited to choose his/her own WTP valuation, unbounded and unprompted, whereas, under PS, subjects choose a value from the same pre-specified and ordered list. Although each of these formats has been used extensively in other health care contexts [7], neither is without criticism. For example, in deliberately failing to provide subjects with any clues as to plausible values, the OE valuation question can prove difficult to answer. A survey using the OE format might therefore be expected to yield a low response rate, and with valuations possibly ill-considered. Conversely, by explicitly indicating a range, the PS format makes the valuation task more comprehensible to the subject, although the scale itself might influence the subjects’ decisions. Thus, a very high scale endpoint might lead a subject to infer that such high levels are actually the most appropriate valuations and responses may accordingly be biased upwards [8]. Whilst being repeatedly employed, both the OE and the PS formats are pre-dated, in environmental economics, by a third. The first WTP format used in

that area was bargaining or the “bidding game”. Bidding elicits WTP by means of interaction between investigator and respondent. The investigator offers the subject a WTP value which is accepted or rejected, and continues to make higher or lower offers depending upon whether the subject accepts or rejects the previous offers. By the format’s very nature, data collection by means of a survey is inappropriate for the bidding game format, which requires a sample of one-to-one interviews. The proponents of the bidding format argue that the process of iteration towards an agreed WTP value requires subjects to consider their responses more carefully than with other formats. Unfortunately, eliciting WTP by means of bidding is suspected of being prone to its own specific bias; final valuations may be conditional on the starting-point chosen to initiate the bid sequence [8]. Having already obtained WTP valuations for colorectal cancer screening using the OE and PS formats administered by questionnaire survey, we initiated an interview-based bidding study to value the same intervention. This paper presents the findings of that study and compares the results with those of the questionnaire formats. The specific research questions are the following. First, do the WTP valuations elicited by interview using the bidding game differ significantly from those obtained by questionnaire using the OE and PS formats? Second, do the bidding game valuation results themselves provide evidence of starting-point bias?

2. Method The subjects in the bidding study were all registered with one large general practice in the suburbs of Nottingham, east-central UK. After obtaining full ethical approval for the research, invitation letters were mailed to a random sample of individuals so registered. With guidance from the practice, we excluded three classes of individual from the randomisation. These were, first, persons under 25 years of age, on the grounds of the likely perceived irrelevance of colorectal screening to that age group. Second, we excluded individuals with a recent diagnosis of colorectal cancer in the family, on the grounds of wishing to minimise distress. Finally, persons with substantial reading, learning or language difficulties

E.J. Frew et al. / Health Policy 68 (2004) 289–298

were excluded, on the grounds of their potential incapacity to participate fully. The earlier questionnaire study had been issued to subjects via general practitioners and identical a priori exclusions had been applied. Each invitation letter, issued via the practice, described colorectal cancer as a health problem, and individuals were asked whether they would be willing to come to the practice for a confidential discussion with an independent (non-clinical) interviewer about aspects of screening. Specifically, they were told that university researchers were “trying to get an idea of how people would value a national screening programme for colorectal cancer” and that “they would appreciate your opinion”. Interested persons completed response forms and returned them in postage-pre-paid envelopes. All those so responding received follow-up letters, thanking them for their interest and requesting them to indicate, on an attached timetable, those dates and times at which they would find it convenient to attend for interview. All interviews were conducted by the same researcher (E.J.F.). At the start of each, the interviewer described and illustrated the nature of colorectal cancer and the purpose of the research. This was followed by explanations of each of the screening methods or tests, FOB and FS. Verbal descriptions were augmented by visual aids, including large-type written displays of the test protocols and illustrations of the equipment to be used. At this stage, the interviewer offered to respond to any of the participant’s queries regarding the disease or the screening processes. Each subject was then asked whether or not they would be willing to take one or other of the CRC screening tests. Those so willing were invited to express a preference (FOB, FS or indifferent). Participants agreeing in principle to take a screening test were then told that the researchers were concerned to identify people’s valuations of the two screening methods. The bidding game approach to valuation was explained. For each screening method in turn, the subject was offered an initial bid, and asked whether this was an amount she/he would be willing to pay. If the subject accepted the test at this bid, the interviewer then offered a higher value, and again sought approval from the subject. When a specific bid was rejected, a lower value was offered and acceptance was then sought.

291

As noted earlier, a suspected weakness of the bidding game is its vulnerability to starting-point bias, i.e. higher starting bids tend to produce higher accepted bids, ceteris paribus [3]. To test this proposition, subjects were assigned at random to one of three different starting bids, subject to ensuring that approximately equal numbers eventually received each bid. The choice of these values was informed by the earlier questionnaire study [6]. The OE and PS results combined had produced a median WTP of £50, for both FOB and FS. The first two values selected for the bidding game, £10 and 200, represented amounts approximately equi-distant and towards the two tails of the earlier WTP distribution, at the 13th and 86th percentile for FOB, respectively (15th and 89th for FS). Both values had appeared on the PS instrument. In an effort to maintain comparability by providing subjects with similar value ranges, the third starting bid—£1000—was the same as the highest value available to PS subjects. Following a principle established by previous researchers [9], the paths of offers following each starting bid were pre-specified by algorithms, to ensure that all subjects who had received the same initial bid were also subject to the same negotiation process. These algorithms are illustrated in Fig. 1. For valuations in excess of £1000, subjects were asked to specify an amount (the same had applied to the earlier PS subjects). The negotiation ended with the subject agreeing to the bid arrived at via the appropriate algorithm. One of the fundamental concerns about applying the WTP approach in a hypothetical setting is whether or not respondents fully understand the problem being asked of them. The interviewer asked each subject to rate how easy or difficult they had found the valuation exercise (five-point scale, easy to hard), with a view to assessing self-perception of comprehension. To encourage the respondents to think about WTP in the context of real-world behaviour, each was asked to nominate a recently-purchased good or service of equal value to the agreed bid. The subject was then faced with the WTP questions again, this time with the nominated good in place of the WTP value. For example, if a subject had agreed a WTP value of £200 for the FOB test and declared they had recently spent that amount on a short holiday, she/he was asked if she/he would indeed be willing to sacrifice such a holiday for the test. This comparison is a form of framing [10],

292

E.J. Frew et al. / Health Policy 68 (2004) 289–298 Algorithm 1 Start at £10

Algorithm 2 Start at £200

Algorithm 3 Start at £1,000

0

0

0

No

No Start

No 10

10

Yes

Yes No

50

50

50

Yes

Yes No

100

100

No

100

No 200

No Start

200

Yes

Yes

500

Yes

500

No

500

No 800

No 800

800

Yes

Yes No

1,000

1,000

Start

Yes

Yes Specify

Specify

Yes Specify

Fig. 1. Bidding algorithms (values in £).

and we believed it would confront the subject with a more concrete notion of opportunity cost than would a monetary value requested in isolation. If, on reflection, the subject declared the nominated good was not equivalent to the valuation, the WTP bidding process was repeated, until subjects reached a revised WTP valuation, and an equivalent good they would agree to forego. Finally, each subject was requested to supply personal information. The social data collected comprised gender, age and age on leaving full-time education. Economic data comprised employment status and annual household income (as one of four bands, starting at zero, band-width £10,000, ending at £30,000 and above). Perceived own-health status (four levels: poor, fair, good, excellent) and smoking status were recorded. Respondents were asked for the number of visits to their dentist in the past two years. Frequency of visits to the dentist was deemed relevant as it had proved in the past to be a robust predictor of actual compliance with FOB screening in the UK trial [11,12]. Subjects were asked to note whether they were particularly worried about CRC

(four levels: not at all, a bit, quite, very) and to identify their perceived chances of eventually suffering the condition, compared with the average for men or women of their own age (five levels: much lower, lower, same, higher, much higher). Irrespective of screening, both FOB tests and sigmoidoscopy occur routinely in the medical management of bowel conditions, and subjects were asked whether they had previously experienced either of these investigations.

3. Results A total of 106 interviews were conducted. After being provided with descriptions of colorectal cancer and the screening procedures, six subjects indicated that they would be unwilling to undertake either test. Accordingly, they were not asked for a preference or a WTP valuation. Table 1 displays descriptive statistics of the 100 interview subjects indicating a preference, and for the sub-samples subsequently offered each of the three bidding algorithms. There is no evidence of significant differences between the three sub-samples

E.J. Frew et al. / Health Policy 68 (2004) 289–298

293

Table 1 Descriptive statistics Characteristic

Full sample

By starting bid (£) 10

200

1000

χ2 /F

P-value

100 49.0 55.5 16.1 51.1

33 51.5 53.7 16.6 54.8

34 47.1 57.1 15.4 48.5

33 48.5 55.7 16.2 50.0

0.14 0.58 2.63 0.28

0.93 0.56 0.08 0.87

Annual household income <£10 K (%) £10–20 K (%) £20–30 K (%) >£30 K (%)

17.6 37.6 27.1 17.6

14.3 28.6 39.3 17.9

23.3 40.0 16.7 20.0

14.8|| 44.4 25.9 14.8

4.81

0.57

Health (%, reporting poor or fair) Mean visits to dentist in previous 2 years

27.0 2.9

18.2 2.9

26.5 2.9

36.4 2.9

2.78 0.01

0.25 0.99

Smoking Current (%) Ex-smoker (%) Never smoked (%)

12.0 43.0 45.0

12.1 48.5 39.4

17.6 38.2 44.1

6.1| 42.4 51.5

2.84

0.59

No worries about colorectal cancer expressed (%) Chances of colorectal cancer perceived above average (%)

33.0 11.0

24.2 6.1

41.2 11.8

33.3 15.2

2.17 1.42

0.34 0.49

n Gender (%, male) Mean age (years) Mean age on leaving full-time education (years) Employment (%, employed)

for any of the comparisons of characteristics. As the proportion of the UK population in receipt of incomes in excess of £20,000 per annum is approximately 40% [13], the interview sample appears reasonably representative of the UK in income terms. Twelve interview subjects had prior experience of the FOB test, eight had undergone flexible sigmoidoscopy and five had experienced both. There was no evidence of disproportionate representation of experienced subjects across the three algorithm sub-samples. From the earlier questionnaire survey, 2637 subjects had expressed screening preferences. Comparing the characteristics identified in Table 1, the questionnaire and interview samples differed significantly in only three respects. First, the questionnaire subjects were, on average, younger (mean age 49.4 years, Z = 4.62, P = 0.00), and second, they had left full-time education later (at mean age 17.3 years, Z = 4.71, P = 0.00). Third, the questionnaire sample contained a smaller proportion of males (36.6%, Z = 2.52, P = 0.01). A regression analysis of the questionnaire data [6] had revealed that, whilst males tended to offer higher WTP values than did females, those leaving full-time education earlier offered lower values, and

the absolute age of respondents appeared to exhibit no significant effect. Based on our questionnaire experience, therefore, the interview sample had two salient differences in characteristics, one of which might have been predicted to increase mean WTP in relation to the questionnaire sample (gender composition) and one of which might have lowered it (years of education). Virtually all of the interview subjects were initially unfamiliar with the WTP methodology. Their immediate responses to being set the task were, for example: “I’ve never thought of treatments like these in terms of how much I’m willing to pay for them “and” once I’d thought about it for a while I realised what you were getting at”. In spite of this unfamiliarity, most subjects completed the task, although a minority were unable or unwilling. Some of these simply declined to answer on the grounds that the question was too difficult, whilst others clearly found valuation intrinsically alien: thus: “how can I possibly do that?—you can’t attach money to people’s lives”. Three individuals evidently interpreted the WTP exercise as a precursor to the introduction of charges, and protested against this. In comparison with many other contingent valuation surveys, this proportion of “protestors”

294

E.J. Frew et al. / Health Policy 68 (2004) 289–298

Table 2 Willingness-to-pay for screening: comparison of formats Format

Distribution of WTP values (%) ≤£200

≥£1000

Range (£)

Mean (£)

95% CI

Median (£)

95 32 31 32

38.9 65.6 35.5 15.6

36.8 12.5 35.5 62.5

0–2000 0–1000 10–2000 10–2000

588 341 607 818

496–681 218–464 439–774 659–976

500 200 500 1000

955 1201

88.8 91.3

2.6 0.8

0–12000 0–2000

122 90

92–153 82–99

93 31 30 32

32.3 48.4 33.3 15.6

35.5 19.4 30.0 56.3

10–2500 10–1000 10–2500 10–1500

615 439 625 774

523–706 306–572 440–810 625–923

895 1084

96.2 91.7

1.9 0.9

0–20000 0–1500

84 92

40–129 84–100

n FOB Interview £10 £200 £1000 Open-ended Payment scale FS Interview £10 £200 £1000 Open-ended Payment scale

WTP values

appears small [14]. Of those offering a WTP value, 29% rated the WTP exercise exactly at the mid-point of the easy–difficult comprehension scale. Thirty-two percent selected points above the mid-point (towards the “difficult” end), and 39% selected points lower than the mid-point. The goods and services nominated by the subjects as representing a value equivalent to that of the screening tests were drawn from a narrow range. At the low-value end, with subjects agreeing a WTP of up to around £100, the most common selection was along the lines of, in the words of several subjects, something for myself. Examples included items of clothing, entertainment and “dining out”. For valuations of £500 and above, the reference good was almost invariably either a holiday or a major item of household expenditure, such as furniture or electrical equipment. Table 2 displays summary statistics of the WTP valuations for the interview subjects and from the previously-obtained questionnaire samples. Considering, first, the distribution of WTP values, it is evident that the full interview sample, in comparison with the OE and PS formats and for both FOB and FS, produced considerably fewer values at or below £200 (equivalent to the middle bid level) and far more at or above £1000 (equivalent to the higher). Within the interview sample, for FOB, the proportion offering agreed WTP valuations ≤£200 fell as starting bid

30 50 500 500 500 1000 30 50

increased (χ2 = 17.05, d.f. = 2, P = 0.00), whilst the proportion accepting ≥£1000 rose (χ2 = 17.23, d.f. = 2, P = 0.00). Similarly, for FS, the proportion accepting ≤£200 fell as starting bid increased (χ2 = 7.76, d.f. = 2, P = 0.02), whilst the proportion accepting ≥£1000 rose (χ2 = 9.95, d.f. = 2, P = 0.01). In effect, the starting bid skewed the corresponding WTP distribution. Not surprisingly, these differences in distributions were translated into substantial differences in average WTP. The two means from the full-sample interview data are several times higher than those from OE and PS, and the confidence intervals suggest that the differences between the bidding game and questionnaire-based results are highly significant. The median WTPs differ by an order of magnitude, despite each of the questionnaire-based formats yielding a small number of extremely high valuations well beyond those agreed by any of the interview subjects. For the FOB interview data, comparison for all three pairs valuations by starting bid produced significant differences (Mann–Whitney: £10 versus £200, U = 2.46, P = 0.01; £200 versus £1000, U = 2.10, P = 0.04; £10 versus £1000, U = 4.29, P = 0.00). The differences for FS can only be accepted with reduced confidence, except for the extreme comparison (Mann–Whitney: £10 versus £200, U = 1.65, P = 0.10; £200 versus £1000,

E.J. Frew et al. / Health Policy 68 (2004) 289–298

U = 1.82, P = 0.07; £10 versus £1000, U = 3.09, P = 0.00). Amongst the interview data, only one zero WTP value was agreed for one of the tests (FOB); all other agreed values were £10 or higher. In the questionnaire study [6], around one-in-ten respondents had offered zero WTP values, ranging from 8.7%, in the case of the FOB test using the OE format, to 11.6%, for the FS test using the PS format. In the 93 interview cases where WTP values for both FS and FOB had been agreed, a comparison of the valuations accepted for each by each subject enabled us to derive three classes of relative valuation, namely, whether the subject’s WTP valuation for FOB was higher than, less than, or equal to, that of FS. A cross-tabulation of these relative valuations against stated preferences revealed insignificant association (χ2 = 6.02, d.f. = 4, P = 0.20). Although 80.3% of subjects had indicated a definite preference for one or other of the two screening methods, 59.3% offered identical WTP values for both and only 37.4% offered relative valuations completely consistent with their relative preferences. In ten cases, preferences and values reversed, i.e. one method was indicated as preferred yet the other received a higher WTP valuation. The proportions had been broadly similar in the earlier questionnaire study: 52.5% had offered identical WTP values, and only 38.0% had offered valuations completely consistent with preferences. Before reflecting on our findings, we comment on a methodological issue. The WTP technique aims to elicit the maximum valuation from each subject, analogous to the highest price which would be acceptable to a subject faced with an actual purchasing decision. Formats differ, however, in how close to this maximum they can expect to reach, for purely technical reasons. Consider how a maximum WTP of, say, £84 might be represented under the different formats. Because OE respondents are free to select any value they choose, £84 could be nominated directly on an OE instrument. With PS, subjects whose valuations do not appear on the scale are not accommodated precisely, and must accordingly choose either the nearest scale value or, more likely, the lower of the range in which their maximum lies. Faced with a payment scale which offers only the levels of £80 and £90, therefore, this subject would presumably select £80. In the interview

295

study, our concern for the standardisation of bidding, plus the desire not to subject the interviewee to an excessively-long negotiation, meant that the number of bid levels for subjects to accept or reject was limited. Negotiation ceased when a bid was accepted and a higher bid refused. We should therefore presume that, faced with the bidding game, the above subject would have accepted a bid of £50, having refused one of £100. The bidding game’s valuation would be well short of the actual maximum. It accordingly follows that our reported WTP estimates, which have been based on the highest accepted bid, are likely to be conservative, and more protracted and persistent negotiation could have raised the agreed valuation further. For each subject, the higher, rejected, bid must certainly lie above their maximum WTP, yet the bid which they have accepted might well lie below it. In fact, the maximum WTP could lie anywhere between the accepted bid (as it were, the “minimum maximum”) and marginally (say, £1) below the higher, rejected, bid, or the “maximum maximum”. The WTP means for the maximum maxima of the full interview sample are 49 and 23% higher than those reported in Table 2, for FOB and FS, respectively. Taking WTP as the mid-point between the lower, accepted, bid and a value marginally below the higher, rejected, bid yields WTP means 7 and 12% higher than those reported in Table 2, for FOB and FS, respectively.

4. Discussion The first conclusion to be drawn from this study relates to starting-point bias. As noted earlier, the existence of such bias in bidding games remains the subject of debate amongst those concerned with contingent valuation in health care. Some analysts [9,15] have failed to detect it in their results, although, it seems, slightly more researchers have [16–19]. Our findings place us in the latter camp. Our interview study standardised the medical interventions under consideration, the information flow to subjects and the bidding processes, whilst the characteristics of the three starting bid sub-samples were essentially similar. It therefore seems difficult to attribute the sizeable differences between the distributions of, and the average, WTP valuations in the sub-samples to anything other than the starting bids themselves.

296

E.J. Frew et al. / Health Policy 68 (2004) 289–298

Our second conclusion is that the bidding game format, administered via interview, has produced significantly higher average WTP valuations than did either the OE or PS formats, administered via questionnaire. We offer four potential candidates for an explanation as to why the average WTP results differed between formats. The first is sample selection bias. In asking subjects to expend time and effort to participate, it is natural to suspect that the agenda of those volunteering to be interviewed might be at variance with that of the wider population. Volunteers for interview might be expected a priori to have a greater-than-average interest in, or concern about, cancer and screening. On the basis of the characteristics of the samples, however, this explanation does not appear entirely convincing. With respect to those independent characteristics which had disposed individuals towards higher WTP valuations in the questionnaire study [6], the interview sample differed only with respect to gender and age on leaving full-time education. Whilst male gender was “over-represented” in the interview sample, education was “under-represented”, and we might thus expect the effects to compensate in some degree. We feel that explaining such a large valuation difference purely on the grounds of sample bias is asking rather a lot. Second, the high values obtained from the bidding study might possibly be attributed to a bias engendered by our framing requirement. It has been shown that providing subjects with value-good equivalences for goods which they are known to enjoy produces lower valuations than for equivalences for goods which subjects are known to find unappealing [20]. Implicitly, the opportunity cost of, or the value of money for, an enjoyable good is higher and, if subjects’ reference goods had been low generators of utility, then we should have predicted high WTP valuations. We find this explanation unconvincing. Framing is usually employed deliberately by investigators, to manipulate and bias subject responses. Our use was quite different. In the first place, it was the subjects, and not the investigators, who proposed the equivalences, and the former were free to re-negotiate if they felt, on reflection, that they had reached the wrong decision. In the second, for subjects to reach high WTP valuations simply as a result of framing, they would have to have chosen, as reference goods, items offering low utility. Whilst we cannot judge subjects’ utility assignments

to their reference goods with precision, the range offered makes it difficult to believe that recent purchases offering poor value for money were being chosen consistently as comparators. To accept the proposition that framing per se had inflated values would require us, for example, to translate the oft-cited reference good, “something for myself, “as” something unpleasant for myself”. The third, and more probable, candidate for explaining high valuations under bidding is interviewer bias, whereby respondents “shape their answers in a way that they think will either please the interviewer or will increase their status in the interviewer’s eyes” ([8] p. 239). In particular, respondents might be unwilling to offer low values (especially zero) to an interviewer evidently positively disposed towards the service being valued, or to protest against the idea of valuation. They might fear giving offence or being thought of as “cheap”. With an anonymous questionnaire, of course, no such pressures exist. With respect to our earlier remarks about maxima, the bidding format is contributing towards the creation of the WTP value, not simply enabling it to be manifested. Finally, and in complete contrast, it is entirely possible that it is the information flow within the interview setting which accounts for the variation in value by format. In the questionnaire study, every respondent had received the same written descriptions of colorectal cancer and of the two alternative screening methods, descriptions essentially similar to those which had been provided to persons invited to join the clinical trials. Information flow was therefore uniform, limited, uni-directional and impersonal. In the interview setting, however, subjects received more comprehensive and vivid descriptions. They were able to question and to delve further, and the confirmatory check entailing the selection of a reference good required them to reflect beyond money values to opportunity costs. Arguably, these factors caused subjects to reflect on their WTP decisions more deeply than had been the case for the questionnaire subjects, and their higher valuations followed from a better appreciation of the “true” value of screening. By implication, the lack of data available to the questionnaire subjects had caused them to “under-value” screening. To a large extent, this is conjecture, as we have no basis for analysing how questionnaire subjects actually reached their decisions. Furthermore, the necessary

E.J. Frew et al. / Health Policy 68 (2004) 289–298

association between richer information and higher valuation is itself dubious, because it ignores the significance of the information content. Previous studies have shown that richer information might increase perceived value [21], lower it [22] or leave it unchanged [23]. In practical terms, the character of the elicitation format locks these last two potential explanations together. In the absence of the interview there would have been no interviewer bias, but neither would there have been the provision of richer information. In the questionnaire study, the use of the OE and PS formats in estimating WTP yielded a considerable degree of inconsistency between relative valuations for the two screening protocols and stated preferences. Our third conclusion is that the bidding game format seems to possess the same property. In other words, we find no evidence that valuation-preference inconsistency, unlike average WTP valuation, varies with format. Inconsistency is often reported in WTP studies [24,25] and preference reversal—whereby individuals declare they prefer A to B yet offer to pay more for B than for A—has been identified by experimental economists since the 1960s [26]. Explanations for reversal vary, from cognitive illusion, analogous to optical illusion, to the belief that establishing preferences and establishing values actually entails the use of different mental processes [27]. To explain why the preponderance of subjects in our study offered identical WTP values, despite exhibiting distinct preferences, we conjecture that, whilst preferences are determined by one or more salient attributes of a particular screening method, WTP is more predominantly determined by a valuation of screening per se, independent of screening method. Although the WTP formats tested in this and our earlier study made reference to a specific intervention, we see no reason to suppose that our findings must be confined to colorectal cancer screening. This being the case, comparative evaluations of other interventions would be likely to produce equivalent results, although the degree of WTP “uplift” in bidding over OE or PS formats must itself be a function of the character of the specific interviewer and the bid structure. Of course, this supposition will only be confirmed when further format comparisons have been undertaken. The general applicability of our specific finding would be a cause for concern, however. If health care decision makers use WTP results as guides in selecting from

297

amongst a portfolio of rival technologies, the economic case for preferring any one technology over others will depend considerably upon whichever format happens to have been used to generate the valuations. An intervention where WTP has been elicited using a bidding game, especially one using high starting-bids, will appear more valuable than interventions where questionnaire-based OE or PS formats have been used, ceteris paribus.

Acknowledgements This research was undertaken as a part of the UK Flexible Sigmoidoscopy Screening Trial, funded by the Imperial Cancer Research Fund, the UK Medical Research Council, NHS R&D and Keymed Ltd. None of the funding bodies had any involvement in the writing of this report or in the decision to submit it for publication.

References [1] Diener A, O’Brien B, Gafni A. Health care contingent valuation studies: a review and classification of the literature. Health Economics 1998;7:313–26. [2] Blumenschein K, Johannesson M. Use of contingent valuation to place a monetary value on pharmacy services: an overview and review of the literature. Clinical Therapeutics 1999;21(8):1402–17. [3] Klose T. The contingent valuation method in health care. Health Policy 1999;47:97–123. [4] Scholefield JH, Moss S, Sufi F, Mangham CM, Hardcastle JD. Effect of faecal occult blood screening on mortality from colorectal cancer: results from a randomised controlled trial. Gut 2002;50:840–4. [5] UK Flexible Sigmoidoscopy Screening Trial Investigators. Single flexible sigmoidoscopy screening to prevent colorectal cancer: baseline findings of a UK multicentre randomised trial. Lancet 2002;359:1291–300. [6] Frew E, Wolstenholme JL, Whynes DK. Willingness-to-pay for colorectal cancer screening. European Journal of Cancer 2001;37:1746–51. [7] Smith RD. The discrete-choice willingness to pay question format in health economics: should we adopt environmental guidelines? Medical Decision Making 2000;20:194–206. [8] Mitchell RC, Carson RT. Using surveys to value public goods: the contingent valuation method. Washington DC: Resources for the Future; 1989. [9] O’Brien BJ, Goeree RAG, Gafni A, Torrance GW, Pauly MV, Erder H, et al. Assessing the value of a new pharmaceutical: a

298

[10] [11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

E.J. Frew et al. / Health Policy 68 (2004) 289–298 feasibility study of contingent valuation in managed care. Medical Care 1998;36(3):370–84. Tversky A, Kahneman D. The framing of decisions and the psychology of choice. Science 1981;211:453–8. Farrands PA, Hardcastle JD, Chamberlain J, Moss S. Factors affecting compliance with screening for colorectal cancer. Community Medicine 1984;6:12–9. Neilson AR, Whynes DK. Determinants of persistent compliance with screening for colorectal cancer. Social Science and Medicine 1995;41:365–74. Office of National Statistics. Distribution of household income, 1995–1998: Regional Trends Dataset RT34802. In: http://www.statistics.gov.uk/statbase/xsdataset.asp; 2000. Jorgenson BS, Syme GJ. Protest responses and willingness to pay: attitude toward paying for stormwater pollution abatement. Ecological Economics 2000;33:251–65. Onwujekwe O, Nwagbo D. Investigating starting-point bias: a survey of willingness to pay for insecticide-treated nets. Social Science and Medicine 2002;55(12):2121– 30. Stalhammar NO. An empirical note on willingness to pay and starting point bias. Medical Decision Making 1996;16(3):242– 7. Kartman B, Andersson F, Johannesson M. Willingness to pay for reductions in angina pectoris attacks. Medical Decision Making 1996;16:248–53. Phillips KA, Homan RK, Luft HS, Hiatt PH, Olson KR, Kearney TE, et al. Willingness to pay for poison centres. Journal of Health Economics 1997;16:343–57.

[19] Easthaugh SR. Willingness to pay in treatment of bleeding disorders. International Journal of Technology Assessment in Health Care 2000;16(2):706–10. [20] Bonini N, Biel A, Garling T, Karlsson N. Influencing what the money is perceived to be worth: framing and priming in contingent valuation studies. Journal of Economic Psychology 2002;23:655–63. [21] Werner P, Schnaider-Beeri M, Aharon J, Davidson M. Family caregivers’ willingness to pay for drugs indicated for the treatment of Alzheimer’s disease. Dementia 2002;1(1):59–74. [22] Domenighetti G, Grillia R, Maggi JR. Does provision of an evidence-based information change public willingness to accept screening tests? Health Expectations 2000;3:145–50. [23] Philips Z, Johnson S, Avis M, Whynes DK. Human papillomavirus and the value of screening: young women’s knowledge of cervical cancer. Health Education Research 2003;18(3):318–28. [24] Olsen JA. Aiding priority setting in health care: is there a role for the contingent valuation method? Health Economics 1997;6:603–12. [25] Ryan M, San Miguel F. Testing for consistency in willingness to pay experiments. Journal of Economic Psychology 2000;21:305–17. [26] Roth AE. Introduction to experimental economics. In: Kagel JH, Roth AE, editors. The handbook of experimental economics. Princeton NJ: Princeton University Press; 1995. p. 3–109. [27] Hargreaves Heap S, Hollis M, Lyons B, Sugden R, Weale A. The theory of choice. Oxford: Blackwell; 1992.