Asking open-ended consumer questions to aid program planning

Asking open-ended consumer questions to aid program planning

~ya~u~~ionand Program P~~nn~ng~ Vol. IS, pp. 1-6, 1992 Printed in the USA.AIIrights reserved. JACK McKa~rr), KATIEMOIRS,and C~CIR~~TXNB CERVENKA Sou...

1MB Sizes 11 Downloads 107 Views

~ya~u~~ionand Program P~~nn~ng~ Vol. IS, pp. 1-6, 1992 Printed in the

USA.AIIrights reserved.

JACK McKa~rr), KATIEMOIRS,and C~CIR~~TXNB CERVENKA Southern Illinois University at Carbondale

The effits ~f~gsti~nlength and ~pri~itn~ ofd~re~t~~~s ~f~rrn~t~ were ~am~nedfor opegended consumer satisfaction questions at nine hearth ~rorn~tjo~ workshops presented on a co& fege iambus. &pert judges rated 271 answers ~~r~tte~ by 97 program part~~~~ants. Rating Dimensions included ~sefM~nessfor program pfanning, ~~rnber of ~~~p~~~~~~~~~ ~~~~~~~~, and number of critical comments. Question format, but not question length, was reiated to the usefulness of answers. More explicit response directions elicited more usefui answers. This relationship was mediated by the ability of explicit directions to elicit critical comments. Imp& cations for the use and the warding of open-ended questions in consumer research are discussed,

face validity da not always translate into i~f~~m~tion useful for program pkuming. ~omp~i~tions arise because ~art~cjpa~ts make only gross discrimjnat~ons among program components (Larsen, Attkisson, Hargreaves, & Nguyen, 1979; Smith, Falva, ~cKi~Iip~ & Pitz, f984) Analyses reveal that responses to items covering diverse content, whether about parking or individual staff members, reflect only a general valuative factor. Participants see multidimensional program inputs through a single lens (or very few lenses). The present study was spurred by program developers’ assertions that responses to standardized consumer reaction questions did not provide useful information (Lebow, f983b). An important drawback of standardize consumer reaction instruments is their dependence on the traditional ~sychom~ric criteria of re~jabil~ty and validity (Fascoe, 1983). Instruments are evaluated positively if they yield stable scores that truly reflect consumer attitudes, not if they yield information that can be used to improve program performance. Larsen, et al. (1979) suggest that useful consumer reactions can he gained by

A staplie task for those involved in program pianning and evaluation is the measurement of consumer reactions, Typically, program participants respond to a st~dardized series of closed-ended questions about services and service providers (Pascoe, f983), The procedure has several attractive characteristics. The instruments are easy to administer since participants are usually a cooperative and often a captive audience, Questions have strong face validity. Who could better comment on services than those who have received them? Data analysis of closed-ended questions is straightforward, especially with computerized scoring of machine-readable answer sheets. A characteristic fmding of consumer studies is that program ~a~ici~ants are happy with services. Reviewing studies of mentaI health settings, Lebow (i983a) reported that 75% of clients in 34 outpatient programs and 76070of clients in 13 inpatient programs were satisfied. In 221 studies of patient reactions ta medical care, Hall and Dornan (1988) found a median of 7884% of participants satisfied with services. Ease of data collection and analysis, and apparent The authors was read at Requests Carboadafe,

thank Marc Cohen, Pat Fabiano, David Elam, and Barb Fijolek for rating open-ended responses. An earlier version of the paper the meeting of rhe American Evaluation Association, ~asb~~gto~~ D.C., October, 1990_ for reprints should be sent to Jack ~~cK~~~~~, Applied Experimental Psychology, Southern fllinois University at Ca~bo~d~e~ IL 62~~-6~~2. f

2

JACK McK~~~~F~ KATIE MUIR& and CHRISTWE CERVENKA

supplementing standardized scales with answers to open-ended questions. Unfortunately, literature reviews reveal little research-based guidance on how to ask program consumers open-ended questions. Effort and complexity of scoring open-ended responses may account

for the Iack of discussion (e.g., Sheatsley, 1983). This study investigated how open-ended questions might be asked to increase the usefulness of answers for program planning and development.

Not mttif recently has empirical work appeared in the wider social science literature examining open-ended questions {Converse, 1984). S~hurn~ and Presser (198 1) reported a series of split-ballot comparisons of openand ciosed-ended questions in surveys with national probability samples of adults. Closed-ended questions produced a more uniform frame of reference among respondents than did open-ended questions. Schuman and Presser (1981) suggest that answers to open- and closedended questions differ not only because the latter provide specific response options, but also because these options model approp~ate answers. Closed-ended questions give directions both for response content and for response format. Open-ended questions may let response format vary freely from respondent to respondent. Responses to open-ended consumer questions may be more useful if question stems give explicit directions about how to respond. This study investigated the

h~othesis that open-ended consumer questions with explicit directions for responding lead to more useful answers than questions that do not give explicit directions. Research by Blair, Sudman, Bradbur~, and Stocking (1977) found that longer open-ended questions elicited more accurate answers about socially less desirable behaviors than did shorter closed-ended questions. Sudman and Bradburn (1982) theorize that increased accuracy results because respondents find longer questions less threatening. Schuman and Presser’s (1981) research also suggests that the length of open-ended questions may have an effect, by serving as a model for answers.. Longer, more t~ou~btful questions may lead to longer, more thou~htf~1 answers. This study investigated the second hypothesis, that longer ~ons~lrner questions lead to responses more useful for program planning and development than do shorter questions.

USEFULNESS OF ANSWERS Analysis af the quality of consumers’ reactions to openended questions is made more difficult by the lack of an indicator of the %sefulness” of an answer. This study

used composite ratings of the usefulness of responses for workshop pIarming and development made by four expert program developers as the criterion measure.

~~TH~~ Subjects Subjects were voluntary participants in &f-hour health promotion workshops presented by the professional staff of the Southern Illinois University at Carbondale (SIUC) Student Health Program Wellness Center (Cohen, 1980). Ali 97 subjects were university students. Sixty-two percent were wornen and 38010were men. The mean age of participants was 22.8 years. A sir&e workshop was selected for each of nine staff members from 19 workshops presented during the spring of 1988. Topics included nutrition (2 staff members), stress management (2), self-care (I), sex (I), exercise (l), test taking (l), and drug use (1). Between 16 and 42 people participated in each workshop. Within workshops, 8 to 12 participants were selected far the study. Restrictions were: (a) that participants have answered at least two of the three open-ended questions, and fb) that at least one subject come from each study condition in each workshop. The restrictions were meant to yield a manageable number of answers while ~ro~~idi~~reliable ~mult~ple) measurements on in-

dividua1 participants. With these restrictions, protocols were selected randomly. Of a total of 225 participants, 175 were eligible for the study, Analyses did not reveal differences in age or gender between thase selected and those not selected, However, selected participants were slightly more satisfied with the workshops than were others (Ff1,223) = 3.74, p < J&i, r = .X3 I). Mean satisfaction was X.67 for selected pa~tj~ip~nts and I .86 for those not selected, on a scale from 1 (very satisfied) to 7 (very dissat~sfied~~ MultipIe versions of the consumer questio~l~~ire were constructed, varying the wording of three open-ended questions. Closed-ended questions were the same on all versions. Closed-Ended Qum’iom.

completed

Workshop participants first 10 closed-ended, machine scored, valuative

questions. Responses were averaged for an overall Satisfaction score. Homogeneity reliability (coefficient aIpha) for this score was .87, Additional questions asked for participants’ gender and age.

~~e~-~~~~~ QwsGxi.

Three open-ended pu&ons were asked of &I workshop participants, A General question asked for ~rti~~ts~ likes and dislikes abont the wo~k~bop. A Topic question asked for topics to be added and dropped from those coveted by the workshop. The third open-ended question asked for reac&ms to the workshop Presenter. The order of ~p~~~~~d~dquestions was the same for all participants. Participants generated a total of 271 written answers to the 3 open-ended CpWAl~~S.

Quesdion &XVM~. The first hypothesis predicted that explicitness of response directions influences the usefulness of answers. Explicitness was ma~ip~ia~ed by Format: (a> the Free format asked for respondents’ f&Gngs about: the ~vorksbop~ the topics covered, and the pmsenter; (b) tke Explicit format asked for a specific res~o~s~ to each part of the open-ended questionsAcc~~di~~ to Patton (199@, use of words like “feelings” reflects true open-ended questions, The short version of each question is presented in Table I, Qu&&rt Leng!h. The second hypothesis predicted that length of the open-ended question influences the use&C ness of answers. Length was manipulated by combinjng the wording given in Table I with an addjtio~~ sentence giving a ratio&e for the open-ended question. Twu forms of the Long sentences wcze used, supposedly varying in strength of the reawn ghwl. For exampIe, for the Topic question one rationate sentence stated:

TABLE 1 EXPLKW AND FREE QUESTION F6RMA”TS i=aR QPEN-ENDED QUESIWNS

Format

Quest&-~ Type

Open-Ended

Question voider

Pfesse give us your feelings about what m iiked arxf did nat iike a&& the workshop. Pfease give us your feeiings about What topics could be ad&d or dropped from those covered in tha: workshop. Please give us your feelings about the presenter. tit one thing you Wad and one thing you did not like about this workshop. list one topic that sllouid be added to those covered by Wi workshop and one fop& that could be dropped. List one positive point and 0378 negatfve p&nt about ths prW&?ntar.

“People who &tend this workshop have 8 wide range of interests, needs, and backgrounds.” The other stated “Some ~;opics covered in this workshop are interesting to most people, some to none, and some topics are not covered.‘” Analyses revealed no differences between the two Long question forms, nor interactions between this factor and any study variable. Analysis of differences between Long q~~~ion forms are no%tu be p~e~~~edMean length of the Short questions was 14,S words, and mean length of the Long questions was 28,8 words,

Versions of the consumer questionnaire were randomfy ordered and distributed by presenters to workshop participants. Completion of the questionnaire was voluntary and was not monitored by the presenter. Experience indicates that between ‘70% and 100% of those remaining fog most af the workshop compleze some part of the consumer q~estio~~~re. Participants left the questionn&e at their seats when they Ief~ the worksbo~. Presenters later coilected the questionnaires zmd fumed them in to the study’s authors for a~~~~sis, For: the study sample, the verbatim text of answers to each open-ended question was typed on an index card, Information codes identifying the question, the Format x Length conditions, the workshop, and a subject identification number were placed on the reverse side of the card, At the time of transcribing, the number of words in the answer was counted. Answers averaged IO.4 words In fen@_ ~~~~~~~~~~~~~~~~~~Ratings of the us~f~~n~~ of the open-ended a~fswers for workshop planning were made by four expert workshop presenters From the Welfness Center, AIXraters were full-time professio~~a~health educators who had been presenting workshops for the Wellness Center for at least four years, Ail had responsibility far planning and development af health-related educational workshops. Index cards were grouped by question and presented in a different random order for each rater, Raters praced answers in one of four ca%egories of ~~usef~I~ess for planning and deveIopmen~ of a works~up~: most usef& (41, fairly usefuf (31, somewhat useful Qf, a& feast useful (I), Ratings of each ope~~e~d~d question were made at a separate sitting, and question order was counter balanced. The rating procedure was designed to keep raters blind t,n the Format x Length conditions, subject characteristics, the ratings of the other raters, and the particular workshop. The complexity of the procedure and the number of versions of the consumer questionnaire contributed to this effort. While individual answers inevitabty i~c~~d~ &es tu the wor~hop~ discusGoB w&I% the raters indicate that they were blind tu the study h~otheses,

Interrater reliability was -74, .88, and -84 for the General, the Topic, and the Presenter questions, respectively. Usefuh~ess ratings were averaged over the four expert judges for each question for each respondent. The mean Usefulness score for all answers was 2.56, about the midpoint of the rating scale. Am@ C~~~~~~~~~~_ Ratings of the n~rn~r of comp~jrne~~~ ~rnrn~~ts and the number of criticaJ comments expressed in each answer were made by the study’s authors, using the same ventral procedures as used for the Usefulness ratings. The three raters read each answer and rated the number of complimentary comments and the number of critical comments. A comphmentary comment indicated that the participant liked something, was happy about something, or wanted something to stay the same. A criticti comment indicated that the participant disliked s~rn~~hiug or wanted some aspect of the workshop changed, Raters were blind ta the Format x Length conditions, subject char~~er~~~~~s* the ratings of the other raters, and workshop+ Interrater reiiab~~~ty ranged from 95 to -96 for the number of ~OrnpI~rnentary monuments and from .95 to -98 for the number of critical comments. The number of complimentary comments and the number of critical comments were averaged over the three judges for each question for each respondent. Answers averaged 1.01 complimentary comments and .53 critical comments,

A.mdy$es ofvarkmce were ~0~~~~ fur~sef~~~~s and for each answer ~~ara~ter~st~~~number of words, number of ~rnp~ment~ ~o~ents~ and number of critical comments. Since interactions were not observed among the study variables, only main effects are presented.

The de~~~d~~t variables of Usefulness, number of words, mmrber of complimentary comments, arrd rmmber of critical comments were analyzed by the SPSSX MANOVA procedure. Workshop (N = 9) was a blocking factor with subject nested within workshop, Format (2) and Length (2) were between-subject factors, and question (Genera& Topic, Presenter] was a ~~~t~n-~~bj~~t factor. The two Loag question forms were collapsed. Separate analyses were conducted for each of the dependent variables. Whik main effects for question were oiled on ah dependent variables, question did not interact with Workshop, Format, or Length. In addition, no interaction was found between Format and Length. Because no interaction term was found to be significant, analyses of interactions are not reported. Because of the multiple significant effects, process analysis was conducted to model the reiat~o~s~i~s among the j~d~~~~~~t and dependent variables as suggested by Judd and Kenny j1981). Format and Length were ~ons~de~~ input variables. Usefulness was the output or criterion variable. The number of words, the number of complimentary comments, and the number of critical comments were investigated as mediating or process variables. Since question differences did not interact with the input variables, analysis collapsed over questions. Analyses by question presented essentially the same pattern as was observed overall.

Free and ~~~~~jt formats a&o differed on thawnumber of comments elicited, Free Format qu~~o~s &cited s~g~f~~~nt~~more ~omp~~rnenta~ moments (M= 1. 10) than Explicit Format questions [iw = 91, F(1,45) =

Question Length Table 2 presents characteristics of answers to all openended questions for the Length and the Format conditions. Longer questions did not lead to more useful answers, as had been h~th~~ed. Longer ~~~st~ons did lead to longer answers fF(1,45) = 5.53, p -CC.05, r = .33]. Long questions elicited answers averaging 1f-74 words. Short questions ehcited aE1swersaveraging 8.05 words. Reliable differences were not observed for the number of complimentary ~mme~ts and the number of critical comments.

TAIXE 2 MEAN ANSWER USEFULNESS AND CHAflACtER1S”FIC FOR QUES’WDN LENGtTH AND FORMAT, ALL QUESTiQNS Answer Rating

Ait

97

2.56

10.40

1 .Ol

0.53

Length

Question Format Answers to Explicit Format questions were judged to be more useful. than answers to Free Format questions fF(1,45$ = 4.10, p < .05, r= 291. Mean Usefulness ratings were 278 and 238 for the Explicit and Free Formats, re~~e~~~v~~~_ The h~othes~zed effect of Format on Usefulness was observed.

Short

Long Format Free Expkit

33

2.56

84

2.57

8.05* 11.74’

0.96 1 .O4

Q-48 0.58

411 49

2”38* 2.TW

31.31 9.41

1.10* 0.91*

O.&z” 0.65”

Asking Qpen-Ended Consumer Questions

input

Length

Output Variables

Mediating Variables

Variables

-

.20 W

#Words

.n

# Critkaf Commerrts -

UwJP”

Format T*, -

#c~~p~~~~~ Qmments

*m ==+w‘+ tfsswnsss .#f .29 J

Figuw?1, Signifkant standardiied regression coefjlcients for the Length and Format input variables, answer charactedstic mediating variables, nnd Usefulness output variable. Analysis collapsed wer open-ended questions.

5.75, p < m, r = .345. Explicit Format q~~s~j~~s ehcited s~~~~~t~y more critica! comments (M = -65) than Free Format Questions (M = .4& F( 1,45) = 4.15, p < .Q5,F = .291_No differerrces were observed in the number of words per answer.

Because significant effects were found for Usefulness ratings as well as for the answer characteristics of number of words, number of complimentary comments, and number of critical comments, path analytic process analysis was conducted to examine whether answer characteristics mutated the impact af the input variables of Format and Length on the output variable of Usefulness (Judd & Kenny, 19gl)_ usefulness of responses was repassed on answer char~ter~st~~s~ the input variables, and the subject variables of age, gender, and ~at~sfa~~~n* as measured by the closed-ended satisfaction questions, Only the number of complimentary comments and the number of critical comments predicted the Usefulness of answers. Although both types of comments were positively related to usefulness, critical comments were much more strongly related to usefuIness than corn~~i~e~~~r~ comments, ~~~ifjc~t stand~d~zed regression c~e~fic~cnts for rned~at~~~ and output variables are shown in the right haif of Figure f _ Next, number of words, number of ~~rn~~~rnenta~

comments+ and ~~~~er of crriticaI commenls were regressed separately on the input variables and the subject variables of age, gender, and Satisfactiorr. ~uest~~~ Length and ~atjs~ac~on predicted th-e number of words per answer. Less satisfied respondents and those presented with Long questions wrote longer answers. Only question Format predicted the number of comments. The positive relationship between Format and the aumber of critical comments reflects the finding that the Explicit Format elicited more critical comments, The negative relationship between Format and the number of compl~en~ comments reffects the finding that the Free Format elicited more cornp~~rnen~a~ comments. Significant stand~d~~ed regression coefficients for the input and ~~~~a~~~ variables are shown on the left side of Figure 1. The r-es&s of the process analysis support the finding that question Format but not Length i~~~e~~es answer Usefulness, The absence of a reliable relationship between Format and Usefulness when t,he number of critical and complimentary comments were included in the regression equation suggests a mediating role for the number comments. Format is related to Usefulness, but indirectly. The size of the regression coefficients fat criticat compared to c~mp~imenta~ comments s~~~es~that effect of Format on Usef~ne~ was primarily due to the abifitg of the Exphcit Format to ehcit critical ~mme~ts from workshop participants.

The wording of open-ended questions can affect the usefulness of answers for program planning, As suggested by Schuman and Presser (1981), the utihty of answers to open-ended questions was increased by making explicit the format appropriate responses. should take, Questions that asked for a specific number of client (%~e like and one dislike”) produeed more useful an-

swers than questions that asked for “fedings about what is liked and dishked,” Longer questions led to longer, but not more useful, answers. A similar resuit was found for the closedended measure of satisfaction. Satisfaction with programming i~~ue~c~d the length but not the ~sef~~~~ss of answers. Less satisfied respondents wrote longer an-

tion,As iathe presentstudy, the ~~rn~~~~~~u~of

swers~In this study, longer q~~s~io~~ provided a rationale for the open-ended question. Another question length manipulation, perhaps expanding on the question itself, may find an effect on answer usefufulness. While positively related to the number of both complimentary and critical comments, usefufnc;ss of answers was tied mrsst closely to the number of crlticaI comments. PKSCS analysis indicated that the ~~~~~i~ene~ of the expficir: ~~~st~on format was due to its ~~~~~t~to elicit su~~s~~~~s for workshop and ~r~s~~t~r imgrovement from atherwise very saMi& r~s~o~de~ts. F~us~~g on reasons for di~atisf~~o~ was one of the

closedand open-ended consumer reaction questions can avoid these problems, Closed-ended questions gave evidence that consumers were happy with the workshops, and open-ended questions gave program planners information useful far psogrtim improvement.

~~~~~~~~5~~

Subjects were colXegestudents who are d~s~~~guished fram many cansumers by high verbal ability, Results may be gcxresaXizedbeyond this popuIat.ion. Za a review of national probability survey of adults, Glrrr (1988) found that nmst respondents answered opxr-ended questions and that interest in the topic was much more important than ~~~at~~~ level in predicting whether an open-ended re~pcmse WC&~ be made. Since ~~~~~~ ~~tici~~t~ zze o&33 interested in comrnen~~~ on services they have recei%Td recent&% co~s~rn~~ ~~~~~~ stndies may be a partidady fertik as-es Ear use of odes-ended qu~s~i~~s.

made

Z.IY Larsen,

utility

CYTclient

et at.

ff979)

for

increasing

reaction studies. The results of this study support that suggestion and are consistent with the basic research findings that negative iafarmatian influences judgments more than positive information (e.g., Anderson, 1981), Eliciting examples of ccmsumer diss~tisf~et~~~ has disadvantages. Where e~asmner reactions ale used to evaluate a program rather than to improve it, ~~~~~ctio~of criticai program ~~~o~rnatio~ may br& ~drn~~~st~t~v~dis~I~~~ fpasavaCzL Carey, I%$). ~~e~~~tat~on of cons~mers~ critic4 comments may give the fafse impression of c~~surne~ dissatisfacthe