Journal of Pragmatics 42 (2010) 2430–2448
Contents lists available at ScienceDirect
Journal of Pragmatics journal homepage: www.elsevier.com/locate/pragma
The effect of apologetic error messages and mood states on computer users’ self-appraisal of performance Mahir Akgun a,*, Kursat Cagiltay a, Deniz Zeyrek b a b
Computer Education and Instructional Technology Department, Middle East Technical University, 06351, Ankara, Turkey Department of Foreign Language Education & Cognitive Science, Middle East Technical University, 06351, Ankara, Turkey
A R T I C L E I N F O
A B S T R A C T
Article history: Received 31 August 2009
This study, in which 310 university students participated, was designed to investigate whether computer interfaces that offer human-like apologetic error messages influence users’ self-appraisals of performance in the computerized environment. The study consists of three phases. In the first phase, using the CCSARP (cross-cultural study of speech act realization patterns) coding manual, apology strategy sequences were elicited from Turkish participants. Two of these apology strategy sequences were selected for the second phase, which is a test including experimental and control groups. The experimental groups were presented with the two apology sequences, and the control group was given a plain computer message. The second phase investigated whether any of these three messages were perceived as apologies. The results indicate that the two apology messages were perceived as apologies, but the plain computer message was not perceived as one. The third phase investigated the relationship between the users’ moods and their selfappraisals of performance after the transmission of the apologetic error messages. The findings show that the influence of apology messages on the users’ self-appraisals of performance depends on the participants’ mood state and the content of the apology messages. ß 2010 Published by Elsevier B.V.
Keywords: Self-appraisal of performance Apologetic error message Mood state Human-like computer interface
1. Introduction One of the major goals in human–computer interaction (HCI) is the employment of designs that allow more human-like communication (Burgoon et al., 2000; Ritter and Young, 2001). Suchman (1987) mentioned the term ‘‘sociability of computers’’ and advised that properties of human–human interaction (HHI) (e.g., dialogue, conversation, etc.) should be considered when describing what goes on between people and machines. Lisetti and Schiano (2000) more recently noted a conspicuous paradigm shift in HCI from the design-centered approach, i.e., adapting people to computers, to the user-centered approach, i.e., adapting computers to people. In the literature, there has been little discussion concerning which users’ needs should be considered in the development of more human-centered interfaces that interact with users when their intended tasks cannot be executed because of the incapacity of the product. One example of this situation is a computer’s presentation of an error message when a problem occurs in the interface. Most of the error messages commonly utilized in computer interfaces are short and plainly technical; these messages straight-forwardly reveal the nature of computer-centered designs to the users when they encounter a problem during their interactions with the interface (Tzeng, 2004). With regard to error messages, an alternative to the
* Corresponding author. E-mail addresses:
[email protected] (M. Akgun),
[email protected] (K. Cagiltay),
[email protected] (D. Zeyrek). 0378-2166/$ – see front matter ß 2010 Published by Elsevier B.V. doi:10.1016/j.pragma.2009.12.011
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2431
computer-centered interface is the human-centered interface, which presents error messages that include emotional expressions, such as apologies (Neilsen, 1998; Tzeng, 2004). In HHI, apologies are generally used to express regret (Leech, 1983; Schlenker and Darby, 1981) or to alleviate individuals’ anger caused from their disapproval of others’ action. In other words, apologies mitigate frustration and anger when attempted interactions fail. Similarly, Neilsen (1998) argues that error messages responding to the computer user’s action should include a simple apologetic statement when the reason for the error is the limitation of the computer interface to perform the intended task. Tzeng (2006) conducted a study investigating users’ perceptions of online systems containing three different error messages, each of which includes different politeness strategies. In the study, firstly users’ politeness orientations were elicited and then participants were asked to interact with websites including pre-determined problems. When users encountered problems, the system provided certain error messages representing one positive politeness strategy (i.e., joke), one negative politeness strategy (i.e., a simple apology), and a mechanical message for the error (i.e., the page is temporarily unavailable). The findings of the study showed that users who deal with social events with polite expressions preferred to receive apologetic messages significantly more than mechanical or joke messages, and they preferred apologetic messages significantly more than those messages that are less oriented to polite expressions. Tzeng (2004) examined whether apologetic feedback affects users’ performance perception in the computerized environment. A computer-based word-guessing game was designed for the study in which participants were expected to guess the correct word after the computer provided a key word. Tzeng reported that the subjects in apologetic feedback groups did not perceive their performance or their ability to play the game more positively than those in non-apologetic groups. The reason for the insignificant results in the study by Tzeng (2004) investigating the effect of polite expressions (i.e., apologies) in the computerized environment might be the negligence of certain factors influencing the need for polite expressions. For example, mood can be such a factor. Forgas (1999) indicated the importance of mood in the level of politeness people use. In this study it was found that individuals in negative affect produced greater politeness than those in positive affect while producing requests. In addition to this, a number of different theories have explained the mood-cognition relations (e.g., Bower, 1981; Forgas, 1995). The findings of the studies examining this relation indicate that mood can influence judgments, decisions and evaluations in at least two ways. First, mood can influence what individuals think (the content of cognition). In his Associative Network Theory, Bower (1981) claims that mood state influences cognitive processing related to the retrieval of information from long-term memory. According to this theory, memory consists of networks of associated concepts and each cognitive network includes emotion-related memories and cognitions. Within each cognitive network, certain emotion nodes, which are defined as the groups of associated concepts, stand for specific emotions such as sadness, anger and fear. The activation of a particular node triggers the related network of connections to retrieve emotion-related memories and cognitions. Bower (1981) found that individuals recalled most of the experiences that were affectively congruent with the mood they were in during recall. Stated differently, an individual in a positive mood retrieves positive thoughts and associations more easily from memory, whereas recalling negative thoughts and associations is more effortless for an individual in a negative mood. In addition to associative network theory, Isen (1985) suggested that most individuals retrieve positive thoughts and associations more easily than negative thoughts and associations because positive materials are better integrated and more extensive. Second, mood can have an impact on how individuals think (the process of cognition). Fielder (2000) indicates some differences between positive and negative moods in terms of the process of thinking. Internally driven, top-down, flexible and generative processing styles are more related to positive mood, while externally oriented, bottom-up and systematic thinking styles are more related to negative mood. Forgas (1994, 1995) developed the Affect Infusion Model in order to contribute to the discussion concerning the process of cognition. According to the model, four alternative processing strategies might be used when making judgments: The direct access of a preexisting evaluation (1) and motivated processing toward a preexisting goal (2) involve little generative and constructive processing; during these processing strategies, moodincongruent processing may occur. On the other hand, the Affect Infusion Model suggests that mood-congruent processing occurs during (3) heuristic processing and (4) substantive processing. These processing strategies involve constructive processing of information. Theories of mood-congruent processing support the idea that positive moods lead to more favorable judgments, whereas negative moods lead to less favorable judgments. Previous studies aiming to understand the effect of human-like messages on computer users generally concentrated on the effect of human-like messages in terms of actual performance, performance perception, self-appraisal and interaction time. The activities related to the variables measured in previous studies require cognitive effort and so the factors (i.e., mood state) influencing cognitive processes also affect the level of success in these kinds of activities. Therefore, mood state as a factor influencing both cognitive processes and preferences of polite expressions should be considered when the effect of polite expressions (i.e., apologies) on cognitive variables (i.e., self-appraisal) is the focus of investigation. Consistent with the purpose of creating human-like computer interfaces in HCI, the properties of HHI have been expected to be valid for computer interfaces. As stated before, one of the crucial aspects of HHI is to use apology for maintaining the relationship between the interactants by alleviating frustration and anger of the frustrated interactant whenever the interaction fails for any of the various reasons. Therefore, HCI researchers have widely accepted the practice of using apologetic error messages in the computerized environment when users encounter a problem caused by the computer’s inability to carry out a task. Based on the models explaining the effect of mood state on cognitive processes (e.g., Bower, 1981; Forgas, 1995) and the findings indicating the role of mood state in the preferences of polite expressions (e.g., Forgas, 1999), it can be proposed that
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2432
mood state should be taken into account when the effect of polite expressions on cognitive variables is investigated. In particular, the use of apologetic error messages in relation with users’ mood states in the computerized environment may influence users’ self-appraisals of performance by mitigating the users’ frustration and anger when their intended tasks cannot be executed. In order to achieve a better understanding of this, the present study investigates whether apologetic error messages and the users’ mood states significantly affect computer users’ self-appraisals of performance. 2. Significance of the study Most studies that have focused on computer interfaces that present human-like messages investigated the effect of human-like messages on computer users without considering other important factors, such as the users’ mood states and the way the messages are formulated. There is a significant need for empirical research on the effects of human-like messages on computer users in relation with their mood states. The findings of this study will hopefully reveal the significance of users’ mood states and help to determine which forms of human-like apologetic error messages are most effective in computer interfaces. The main goal of research concerning human-like interfaces is to develop interactive computer systems which have certain characteristics that resemble those of human beings. According to Nass et al. (1994), the social rules and dynamics which guide HHI can be applied equally well to HCI. This idea has become one of the main areas of focus in research concerning human-like interfaces. However, only a few studies have concentrated on the dynamics within HCI. This study has been designed to fill this gap with the help of the field of Pragmatics which is defined as a science focusing on the language-using human (Mey, 1993). The researchers conducted a pragmatic analysis to determine which apology strategies are preferred in the computerized environment and how these selected apology strategy sequences interact with the users’ mood states. The results yield several significant findings which the authors hope will be an important contribution to future research in the field of the Pragmatics of Human–Computer Interaction. 3. Apologies Olshtain and Cohen (1983) proposed an apology taxonomy consisting of five main strategies and many sub-strategies (not mentioned in this paper)1: A.
Illocutionary force indicating devices (IFIDs) E.g., I apologize, I am sorry.
B.
An explanation or account (Exp) E.g., The traffic was terrible, the bus was late.
C.
Take on responsibility (ToR) E.g., It is my mistake, I didn’t mean to.
D.
Offer of repair (OoR) E.g., I will pay for the damage.
E.
Promise of forbearance (PoF) E.g., This won’t happen again.
While two of these apology strategies are general, namely the IFID and the ToR, the other three strategies are situation specific (Olshtain, 1989). It has been proposed that these two strategies can be used in all situations and in all languages (Olshtain, 1989). IFIDs are ‘‘formulaic, routinized expressions in which the speaker’s apology is made explicit’’ (Blum-Kulka et al., 1989, p. 290). The ToR strategy refers to accepting the responsibility of an act in order to placate the hearer. The remaining three strategies (the Exp, the OoR, and the PoF) are situation specific, which means that their use depends on the context. When the speaker prefers to use the strategy Exp, s/he describes the situation which caused him or her to commit the offense. The OoR may be chosen by the speaker for situations in which the inconvenience influencing the hearer can be repaired. Finally, the PoF is employed by the speaker in order to assure the hearer that the action which created the need to apologize will not happen in the future. Olshtain (1989) presented seven situations in a discourse completion questionnaire, and then analyzed the responses given by the participants in Hebrew, Australian English and Canadian French. The findings of the study showed that the Hebrew speakers preferred to use the IFID and the ToR in all situations. There was a considerable variation in the level of preference in relation to the situations. Olshtain also found that in the three languages studied, the respondents employed IFIDs between 60% and 75% of the time and the ToR strategy between 65% and 70% of the time. These findings showed that these two strategies are highly used in all three languages. The other three strategies have much lower use compared to the IFID and the ToR.
1
Throughout the paper, the abbreviations shown in the parentheses will be used to refer to these apology strategies.
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2433
Vollmer and Olshtain (1989) used the discourse completion test to investigate apology preferences in German. In their study, eight different apology situations were developed and participants’ choices of apology strategies across situations were investigated. Their findings showed that the participants employed the IFID and the ToR in all situations: The use of the IFID across situations ranged from 37.5% to 84%. On the other hand, the use of the ToR across situations was between 30% and 94%. Hatipoğlu (2003) investigated the apology strategies utilized by British and Turkish people. The findings of that study revealed that half of the collected apologies were explained by the IFID and the ToR in both Turkish and British culture. The percentages of the strategies for each culture were as follows: The IFID = 33% for British and 24% for Turkish, the Exp = 10% for British and 9% for Turkish, the ToR = 17% for British and 26% for Turkish, the OoR = 24% for British and 22% for Turkish, and the PoF = 1% for British and 1% for Turkish. In another study, Hatipoğlu (2004) investigated the form and type of apologies used in e-mail messages. She found that e-mail apologies have some distinctive properties that distinguish them from those utilized in spoken and written languages. For example, in the one-to-one dialogue group including personal e-mails intended for only one receiver, 11.1% of the total apologies were the apologies employed for ‘an offer of apology’ (i.e., I apologize) and 85.7% of them were the apologies utilized for ‘an expression of regret’ (i.e., I am sorry). Since ‘an offer of apology’ is generally used for formal expressions and ‘an expression of regret’ is mostly employed for less formal expressions, this finding could be utilized as a support for the idea that the language in e-mails is less formal than the one in traditional writing (Hatipoğlu, 2004). Furthermore, in e-mail communications, apologies are used if there is a possibility of multiple posting (e.g., ‘‘apologies for any duplicate e-mails/multiple posting’’, ‘‘apologies if N has already sent this’’ and ‘‘(with) apologies for (any) cross-posting’’). As will be explained in the section 5.2.4, the taxonomy proposed by Olshtain and Cohen (1983) was used to analyze apology strategy sequences elicited from computer users. In the following section, a definition of ‘‘human-like computer interface’’ and the results of studies which investigated the effectiveness of this kind of interface on users are presented. 4. Human-like interfaces A human-like computer interface is a kind of interactive computer system which has certain characteristics that resemble characteristics found in human beings. Based on the idea that human emotional expressions are important in interpersonal communication, interactive computer systems have been built to respond to users who experience frustration and other negative emotions with emotionally supportive response interactions. Many studies have shown that people can apply the social rules which regulate HHI to HCI. As an example of these studies, Fogg and Nass (1997) reported that users who received flattery from a computer viewed the interaction as more enjoyable than those who received plain computer feedback. This study also showed that users who received flattery from the computer displayed a much greater interest in continuing to work with the computer than those who received plain computer feedback. Klein et al. (1999) reported the same results as Fogg and Nass. One of the important concepts that shape human-like interfaces is the media equation (i.e., ‘‘media equals real life’’). Media equation research concentrates on social rules and norms, such as politeness, reciprocity, flattery, assignment of roles, and criticism (Reeves and Nass, 1996). Nass et al.’s (1994) reported that people’s interactions with computers are social, and that this is not the result of a conscious belief that computers are human-like. In other words, users treat computers as if they are human during their interactions with the computers, even though they know that the computers have no human motivations, such as feelings. Nass and Moon (2000) reported that people tend to rely on social categories and mindlessly apply social rules to their computer interactions. Resnik and Lammers (1985) suggested that high self-esteem subjects who received human-like error messages (e.g., ‘I don’t understand these letters’) performed significantly better on computerized tasks than high self-esteem subjects who received less personal, computer-like feedback. Nass et al. (1994) stated that people tend to treat computers as social actors; they call this the CASA paradigm: ‘‘Computers Are Social Actors.’’ CASA studies demonstrate that the social rules and dynamics guiding human-human interaction are applied equally well to HCI. Many studies have confirmed people’s tendency to view computers as social actors. Picard (2000) demonstrated that participants who interacted with an emotion-support agent played a computer game significantly longer than those who interacted with similar agents that ignored their emotions. Johnson et al. (2004) indicated that highly computer-literate participants who received feedback that included flattery from the computer tended to treat the computer in a manner resembling people’s reactions to flattery from other people. User frustration with computer-provided information and technology is a pervasive problem caused by factors such as the crashing of computers and poor user interfaces (Lazar et al., 2005). Lazar et al. (2005) investigated whether there are commonalities between student and workplace user frustration during interactions with computers. The study reported that there are three important factors which influence frustration levels: time lost, time needed to repair, and the importance of the task. Another important finding of the study was that when the participants were asked to write down the specific causes of their frustration, the most cited cause was the way computer error messages were presented. Klein et al. (2002) reported that users in their study who interacted with an affect-support agent (this agent helps users to recover from frustration by using active listening, empathy, and sympathy) engaged with the system significantly longer than users in their two control studies. Similarly, Hone (2006) reported that text-based agents can be effective in reducing user frustration. In addition to this finding, Hone examined the effects of embodied agents. The results of this investigation revealed that embodied agents that provide emotional feedback can be more effective than text-based agents for reducing user frustration caused by the computer.
2434
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
In an empirical study similar to the present one, De Laere et al. (1998) compared human-like versus machine-like interaction styles of computer interfaces. They did not observe a significant difference between the different styles in terms of users’ self-appraisals. The findings of the study also demonstrated that negative feedback affects the self-perceptions of the participants differently than positive feedback. Tzeng (2004) examined whether apologetic feedback affects users’ performance perception in the computerized environment. This study suggested that users may not expect computers to be polite, but apologetic statements made the subjects feel better about their interactions with the program. The presented findings of sections 3 and 4 support the idea that apologies preferred in HHI might be different from those in the mediums of HCI. Therefore, results in HCI research examining computer apologies must be compared to results in HHI research in order to develop more effective human-like computer interfaces. We shall contextualize our findings on the background of these studies below in section 6. 5. Method 5.1. Overview of the present study This study consists of three phases (see Fig. 1). The first phase of this study was conducted to identify what kinds of apology strategy sequences (APSSs) are preferred by users during their interactions with a computer. Of the APSSs elicited in the first phase, two were selected for use in the second phase (We named these APSSs Apology 1 and Apology 2). The purpose of the second phase was to investigate whether these two messages are actually perceived as apologies; if so, they would be utilized in the third phase. The third phase was conducted in accordance with the following question: Do the two apologetic error messages influence users’ self-appraisals of performance in the computerized environment? 5.2. Phase 1 5.2.1. Participants Eighty-six university students from METU (43 male and 43 female, aged 20–24 years) participated in the first phase of the study. All of the participants had taken the course ‘‘Educational Technology and Material Development,’’ on the basis of which online instructional materials were designed (see section 5.2.2). Among the participants, 16 were from Foreign Language Education, 15 were from Elementary Science Education, 18 were from Elementary Mathematics Education, 17 were from Early Childhood Education, and 20 were from Computer Education and Instructional Technology. 5.2.2. Materials 5.2.2.1. The online instructional materials. In the procedural part, a web environment was used. In order to simulate a human–computer interaction environment in which a problem occurs because of the computer’s inability to carry out a task, a Macromedia Flash-based e-learning environment was developed. The environment consisted of two parts: The lesson part (see Fig. 2) and the test part (see Fig. 3). The system first presented the lesson, in which brief information about educational technology and instructional material development was presented. For example Fig. 2 presents three approaches to learning, namely behaviorism, cognitivism and constructivism and how they define learning, the learning process, and the role of learning. After the lesson, the system activated a multiple-choice test including seven questions about the topic presented in the lesson. For example, in Fig. 3, a test item about the cognitivist learning process is presented along with the alternatives. The test material incorporated a deliberate design problem, intended to create frustration for the users (see section 5.2.3).
Fig. 1. Three phases of the study.
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2435
Fig. 2. The lesson part of the online instructional materials.
Fig. 3. The test part of the online instructional materials.
5.2.2.2. Discourse Completion Test. In the Discourse Completion Test (DCT) used for the present study, the problem which the participants had encountered during their interaction with the computer was retold. The participants were asked to write which kind of apology message they would prefer if they encountered the same problem again in a computerized environment. Our reason for using this method in the study is that simulating the problem occurring in a computerized environment is more realistic than writing a scenario that explains the problem. 5.2.3. Procedure The interface was designed so that the users would encounter an important problem caused by the computer’s inability to carry out a task. In this study a real-life like problem was created. Since special Turkish characters need special handling in some authoring programs, the appearance of problematic Turkish characters is a frequently observed problem. That’s why this case was used in this study. The participants did not know that they would encounter a problem while interacting with the computer interface. They first completed the lesson part of the e-learning materials, and then they started to do the test consisting of seven questions. After the 2nd question, they encountered a problem and received an error message simply saying that there was a problem in the system. The problem had to do with character encoding, i.e., Turkish alphabetical characters (e.g., c¸ [the letter for the English letter combination ch], s¸ [the letter for the English letter combination sh]) were not presented properly (see Fig. 4). More specifically, the letter s¸ appeared in an unintelligible form such as = FE. This problem was not fixed but the system allowed the users to continue their interaction with the computer. At the end of the interaction, the DCT was distributed to the participants. 5.2.4. Coding and frequency analysis The statements elicited by the DCT were analyzed according to the CSSARP coding manual (Blum-Kulka and Olshtain, 1984). All the sentences were independently analyzed and coded by two individuals (one from the field of HCI, and the other from the field of Pragmatics) and the analyses were compared. In cases where there was no agreement between the coders,
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2436
Fig. 4. The test part of the online instructional materials after the problem.
they adjudicated either by choosing one of the codings or by jointly proposing a new one. The following example shows how an apology was analyzed: ‘‘Unfortunately, because of the system, there has been an error. Rest assured that this error is not going to be repeated again. We’re sorry’’
because of the system
ToR
is not going to be repeated again
PoF
We’re sorry
IFID
The APSS of the apologetic message is therefore ToR-PoF-IFID. At the next step, the frequency of stated preference for each of the APSSs was calculated. In order to quantify the percentage for each APSS, the frequency value was divided by the number representing the number of individuals who participated in the first phase of the study. For example, if the ToR-PoF-IFID sequence is preferred by 20 of 80 individuals, then the percentage for this APSS is calculated by dividing 20 by 80 (20/80 = .25). 5.2.5. Results Table 1 shows that 17 APSSs were elicited from the data obtained in phase 1. The results showed that in general gender had an effect on the preferences for apology messages. For example, the APSS consisting of the IFID and the ToR strategies
Table 1 The APSSs derived from phase 1. APSSs
1. IFID–ToR 2. IFID–ToR–OoR 3. ToR–OoR 4. IFID 5. ToR 6. IFID–ToR–EXP 7. IFID–OoR 8. OoR 9. IFID–EXP 10. ToR–EXP–OoR 11. IFID–EXP–OoR 12. IFID–ToR–EXP–OoR 13. ToR–EXP 14. EXP–OoR 16. IFID–ToR–PoF 15. IFID–INTENSa 17. IFID–ToR–PoF–INTENS a
Gender (%)
Total (%)
Male
Female
16.28 23.26 16.28 4.65 13.95 6.98 .00 4.65 2.33 2.33 2.33 2.33 2.33 .00 2.33 .00 .00
23.26 9.30 6.98 13.95 4.65 4.65 11.63 4.65 6.98 4.65 2.33 .00 .00 2.33 .00 2.33 2.33
INTENS represents an intensified apology, such as, ‘‘I am very sorry,’’ or ‘‘I am terribly sorry.’’
19.77 16.28 11.63 9.30 9.30 5.81 5.81 4.65 4.65 3.49 2.33 1.16 1.16 1.16 1.16 1.16 1.16
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2437
Table 2 The apology strategies. Strategy
IFID Exp ToR OoR PoF
Percentage
Total (%)
Male
Female
60.47 18.60 86.05 51.16 2.33
76.74 20.93 55.81 41.86 2.33
68.60 19.77 70.93 46.51 2.33
(no. 1) was preferred by 16.28% of the male participants (Ms), and by 23.26% of the female participants (Fs). This difference, however, was not significant, t(83) = .861, p = .392. The only significant difference found between the male and female participants is for the APSS including the IFID and the OoR strategies (no. 7) (Ms = 0% < Fs = 11.63%, t(41) = 2.354, p < .05). The percentage value of stated preference for each individual apology strategy was also analyzed (see Table 2). We found that the ToR strategy was the most widely preferred apology strategy (70.93% of the respondents preferred it in their responses). The second most frequently preferred strategy was the IFID (68.60% of the respondents preferred this strategy in their responses). The third most frequently preferred strategy was the OoR, preferred by 46.51% of the respondents. The remaining strategies, namely the Exp and the PoF, were preferred by 19.77% and 2.33% of the participants respectively. In addition, the results indicated that 66.10% of the respondents who preferred the IFID also preferred the ToR strategy. Similarly, 63.93% of the participants who preferred to receive the ToR strategy also preferred the IFID strategy. In terms of the frequency of preference for individual strategies, we found that the percentage of the males’ selection of the ToR apology strategy was much higher than that of the females (males = 86.05% > females = 55.81%), and that this difference was significant, t(60) = 4.392, p < .01. A similar result was found for the IFID strategy; 60.47% of the males preferred the IFID strategy, whereas 76.74% of the females preferred it in their responses. This difference was not significant, t(81,43) = 1.605, p = .112. The percentage of the OoR apology strategy preferred by the males was 51.16%, while the percentage of the strategy preferred by the females was 41.86%. This difference was not significant either, t(83) = .761, p = .449. The results showed that the percentages of the Exp strategy preferred by both sexes in the study were almost identical (males = 18.60% < females = 20.93%). The percentages of the PoF apology strategy preferred by both sexes in the study were also identical (males = 2.33% = females = 2.33%). These findings are consistent with those of Hatipoğlu (2003) in that most of the preferred apologies by participants who were natives of the Turkish culture contain the ToR and the IFID strategies. Moreover, the findings of this study are also consistent with those of Olshtain (1989), and Vollmer and Olshtain (1989) in that the IFID and the ToR were mainly preferred in all situations. The present study also indicated that the PoF was the least preferred strategy by users in the HCI context. Similarly, Hatipoğlu (2003) indicated that it was the least preferred strategy in Turkish culture. Overall, it seems clear that the IFID and the ToR strategies are mainly preferred both in HHI and HCI. This finding also supports Cohen and Olshtain’s suggestion that the IFID and the ToR can be used in all situations, whereas the other three strategies are situation specific. It can be inferred that the apologies that one expects to get from a computer are similar to the apologies utilized in HHI. Therefore, it can be said that the findings of this study support Nass et al.’s (1994) arguments that people tend to treat computers as social actors, and that the social dynamics for HHI can be applied equally well to HCI. It is therefore possible to say that computer interactions may trigger users’ social schema that are relevant to situations in which the use of apologies is normally thought to be required. 5.2.6. The rationale to choose two APSSs to be used in the third phase In order to determine two APSSs for the second phase, the following criteria were considered. 1.
According to Olshtain (1989), IFIDs are direct apologies and they can be used in any situation for every language. Therefore, APSSs should be selected among those including the IFID strategy so that participants could easily perceive the messages as apologies.
2.
The APSSs to be chosen should be preferred by more than one individual. Stated differently, the percentage values of the APSSs should be more than 1.16.
3.
The two APSSs to be chosen should be maximally different in terms of the strategies they include. In other words, the APSSs should not consist of the same apology strategies except the IFID.
4.
When the three criteria were considered together, it was determined that two sets of APSSs were reached in order to utilize in the second phase. The first set of APSSs included IFID–ToR and IFID–Exp–OoR, while the second set consisted of IFID–ToR–OoR and IFID–Exp.
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2438
5.3. Phase 2 5.3.1. Participants Thirty-two students who did not participate in phase I (17 male and 15 female, aged from 20 to 24 years) participated in the second phase of the study. All of the participants were chosen from a pool of students who had taken the course ‘‘Educational Technology and Material Development.’’ 5.3.2. Apology perception rating A rating scale was designed to see whether the participants perceived either of the two messages as apologies. In this test, the two apology messages and one plain message formulated by the researchers were presented to a different group of participants who were asked to judge the value of the messages in terms of their effectiveness as apologies. Table 3 shows the selected apology messages and the strategies they contain. 5.3.3. Procedure The procedure of the second phase was basically the same as that of the first phase. Additionally, the rating scale explained in section 5.3.2 was employed in order to investigate whether the messages listed in Table 3 were perceived as apologies or not. 5.3.4. Analysis To determine whether there is a significant difference between the apology messages and the plain computer message, a repeated ANOVA analysis was conducted. In the analysis, three different messages were considered as if they were three different conditions in which the same participants were involved. 5.3.5. Results The repeated ANOVA results indicated that there is a significant difference among the messages in terms of apology perception, F(2.62) = 70.541, p < .05. When we investigated the paired comparisons, the two apology messages (M apology 1 = 4.750 and M apology 2 = 4.125) were found to be significantly different from the plain computer message (M plain = 1.594), but there was no significant difference between the two apology messages (see Table 4). These results indicated that the plain computer message was not perceived as an apology message, whereas the two apology messages were perceived as apologies. 5.4. Phase 3 In the third phase, answers to the main research question of the study were sought (i.e., what are the effect of the participants’ mood states and the effect of the apology message type on their self-appraisals of performance?). 5.4.1. Participants Two hundred and twenty university students were involved in the third phase of the study. Forty of them dropped out (12 of them were eliminated because of the missing data and 28 of them were ignored because their response times were outside of the confidence interval designated for an acceptable response time). The reason to set an acceptable response time was to eliminate the participants who responded very quickly or very slowly. Of the remaining 180 students, 105
Table 3 APSSs and messages. Message group
Apology strategy sequence
Message (translated into English)
Apology 1
IFID–ToR–OoR
Apology 2
IFID–EXP
Control message
(Plain message with no apology in it)
The problem caused by the system could not be fixed. We apologize for this. Any negative effects of the problem on your performance will be compensated during the evaluation of the test. Turkish characters could not be printed appropriately due to an error. We apologize for this. The problem could not be fixed.
Table 4 Apology message comparisons.
Comparisons Apology 1–Apology 2 Apology 1–plain message Apology 2–plain message
Mean difference
Std. error
Sig.
.625 3.156 2.531
.276 .330 .229
.092 .000 .000
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2439
were female and 75 were male. The participants’ age ranged from 20 to 24 years. All of the participants had taken the course ‘‘Educational Technologies and Material Development.’’ Among the participants, 21 were from Foreign Language Education, 29 were from Elementary Science Education, 36 were from Elementary Mathematics Education, 32 were from Early Childhood Education, and 62 were from Computer Education and Instructional Technology. 5.4.2. Materials 5.4.2.1. Mood questionnaire. Watson and Tellegen (1985) indicated the stability and robustness of positive and negative affect in self-reports. Depending upon this finding, classifying participants into positive and negative affect groups was chosen as a way to understand the role of mood state in students’ self-appraisals of performance. We chose a questionnaire capable of revealing positive and negative affect. The questionnaire selected was the one used in Efklides and Petkaki’s study (2005), which includes 10 adjectives and comprises two main factors: one for negative affect (sad, melancholic, anxious, pessimistic, and disappointed) and one for positive affect (good, calm, happy, excited, and pleased). In our study, an explanatory factor analysis was conducted to reveal the underlying factors. Our findings were the same as those in Efklides and Petkaki’s study (2005). According to our results, two factors were found, positive and negative affects. Good, calm, happy, excited, and pleased loaded on positive affect. Sad, melancholic, anxious, pessimistic, and disappointed loaded on negative affect. The scoring of the questionnaire was conducted separately for the negative and positive moods. If a participant’s positive mood was greater than his/her negative mood, s/he was considered to be in positive mood. For the reverse situation, s/he was considered to be in negative mood before the task. 5.4.2.2. The online instructional materials. In phase 3, online instructional materials similar to those used in the first phase were utilized. Recall that these materials consisted of two parts: a lesson part, in which the brief information about educational technology and material development was presented, and a test part, which included seven questions. The difference in the materials used in the third phase was that this included the mood state questionnaire presented before the lesson, and there were 14 questions in this test instead of seven questions. Thirdly, the online environment had a database connection that allowed the researchers to record the collected data in the database. In other words, the participants’ answers to the questions and their responses to the mood questionnaire were recorded. In addition, the amount of time spent for each question by each participant was recorded in order to eliminate those participants whose response times were outside of the acceptable confidence interval. 5.4.2.3. Test questions. Multiple-choice questions were utilized in the test part of this phase because this type of questionnaire increases objectivity and decreases the required effort to be exerted by the researchers during the evaluation of the responses. Seven different pairs of multiple-choice questions (14 questions in total) were prepared by an expert of the subject. In each pair, the questions had the same educational objective, and they were approximately of the same level of difficulty. Then the questions in each pair were separated and re-paired in a different set of matches in order to form two sets of questions. After preparing the two sets, the questions were checked by two other subject-matter experts in terms of the questions’ difficulty level, intended educational objectives, and correctness of the questions. Based on these experts’ suggestions, corrections were made to finalize the questions for use in the study. 5.4.3. Experimental design There were two experimental groups and one control group. The experimental groups were Apology 1 group and Apology 2 group. The participants received error messages including the apologetic statements shown in Table 3. The participants were randomly assigned to the groups by the system. The outcome of the random assignment is listed in Table 5. 5.4.4. Procedure In the procedural part of the study, a web environment was used. After the log-on procedure, the questionnaire about mood states was presented. After the participants had filled in the questionnaire, the system presented the short lesson about educational technologies. After completing the lesson, the participants took an exam related to the lesson. The participants were instructed to answer each question by choosing one out of five alternatives, where one was always correct. Table 5 Random assignment of the participants. Apology 1 group
Apology 2 group
Control group
Gender Female Male
40 26
36 21
29 28
Mood state Positive affect Negative affect
41 25
34 23
28 29
Total
66
57
57
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2440
Fig. 5. Diagram of the experimental design of the third phase.
After each question in the exam, the participants conducted a confidence rating of how sure they were that they had answered the question correctly. The scale on which the participants conducted this confidence rating ranged from 1 to 5. It was explained to them that 1 meant that s/he was absolutely not sure about the correctness of his/her answer, and 5 meant that s/he was completely certain concerning the correctness of the answer. The questions were given in a determined order. After answering one question, there was no possibility to go back to the previous one. When the participants completed the 7th question, they encountered a problem deliberately caused by the system. The problem was the same: Turkish alphabetical characters (i.e., c¸, s¸) did not come through properly. Hence, it was very difficult to read the rest of the questions and the related choices. When the problem occurred, the system showed an error message and informed the participants that the problem could not be fixed by the system. The system showed a different error message (i.e., Apology 1 or Apology 2) to each group. After the 10th question, these messages were presented again (see Fig. 5). The aim of presenting the second error message was to ascertain whether there is a short term or long term effect of the message on the participants’ selfappraisals of performance. 5.4.5. Analysis A mixed ANOVA was conducted to evaluate the relationship between message type and self-appraisals of performance. In the analysis, message type was considered as between-subject variable. While self-appraisal of performance was considered the dependent variable, mood state was used as the quasi-independent variable in our study. One of the aims of the study was to determine whether there are significant decreases or increases in the different levels of dependent variables according to the message type. Specifically, the study investigated both within-subject variables and between-subject variables simultaneously. For this reason, a mixed ANOVA was chosen to analyze the data in this study. 5.4.6. Independent variables 5.4.6.1. The message type (between-subject variable). Recall that based on the APSSs derived in the first phase, the following apology messages was prepared. These messages were used in the experimental groups. Apology 1 Group
The problem caused by the system could not be fixed. We apologize for this. Any negative effects of the problem on your performance will be compensated during the evaluation of the test.
Apology 2 Group
Turkish characters could not be printed appropriately due to an error. We apologize for this.
In addition to these apology messages, one plain computer message was prepared for use in the control group. 5.4.6.2. The mood state (between-subject variable). One way to incorporate emotions into research is to collect data about ‘‘naturally occurring, temporary emotional reactions,’’ which means emotional states that are not manipulated before starting the experiment. Rather, the groups—both experimental and control—are formed based on emotional states that are obtained in a natural way. These naturally acquired states can serve either as a quasi-independent variable or as a dependent variable (Parrott and Hertel, 1999). Hence, the mood states were used as a quasi-independent variable in our study, because they were not manipulated before starting the experiment. 5.4.7. The dependent variable 5.4.7.1. The self-appraisals of performance (within-subject variable). This variable was measured to determine whether the apologetic error message affects the participants’ self-appraisals of performance. Levels of self-appraisal of performance are generally determined by questions such as ‘‘how well do you think you performed on this test?’’ or ‘‘how would you rate your
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2441
Table 6 Self-appraisal of performance scores. Confidence rating
Self-appraisal of performance score
Certainly correct Probably correct Uncertain Probably incorrect Certainly incorrect
5 4 3 2 1
performance?’’ (De Laere et al., 1998). Similarly, in our study, the participants were asked to judge their performance for each question by means of a rating scale. In order to measure the participants’ self-appraisals of performance, our procedure was to quantify the self-appraisal of performance for each question and then to compute an overall self-appraisal of performance score by summing up the quantified scores. The total scores for the participants’ self-appraisals of performance were based on their confidence ratings. The self-appraisal of performance score assigned to each question was simply the participant’s confidence rating for the question, regardless of the correctness of the answer for the question. Table 6 shows the self-appraisal of performance score for each confidence rating. As can be seen from the Table 6, if a participant was completely certain concerning the correctness of his/her answer for a question, 5 was assigned as a selfappraisal of performance score for the question. On the other hand, if he/she was definitely not sure about the correctness of his/her answer, 1 was assigned as a self-appraisal of performance score. Three sub-variables related to this dependent variable were constructed according to the points where the errors occurred. The sub-variables and the ways in which the self-appraisal of performance score for each sub-variable was obtained are explained below. 5.4.7.1.1. Self-appraisal 1 variable. (Mean score of self-appraisal of performance scores of first seven questions, before the first error message). The self-appraisal of performance scores of the first seven questions were summed up and then the total score was divided by seven to get the mean score. 5.4.7.1.2. Self-appraisal 2 variable. (Mean score of self-appraisal of performance scores of the questions from 8 to 10, between the first error message and the second error message). The self-appraisal of performance scores of the questions from 8 to 10 were summed up and then the total score was divided by three to get the mean score. 5.4.7.1.3. Self-appraisal 3 variable. (Mean score of self-appraisal of performance scores of the questions from 11 to 14, after the second error message). The self-appraisal of performance scores for the questions from 11 to 14 were summed up and then the total score was divided by four to get the mean score. 5.4.7.2. Real task performance (within-subject variable). The main reason to use actual task performance as dependent variable in this study is to understand whether the changes in participants’ self-appraisals of performance are caused by the changes in their actual task performance. Participants’ actual performance score is defined as the number of questions correctly answered in the test. Three subvariables related to this dependent variable were constructed according to the points where the errors occurred. The subvariables and the ways in which the actual performance score for each sub-variable was obtained are explained below: 5.4.7.2.1. Performance 1 variable. (Mean score of actual performance scores of first seven questions, before the first error message). Actual performance scores of first seven questions were summed up and then the total score was divided by seven to get the mean score. 5.4.7.2.2. Performance 2 variable. (Mean score of actual performance scores of the questions from 8 to 10, between the first error message and the second error message). Actual performance scores of the questions from 8 to 10 were summed up and then the total score was divided by three to get the mean score. 5.4.7.2.3. Performance 3 variable. (Mean score of actual performance scores of the questions from 11 to 14, after the second error message). Actual performance scores for the questions from 11 to 14 were summed up and then the total score was divided by four to get mean score.
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2442
Table 7 Item loadings, communalities, eigenvalues, and proportion of variance explained. Item
Positive affect
Pleased Calm Good Excite Happy Disappointed Sad Anxious Melancholic
Negative affect
Communalities
.91 .87 .86 .86 .82
.96 .90 .78 .86 .85 .80 .72 .72 .55
.85 .81 .73 .72
Pessimistic Eigenvalues Explained variance
.69 1.20 35.14
6.58 42.70
.63
5.4.8. Results 5.4.8.1. Exploratory factor analysis for the mood questionnaire. Factor analysis is a statistical method for identifying groups or clusters of variables in order to understand the structure of variables that cannot be measured directly. In this study, exploratory factor analysis was conducted in order to ascertain how many factors there are in the mood state questionnaire. The results of the factor analysis revealed that the questionnaire comprised two factors: one for positive affect and one for negative affect. In order to determine whether factor analysis can be applied to a particular set of data, adequacy of sample size has to be checked. One way in doing so is to conduct KMO (Kaiser–Meyer–Olkin) test. The result for our data, using the KMO (Kaiser–Meyer–Olkin) index of sampling adequacy, was .90 for the sample, indicating that the data represented a homogeneous collection of variables that were suitable for factor analysis. Bartlett’s test of sphericity was significant for the sample (x2(45) = 2267.89; p < .001), indicating that the set of correlations in the correlation matrix was significantly different from zero and suitable for factor analysis. To ensure what variables relate most to which factor, following Reise et al.’s (2000) suggestion, a principle component extraction with varimax rotation was run to estimate the number of factors. Prior analysis indicated a two-factor solution, explaining 77.84% of the total variance. Eigenvalues and the screeplot offered a two-factor solution, as well. Furthermore, a parallel analysis (Reise et al., 2000) revealed the same result. These variables were internally consistent and well defined by the variables. The first extracted factor, labeled positive affect, was composed of five items and explained 42.70% of the variance. The positive affect included positive mood instances such as satisfied, calm, happy, etc. The second factor, termed negative affect, consisted of five items and explained 35.14% of the variance. Recall that negative affect refers to negative mood states such as disappointment, depressing, anxious, etc. Table 7 indicates item loadings, eigenvalues, and proportion of variance explained. Each factor, namely the positive and negative affects, had satisfactory internal consistency (α = .96, and α = .88, respectively). 5.4.8.2. The participants’ self-appraisals of their performance. In order to understand how the participants’ self-appraisals of performance change according to message type and mood state, self-appraisals of performance scores (repeated-measures variable), message type (between-group variable) and mood state (between-group variable) should be entered into the same analysis. In order to do so, a mixed ANOVA test was conducted because it is used if there are both repeated-measures and between-group variables in the same research design. The mixed ANOVA showed that the main effect of message type on self-appraisals of performance was significant when repeated measures of three sequenced occasions were considered, F(4.354) = 2.644, p < .05. On the other hand, when the between-subjects effects were investigated, the results did not show significance among the groups, F(2.177) = 1.894, p = .154. The descriptive statistics of self-appraisals of performance are shown in Table 8. Table 8 and Fig. 6 indicate that self-appraisal of performance of all groups were nearly the same before the time the 1st error message was presented. Self-appraisal of performance scores of all groups began to descend after the first error message. At the time the 2nd error message was presented, the mean difference of self-appraisal of performance scores
Table 8 Descriptive statistics of the self-appraisals of performance. Message groups
Self-appraisal 1 Before 1st error message
Self-appraisal 2 Between 1st and 2nd error message
Self-appraisal 3 After 2nd error message
Mean
S.D.
Mean
S.D.
Mean
S.D.
Apology 1 group Apology 2 group Control group
4.00 4.06 4.09
.66 .56 .55
3.48 3.57 3.91
1.00 .88 .75
3.56 3.85 3.74
1.07 .90 .93
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2443
Fig. 6. Self-appraisals of performance.
between the Apology 1 group and the control group and the mean difference between the Apology 2 group and the control group were the greatest. But, these differences are not statistically significant. At the end of the test, the mean difference between the Apology 1 group and the Apology 2 group was the greatest. However, this difference is also not statistically different. When the main effect of mood state on self-appraisals of performance was investigated, it was realized that there was no significant main effect of mood state on the self-appraisals of performance between groups, F(1,178) = 1.901, p = .170. The results of the test of within-subjects effects show that there was a significant interaction effect exerted by the message type and mood state on the self-appraisals of performance, F(10.348) = 3.327, p < .05. Moreover, when the results of the test for the between-subject effects were investigated, the interactions between the message type and mood state revealed a significant effect on the self-appraisals of performance between groups, F(5.174) = 3.033, p < .05, r = .13. This effect indicates that the influence of apology messages on self-appraisals of performance depends on the participants’ mood state (see Table 9, Fig. 7, and Fig. 8). Fig. 7 shows that self-appraisal of performance of the control group having negative mood decreased without sharp changes between the time the 1st error message was presented and the time the 2nd error message was presented, and between the time the 2nd error message was presented and the time at the end of the test. Similarly, the self-appraisal of performance of the participants in the Apology 2 group decreased from time 1 to time 2 with a sharp decline. After the second apology message, this decline continued, but the slope of the decline was much smaller than the previous one. In the Apology Table 9 Descriptive statistics of the interaction effects of message type and mood state. Message groups
Self-appraisal 1 Before 1st error message
Self-appraisal 2 Between 1st and 2nd error message
Self-appraisal 3 After 2nd error message
Mean
S.D.
Mean
S.D.
Mean
S.D.
Apology 1 group Positive affect Negative affect Total
4.00 4.00 4.00
.61 .76 .66
3.75 3.05 3.48
.86 1.08 1.00
3.55 3.57 3.56
1.02 1.15 1.07
Apology 2 group Positive affect Negative affect Total
4.20 3.84 4.06
.40 .70 .56
3.70 3.38 3.57
.79 .98 .88
4.17 3.36 3.85
.70 .95 .90
Control group Positive affect Negative affect Total
3.93 4.25 4.09
.60 .46 .55
3.76 4.05 3.91
.63 .84 .75
3.71 3.78 3.74
1.03 .82 .93
2444
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
Fig. 7. Self-appraisals of performance in negative moods.
Fig. 8. Self-appraisals of performance in positive moods.
1 group, the participants having negative mood did not think that they performed better in time 2 compared to time 1, and the slope of this decline was greater than that of other groups. When they received the second apology message, their selfappraisal of performance increased. Finally, at the end of the application, the participants’ self-appraisals of performance scores were close to each other. On the other hand, for positive mood, changes among occasions in terms of self-appraisal of performance were different than those of negative mood. Fig. 8 indicates that the participants’ self-appraisals of performance decreased in all groups after the problem occurred. When they received the second message, the participants’ self-appraisals of performance continued to descend in apology1 and control group. This decline was similar to that of between time 1 and time 2. However, after the second message, the participants’ confidence level increased in the Apology 2 group. This means that the participants in the Apology 2 group responded to questions with more confidence after they had received the second apology message.
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2445
5.4.8.3. Correlation between dependent variables. The results of correlation analysis using Pearson’s correlation coefficient indicated that actual performance was positively related with self-appraisal of performance (r = .180 for questions 1–7; r = .312 for questions 8–10; r = .260 for questions 11–14). When R2 was used to understand how much of the variability of self-appraisal of performance is accounted for by actual performance, the following results were obtained: The value of R2 for questions 1–7 was .032, that for questions 8–10 was .097 and that for questions 11–14 was .067. These results showed that (1) actual performance accounted for 3.2% of the variability in the self-appraisal of performance before the first error message, (2) explained 9.7% of the variability in the self-appraisal of performance between first and second error messages, and (3) accounted for 6.7% of the variability in the self-appraisal of performance after the second error message. Depending upon these findings, it is difficult to say that change in self-appraisal of performance is solely caused by the change in actual performance or vise versa. Because of the fact that actual performance explained the small percentage of the variability in the self-appraisal of performance, the effect of mood state and of message type on self-appraisal of performance are going to be discussed regardless of the actual performance. 6. Discussion In this study, the main effect of apology type on self-appraisals of performance, and the interaction effect of apology type and mood state on self-appraisals of performance in human–computer interaction were examined. The findings showed that the main effect and the interaction effect on self-appraisals of performance were significant when repeated measures of the three sequenced occasions (the occasion before the first error message, the occasion between the first and the second error messages, and the occasion after the second error message) were considered. When the difference between the three groups (two experimental and one control group) was examined, a significant difference was found only for the interaction effect of apology type and mood state on self-appraisals of performance. The results of the study further showed that the use of apologetic error messages in the computerized environment did not influence the users’ self-appraisals of performance. These results are consistent with those of De Laere et al. (1998), who found no significant difference between the participants who received human-like feedback and those who received machine-like feedback in terms of self-appraisals of performance in the computerized environment. An interesting study is that by Forgas (1999), who found that individuals in negative affect demonstrate greater politeness than those in positive affect while making requests. In other words, the level of politeness people use depends on their affective state. From this point of view, one interpretation of our results might be that apologetic error messages used to influence users’ self-appraisals of performance in the computerized environment may not be enough if the users’ affective state is not also considered. This interpretation is supported by another result of this study, namely that the interaction effects of message type and mood state on self-appraisals of performance was significant. The interaction effects of message type and mood state showed that the influence of apologetic messages on selfappraisals of performance depends on the participants’ mood state. The apologetic message that includes the IFID, the ToR, and the OoR strategies (Apology 1) was more effective for users in a negative mood state than the message including the IFID and the Exp strategies (Apology 2). The Apology 2 message, on the other hand, was more effective for users in a positive mood state than the Apology 1 message. These findings can be interpreted in light of the results of Forgas (1999). Forgas (1999) found that sad moods increase and happy moods decrease the level of request politeness, and the effect of mood on the level of politeness is greater when the subject makes more risky and unconventional requests that require more elaborate processing strategies. Based on Forgas’s (1999) findings, we can speculate that the users’ apology preferences might depend on their mood states. It is possible to say that the Apology 1 message was more effective for users in a negative mood state than the Apology 2 message because this message might be one of the messages preferred by users in a negative mood state. On the other hand, the Apology 2 message was more effective for users in a positive mood state than the Apology 1 message because this message might be one of the messages preferred by users in a positive mood state. This idea should be tested in the future, because in the first phase of this study (in which the participants’ apology preferences were elicited), their mood states were not considered. As stated above, the Apology 1 message was more effective for subjects in a negative mood state than the Apology 2 message. Another possible reason behind the effectiveness of the former message might be that it informed the participants that the system was responsible for the problem, and that the negative effects of the problem would be compensated. People in a negative mood state tend to produce more negative self-assessments and their self-confidence decreases, which leads to more self-depreciating attributions (Cervone et al., 1994; Mayer and Hanson, 1995). Therefore, users in a negative mood state would most likely tend to assess their own performance negatively after they encountered the problem. After they received the message including the IFID, the ToR, and the OoR (Apology 1), they learned that the system was responsible for the problem, and that the negative effects of the problem on the task performance would be compensated. Knowing this might have relieved these participants, and this in turn might have increased their self-confidence and produced more positive assessments about their own performances. As can be seen in Fig. 7, the self-appraisal of performance scores of participants in a negative mood state did not increase after they received the Apology 2 message. Stated differently, employing the message consisting of the IFID and the Exp strategies (Apology 2) did not influence the subjects’ self-appraisals of their performance. The reason for this result might be that in this error message the system did not accept the responsibility for the problem. Considering this fact and the theory
2446
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
that people in a negative mood state tend to make more self-depreciating attributions, it may be assumed that subjects might attribute the reason for the problem to certain internal factors. The message including the source of the problem (Exp) and the direct expression of an apology (IFID) might not cause subjects in a negative mood state to feel relieved, hence their self-confidence will not be increased and no positive assessments about their own performance will be produced. This may explain the finding of the non-significant effect of the message on self-appraisals of performance. On the other hand, the Apology 2 message was more effective for subjects in a positive mood state than the Apology 1 message (see Fig. 8). People in a positive mood are more confident, ambitious, and helpful, so they tend to form more positive and more confident inferences (Forgas, 1999; Forgas, 1995; Mayer et al., 1995). Since the subjects in a positive mood are more self-confident, they can produce more positive evaluations of their own performance. Therefore, even if subjects in a positive mood state received the message which omitted the explanation that the computer was responsible for the problem, they might still have attributed the source of the problem to certain external factors, such as the computer. This external attribution for the cause of the problem might cause those subjects to assess their own performance more positively. This might explain the finding that self-appraisal of performance scores of subjects in a positive mood state increased after they received the second message including the IFID and the Exp strategies. Employing the Apology 1 message did not have any influence on the self-appraisal of performance scores of the subjects in a positive mood. The following reasoning of Schwarz and Clore (1983) sheds some light on this finding. They found that the effects of mood on judgments disappeared when respondents attributed their feelings to irrelevant, situational causes. According to this finding, the positive-mood users who received the error message which expressed the responsibility of the system may have attributed the responsibility of the problem to certain irrelevant internal or external factors, namely to causes other than the system’s inability to carry out a task. Such a possible irrelevant attribution might remove the effects of mood on users’ self-appraisals of performance. There is a crucial point related to the finding concerning the interaction effects of apology type and mood state. As can be seen in Figs. 6 and 7, after the first error message, the self-appraisal of performance scores decreased in all groups. After the second error message was presented by the system, the control groups’ self-appraisal of performance scores continued to decrease both in the negative and positive mood state conditions. However, the use of the Apology 1 message increased the self-appraisal of performance scores for subjects in a negative mood state, whereas the use of the message including the IFID and the Exp increased the self-appraisal of performance scores for those in a positive mood state. Weiner’s (1979) study may help us to explain why apologetic error messages showed their influence on self-appraisals of performance not after the first presentation, but only after the second presentation of these messages. Weiner (1979) differentiated the causal attributions according to whether the perceived causes of events are stable (unchanging over time) or variable (changing over time). The individual’s expectancy of future success or failure depends on the perceived stability of the cause of the previous outcome. The attribution of an outcome to stable factors leads to greater typical shifts in expectancy (accretions in expectancy after success and decrements in expectancy after failure) than attributions of outcome to unstable factors. In our study, when the participants encountered the first error message, they might have expected that the system would fix the problem. Between the first and the second error messages, the participants might not have made any attribution to either a stable or a variable cause. Once the participants received the second error message, they might have realized that the problem was stable and might then have attributed their perceived success or failure to a stable factor (e.g., the computer’s inability). After attributing the failure to a stable factor, the control group’s self-appraisals of performance continued to decrease. Depending on Weiner’s hypothesis, the same decline might also be expected in the experimental groups. However, the self-appraisals of performance from the participants in a negative mood state in the Apology 1 group increased after the presentation of the second error message, while the self-appraisals of performance from the participants in a positive mood state in the Apology 2 group increased. Based on these findings and Weiner’s hypothesis, we can speculate that the effects of apologetic error messages on self-appraisals of performance will be observed after the users attribute the failure to stable factors. This hypothesis should be tested in future studies. 7. Conclusion The findings of this study showed that the degrees of preference for certain apology strategies in a computerized environment are similar to those preferred in a social context. This finding supports the idea that the social dynamics which guide HHI are equally applicable to HCI. In addition, different apologetic error messages, containing different APSSs, have different effects on the users’ self-appraisals of performance in relation to their mood states (positive versus negative). Since the field of the Pragmatics of Human–Computer Interaction does not have a long history, the topic of which dynamics of HHI can be applied to HCI has not been studied in detail. This study presents a new point of view to guide future studies in the field by making two original contributions based upon the following findings: (a) The error message including a direct apology, an expression of responsibility, and a promise to offer a repair was useful for users in a negative mood state, and (b) the error message consisting of a direct apology and a statement concerning the source of the problem was useful for users in a positive mood state. Therefore, the first contribution is that the use of different strategies in an apology leads to different effects on users. The second contribution is that this study showed the significance of the users’ mood state in understanding the effects of human-like messages (i.e., apologies) on users’ self-appraisals of performance in a computerized environment. The effects of
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
2447
mood states are particularly significant in relation to evaluative judgments of the users (i.e., self-appraisals of performance). Further studies on the users’ mood state may well be useful for designers of human-like interfaces. This study may also contribute to the field of affective computing, the goal of which is to create affective interfaces, such as human-like interfaces, in which apologetic error messages are presented instead of plain computer messages. Picard (1997) discussed certain applications with various affective abilities, such as a computer tutor that recognizes the users’ affect to individualize its teaching strategy. In the same vein, the findings of this study showed the importance of developing affective interfaces which recognize the users’ affect and help the computer to personalize its apologetic error messages according to the user’s affective state. 8. Limitations of the study and directions for future research The participants of the first phase and the third phase were not the same participants. APSSs preferred by the individuals participated in the first phase might have been different than those preferred by the individuals participated in the third phase. Therefore, different results would have been found if the same participants had been involved both in the first phase and the third phase. In the first phase, the DCT was used to reveal apology preferences of the participants. However, the participants’ mood states were not considered in this phase. Forgas (1999) found that requests preferred by individuals changed with respect to their mood states. It is possible to reach similar results if participants’ mood states are considered while eliciting their apology preferences, which might be an extension to this study. The mood states were elicited before the participants sat down to the online tasks. Since the online system involved a test part, in which participants encountered certain problems, their mood states might change after the problem. Therefore, their mood states at the beginning of the online application might be different from their mood states at the end of the application or after the lesson but before the test. In addition, the apologetic error messages received by users might influence participants’ mood states. Hence, their mood states before the first error message and those after the first and second error messages might be different from each other. Because of these possibilities, considering only the mood states measured at the beginning of the application, it is very difficult to generalize the results of this study. These points should be considered in future studies. After eliciting the APSSs with respect to participants’ mood states, another study may be conducted in order to find an answer to the following questions: Does the apologetic message preferred by users in positive mood state influence the selfappraisals of performances of users in positive mood state better than the message preferred by users in negative mood state? Does the apologetic message preferred by users in negative mood state influence the self-appraisals of performances of users in negative mood state better than the message preferred by users in positive mood state? In order to find out answers to these questions, two different apologetic messages might be considered during the design phase: a message preferred by users in positive mood state and a message preferred by users in negative mood state. Acknowledgements The authors gratefully acknowledge the insightful comments of two anonymous reviewers, and the help of Didar Akar, who presented an early version of this work at the 10th International Pragmatics Conference. We would also like to thank Annette Hohenberger, S¸u¨kriye Ruhi, Bilge Say and Sacip Toker for their helpful and encouraging comments at various stages of this work. References Blum-Kulka, Shoshana, Olshtain, Elite, 1984. Request and apologies: a cross-cultural study of speech act realization patterns (CCSARP). Applied Linguistics 5 (3), 196–213. Blum-Kulka, Shoshana, House, Juliane, Kasper, Gabriele, 1989. Investigating cross-cultural pragmatics: an introductory overview. In: Blum-Kulka, S., House, J., Kasper, G. (Eds.), Cross-cultural Pragmatics: Requests and Apologies. Ablex, Norwood, pp. 1–36. Bower, Godon H., 1981. Mood and memory. American Psychologist 36, 129–148. Burgoon, J.K., Bonito, J.A., Bengtsson, B., Cederberg, C., Lundeberg, M., Allspach, L., 2000. Interactivity in human-computer interaction: a study of credibility, understanding and influence. Computers in Human Behavior 16, 553–574. Cervone, Daniel, Kopp, Deborah A., Schaumann, Linda, Scott, Walter D., 1994. Mood, self-efficacy and performance standards. Journal of Personality and Social Psychology 67 (3), 499–512. De Laere, Kevin H, Lundgren, David C., Howe, Steven R., 1998. The electronic mirror: human-computer interaction and change in self-appraisals. Computers in Human Behavior 14 (1), 43–59. Efklides, Anastasia, Petkaki, Chryssoula, 2005. Effect of mood on students’ metacognitive experiences. Learning and Instruction 15, 415–431. Fielder, Klaus, 2000. Towards an integrative account of affect and cognition phenomena using the BIAS computer algorithm. In: Forgas, J.P. (Ed.), Feeling and Thinking: The role of Affect in Social Cognition. Cambridge University Press, New York, pp. 223–252. Fogg, B.J., Nass, Clifford, 1997. Silicon sycophants: the effects of computers that flatter. International Journal of Human-Computer Studies 46, 551–561. Forgas, Joseph P., 1995. Mood and judgment: the Affect Infusion Model (AIM). Psychological Bulletin 116, 39–66. Forgas, Joseph P., 1999. Feeling and speaking: mood effects on verbal communication strategies. Personality and Social Psychology Bulletin 25 (7), 850–863. Hatipoğlu, C¸iler, 2003. Culture, Gender and Politeness: Apologies in Turkish and British English. Unpublished PhD Thesis. University of the West of England, Bristol, UK. Hatipoğlu, C¸iler, 2004. Do apologies in e-mails follow spoken or written norms? Some examples from British English. Studies About Languages 5, 21–29. Hone, Kate, 2006. Emphatic agents to reduce user frustration: the effects of varying agent characteristics. Interacting with Computers 18 (2), 227–245. Isen, Alice M., 1985. Asymmetry of happiness and sadness in effects on memory in normal college students: comments on Hasher, Rose, Zacks, Sanft, and Doren. Journal of Experimental Psychology: General 114, 388–391.
2448
M. Akgun et al. / Journal of Pragmatics 42 (2010) 2430–2448
Johnson, Daniel, Gardner, John, Wiles, Janet, 2004. Experience as a moderator of the media equation: the impact of flattery and praise. International Journal of Human-Computer Studies 61 (3), 237–258. Klein, J., Moon, Y., Picard, R.W., 1999. This computer responds to user frustration. In: Paper presented at the Conference on Human Factors in Computing CHI’99, Pittsburg, Pennsylvania. Klein, J., Moon, Y., Picard, R.W., 2002. This computer responds to user frustration: theory, design and the results. Interacting with Computers 14, 119–140. Lazar, Jonathan, Jones, Adam, Hackley, Mary, Shneiderman, Ben, 2005. Severity and impact of computer user frustration: a comparison of student and workplace users. Interacting with Computers 18, 187–207. Leech, Geoffrey N., 1983. Principles of Pragmatics. Longman, London. Lisetti, Christine L., Schiano, Diane J., 2000. Automatic facial expression interpretation: where human-computer interaction, artificial intelligence and cognitive science intersect. Pragmatics and Cognition (Special Issue on Facial Information Processing: A Multidisciplinary Perspective) 8 (1), 185–235. Mayer, John D., McCormick, Laura J., Strong, Sara E., 1995. Mood-congruent memory and natural mood: new evidence. Personality and Social Psychology Bulletin 21, 736–746. Nass, Clifford, Moon, Youngme, 2000. Machines and mindlessness: social responses to computers. Journal of Social Issues 56 (1), 81–103. Nass, Clifford, Steuer, Jonathan, Tauber, Ellen R., 1994. Computers are social actors. In: Paper presented at the Conference on Human Factors in Computing, CHI’94, Boston, Massachusetts. Neilsen, Jakob, 1998. Improving the Dreaded 404 Error Message. Retrieved September 12, 2006, from http://www.useit.com/alertbox/404_improvement. html. Olshtain, Elite, 1989. Apologies across languages. In: Blum-Kulka, S., House, J., Kasper, G. (Eds.), Cross-cultural Pragmatics: Requests and Apologies. Ablex, Norwood, pp. 155–174. Olshtain, E., Cohen, Andrew, 1983. Apology: a speech act set. In: Wolfson, N., Judd, E. (Eds.), Sociolinguistics and Language Acquisition. Newbury House, Rowley, pp. 18–35. Parrott, W.Gerrod, Hertel, Paula, 1999. Research methods in cognition and emotion. In: Dalgleish, T., Power, M.J. (Eds.), Handbook of Cognition and Emotion. John Wiley and Sons, Chichester, pp. 61–77. Picard, Rosalind W., 1997. Affective Computing. The MIT Press, Cambridge. Picard, Rosalind W., 2000. Toward computers that recognize and respond to user emotion. IBM Systems Journal 39, 705–719. Reeves, Byron, Nass, Clifford, 1996. The Media Equation: How People Treat Computers, Televisions, and New Media Like Real People and Places. Cambridge University Press, Cambridge. Reise, Steven P., Waller, Niels G., Comrey, Andrew L., 2000. Factor analysis and scale revision. Psychological Assessment 12 (3), 287–297. Resnik, Paula V., Lammers, H. Bruce, 1985. The influence of self-esteem on cognitive responses to machine-like versus human-like computer feedback. The Journal of Social Psychology 125 (6), 761–769. Ritter, Frank E., Young, Richard M., 2001. Embodied models as simulated users: introduction to the special issue on using cognitive models to improve interface design. International Journal of Human-Computer Studies 55, 1–14. Schlenker, Barry R., Darby, Bruce W., 1981. The use of apologies in social predicaments. Social Psychology Quarterly 44 (3), 271–278. Schwarz, Norbert, Clore, Gerald L., 1983. Mood, misattribution, and judgments of well-being: informative and directive functions of affective states. Journal of Personality and Social Psychology 45, 513–523. Suchman, Lucy A., 1987. Interactive artifacts. In: Suchman, L.A. (Ed.), Plans and Situated Actions. Cambridge University Press, New York, pp. 5–26. Tzeng, Jeng-Yi, 2004. Toward a more civilized design: studying the effects of computers that apologize. International Journal of Human-Computer Studies 61, 319–345. Tzeng, Jeng-Yi, 2006. Matching users’ diverse social scripts with resonating humanized features to create a polite interface. International Journal of Human Computer Studies 64, 1230–1242. Vollmer, Helmut J., Olshtain, Elite, 1989. The language of apologies in German. In: Blum-Kulka, S., House, J., Kasper, G. (Eds.), Cross-cultural Pragmatics: Requests and Apologies. Ablex, Norwood, pp. 197–218. Watson, David, Tellegen, Auke, 1985. Toward a consensual structure of mood. Psychological Bulletin 98, 219–235. Weiner, Bernard, 1979. A theory of motivation for some classroom experiences. Journal of Educational Psychology 71 (1), 3–25. Mahir Akgun is a PhD candidate at the Department of Instructional Technology at the Middle East Technical University (METU), Ankara, Turkey, from where he also earned his BS in Instructional Technology and MS in Cognitive Science. His research focuses on technology planning, cognitive tools and Human–Computer Interaction. Dr. Kursat Cagiltay is Associate Professor at the Department of Instructional Technology at the Middle East Technical University (METU), Ankara, Turkey. He earned his BS in Mathematics and MS in Computer Engineering from Middle East Technical University. He holds a double Ph.D. in Cognitive Science and Instructional Systems Technology from Indiana University.His research focuses on Human–Computer Interaction, Instructional Technology, social and cognitive issues of electronic games, socio-cultural aspects of technology, distance learning, and Human Performance Technologies. Dr. Deniz Zeyrek is Professor of linguistics with research interests in Turkish, discourse, pragmatic development in second language learners, and cognitive science.