Assessing the suitability of judgmental auditing tasks for expert systems development: An empirical approach

Assessing the suitability of judgmental auditing tasks for expert systems development: An empirical approach

Expert Systems With Applications, Vol. 9, No. 4, pp. 441-455, 1995 Copyright © 1995 Elsevier Science Ltd Printed in the USA. All rights reserved 0957-...

1MB Sizes 1 Downloads 87 Views

Expert Systems With Applications, Vol. 9, No. 4, pp. 441-455, 1995 Copyright © 1995 Elsevier Science Ltd Printed in the USA. All rights reserved 0957-4174/95 $9.50 + .00

Pergamon 0957-4174(95)00015-1

Assessing the Suitability of Judgmental Auditing Tasks for Expert Systems Development: An Empirical Approach VIJAY KARAN Department of Accounting, College of Business and Management, University of Maryland

UDAY S. MURTHY AND A JAY S. VINZE Department of Accounting, College of Business Administration and Graduate School of Business, Texas A&M University, College Station, TX 77843-4353

Abstract--The "Big 6" public accounting firms have invested considerable resources in the development of expert system (ES) for a variety of auditing tasks. However, the tasks for most existing auditing ESs appear to have been selected based on accessibility and cooperation of experts, and~or the judgmental evaluation of the developer, rather than a careful selection of the task from among a number of viable alternatives. A critical aspect of task suitability is the degree to which the characteristics of a candidate task match the capabilities of ES technology. In this study, a questionnaire was developed to obtain task-related information from practicing auditors in order to distinguish among candidate auditing tasks in terms of their suitability for ES development. Auditors in the "'Big 6" public accounting firms were asked to provide ratings of the knowledge, data, and task characteristics of nine judgmental auditing tasks. The analysis of the data obtained from fifty-nine auditors revealed that the nine tasks were distinguishable in terms of their suitability for ES application. Two tasks, determining compliance with generally accepted accounting principles and audit work program development, were relatively better suited for ES application, while determination of the adequacy of an allowance and going concern evaluation were the tasks least suited for ES application. The fact that actual ES development efforts in the "'Big 6" firms have emphasized the compliance and audit work program development tasks provides a degree of validation of our results.

1. I N T R O D U C T I O N

but limited resources, choosing among alternative tasks becomes especially critical. 2 In the ES literature, score-sheet approaches for determining the'likelihood of successful ES development of a particular task have been proposed (Krcmar, 1988; Prerau, De Vaney, & Whiting, 1985; Laufmann, 1990; Slagle & Wick, 1988). These approaches consider factors such as time and cost for development, impact on other operational systems, task characteristics, and potential system benefits. Although other considerations are relevant, successful system development is said to largely depend on the extent of "fit" between the characteristics of the task and the capabilities of ES technology (Brachman et al., 1983; Hayes-Roth, Waterman, & Lenat, 1983; Waterman, 1986). Several tasks in auditing are judgmental in nature and therefore may

OVER THE LAST few years, the applicability of expert systems (ES) technology for judgmental auditing tasks has been abundantly demonstrated. Several auditing ESs have been developed as academic prototypes and as commercial systems by the "Big 6 " accounting firms) The tasks that these ESs deal with have been Selected primarily based on accessibility and cooperation of experts, and/or the judgmental evaluation of the developer, rather than a careful selection of the task from among a number of viable alternatives (Connell, 1987). Since accounting firms typically have several potential auditing tasks that can be addressed by ES technology, Requests for reprints should be sent to Uday S. Murthy, Department of Accounting, College of Business and Graduate School of Business, Texas A&M University, College Station, TX 77843-4353, U.S.A. IO'Leary and Watkins (1989) provide a good review of both academic and commercial systems. Brown and Murphy (1990) and Brown (1991) have reviewed recent ES developments in the Big 6 finns.

tin a survey of partners in the Big 6 accounting firms, Abdolmohammadi and Bazaz (1991) identified 45 tasks from among 332 as being suited for ES development. 441

442

V. Karan et al.

appear to be well suited for ES technology. However, successful ES development hinges on making fine distinctions among tasks that are prima facie well suited for ES technology. Thus, it is important to correctly rate the relevant characteristics of alternative auditing tasks to determine their relative suitability for ES application. This rating of task characteristics is best done by the domain experts, i.e., practicing auditors, rather than by system developers) The auditors' task ratings can then be used to draw inferences about the appropriateness of alternative tasks for ES development. This paper presents an approach for differentiating among candidate auditing tasks in terms of task characteristics of relevance for ES development. The objectives of this study are (1) to develop a questionnaire which can be used to distinguish between candidate auditing tasks in terms of their characteristics which determine suitability for ES development, (2) validate the questionnaire by addressing concems of content and construct validity (Kerlinger, 1986; O'Leary, 1987), (3) demonstrate the utility of this questionnaire by obtaining data from practicing auditors, and (4) create profiles of candidate auditing tasks to determine the dimensions along which tasks better suited for ES development differ from tasks not well suited. The questionnaire developed in this study was pilot-tested and then targeted to a sample of auditors from each of the Big 6 public accounting firms. The analysis of the data obtained from 59 auditors demonstrated the ability to distinguish among the nine tasks included in the questionnaire. The paper is organized as follows. The next section reviews the literature on existing auditing ESs and prior efforts aimed at identifying appropriate tasks for ES development. Based on this literature, in the following section we discuss the development of the questionnaire and its pilot testing. Thereafter, we present the results indicating the tasks which were significantly different in terms of their suitability for ES development. The analysis of the profiles of tasks better and less suited and the comparison with existing Big 6 ESs are also presented. We then discuss the implications of the findings and the final section summarizes the paper and presents our conclusions.

2. PRIOR R E S E A R C H In the artificial intelligence literature, there has been some effort to identify the criteria for evaluating the appropriateness of a candidate task for ES application (Laufmann et al., 1990; Prerau, 1985; Slagle & Wick, 1988). We review this research, and also other efforts,

3An initial attempt at distinguishing among candidate auditing tasks was made by Havelka and Sutton (1990) using Slagle and Wick's (1988) questionnaire. However, the tasks were rated by the authors themselves and not by practicing auditors who are the real domain experts.

specifically in auditing, as a basis for our questionnaire development. 2.1. Prior Research on Identifying Appropriateness of a Task for ES Application According to Prerau (1985), the attributes for successful expert systems projects fall into these categories: basic requirements, problem type, expert, problem bounds, and domain area personnel. A total of 54 attributes across these six categories were identified by Prerau. The focus of Prerau's categorization was on choosing tasks with the highest potential for ES development success within a specific organization, assuming that an expert could be identified and that the same development team would be used. Slagle and Wick (1988) extended Prerau's work by reorganizing the list of attributes for successful ES development along two dimensions: an essential-desirable features axis and the users-task-expert axis. According to the authors, an "essential feature" is a feature that current expert system technology requires in order for the application to be a success (1988, p. 46). In contrast, "desirable features" are defined as those which are not necessarily required by current expert system technology but the ES development task would be significantly more difficult if they were absent. Both essential and desirable features relate to different aspects of users and their management, the task, and the expert. The Slagle and Wick evaluation method involves assigning scores to candidate tasks based on how each task rates on each of the essential and desirable features; tasks with higher scores being better suited for ES development. Although the Slagle and Wick approach represents one possible approach at evaluating candidate tasks, exactly how the "correct" scores for the various features of each task are elicited remains an open question. Laufmann et al. (1990) present a methodology for qualitatively analyzing the applicability of ES technology to specific tasks within specific organizational environments. They too propose a "scoring worksheet" approach with plus and minus scores being assigned to various features of each task. Four major topics are addressed in their worksheet: (1) project goals and objectives and how success is defined, (2) appropriateness of the application of ES technology for the task, (3) resource requirements and availability, and (4) nontechnical considerations. These four topics are scored at two levels: a top-level analysis focused on organizational factors independent of candidate tasks and a detailed analysis focused on specific candidate tasks. With reference to task appropriateness, the worksheet questions relate to the source of the task-related knowledge, nature of knowledge, characteristics of the data required to perform the task, and task-specific characteristics. The Laufmann et al. score-sheet is intended to be used by ES

443

Assessing Task Suitability developers to rate a task, perhaps in consultation with one or more domain experts. However, when several candidate tasks exist, a more appropriate approach would be to obtain ratings about task-specific features from a large pool of domain experts.

2.2. Task Suitability Assessments for ESs in Auditing In the auditing domain, several ESs have been developed as academic prototypes and as commercial systems by the Big 6 auditing firms (see O'Leary and Watkins, 1989, for a review). As Brown (1991, p. 6) notes, there is considerable diversity among the 'Big 6' firms in terms of which specific audit judgment tasks were emphasized for ES development. Some Big 6 ES are comprehensive and address composite tasks such as audit planning, while others address elemental tasks such as inherent risk assessment. Many systems were developed on the basis of in-house proposals that were approved for specific auditing tasks (Abdolmohammadi & Bazaz, 1991, p. 100). Other systems were developed by supporting academic efforts through various Big 6 research funding programs, principally through KPMG Peat Marwick's Research Opportunities in Auditing program. Given the judgmental nature of most auditing decisions, ES technology could potentially be applied to several auditing tasks. However, few attempts have been made in the auditing literature at identifying tasks better and less suited for ES development. A preliminary attempt at identifying auditing tasks appropriate for ES application was made by Hansen and Messier (1986). As part of a larger study, they asked practicing auditors to indicate which of several auditing tasks they felt were suited for ES development. However, this approach presupposes that domain experts also have sufficient understanding of ES in order to accurately indicate which tasks are better suited for that technology. Havelka and Sutton (1990) used Slagle and Wick's (1988) method for classifying auditing tasks as being either more or less appropriate for ES technology. They modified the attributes identified by Slagle and Wick by omitting some as being irrelevant for differentiating between tasks in the auditing domain and adding others of specific relevance to the audit environment. The resulting categories were: (1) audit-oriented features, (2) task/problem features, and (3) expert features. After assigning weights to the features in each of these three categories, the authors assigned scores for each feature for each of 13 auditing tasks. The resulting scores were, according to the authors, an indication of the appropriateness of each task for the application of ES technology (higher scores representing tasks more appropriate for ES development). However, it is important to note that the scores for each feature were assigned based on the authors' subjective evaluation of each task--no input was obtained from the "real" experts, i.e., practicing auditors.

Abdolmohammadi and Bazaz (1991) solicited input from practicing auditors in an attempt to identify tasks appropriate for ES development. Subjects were provided with definitions of several decision aids including expert systems. Then, from a list of 332 audit tasks provided to them, auditors were asked to indicate which tasks, in their opinion, were suited for ES development. Forty-five tasks were identified as being suitable for ES development. A critical concern is whether auditors can correctly identify tasks suited for ES development based simply on an understanding of ES technology from a definition provided to them. Practicing auditors do however have extensive knowledge about various characteristics of auditing tasks such as the nature of knowledge, the kinds of data required for the task, and the type of reasoning used. A more appropriate approach would be to obtain input from expert auditors about the characteristics of various auditing tasks of relevance for ES development and to analyze their answers to determine which tasks are better suited for ES development. 3. Q U E S T I O N N A I R E D E V E L O P M E N T The questionnaire used in the study was developed based on Laufmann et al.'s (1990) listing of features relevant to ascertaining the appropriateness of candidate tasks for the application of ES technology. Laufmann et al.'s (1990) scoresheet was used as the basis for developing the questionnaire because of its comprehensive listing of task characteristics relevant to establishing a task's appropriateness for ES technology. Both Prerau (1985) and Slagle and Wick (1988) identified far fewer taskspecific attributes. Laufmann et al.'s (1990) worksheet devotes a total of 67 questions to determining whether a candidate task is appropriate for the application of ES technology. The questions are organized in the following categories: (1) problem-solving knowledge: the source of the knowledge, (2) problem-solving knowledge: the nature of the knowledge, (3) problem-solving knowledge: case-specific data, (4) problem-solving knowledge: a knowledge/data comparison, (5) task characterization: definition and bounding, and (6) task complexity and characterization. The thrust of Laufmann et al.'s (1990) worksheet approach was to propose a comprehensive set of questions that need to be addressed for task selection. We analyzed these questions and developed a questionnaire by modifying certain questions and deleting others to suit business decision tasks, in particular, for audit judgment tasks. For example, some questions were deleted because their answers were apparent, given the nature of the audit environment ("Does the knowledge exist? Is the expert currently practicing in the selected task?"). Other questions did not apply to the audit decision-making environment ("Does the task involve significant realtime interaction with other programs or devices? Will results be reviewed for each case, or will they automat-

444

V. Karan et al.

TABLE 1 Breakdown of Questionnaires Mailed and Returned, by Firm

Mailed Returned Response rate

Firm 1

Firm2

Firm3

Firm4

Firm5

Firm6

Total

40 21 52.5%

40 14 35%

20 3 15%

20 4 20%

20 11 55%

20 6 30%

160 59 36.9%

ically trigger other actions or events?"). Some questions were somewhat ambiguous and needed to be specified more clearly to be understandable by auditors ("To what extent is the knowledge diverse? Are critical task decisions normally made from among very close alternatives, or made on the basis of levels of belief?."). The resulting questionnaire had 29 statements addressing the following four concerns in ES development: source of knowledge (9 statements), nature of knowledge (5 statements), data characteristics (4 statements), and task characteristics (11 statements). Respondents were instructed to indicate the degree of their agreement with each statement, on a nine-point scale from " 1 " for "strongly disagree" to " 9 " for "strongly agree". About half the statements were worded such that a higher degree of agreement with the statement indicated a higher degree of suitability of that task for ES development. The remaining questions were worded such that a higher degree of agreement with the statement indicated that the task was less suited for ES development. These questions were reverse coded during the statistical analysis such that higher scores always indicated a greater degree of suitability of the task for ES development. The questionnaire used in this study is included in the Appendix. Statements for which responses were reverse coded are indicated with an " R " in parentheses after the statement. The next step was to select audit judgment tasks for which practicing auditors would indicate the degree of their agreement with the 29 task-related statements. Representative elemental tasks requiring extensive judgment at all stages of the audit process were included in the questionnaire. Our objective was to analyze auditors' responses to the statements for each of these auditing tasks to identify tasks more appropriate for ES development. Based on the reports by Brown and Murphy (1990) and Brown (1991), six judgmental auditing tasks were identified for which expert systems had been developed by at least one of the Big 6 firms. These tasks w e r e : (1) inherent risk assessment, (2) control risk assessment, (3) developing an audit work program (audit plan), (4) determining the adequacy of a client allowance or reserve, (5) determining whether a client accounting treatment conforms to generally accepted accounting principles, and (6) evaluating the existence and adequacy of electronic data processing (EDP) controls. In addition, we selected the following three audit judgment tasks for which ESs have not as yet been developed by the Big 6 firms: (1) determining whether the client can continue as

a "going concern," (2) determining planning stage and evaluation materiality levels, and (3) interpreting the results of analytical review procedures. Academic prototype ESs have been developed for the going concern decision (Selfridge & Biggs, 1989) and for the materiality judgment (Steinbart, 1987). Thus, the selected tasks facilitated the identification of more appropriate tasks for ES technology with respect to both existing and future ESs. The tasks included in the questionnaire comprised elemental tasks rather than comprehensive tasks such as audit planning. To assess the content validity of the questionnaire, pilot tests were performed using university professors and practicing auditors. Two auditing professors teaching at a major university were requested to scrutinize the instrument. The feedback provided by the professors led to the rewording of many statements that appeared somewhat ambiguous or unclear. The revised questionnaire was then mailed to senior managers in three of the Big 6 accounting firms. A total of five auditing professionals who held the rank of "manager" or above in their firms reviewed the questionnaire. A few additional statements were rephrased in the light of feedback obtained from these expert auditors. These independent reviews of the questionnaire provided some assurance that the respondents would interpret the statements in a reasonably consistent manner.

4. RESULTS Participation in the study was solicited from all of the Big 6 firms in a large city in Texas. Additionally, the national offices of two of the Big 6 firms agreed to participate. Twenty questionnaires were mailed to a key "facilitator" in each of these eight offices with a request to distribute the questionnaires to partners, managers, and seniors, with significant audit experience. Fifty-nine questionnaires were returned over a two-month period. A breakdown of questionnaires given to each office and those returned is shown in Table I. As Table 1 reveals, the overall response rate was 36.9%. The response rates from individual firms ranged from a low of 15% to a high of 55%. 4 Firms 1 and 2 were the firms to which 20 questionnaires were mailed to the national office in 4A test was conducted for non-responsebias using late respondents as a surrogate for non-respondents. No systematicdifferences were noted betweenearly and late respondents in any of the nine tasks at the 0.05 level of significance.

445

Assessing Task Suitability TABLE 2 Positions of Respondents Within Each Firm and Average Years of Experience

Position Partner Manager Supervisor In-charge acct. Total Average experience (years)

Firm 1 3 17 1 21 10.5

Firm 2

Firm3

2 8 1 3 14 7.9

1 2 3 10.6

addition to the 20 questionnaires mailed to the regional office in a large city in Texas. Table 2 shows the positions of respondents within each finn and the average years of experience of the respondents. The respondents had an average of 8.2 years of experience, ranging from an average of 15.4 years of experience for in-charge accountants (seniors). A vast majority of respondents were very experienced auditors, with 43 of the 59 respondents being either managers or partners.

4.1. Questionnaire Validation To discern the underlying constructs in the 29-item questionnaire, a principal components factor analysis with varimax rotation was performed. Four item groupings were formed based on the factor loadings using a cut-off of 0.30. These four groupings were labeled as follows: (1) complexity of the task, (2) expertise requirements, (3) task manageability, and (4) task objectives. Statements focusing on the difficulty of the judgment at hand were clubbed together as the "task complexity" factor. The second factor was labeled "expertise requirements" given its emphasis on the knowledge and data characteristics of the task at hand. The third factor, "task manageability," highlighted the scope or breadth of the task being addressed. The final factor took a solution perspective focusing on the objectives of the task and was therefore called "task objectives." All but six of the questions loaded on these four factors (the six questions had factor loadings less than 0.30). The four factors and the factor loadings are shown in Table 3. Given the exploratory nature of this study, and that nine tasks were being considered, a high degree of variance in responses was expected. Therefore, to determine the construct validity of the four factors, a reliability analysis was performed. The resulting Cronbach's alpha (o0 scores were as follows: 0.80 for "task complexity," 0.70 for "expertise requirements," 0.51 for "task manageability" and 0.48 for "task objectives." Although Nuunally (1967) suggests that a-levels of at least 0.80 are required to affirm construct validity, researchers in information systems have indicated that, for exploratory studies such as this, much lower a levels

Firm4

Firm5

3 1

3 1 7 11 5.i

4 6.2

Firm 6 1 3 2 6 6.8

Total 7 36 4 12 59

Average experience (years) 15.4 8.8 4.5 3.6 8.2

are acceptable. For example, Treacy (1985) indicated that a maintained at a 0.70 level would be reasonable, and Srinivasan (1985) suggested that an even lower level of 0.50 was acceptable.

4.2. Comparison of the Nine Tasks Mean scores and standard deviations for each of the four factors for the nine tasks are reported in Table 4. The means and standard deviations for each statement in the questionnaire are shown along with each statement in the Appendix. The "total score" comprises the sum of the means of the four factors for each task. In effect, a higher total score for a task indicates a greater degree of suitability for ES development. To determine whether there was a significant difference in the total scores among the nine tasks, a withinsubjects multivariate analysis of variance (MA_NOVA) was performed. The total score for each t a s k can be considered to be an indication of its appropriateness for ES development. Since all nine tasks were rated by each of the 59 respondents a within-subjects analysis was employed. The null hypothesis tested was as follows: H0:

The total scores for the nine candidate tasks are not different from one another

The MANOVA resulted in a Wilks' lambda value of 0.33 and an F statistic of 12.90 indicating that the null hypothesis of no task effect can be rejected (P < 0.001). Next, to distinguish between candidate tasks, we compared pairs of tasks in an attempt to identify which pairs of tasks were significantly different from one another. The null hypothesis tested was as follows: Ho:

No two tasks taken as a pair are significantly different from one another

This hypothesis was tested using contrasts for each pair of tasks in the within-subjects MANOVA. Table 5 shows which pairs of tasks were found to be significantly different. The most consistent pattern emerging from Table 5 is that the tasks of going concem evaluation and determining the adequacy of an allowance are clearly

V. Karan et al.

446

TABLE 3 Factors and Their Loadings

Statement in questionnaire 1. Different knowledge sources (e.g., AICPA guidelines, human experts, firm policies) take significantly different approaches to the task. 6. There is a high degree of agreement among multiple experts about the knowledge required to perform the task. 11. The knowledge available for performing the task is incomplete. 13. The knowledge available for performing the task is unclear. 14. The knowledge available for performing the task is unreliable. 16. All data required for the task are usually available. 17. The data available is usually reliable. 18. Data provided by different sources are often conflicting or contradictory. 23. The task often presents unique situations that experts have never encountered before. 29. Auditors are likely to disagree with each others decisions or solutions for this task. 4. Multiple experts, possibly with varying specialties, are required to perform the task. 9. Many years of experience are required to acquire the knowledge to perform the task. 10. The knowledge requirements for performing the task are extensive. 12. The task-related knowledge is rapidly changing. 15. Large amounts of data are required to perform the task. 1. The knowledge required for the task is well defined by AICPA guidelines, GAAS, textbooks, firm policies, etc. 2. The knowledge required for the task resides primarily in human experts rather than by AICPA guidelines, GAAS, textbooks, firm policies, etc. 7. Different knowledge sources (e.g., AICPA guidelines, human experts, firm policies) each contain only a part of the total knowledge required for the task. 19. The task can easily be divided into subtasks. 24. The possible task solutions are predefined and the objective is to select from among this set of solutions. 25. The possible task solutions are unclear and the objective is to construct unique solutions. 26. The task involves finding either one solution, a best solution, or all plausible solutions. (Enter "1" for one solution, "5" for best solution and "9" for all plausible solutions). 28. Task decisions or solutions are generally evaluated as either right or wrong. "--" indicate factor Ioadings lower than 0.30.

Factor 1 Complexity of task

Factor 2 Expertise requirements

Factor 3 Task manageability

Factor 4 Task objectives

0.55

0.48 0.58

m

0.71

q

0.79

m

0.72

E

0.62 0.56

m

m

m

m

m

m

m

m

0.47

u

0.36 0.42 0.71

m

0.75 0.36

0.40

m

0.60

m

0.75 0.74

0.33

m

0.30

0.35

m

0.35

m

0.42

m

0.52 0.62 0.54

0.45

Assessing Task Suitability

447

TABLE 4 Average Total Scores and Average Scores for Each Scale for the Nine Tasks (In Ascending Order of Average Total Score)

Task Determining the adequacy of an allowance Going concern evaluation Interpreting the results of analytical procedures Evaluating the adequacy of EDP controls Establishing materiality levels Inherent risk assessment Control risk assessment Audit work program development Determining compliance with GAAP

Total score

Task complexity

Expertise requirements

Task manageability

Task objectives

18.85 (3.78)

5.66 (1.42)

4.67 (1.59)

4.17 (1.21)

4.35 (1.40)

19.13 (4.18) 20.37 (3.11 )

5.52 (1.45) 6.19 (1.14)

4.21 (1.33) 5.13 (1.15)

4.35 (1.70) 4.73 (1.15)

5.06 (2.13) 4.32 (1.45)

20.48 (2.92)

6.33 (1.17)

4.06 (1.46)

4.98 (1.16)

5.11 (1.32)

21.23 (4.21) 21.29 (3.19) 21 ~31 (2.90) 23.05 (3.23) 23.28 (3.38)

6.28 (1.27) 6.14 (1.14) 6.20 (1.13) 6.76 (1.07) 7.00 (1.05)

5.72 (1.41) 5.43 (1.17) 5.02 (1.17) 5.53 (1.31) 4.19 (1.25)

4.32 (1.75) 4.58 (1.40) 4.87 (1.32) 5.74 (1.26) 6.45 (1.15)

4.91 (1.60) 5.13 (1.45) 5.23 (1.37) 5.03 (1.51) 5.64 (1.73)

Note: values in parentheses are standard deviations.

TABLE 5 Comparison of Mean Total Scores Among the Nine Tasks: Within-Subjects MANOVA Using Contrasts for Pairs of Tasks

Mean scores

Tasks

ALL

18.85 19.13 20.37 2O.48 21.23 21.29 21.31 23.05 23.28

ALL GC ARP EDP MAT IR CR WP GAAP

** ** .... .... .... ** ....

GC

ARP

EDP

"A"

~"

MAT

IR

* *

**

*denotes pairs of groups significantly different at the 0.01 level. **denotes pairs of groups significantly different at the 0.05 level. Tasks: ALL = Determining the adequacy of an allowance. GC = Going concern evaluation. ARP = Interpreting the results of analytical review procedures. EDP -- Evaluating the adequacy of EDP controls. MAT = Establishing materiality levels. IR = Inherent risk assessment. CR = Control risk assessment. WP = Audit work program development. GAAP = Determining compliance with GAAP.

CR

WP

448

V. K a r a n et al.

less well suited for ES development, while audit work program development and determining compliance with GAAP are the two tasks clearly better suited for ES development. Since the comparison of the total scores demonstrated that several tasks were distinguishable from one another, and given that the total score is comprised of four scales (task complexity, expertise requirements, task manageability, and task objectives), at issue is whether these four scales vary significantly among the nine tasks in a multivariate sense. The null hypothesis is as follows:

Ho: The nine candidate tasks cannot be distinguished using the four factors of task complexity, expertise requirements, task manageability, and task objectives taken together. To test this hypothesis, a doubly multivariate withinsubjects MANOVA was performed. This analysis considered four dependent variables (i.e., the four factors) for each of the nine tasks in a within-subjects fashion. The objective was to determine whether the four factors were significantly different in a multivariate sense among the nine tasks. The results of the MANOVA analysis are shown in Table 6. The null hypothesis of no difference in the four scales across the nine tasks can be rejected (see Panel A of Table 6). Further, each of the four scales taken individually is significantly different across the nine tasks (see Panel B of Table 6).

4.3. Profile Analysis of the Nine Tasks Having established that the tasks were distinguishable, considering both the total scores as well as the four factors, we next analyzed the aspects of each task which contributed to a greater or lower degree of suitability for ES development. One approach to achieve this objective is to develop "profiles" of each task comprising standardized Z scores on each factor, as shown in Figure

1. Factors contributing to greater task suitability have bars in the positive side of the scale whereas factors that deter from a task being suitable for ES application have bars in the negative side of the scale. For example, the task of evaluating the adequacy of EDP controls has slightly positive ratings for the "task complexity, .... task manageability," and "task objectives" factors. However, the overall rating for this task in terms of its suitability for ES development was relatively low (see Table 3). As Figure 1 reveals, the main factor detracting from the suitability of the evaluating EDP controls task was its negative rating on the dimension of "expertise requirements." What the profiles in Figure 1 reveal is that when comparing tasks for ES development, a "total score" measure is not an adequate means of comparison. For example, although the task of determining compliance with GAAP was best suited based on the total score, it received a relatively negative rating in terms of its expertise requirements. Figure 1 reveals the dimensions along which each task is either better or less suited for ES development. The two tasks at opposite ends of the "task suitability" scale (using total score) were determining the adequacy of an allowance (rated lowest) and determining compliance with GAAP (rated highest). The task of determining the adequacy of an allowance was the only task to be rated negatively on each of the four factors. The only task to receive positive ratings on all four factors was audit work program development. The determining compliance with GAAP task, which was rated as being the best suited for ES development among the nine tasks, had overwhelmingly positive ratings on three of the four factors (task complexity, task manageability, and task objectives). As was noted earlier, certain factors may be more critical than others in certain situations, implying that the total score would not accurately reflect the suitability of a candidate task. For example, in the case of determining compliance with GAAP, although its total was the highest, in terms of the expertise dimension, it rated

TABLE 6 Comparison of the Four Scales Across the Nine Tasks: Multivariate Analysis of Variance Results

Panel A--Multivariate tests of significance Test Name Value F Sig. of F Pillais Hotellings Wilks

0.819 1.356 0.355

14.95 19.48 17.22

0.000" 0.000" 0.000"

Panel B--Univariate F-tests for the four scales Variable F Sig. of F Task complexity Expertise requirements Task manageability Task objectives * = P < 0.001

21.76 21.47 28.46 6.25

0.000" 0.000" 0.000" 0.000"

Assessing Task Suitability

449

~

T:::

~

.......

~

/

A

. .......

I

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1Task complexity rqExpertise requirements INTask manageability INTask objectives FIGURE 1. Z Score profiles of the nine tasks.

lower than all other tasks except the task of evaluating E D P controls. Thus, if expertise requirements was of particular concern for an organization, the task o f determining compliance with G A A P m a y not be the obvious choice. However, if all four dimensions are to be equally weighted, then the total score would be an appropriate measure o f task suitability.

factors caused the clusters of tasks to be different from one another. The null hypothesis is as follows: The three clusters o f tasks cannot be distinguished using the four factors of task complexity, expertise requirements, task manageability, and task objectives taken together A within-subjects M A N O V A with contrasts was

4.4. Cluster Analysis The profiles shown in Figure 1 appeared to indicate that the nine tasks could be roughly grouped into three categories: those clearly well suited for ES development, those clearly not suited, and the remaining tasks showing no clear indication in either direction. In an attempt to determine whether three groups of tasks could be formed into distinct "clusters" based on the four factors, a cluster analysis was performed. The vertical icicle plot resulting from the cluster analysis is shown in Figure 2. On scanning the vertical icicle plot, at the high end indicating greater task suitability only one task stood out---determining compliance with GAAP. A t the low end, three tasks appeared to be clustered going concern evaluation, determining the adequacy of an allowance, and evaluating the adequacy of EDP controls. The remaining five tasks were clustered in the middle. We next sought to determine whether the three clusters were distinguishable in terms o f the four factors. The objective here was to investigate which factor or

Tasks Number of clusters

l 2 3 4 5 6 7 8

GAAP

=

WP

=

CR

=

IR

=

MAT

=

ARP

=

EDP

=

GC

=

ALL

=

G A A P

W P

C R

R

I A

M R

A D

T

C P

E L P

G

A L

************************* * ********************** * ************* ******* * * ********** ******* * * ********** * **** * * ******* * * **** * * ******* * * * * * * * * * * * * * * *

Determiningcompliancewith GAAP Audit work programdevelopment Control risk assessment Inherent risk assessment Establishingmaterialitylevels Interpretingthe resultsof analytical.reviewprocedures Evaluatingthe adequacyof EDP controls Goingconcernevaluation Determiningthe adequacyof an allowance

FIGURE 2. Cluster of similar tasks~verticle icicle plot.

V. K a r a n et al.

450

TABLE 7 Comparison of the Four Scales Among the Three Clusters of Tasks: Within-Subjects MANOVA Using Contrasts for Pairs of Clusters

Scale

Cluster 1 vs. cluster 2

Cluster 1 vs. cluster 3

P < 0.01 P < 0.01 P < 0.10 N.S.

P < 0.01 N.S. P < 0.01 P < 0.01

Task complexity Expertise requirements Task manageability Task objectives

Cluster2 vs. cluster 3 P< P< P< P<

0.01 0.01 0.01 0.01

Tasks in cluster 1: Determining the adequacy of an allowance, Going concern evaluation, Evaluatingthe adequacy of EDP controls. Tasks in cluster 2 : Interpreting the results of analytical review procedures, Establishing materiality levels, Inherent risk assessment, Control risk assessment, Audit work program development. Task in cluster 3: Determining compliance with GAAP.

performed to determine which clusters differed significantly in terms of each of the four factors. The results are shown in Table 7. As shown in Table 7, the second and third clusters were significantly different on each of the four factors. In comparing the first and second clusters, task complexity and expertise requirements were clearly different. The task manageability factor was also different, albeit at a lower significance level (P < 0.10). Based on the task objectives factor alone, the first and second clusters were not distinguishable. The first and the third clusters were significantly different on all but the "expertise requirements" factor. Recall that cluster 3 comprised only the determining compliance with GAAP task, and that this task had a negative rating on the expertise requirements dimension. Thus, it seems logical that the first and third

cluster are not distinguishable in terms of expertise requirements.

4.5. Task Suitability Revealed by Survey vs. Existing Big 6 ESs Our results indicate that determining the adequacy of an allowance and going concern evaluation were the tasks least suited for ES development while the task of determining compliance with GAAP was the most suited. We then compared existing Big 6 ES relative to the tasks we rank-ordered in terms of suitability for ES development. Table 8 reports the number of ESs in use by each of the Big 6 firms for each of the nine tasks. The data reported in Table 8 are extracted from the report by Brown (1991) and updates provided at the 1991

TABLE 8 Expert Systems Developed by the Big 6 Firms Categorized by Tasks

Task

Arthur Andersen

Going concern evaluation

Coopers & Lybrand

Deloitte & Touche

Ernst & Young

KPMG Peat Marwick

Price Waterhouse

Total

1

1

2

1

1"

2

Determining the adequacy of anallowance Inherent risk assessment Interpreting the results of analytical procedures Evaluating the adequacy of EDP controls

1

1

3

5

1

6

1

5

3

1

6

2

6

4

9

28

2*

Control risk assessment

1

1

2*

1

1

1"

Establishing materiality levels Audit work program development

1

1

Determining compliance with GAAP

1

1

Tax compliance systems Firm totals

2 2

6

2 2

5

*The same system is used for multiple purposes (e.g., assessing inherent and control risk). Source: Brown (1991) and updates provided at the University of Southern California Expert Systems Symposium, October 1991.

Assessing Task Suitability University of Southern California Expert Systems Symposium. It is important to note that the data in Table 8 reflect the state of Big 6 ES development efforts in 1991. Further, the Big 6 firms vary considerably in their policy on revealing information about their ES development efforts. 5 A clear pattern emerging from Table 8 is that the tasks least suited for ES development (based on our analysis) were also the tasks for which the fewest ESs have been developed by the Big 6 firms as of 1991. The going concern evaluation task did not have a single ES developed and the task of determining the adequacy of an allowance had only two developed ESs. In contrast, at the other end of the scale, five systems existed in 1991 for audit work program development and six systems were in use for the task of determining compliance with GAAP. If tax compliance systems are included in the analysis, 6 then a total of 12 systems were developed as of 1991 for the task most appropriate for ES development as indicated by our study. Thus, the data in Table 8 corroborates our results and is especially revealing given that the respondents in our study did not directly rate the appropriateness of each task for ES application. Rather, the questionnaire elicited respondents ratings of task characteristics, from which the suitability of each task for ES application was inferred. It is important to note that, unlike in the studies by Hansen and Messier (1986) and Abdolmohammadi and Bazaz (1991), the respondents did not directly rate the tasks in terms of their suitability for ES.

5. DISCUSSION The issue addressed in this research is whether judgmental auditing tasks that are potential candidates for ES development can be differentiated in terms of their relative suitability. The analysis of the auditors' ratings of the nine tasks presented in the previous section indicated the value of the research instrument in discriminating among judgmental auditing tasks in terms of their characteristics of relevance from an ES development perspective. Although it may appear that the average total scores were not very wide ranging, it is important to recognize that the selected decision making tasks were similar in the sense that each of them involves considerable judgment. Given that prior research has established that judgmental auditing tasks are well suited for the application of ES technology (Abdolmohammadi, 1987), our results indicate that the characteristics of 5For example, Arthur Andersen has a policy of not releasing information on ES they use, Coopers & Lybrand only releases information on completed systems, and both Price Waterhouse and KPMG Peat Marwick release information on planned systems as well as those they have actually implemented. 6Like ESs for determining compliance with GAAP, tax compliance systems also determine the degree to which the client has complied with certain laws, rules or regulations.

451 some tasks make them relatively better suited for ES development.

5.1. Auditing Tasks for Future ES Development Some audit judgment tasks that would appear to be good candidates for ES development have been largely ignored by the Big 6 firms. From the results reported in Table 5 and the classification of existing ESs shown in Table 8, it would appear that the task of establishing materiality levels is a fruitful area in which Big 6 firms could focus their ES development efforts. It is interesting to note that none of the Big 6 finns have developed an ES for that t a s k . 7 Another judgmental task for which a working ES has not been developed is the interpretation of the results of applying analytical review procedures. Since the development and refinement of ESs is a timeconsuming and expensive proposition, it is conceivable that the Big 6 firms are focusing their efforts on tasks which appear to have the highest payoff or where the perceived need for additional decision support is the greatest (e.g., for assisting inherent and control risk assessment). Furthermore, in addition to the cost-benefit issue, public accounting firms would probably consider a host of environmental factors such as the legal and business climate, competitors' actions, and changes in the firm's audit strategy in their decision regarding which auditing task(s), if any, should be selected for ES development. The two firms that have been the most cautious in applying ES technology (Arthur Andersen and Deloitte & Touche) could target their ES development efforts on building systems for determining compliance with G A A P - - a task that clearly emerged as being the most suited for ES development. Perhaps one reason why these two firms have not moved as quickly as the other firms in embracing ES technology is the lack of evidence regarding which task(s) are relatively better suited for this technology. One approach might be to wait for other finns to succeed or fail in applying the technology to various audit judgment tasks. However, given that success o r failure in using ES technology can be attributed to a variety of reasons other than task appropriateness, merely observing other firms' experiences may be an insufficient basis for determining whether and which tasks should be selected for ES development. Based on an analysis of task characteristics, our results provide evidence regarding the relative suitability of various audit judgment tasks for the application of ES technology. Thus, a u d i t firms contemplating the development of ESs for tasks in which the technology has not as yet been applied could benefit from the results of this research. 7However, an academic prototype system has been developed for the task of establishing planning stage materiality levels (Steinbart, 1987).

452

Our results provide details about the relative ratings for the nine tasks along the dimensions of: (1) task complexity, (2) expertise requirements, (3) task manageability, and (4) task objectives. These data are useful for ES developers in public accounting firms since they indicate potential problem areas during knowledge acquisition or system validation. For example, the "evaluating EDP controls" task was rated very low in terms of the "expertise requirements" factor, suggesting that ES developers would have considerable difficulty in the knowledge acquisition and refinement stage of system development. The going concern evaluation task, which was rated overall as being the least well suited for ES development relative to the other tasks, was rated very low in terms of the "task complexity," "expertise requirements," and "task manageability" factors. The only area which is not problematic for the going concern evaluation task was the "task objectives" factor. For the task of interpreting the results of analytical review procedures, which was relatively less well suited for ES development, the results indicated that expertise requirements were not a problem for this task, i.e., identifiable experts exist for this task. However. the task was rated poorly in terms of task objectives (possible task solutions are unclear, objective is to construct new and unique solutions, etc.). 5.2. Limitations The study is subject to the usual limitations of questionnaire-based survey research. For example, no control was exercised over exactly when and where the respondents filled out the questionnaire. Although considerable care was taken to pilot test the research instrument resulting in numerous wording changes, there was a possibility of respondents misunderstanding or misinterpreting some of the statements. The reliability analysis revealed relatively low Cronbach's a scores for two of the four factors (task manageability and task objectives). This indicates the need for testing using additional data sources. Thus, it would be fair to prescribe caution in using this instrument to evaluate other auditing tasks or tasks in other domains, especially with respect to the dimensions of task manageability and task objectives dimensions. Furthermore, the predictive power of the instrument has not been tested. With the cooperation of the Big 6 firms, the instrument could be used to sample both successful and unsuccessful ES development efforts to determine its predictive ability. It is also important to note that the tasks included in the questionnaire were narrow in scope; composite tasks such as audit planning with multiple interrelated judgments were not considered. Finally, since the wording of some of the questions specifically referenced the auditing domain, changes would need to be made to the instrument before it could be used in other domains such as tax or managerial accounting.

V. Karan et al.

6. SUMMARY AND C O N C L U S I O N The utility of this study is in the demonstration of an approach for selecting among candidate auditing tasks for ES implementation. Accounting firms are often faced with such a decision given limited resources and the increasing acceptance of ES technology for solving business problems. Based on prior research in the AI literature, a questionnaire was developed for obtaining task-related information from practicing auditors as a basis for distinguishing between candidate auditing tasks in terms of their suitability for ES development. After pilot-testing the questionnaire, it was mailed out and data were obtained from 59 auditors regarding the characteristics of nine judgmental auditing tasks. The tasks of determining compliance with GAAP and audit work program development were relatively better suited for ES application, while the tasks of determining the adequacy of an allowance and going concern evaluation were the least suited for ES application. The results of this research should facilitate the allocation of resources in public accounting firms into areas that would have the greatest potential for success. The questionnaire used in this study could be applied for selecting among candidate decision making tasks in domains other than auditing. The audit related terminology could be replaced by terminology specific to a particular domain such as tax. For example, some questions in the instrument specifically addressed sources of auditing knowledge such as AICPA guidelines, firm policies, etc. [see questions 1, 2, 5 and 7 in Part (A) of the Appendix]. These questions could be reworded to address the specific knowledge sources in the domain of interest. The remaining questions are domain independent and can be used without alterations. Thus, our instrument, appropriately modified, could potentially be used in other domains to select among candidate tasks for ES application. Research could also examine other auditing tasks as a basis for differentiating among them in terms of their suitability for ES development. The questionnaire could be used to obtain input regarding a single task to determine what the potential "trouble spots" might be during ES development. Another avenue for future research is the development of a taxonomy of auditing tasks along a "structured-unstructured" continuum. Such a taxonomy would assist in the determination of the most appropriate form of computer-assisted decision support for each of several auditing tasks (decision support system, expert system, or no computer-assisted support).

Acknowledgements--The comments of James Courtney, George Fowler, David Ke~rr,David Paradice, Casper Wiggins, and Chris Wolfe are greatly apt~eciated. This paper was presented at the American Accounting Association Annual Meeting in San Francisco, CA, August 1993, and the Fifth Annual Symposium on Intelliger~t Systems in

Assessing Task Suitability

453

Accounting, Finance, and Management, Stanford, CA, November 1993. Authors contributed equally to the paper.

REFERENCES Abdolmohammadi, M. J. (1987). Decision support and expert systems in auditing: A review and research directions. Accounting and Business Research, 17(2), 173-185. Abdolmohammadi, M. J., & Bazaz, M. S. (1991). Identification of tasks for expert systems development in auditing. Expert Systems with Applications, 3(1), 99-107. Brachman, R. J., Amarel, S, Engelman, C., Engelmore, R. S., Feigenbaum, E. A.. & Wilkins, D. E. (1983). What are expert systems. In E Hayes-Roth, D. A. Waterman, & D. B. Lenat (Eds.), Building expert systems. Reading, MA: Addison-Wesley. Brown, C. E. (1991). Expert systems in public accounting: Current practice and future directions. Expert Systems with Applications, 3(1), 3-18. Brown, C. E., & Murphy, D. S. (1990). the use of auditing expert systems in public accounting. The Journal of Information Systems, 4(3), 63-72. Connell, N. A. D. (1987). Expert systems in accountancy: A review of some recent applications. Accounting and Business Research, 17(3), 221-233. Hansen, J. V., & Messier, W. E, Jr. (1986). A preliminary investigation of EDP-XPERT. Auditing: A Journal of Practice and Theory, 6(1), 109-123. Havelka, D., & Sutton, S. G. (1990). A critical success factors model for evaluating audit task domains for knowledge representation, paper presented at the American Accounting Association National Meeting, Toronto, Canada. August. Hayes-Roth, F., Waterman, D. A., & Lenat, D. B. (1983). An Overview of Expert Systems, In E Hayes-Roth, D. Waterman, & D. B. Lenat

(Eds.), Building Expert Systems. Reading, MA: Addison-Wesley. Kerlinger, E N. (1986). Foundations of behavioral research, Third edn. New York, NY: Holt, Rinehart and Winston. Krcmar, H. (1988). Caution on criteria: On the context dependency of selection criteria for expert systems projects. DATA BASE, 19(3), 39-42. Laufmann, S. C., DeVaney, D. M., & Whiting, M. A. (1990). A methodology for evaluating potential KBS applications. 1EEE Expert, (December), 43-61. Nunnally, J. (1967). Psychometric theory. New York: McGraw-Hill. O'Leary, D. E. (1987). Validation of expert systems With applications to auditing and accounting expert systems. Decision Sciences, 18(3), 468-86. O'Leary, D. E., & Watkins, P. R. (1989). Review of expert systems in auditing. Expert Systems Review, (Spring-Summer), 3-22. Preran, D. S. (1985). Selection of an appropriate domain for an expert system. AI Magazine, 6(2), 26-30. Selfridge, M., & Biggs, S. F. (1989). A computational model of the auditor's going-concern judgment. In Proceedings of the Audit Judgment Symposium, University of Southern California, Los Angeles, CA, February 1989. Slagle, J. R., & Wick, M. R. (t988). A method for evaluating candidate expert system applications. A1 Magazine, 9(4), 44-53. Srinivasan, A. (1985). Alternative measures of system effectiveness: Associations and implications. MIS Quarterly, 9(3), 243-253. Steinbart, P. (1987). The construction of a rule-based expert system as a method for studying materiality judgments. The Accounting Review, 67(1), 97-116. Treacy, M. (1985). An empirical evaluation of a causal model of user information satisfaction. Proceedings of the Sixth International Conference on Information Systems. Indianapolis, IN, December 1985. Waterman, D. A. (1986). A guide to expert systems. Reading, MA: Addison-Wesley.

APPENDIX

Analysis of Auditing T a s k s Thank you for your participation in this survey. You are requested to indicate your extent of agreement with a number of statements about each of nine auditing tasks. First, please provide some background information about yourself. Your responses will be confidential and will be reported only in a summary form along with those of other respondents. 1. Firm name: 2. Position within the firm (in-charge accountant, supervisor, manager, partner, etc.): 3. 4.

Years of auditing experience: _ _ Professional certifications and degrees:

5.

Do you have any experience using an expert system on the job? If so, please provide the name of the system or a brief description of the task performed using the system.

The statements listed on the next three pages relate to the following auditing tasks. For each task, indicate the extent of your agreement with the statement by entering a number from " 1 " for "strongly disagree" to " 9 " for "strongly agree." Abbreviations used for each task are also indicated below. a) Determining whether the client can continue as a going-concern (GC). b) Determining planning stage and evaluation materiality levels (MAT). c) Assessing i n h e r e n t risk for an account, class of transactions, or financial statement assertion (IR). d) Assessing control risk for an account, class of transactions, or financial statement assertion (CR). e) Developing an audit work p r o g r a m specifying the nature, timing and extent of substantive audit procedures to be performed (WP). f) Determining the adequacy of a client allowance/reserve/accrual amount (e.g., allowance f o r doubtful accounts, loan loss reserves, etc.) (ALL). g) Determining whether the client is making all the required financial statement disclosures in compliance with generally accepted accounting principles (GAAP).

454

V. K a r a n et al.

h) Interpreting the results of analytical review procedures (ARP). i) Evaluating the existence and adequacy of EDP controls (EDP).

Tasks: GC going concern judgment; MAT: setting materiality level; IR: inherent risk assessment; CR: control risk assessment; WP: audit work program development; ALL: adequacy of allowance; GAAP: disclosure compliance; ARP: analytical review procedures; EDP: assessing EDP controls.

Required: For each task, indicate the extent of your agreement with the statement by entering a number from "1" for "strongly disagree" to "9" for "strongly agree."

(A) Knowledge source questions 1. The knowledge required for the task is well defined by AICPA guidelines, GAA& textbooks, finn policies, etc. 2. The knowledge required for the task resides primarily in human experts rather than by AICPA guidelines, GAAS, textbooks, firm policies, etc.

GC

MAT

IR

CR

WP

ALL

GAAP

ARP

EDP

5.54 (2.32)

5.02 (2.52)

5.05 (2.08)

5.47 (1.98)

6.24 (1.85)

4.63 (2.0)

7.80 (1.64)

4.88 (1.92)

5.12 (1.88)

3.85 (2.45)

3.86 (2.47)

3.92 (2.18)

4.19 (2.12)

5.03 (2.98)

2.98 (1.98)

6.22 (2.12)

3.56 (1.93)

4.46 (2.04)

7.25 (2.48) 5.00 (2.78)

6.71 (2.64) 6.63 (2.12)

6.72 (2.38) 6.19 (2.07)

6.93 (2.23) 6.00 (2.13)

7.03 (2.48) 6.51 (2.49)

7.05 (2.16) 5.69 (2.53)

8.17 (1.79) 6.02 (2.36)

6.98 (2.26) 6.25 (2.26)

8.29 (1.52) 5.05 (2.46)

6.14 (2.39)

5.25 (2.50)

5.29 (2.33)

5.36 (2.34)

5.75 (2.47)

5.31 (2.49)

6.86 (1.90)

5.63 (2.35)

5.69 (2.28)

5.98 (1.95)

5.53 (1.85)

5.69 (1.83)

5.58 (1.84)

6.07 (1.86)

5.47 (1.89)

6.95 (1.47)

5.64 (1.70)

5.92 (1.75)

3.85 (2.64)

4.12 (2.74)

4.20 (2.50)

4.05 (2.50)

4.64 (2.51)

3.83 (2.49)

4.73 (2:74)

4.29 (2.39)

3.83 (2.28)

8.10 (2.00) 2.81 (1.96)

7.46 (2.08) 4.81 (2.35)

7.49 (2.10) 5.02 (2.10)

7.42 (2.03) 4.95 (2.14)

6.95 (2.12) 5.47 (2.07)

7.85 (1.80) 4.17 (2.24)

7.64 (1.82) 4.15 (2.05)

7.64 (1.82) 4.69 (2.00)

8.00 (1.70) 4.15 (2.07)

4.25 (1.73) 6.38 (2.48) 6.29 (2.00) 6.12 (2.21) 7.56 (1.56)

4.17 (2.24) 6.38 (2.38) 5.59 (2.21) 6.08 (2.19) 7.56 (1.57)

4.85 (1.75) 7.34 (1.81) 6.17 (1.98) 7.20 (1.62) 7.90 (1.35)

3.59 (1.98) 6.10 (2.65) 5.81 (2.28) 5.98 (2.38) 7.10 (2.21)

3.36 (1.80) 7.29 (2.01) 4.34 (2.32) 7.29 (1.73) 7.83 (1.44)

4.25 (1.60) 6.50 (2.47) 5.95 (2.06) 6.27 (2.25) 7.47 (1.73)

3.25 (1.81) 6.28 (2.36) 4.05 (2.23) 5.97 (2.19) 7.56 (1.51)

(R) 3. There exists at least one identifiable expert in the firm for the task. 4. Multiple experts, possibly with varying specialties, are required to perform the task. (R) 5. Different knowledge sources (e.g., AICPA guidelines, human experts, firm policies (take significantly different approaches to the task. (R) 6. There is a high degree of agreement among multiple experts about the knowledge required to perform the task. 7. Different knowledge sources (e.g., AICPA guidelines, human experts, firm policies) each contain only a part of the total knowledge required for the task. (R) 8. Experts are significantly better than beginners at the task. 9. Many years of experience are required to acquire the knowledge to perform the task. (B) Nature of knowledge (i.e., essential characteristics or distinguishing qualities) 1. The knowledge requirements for performing the task are extensive. (R) 2. The knowledge available for performing the task is incomplete. (R) 3. The task-related knowledge is rapidly changing. (R) 4. The knowledge available for performing the task is unclear. (R) 5. The knowledge available for performing the task is unreliable. (R)

2.98 (1.78) 5.64 (3.05) 6.08 (2.28) 5.63 (2.62) 7.22 (2.00)

4.68 (1.97) 6.48 (2.64) 6.85 (1.86) 6.34 (2.46) 7.61 (1.64)

Assessing Task Suitability

455

GC

MAT

IR

CR

WP

ALL

GAAP

ARP

EDP

5.64 (2.10) 6.71 (2.09) 6.63 (1.88) 6.22 (2.10)

5.41 (2.09) 6.20 (2.07) 6.22 (1.52) 5.88 (2.02)

4.39 (1.90) 6.22 (1.70) 6.31 (1.48) 6.00 (1.83)

4.64 (2.14) 6.90 (1.80) 6.93 (1.59) 7.10 (1.65)

4.08 (2.20) 5.66 (2.29) 6.15 (1.79) 4.81 (2.21)

3.07 (1.86) 7.20 (1.39) 7.27 (1.65) 6.98 (1.83)

4.47 (1.84) 6.39 (1.71) 6.39 (1.62) 5.61 (1.97)

3.80 (1.91) 6.24 (1.87) 6.64 (1.49) 6.54 (1.74)

(C) Data characteristics 1. Large amounts of data are required to perform the task. 2. All data required for the task are usually available. 3. The data available is usually reliable. 4. Data provided by different sources are often conflicting or contradictory. (R)

4.15 (2.15) 4.97 (2.51) 5°42 (1.98) 4.54 (2.50)

(D) Task characteristics 1. The task can easily be divided into sub-tasks. 2. The task results in a numeric assessment (e.g., a probability of .25, a score of 9, etc.). 3. The task is concerned primarily with symbolic reasoning (e.g., controls are "weak"). 4. The task requires the use of heuristics ("rules of thumb") for handling large numbers of possibilities or for handling incomplete or uncertain information. 5. The task often presents unique situations that experts have never encountered before. (R) 6. The possible task solutions are predefined and the objective is to select from among this set of solutions. 7. The possible task solutions are unclear and the objective is to construct unique solutions. (R) 8. The task involves finding either one solution, a best solution, or all plausible solutions. (Enter "1" for one solution, "5" for best solution and "9" for all plausible solutions.) (R) 9. Decisions for this task involve selecting between close alternatives. 10. Task decisions or solutions are generally evaluated as either right or wrong. 11. Auditors are likely to disagree with each others decisions or solutions for this task. (R)

4.17 (2.74) 2.31 (1.94)

4.29 (2.88) 7.33 (2.43)

5.17 (2.34) 2.72 (2.31)

5.76 (2.20) 2.98 (2.43)

7.03 (2.04) 2.55 (2.25)

5.22 (2.59) 6.29 (2.38)

7.07 (1.86) 2.76 (2.26)

6.20 (2.30) 5.10 (2.76)

6.53 (1.95) 2.69 (1.97)

6.00 (2.69)

4.15 (2.68)

7.27 (1.89)

7.37 (1.88)

4.39 (2.38)

5.32 (2.43)

3.83 (2.27)

4.92 (2.50)

6.49 (2.44)

5.80 (2.50)

5.66 (2.60)

5.37 (2.29)

5.17 (2.30)

4.31 (2.49)

6.36 (2.12)

3.91 (2.26)

5.78 (2.20)

4.67 (2.27)

5.15 (2.68)

7.41 (1.76)

6.97 (1.94)

7.17 (1.72)

7.39 (1.72)

5.98 (2.26)

6.32 (2.15)

6.81 (1.97)

6.83 (1.82)

4.17 (2.95)

4.14 (2.55)

4.88 (2.49)

5.14 (2.47)

4.93 (2.65)

3.39 (2.09)

5.42 (2.67)

3.69 (2.16)

4.66 (2.30)

4.81 (2.83)

6.03 (2.39)

6.00 (2.12)

6.29 (2.07)

5.98 (2.54)

5.15 (2.36)

6.29 (2.31)

5.36 (2.25)

6.17 (1.91)

6.21 (2.95)

5.71 (2.17)

5.43 (2.61)

5.43 (2.37)

5.14 (2.15)

4.57 (2.11)

5.21 (2.90)

4.14 (2.72)

5.07 (2.47)

5.68 (2.87) 5.21 (2.97)

5.44 (2.61) 3.79 (2.27)

4.53 (1.99) 4.34 (2.37)

4.58 (1.98) 4.14 (2.29)

4.80 (2.33) 4.03 (2.14)

5.53 (2.28) 4.34 (2.40)

5.19 (2.52) 5.66 (2.78)

4.69 (2.20) 4.12 (2.25)

4.34 (I .84) 4.55 (2.26)

4.44 (2.18)

4.59 (2.14)

5.08 (1.92)

5.29 (1.87)

5.02 (1.88)

3.97 (2.04)

6.02 (2.01)

5.17 (1.70)

5.63 (1.81)

Note: values in table cells are the means for each response (after recoding); values in parentheses are standard deviations.