Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
Contents lists available at ScienceDirect
Advances in Accounting, incorporating Advances in International Accounting j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / a d i a c
Monetary unit sampling: Improving estimation of the total audit error Huong N. Higgins a,⁎, Balgobin Nandram b,1 a b
Worcester Polytechnic Institute, Department of Management, 100 Institute Road, Worcester, MA 01609, United States Worcester Polytechnic Institute, Department of Mathematical Sciences, 100 Institute Road, Worcester, MA 01609, United States
a r t i c l e
i n f o
a b s t r a c t In the practice of auditing, for cost concerns, auditors verify only a sample of accounts to estimate the error of the total population of accounts. The most common statistical method to select an audit sample is by monetary unit sampling (MUS). However, common MUS estimation practice does not explicitly recognize the multiple distributions within the population of account errors. This often leads to excessive conservatism in auditors' judgment of population error. In this paper, we review the common MUS estimation practice, and introduce our own method which uses the Zero-Inflation Poisson (ZIP) distribution to consider zero versus non-zero errors explicitly. We argue that our method is better suited to handle the real populations of account errors, and show that our ZIP upper bound is both reliable and efficient for MUS estimation of accounting data. © 2009 Elsevier Ltd. All rights reserved.
1. Introduction In auditing a company's accounts receivable, the goal of auditors is to verify if the values of the accounts reported by the company are not materially misstated. To determine the total error in the amount reported by the company, auditors must audit all accounts in the population of accounts. However, it is costly and it takes much time to audit all accounts. In practice, auditors audit only a sample to estimate the total error in the population of accounts. A common statistical sampling method is monetary unit sampling (MUS), where each monetary unit (e.g., dollar) is equally likely to be included in the sample. In this method, an account with a book value of $10,000 is ten times as likely to be sampled as an account with a book value of $1000. This paper addresses the estimation process for MUS sampling. Common MUS estimation has a shortcoming because it does not explicitly recognize that a total population of account errors typically consists of distinct distributions, namely one large mass with zero error, a second distribution of small errors, and a third distribution of 100% errors. These distribution characteristics of accounting error populations have been discussed in prior research (e.g. Kaplan, 1973; Neter & Loebbecke, 1975; Chan, 1988). Due to this shortcoming, in practice sample accounts are incorrectly assumed to have similar tainting (ratio of Error-Per-Dollar) to non-sample accounts. This
⁎ Corresponding author. Tel.: +1 508 831 5626; fax: +1 508 831 5720. E-mail addresses:
[email protected] (H.N. Higgins),
[email protected] (B. Nandram). 1 Tel.: +1 508 831 5539; fax: +1 508 831 5824. 0882-6110/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.adiac.2009.06.001
assumption, combined with MUS sampling bias towards selecting larger accounts, often leads to very large estimations of total error in the population and overly conservative auditors' decisions. We acknowledge that the major risk of accounts receivable in financial reporting is overstatement, so generally auditors should prefer methods that result in larger error estimations. However, we argue that estimations in practice are too conservative, and excessive conservatism has its own practical problems. For example, discussions with auditors in three large accounting firms reveal that clients rarely approve large adjustments for error estimations (Elder & Allen, 1998). This, along with the cost of statistical expertise, may cause auditors not to project sample errors to the population at all. Indeed, surveys of auditors reveal that they often fail to project errors to populations, notwithstanding audit standards and professional guidelines (Burgstahler & Jiambalvo, 1986; Akresh & Tatum, 1988; Hitzig, 1995; Elder & Allen, 1998). Overall, excessive conservatism in the estimation method may lead to projection problems, which impairs the auditor's overall decision process. We introduce our own approach which is better suited to handle the real distributions of account errors. We use a regression with selection probability as a covariate to address MUS sampling bias. Our technical innovation lies in the development of the Zero-Inflation Poisson (ZIP) model, using the Poisson distribution to treat errors as rare count data, and increasing the probabilities of zero errors to inflate the weights of these observations. By handling accounts with zero versus non-zero errors explicitly, our technique does not rely on the assumption of similar tainting in the population. We provide mathematical expressions and, upon request, will furnish the software codes for our ZIP estimation and confidence intervals. Through simulations, we show that our ZIP bound is reliable and efficient for
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
MUS estimation of accounting data. Overall, our paper serves professionals and academicians by pointing out an improvement potential for MUS estimation. The remainder of the paper is as follows. Section 2 reviews the monetary unit sampling method in the accounting literature and practice of auditing. Section 3 describes our data and demonstrates a common method in current practice to estimate the total error in a population of accounts from which a dollar unit sampling is taken. Section 4 introduces our Zero-Inflated Poisson (ZIP) estimation method. Section 5 reports simulations to validate our ZIP upper bound. Section 6 concludes the paper with a brief summary. 2. Monetary unit sampling In performing substantive tests of accounts, an auditor often takes a sample to estimate the true total balance of all accounts. Ultimately, the auditor's goal is to estimate the total amount of error, which is the difference between the true total balance of all accounts and the reported balance amount. The standards for audit sampling are set by the Audit Standards Board's Statement on Auditing Standards (SAS) No 39, (AICPA, 1981) amended by SAS No. 111 (AICPA, 2006). Both statistical and non-statistical sampling methods are allowed for substantive tests in auditing (AICPA, 1981, 2008). Both are used by a wide-range of public accounting firms (Nelson, 1995) and government agencies (Annulli, Mulrow & Anziano, 2000). Some surveys show that non-statistical sampling is used more than half the time (Annulli et al., 2000; Hall, Hunton & Pierce, 2000, 2002). Of course, non-statistical sampling procedures do not afford the full control provided by statistical theory, and so raise questions concerning auditors' evaluation of sample results (Messier, Kachelmeier & Jensen, 2001). Among statistical sampling techniques, monetary unit sampling (MUS) is the most commonly used for substantive tests (Tsui, Matsumura & Tsui, 1985; Smieliauskas, 1986b; Grimlund & Felix, 1987; Hansen, 1993; Annulli et al., 2000; AICPA, 2008). The authoritative guide for the auditing profession, the AICPA's Audit Guide — Audit Sampling (AICPA, 2008), has detailed MUS instructions. MUS is a statistical sampling method where the probability of an item's selection for the sample is proportional to its recorded amount (probability proportional to size). MUS can be thought of as employing the ultimate in stratification by book amount. No further stratification by book amount is possible with dollar units because all sampling units are of exactly the same size in terms of book value. Consequently, MUS incorporates efficiency advantages similar to those of stratification by book value without requiring stratification. Despite its widespread use, the relative performance of MUS compared to traditional normal distribution variables is often not clear (Smieliauskas, 1986b). Prior research has developed a number of methods for evaluating MUS samples (Tsui et al., 1985; Smieliauskas, 1986b; Grimlund & Felix, 1987). Evaluating criteria include sample size (Kaplan, 1975), sampling risks (Smieliauskas, 1986b), sample size implications of controlling for the same level of sampling risks (Smieliauskas, 1986b), and bounds (Tsui et al., 1985; Dworin & Grimlund, 1986). Obtaining reliable bounds on the total error in the population is desirable for making decisions at different confidence levels and probabilistic statements. There are many methods to compute bounds for MUS sampling (Felix, Leslie & Neter, 1981; Swinamer, Lesperance & Will, 2004). The Stringer bound, introduced by Stringer (1963), is used extensively by auditors (Bickel, 1992; AICPA, 2008). A feature of thee Stringer bound which is particularly attractive to auditors is that it provides a non-zero upper bound even when no errors are observed in the sample. Simulations show that the Stringer bound reliably exceeds the true audit error (Swinamer et al., 2004). This reliability is favorable to auditors who are concerned with strictly overstatements (or strictly understatements) in financial statements. The Stringer
175
bound is also simple to compute and the required statistical tables are readily available. 3. Data and common estimation 3.1. Data To illustrate the common estimation method, we use data of a company used by Lohr (1999) in her demonstration of MUS. The company has a population of N = 87 accounts receivable, with a total book balance of $612,824. We know the book values b1, b2, …, bN for each account in the population. Let B denote the total book value of all accounts receivable in the population, B = b1 + b2 + … + bN = 612; 824:
ð1Þ
If each account in the population were audited, we would obtain the set of audit values a1, a2, …, aN. Let A denote the unknown total audit value in our data, A = a 1 + a2 + … + aN :
ð2Þ
We define the error on any account i (i = 1, 2, …, N) by di = bi − ai, di ≥ 0. After performing audit on a sample size n, which we suppose is predetermined, we observe a1, a2, …, a n (see Kaplan, 1975; Menzefricke, 1984 for discussions of issues in determining the sample size). We wish to fit an + 1, …, aN to predict the total error from all accounts. Let D denote the total error in the population of all accounts receivable, D = d1 + d2 + … + dN :
ð3Þ
The total error D is also the difference between the total book value and the total audit value, D = B−A:
ð4Þ
Our ultimate goal is to predict total error D. Because D is unknown, it is standard practice to estimate its mean and upper bound. The book values, audit values, selection probabilities and errors of all accounts are tabulated in Table 1. Selection probability is the probability of an account being selected from all the 87 accounts, equal to the book value of an account divided by the total book balance $612,824 (e.g. for account 3, the selection probability = 6842/612,824 = 0.011164706). Panel A describes the audit sample, and Panel B the nonsample. The audit values and errors of accounts in the audit sample are known, but these values are unknown for accounts in the non-sample. In this company, only a small proportion of audited accounts (4 of 20, or 80%) are subject to error. 3.2. Common estimation SAS No. 39 on audit sampling (AICPA, 1981) promulgates that auditors should estimate the total error in the population by projecting sample errors to the population. A common estimation method of the mean population error D is based on tainting, which equals the average error amount per dollar in the audited accounts. Taintings are multiplied by the total dollars in the sampling intervals, or average tainting is multiplied by the total dollars in the population, to yield an estimate of the total error dollars in the population. This method is prescribed by the AICPA Audit Guide — Audit Sampling (AICPA, 2008). Besides the AICPA guide, many professional publications also prescribe this MUS evaluation method (Gafford & Carmichael, 1984; Guy & Carmichael, 1986; Schwartz, 1997; Yancey, 2002).
176
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
Table 1 Description of data. Panel A: sample Account number
Book value (b)
Audit value (a)
Selection probability (πi)
Error (d = b − a)
3 9 13 24 29 34 36 43 44 45 46 49 55 56 61 70 74 75 79 81 Total error in the sample
6842 16,350 3935 7090 5533 2163 2399 8941 3716 8663 69,540 6881 70,100 6467 21,000 3847 2422 2291 4667 31,257
6842 16,350 3935 7050 5533 2163 2149 8941 3716 8663 69,000 6881 70,100 6467 21,000 3847 2422 2191 4667 31,257
0.011164706 0.026679765 0.006421093 0.01156939 0.009028693 0.003529562 0.003914664 0.014589833 0.006063731 0.014136196 0.113474668 0.011228346 0.11438847 0.010552785 0.034267587 0.006277496 0.003952195 0.003738431 0.007615563 0.051004856
0 0 0 40 0 0 250 0 0 0 540 0 0 0 0 0 0 100 0 0 940
Panel B: non-sample Account number
Book value (b)
Audit value (a)
Selection probability (πi)
Error (d = b − a)
Account number
Book value (b)
Audit value (a)
Selection probability (πi)
Error (d = b − a)
1 2 4 5 6 7 8 10 11 12 14 15 16 17 18 19 20 21 22 23 25 26 27 28 30 31 32 33 35 37 38 39 40 41
2459 2343 4179 750 2708 3073 4742 5424 9539 3108 900 7835 1091 2798 5432 2325 1298 5594 2351 7304 4711 4031 1907 3341 8251 4389 5697 7554 8413 4261 7862 3153 4690 6541
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0.004013 0.003823 0.006819 0.001224 0.004419 0.005014 0.007738 0.008851 0.015566 0.005072 0.001469 0.012785 0.00178 0.004566 0.008864 0.003794 0.002118 0.009128 0.003836 0.011919 0.007687 0.006578 0.003112 0.005452 0.013464 0.007162 0.009296 0.012327 0.013728 0.006953 0.012829 0.005145 0.007653 0.010674
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42 47 48 50 51 52 53 54 57 58 59 60 62 63 64 65 66 67 68 69 71 72 73 76 77 78 80 82 83 84 85 86 87
9074 8746 7141 2278 3916 2192 5999 5856 7642 8846 2486 2074 3081 7123 5496 7461 6333 13,597 1317 5437 4030 2620 2416 5882 6596 2626 7571 1331 5924 4356 6618 5658 6943
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
0.014807 0.014272 0.011653 0.003717 0.00639 0.003577 0.009789 0.009556 0.01247 0.014435 0.004057 0.003384 0.005028 0.011623 0.008968 0.012175 0.010334 0.022187 0.002149 0.008872 0.006576 0.004275 0.003942 0.009598 0.010763 0.004285 0.012354 0.002172 0.009667 0.007108 0.010799 0.009233 0.01133
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Book value is the amount reported by the audited company. Audit value is the value audited and verified by the auditors. Selection probability is the probability of which an account is selected from all the accounts in the population, equal to the book value of an account divided by the total book balance $612,824 (e.g. for account 3, the selection probability = 6842 / 612824 = 0.011164706). Error is the difference between book value and audit value. Panel A shows the accounts selected by the auditors to be audited (sample), and Panel B the remainder of the population of accounts (non-sample). Accounts in the non-sample do not have audit value.
Table 2 illustrates this estimation process. Error-Per-Dollar (tainting) indicates the difference per dollar for each account, equal to the difference in book and audit values divided by the corresponding book value. Account 24, for example, has a book value of $7090 and an error of $40. The error is prorated to every dollar in the book value, leading to an error (tainting) percentage of $40/$7090 = 0.005642 for each of the 7090 dollars. The average error for the individual dollars in the
sample is the sum of all Errors-Per-Dollar divided by 20, the number of accounts in the sample, or 0.008063. Thus the total error for the population D is estimated as 0.008063 ⁎ 612,824 = $4941. Often, audit units greater than the sampling interval are excluded from the population before sampling (AICPA, 2008). These units are examined separately, and their projected misstatements equal their full misstatements, not an extension of tainting. For example, the
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
177
Table 2 Common error estimation for dollar unit sampling. Account number
Book value (b)
3 6842 9 16,350 13 3935 24 7090 29 5533 34 2163 36 2399 43 8941 44 3716 45 8663 46 69,540b 49 6881 55 70,100 56 6467 61 21,000 70 3847 74 2422 75 2291 79 4667 81 31,257 Total error in the sample Average Error-Per-Dollar Total projected error in the population
Audit value (a)
Selection probability (πi)
Error (d = b − a)
Error-Per-Dollar (tainting)
6842 16,350 3935 7050 5533 2163 2149 8941 3716 8663 69,000 6881 70,100 6467 21,000 3847 2422 2191 4667 31,257
0.011164706 0.026679765 0.006421093 0.01156939 0.009028693 0.003529562 0.003914664 0.014589833 0.006063731 0.014136196 0.113474668 0.011228346 0.11438847 0.010552785 0.034267587 0.006277496 0.003952195 0.003738431 0.007615563 0.051004856
0 0 0 40 0 0 250 0 0 0 540b 0 0 0 0 0 0 100 0 0 940
0 0 0 0.005642 0 0 0.10421 0 0 0 0.007765 0 0 0 0 0 0 0.043649 0 0
Projected mistatement (taintinga sampling interval)
172.87
3193.12
237.94b
1337.46
0.008063a 4941
Book value is the amount reported by the audited company. Audit value is the value audited and verified by the auditors. Selection probability is the probability of which an account is selected from all the accounts in the population, equal to the book value of an account divided by the total book balance. Error is the difference between book value and audit value. Error-Per-Dollar is the ratio between error and book value. The sampling interval is $612,824/20 = $30,641.20. Total error D for the population is estimated as Average Error-Per-Dollar ⁎ total book value, or 0.008063 ⁎ 612,824 = $4941. a The Average Error-Per-Dollar is the sum of all error-per-dollars divided by the number of accounts in the sample (20). b This audit unit is greater than the sampling interval and may be considered separately from the sample evaluation.
projection from account 46 would be its full misstatement of $540 instead of $237.94. In this case, the total projected error in the population would be $5243 instead of $4941. A common method to estimate the upper bound of D is the Stringer bound. Let Ti be the tainting of the i-th selected item, Ti ≡ di/bi. If M ≡ number of non-zero Ti let 0 b zM ≤ … ≤ z1 be the ordered non zero Ti. Let p(j; 1 − α) be the 1 − α exact upper confidence bound for p when X has a binomial (n, p) distribution and X = j. Thus p(j; 1 − α) is the unique solution of n
∑
k=j + 1
n k n−k p ð1−pÞ = 1−α; k
ð5Þ
if j b n and p(n; 1 − α) = 1. The Stringer bound for the mean of tainting μ is M
μ ST ≡ pð0; 1−αÞ + ∑ ½pðj; 1−αÞ−pðj−1; 1−αÞzj :
ð6Þ
j=1
The upper bound of the total error in the population is μ–ST ⁎ B. There is a similar bound for the case in which X has a Poisson (n ⁎ p) distribution. The AICPA (2008) publishes tables as aids in calculating the Stringer upper bound (see Table C.3. “Monetary Unit Sampling — Confidence Factors for Sample Evaluation” in Appendix C of the Audit Guide). Our computation of the Stringer 95% upper bound based on the binomial distribution for calculating p(n;1 − α) is $129,380. A replication of the Stringer computation using the Poisson instead of binomial distribution for p(n;1 − α) yields a similar value, $133,518. The above results, which are based on the common estimation approach, can be summarized as follows. The sample contains known errors of $940. The total error projected in the population is $4941. The Stringer 95% upper confidence bound of total error in the population is $133,518. Thus, there is a 5% risk that the recorded amount of $612,824 is overstated by more than $133,518.
We argue that the approach described above is too conservative. Compared to other bounds, for example the Multinomial-Dirichlet bound computed at $11,125 for the same sample, the Stringer bound is too large. References for Multinomial bounds can be found in Swinamer et al. (2004) and Neter, Leitch and Fienberg (1978). As Stringer (1963) stresses, if the population is free of errors it gives the same answer (overestimate), no matter what sample is drawn from the population. In fact, simulation studies (Reneau, 1978; Leitch, Neter, Plante & Sinha, 1982) find that the Stringer bound is always too conservative. The trade-off for high reliability is loss of efficiency: the Stringer bound is always much larger than the population total error (Bickel, 1992; Swinamer et al., 2004). It is probable that many auditors do not project sample results to populations to avoid unrealistic conservatism and client resistance. However, failing to project leads to judgment error regarding the financial statements overall. We seek to introduce an alternative method that is reliable but more efficient, to help auditors make realistic projection and probabilistic statements about the population error. 4. Zero-Inflated Poisson regression To improve on the common estimation method described in Section 3, our aim is to estimate total error D and its 95% upper confidence bound. In the first step, we develop the Poisson regression to view each dollar in an account either as error or not (as count data). In the second step, we further develop the Poisson regression to count data with a large number of zero values to suit populations of account errors. 4.1. Ordinary Poisson regression We use the generalized regression model to allow for count data so that we can capture the distribution of our data more accurately (see Nandram, Sedransk & Pickle, 2000 for a detailed discussion of the generalized regression model and specifically the Poisson regression).
178
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
We use random coefficients, instead of fixed coefficients as in a standard linear regression, to treat each account as having its own set of regression coefficients. By allowing random coefficients, we can model the data more flexibly and therefore can achieve more accuracy. This is an important feature of our method to incorporate random effects. Using random coefficients also allows us to build confidence intervals and upper confidence bounds. We approximate the distribution of the random coefficients by using a standard mode– Hessian normal approximation (Gelman, Carlin, Stern & Rubin, 2004), which allows us to draw samples of the regression coefficients. This permits us to make copies of the total errors, thus we get the distribution of the total error. In a sample of 10,000 we obtain the 250th and 9750th values to form the lower and upper ends of the 95% confidence interval. To consider each dollar in an account is either in error or not, our model is, ind
di jλi e Poissonðλi bi Þ
ð7Þ
ln λi = ðβ0 + r0i Þ + ðβ1 + r1i Þπi where i = 1,2, …, N, r0i and r1i are random perturbations and are independent and identically distributed with mean 0 and finite variances, and λi is the error rate for a dollar, or equivalently, the probability that a dollar is in error. Now we have two sets of random coefficients β0 + r0i and β1 + r1i. Note that the model with r0i = r1i = 0, i = 1,2, …, N, is di jλi ind e Poissonðλi bi Þ
ð8Þ
ln λi = β0 + β1 πi : This is another important feature of our method because observations are treated according to their selection probabilities π, and so sample and non-sample observations are not assumed to be the same. The first model is centered on the second model. Centering is achieved by taking r0i = r1i = 0, so that di jλi ind e Poissonðbi e
β0 + β1 πi
Þ:
ð9Þ
Centering allows us to use this simpler model to study the original, more complex model. Because the dis are independent, the loglikelihood function is n
n
n
i=1
i=1
i=1
β0 + β1 πi
:
i=1
N
∑ d˜i
ð11Þ
i=n + 1
N
Projected: ∑d˜i i=1
from Appendix A, and M = 10,000, h = 1,2, …, M. An estimate of the overstatement error is M
∑E
h=1
ðhÞ
=M = E
ð13Þ
and the standard error is sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi M
S=
∑ ðE
ðhÞ
i=1
P 2
− E Þ = ðM−1Þ:
ð14Þ
To obtain a 95% confidence interval for the total overstatement error, we order the E(h) from smallest to largest and pick the 250th and 9750th value from the lower and upper end of the confidence interval. The 95% upper confidence bound is the 9500th value. 4.2. Zero-Inflation Poisson regression We now develop the Poisson regression model to count data with many zero values. This is our major innovation to address MUS estimation errors. We assume that with probability θ the only possible observation is 0 (i.e. pure zero), and with probability1 − θ, a Poisson (λ) random variable is observed. In other words, we treat errors as nearly impossible, but allow errors to occur according to a Poisson (λ) distribution. This treatment is termed Zero-Inflation Poisson (ZIP). For a pioneering description of ZIP regression, see Lambert (1992). Both the probability θ of the perfect zero error state and the mean number of errors λ in the imperfect state may depend on covariates. Pertinent covariates, such as trade relationships, credit terms, product prices, etc… can lead to improved precision in statistical inference. For a discussion of how to introduce covariates, see Nandram et al. (2000), for example. Sometimes θ and λ are unrelated; other times θ is a simple function of λ. In either case, ZIP regression models are easy to fit, and the maximum likelihood estimates are approximately normal in large samples of at least 25 (Lambert, 1992). As in the ordinary Poisson model in Section 4.1, our model is Eq. (7). To increase the probabilities of zero errors, we use the following distribution for di, β 0 + β 1 πi
ð12Þ
ð15Þ pðdi ≠0Þ = ð1−θÞ
β0 + β1 πi di −bi eβ0
ðbi e
Þ e di !
+ β 1 πi
where di = 1,2,3, …. The random effect corresponding to θ is θ + r21.We do not treat the dis equally. Because most of the errors in the sample are zero, different treatments of zero-valued dis and non-zero dis are desired to give more appropriate weights to the zero values. We center the ZIP model by taking r0i = r1i = r2i = 0, i = 1, 2, …, N, similar to Eqs. (8) and (9). Since the dis are independent, we consider their joint probability density function:
Lðθ; β0 ; β1 Þ = ∏
di = 0
where d̃i is the predicted value of di.
be the h-th copy of the 10,000 copies obtained
ð10Þ
We use maximum likelihood estimate in the centered model to obtain MLEs. Preliminary estimations are possible at this step, however they do not benefit from the zero inflation process, so they are less adjusted for populations with a large number of zero errors. We show in Appendix A how to obtain 10,000 copies of the overstatement error. We use two estimators: the predicted estimator and the projected estimator. The two estimators are defined as: n
ðhÞ
i=1
pðdi = 0Þ = θ + ð1−θÞe−bi e
f ðβ0 ; β1 Þ = β0 ∑di + β1 ∑ πi di − ∑ bi e
Predicted: ∑ di +
N
Let EðhÞ = ∑ di
n β −b e 0 θ + ð1−θÞe i
+ β1 πi
o
( ∏
dt N 0
ð1−θÞ
ðbi e
β0 + β1 πi
β0 + β1 πi di −bie
Þ e di !
) :
ð16Þ
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
Thus the log-likelihood function is n
−bieβ0
Δðθ; β0 ; β1 Þ = ∑ ln½θ + ð1−θÞe π=1
M
+
∑
l=π + 1
+ β1 πi
+ ðN−nÞ lnð1−θÞ ð17Þ M
di ðβ0 + β1 πi Þ− ∑
bie
β0 + β1 πi
l=π + 1
:
We use the Expectation Maximization (EM) algorithm to maximize the function over θ and ß (see Appendix B). The values we obtain from the above procedures are: θ˜ = 0:80 β˜0 = −3:25 β˜1 = −14:36: Our estimations are based on two estimators, the predicted estimator and the projected estimator as defined in Eqs. (11) and (12). We estimate the mean, standard error, confidence interval, and upper bound using in a manner as described in Eqs. (13) and (14). We show in Appendix B how to obtain 10,000 copies of the overstatement error. The predicted estimator has a mean of $3272, a standard error of $1037, a 95% confidence interval of [$1487; $5336], and a 95% upper confidence bound of $4848. The projected estimator has a mean of $3435, a standard error of $1463, a 95% confidence interval of [$931; $6107], and a 95% upper confidence bound of $5444. As per design, we argue our ZIP method has advantage because it addresses the selection bias in MUS towards selecting larger accounts, it handles populations with a large number of zero values, and it handles zero versus non-zero values explicitly separately. As a result of this design, ZIP should be better suited than common estimation practice for accounting populations. 5. Simulations We perform three simulations to assess the reliability and efficiency of our ZIP upper bound, using the following definitions by Swinamer et al. (2004): a 95% upper confidence bound is reliable if, when used repeatedly, the bound exceeds the true error 95% of the time. Efficiency measures the size of the bound: the smaller the bound is, the more efficient it is said to be. We compare the reliability and efficiency of our ZIP method with the Stringer bound, because this bound is used the most extensively by accounting auditors as discussed in previous sections. We also compare our ZIP method with the MultinomialDirichlet bound, because this bound demonstrates the best reliability for a variety of populations (Tsui et al., 1985). More MUS bounds exist but they have relatively more reliability problems and they all have pros and cons (Grimlund & Felix, 1987; Swinamer et al., 2004).
179
Simulation 1 consists of three steps. First, we fill in the missing audit value for the 87 − 20 = 67 accounts in our data set. We use the bivariate Parzen–Rosenblut kernel density estimator, an independent non-parametric model (Silverman, 1986) to fill in the audit values. To model errors, we formulate the missing audit values in the range 0.90 book value b audit value b book value. We create the scenario where 80% of the book values and the audit values are in perfect agreement. This population is kept fixed throughout. Because we generate all the necessary audit values, we have the entire population and the true total population error. Second, we draw 1000 PPS samples (probabilities of selection are proportional to the book values) without replacement of size 20 from the single populations we construct in the first step. Third, we fit the ZIP regression model in exactly the same manner we discussed in Section 3 for the observed data. We compute 90% confidence interval for each fixed population, whose upper end provides a 95% upper confidence bound. For comparison, we compute the same for the Stringer bound and the Multinomial-Dirichlet bound. The results are shown in Table 3. Table 3, Columns 2–6 show the reliability results, specifically the frequency in which the bounds exceed the true population error. In Simulation 1, the ZIP bound exceeds the true population error 89.5% of the times, whereas it is supposed to exceed 95% of the times. In contrast, the Stringer bound exceeds the true error 100% of the times, while this frequency is 96% for the Multinomial-Dirichlet bound. These results demonstrate ZIP's lower reliability than the theoretical level and the extra conservatism of the Stringer bound. To report on the efficiency, Columns 5 and 6 show the frequency in which the ZIP bound is larger (more inefficient) than the Stringer bound and the Multinomial-Dirichlet bound. In Simulation 1, these frequency numbers are very close to zero (0.7% and 9.6%, respectively), indicating the ZIP bound is almost always smaller, or more efficient. To further shed light on the size of the other bounds relative to the ZIP bound, Columns 7 and 8 show the ratio of the differential size over the ZIP bound. On average, this ratio is about 13 times for the Stringer bound, and 25.8% for the Multinomial-Dirichlet bound. These results demonstrate the far greater efficiency of the ZIP bound compared to the other two. Simulation 2 is similar to the first, except that the audit values are generated by the ZIP model. The generated data have similar characteristics to accounting data in the sense that they come from a distribution that presumes a very large number of zero errors. From Column 2, the ZIP bound is reliable as its 95% upper bound exceeds the true error value 96.8% of the times. From Columns 5 and 6, the frequency in which the ZIP bound is larger than the other two bounds is very small (0.8% and 3%, respectively), denoting ZIP's greater efficiency.
Table 3 Simulation results. Reliability 1 Simulation 1 Data from Parzen–Rosenblut model Simulation 2 Data from ZIP model Simulation 3 Data from Neter Loebbecke population 4
|{z}
Efficiency
2
3
4
5
6
7
8
ZNT
SNT
MNT
ZNS
ZNM
(S − Z) / Z
(M − Z) / Z
0.895
1.000
0.960
0.007
0.096
12.989
0.258
0.968
1.000
1.000
0.008
0.030
0.342
0.188
0.984
1.000
1.000
0.009
0.044
0.343
0.182
|{z} Proportion of 1000 simulations
|{z}
Average over 1000 simulations
Z: ZIP bound, S: Stringer bound, M: Multinomial-Dirichlet bound, T: True value. The rows show the results from three separate simulations. Columns 2–6 show the frequency in which one bound is larger than the true value or another bound. Columns 7 and 8 show the relative size of the other bounds compared to the ZIP bound.
180
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
Simulation 3 is similar to the first two, except that the audit values are generated to have error distributions identical to Population 4 in Neter and Loebbecke (1975). This population consists of 4033 accounts receivables of a large manufacturer. Neter and Loebbecke (1975) have made major contributions in analyzing the error characteristics of accounting populations, and their populations are often used to represent accounting populations (Chan, 1988; Smieliauskas, 1986a; Frost & Tamura, 1982). From Column 2, the ZIP bound is reliable as its 95% upper bound exceeds the true error value 98.4% of the times. From Columns 5 and 6, the frequency in which the ZIP bound is larger than the other two bounds is very small (0.9% and 4.4%, respectively), denoting ZIP's greater efficiency. Columns 7–8 show the magnitude of the Stringer bound is about 34% larger and the Multinomial-Dirichlet bound about 18% larger than the ZIP bound, also denoting ZIP's greater efficiency. Summing the simulation studies, results based on a general population without distinct distributions reveal our ZIP method to be very efficient, although with slightly lower reliability than the theoretical level. However, based on populations similar to or taken from accounting error populations, the ZIP bound is reliable and more efficient than the Stringer and Multinomial-Dirichlet bounds. The ZIP overall performance seems very good, in light of Swinamer et al. (2004), who use extensive simulations on both real and simulated data to compare 14 bounds, and find no single method to be superior in terms of both reliability and efficiency. Given the trade-off between reliability and efficiency in the state-of-the-art, our ZIP method is very promising as an MUS estimation technique.
An estimate of the covariance matrix of β0̂ and β1̂ is the negative inverse Hessian matrix. The Hessian matrix is 0
n
β + β1 πi
bi e 0 B i∑ B =1 B n @ β ∑ bi πi e 0
n
β0 + β1 πi
∑ bi πi e
i=1 + β1 πi
i=1
n
2 β0 + β1 πi
∑ bi πi e
whose elements are all positive values. And the negative inverse Hessian matrix is H
−1
=
c a2
a1 c
where n
2 β0 + β1 πi
∑ bi πi e
a1 =
i=1 n
β0 + β1 πi
∑ bi e
i=1
n
2 β0 + β1 πi
∑ bi πi e
i=1
n
The authors acknowledge the helpful comments from two anonymous reviewers and the Journal's associate editor. Any errors are the authors' responsibility.
β0 + β1 πi 2
Þ
i=1
β0 + β1 πi
∑ bi πi e
c=
n
n
β0 + β1 πi 2
ð ∑ bi πi e i=1
a2 =
Acknowledgement
n
−ð ∑ bi πi e
i=1
β0 + β1 πi
Þ − ∑ bi e i=1 n
We propose a method that improves upon the common MUS estimation approach. MUS estimation as commonly practiced by accounting auditors does not explicitly recognize that a total population of account errors typically consists of multiple distinct distributions. As a result, this common approach often leads to very large error estimations and very conservative auditor's decisions on the fairness of client financial statement. For conservatism auditors should want large estimations of errors and upper bounds, however estimations under common practice are too conservative, and excessive conservatism has its own problems. Our method, based on Zero-Inflation Poisson distribution, addresses the above shortcoming. We discuss our method and show that for data similar to accounting populations, our bound is reliable and more efficient than common MUS practice, so we would recommend our bound to accounting auditors. For other populations, our method may be slightly less reliable than theoretically desired, and future research should seek to improve the method for these populations.
C C C A
i=1
∑ bi e
6. Conclusion
1
n
2 β0 + β1 πi
∑ bi πi e
i=1
β0 + β1 πi
i=1 n
β0 + β1 πi
∑ bi e
i=1
n
2 β0 + β1 πi
∑ bi πi e
i=1
n
β0 + β1 πi 2
−ð ∑ bi πi e
:
Þ
i=1
First, we take
β0 + r0i β1 + r0i
˜0 β e Normalf β ˜1
ind
! a1 ; c
c g a2
where a1, a2, and c are the elements in the previous matrix H− 1. This is a standard approximation and is typically used in generalized regression model. Next, we show how to construct the confidence interval. We can see that ln λi = (β0 + r0i) + (β1 + r1i) πi also follows the normal distribution and we can obtain its mean and variance easily, ˜ ˜ ln λi iid e Normalðβ0 + πi β1 ; 1 = a1 + πi = a2 + 2πi = cÞ: 2
As we discussed earlier, we know that di follows the Poisson β0 + β1 πi Þ. This is equivalent to di jλi ind distribution di j λi ind e Poissonðbi e e lnλi Poissonðe bi Þ. We can then draw Z1, Z2, …, ZN, N independent standard normal random variables, and compute ˜ +π β ˜ lnðλi Þ = ðβ 0 i 1 Þ + Zi
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 = a1 + π2i = a2 + 2πi = c
Appendix A For the centered model, the log-likelihood function is
n
n
n
i=1
i=1
i=1
f ðβ0 ; β1 Þ = β0 ∑ di + β1 ∑ πi di − ∑ bi e
β0 + β1 πi
:
We use a two-dimensional Newton's method to maximize it over β0 and β1. Details are omitted.
where i = 1, 2, …, N. Thus, we can now draw di as a Poisson random variable with parameter λ̃i, i = 1, 2, …, N. So these random draws of λ̃1, λ̃2, …, λ̃N are given by ˜ i = expfðβ ˜ 0 + πi β ˜ 1 Þ + Zi λ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 = a1 + π2i = a2 + 2πi = cg
where i = 1, 2, …, N. We repeat this process 10,000 times and obtain N
10,000 values of the overstatement error ∑ di. i=1
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
Appendix B A Zero-Inflated Poisson (ZIP) regression model Now the joint probability density function becomes n β −b e 0 Lðθ; β0 ; β1 Þ = ∏ θ + ð1−θÞe i
+ β1 πi
D
( o ðb eβ0 ∏ ð1−θÞ i D
+ β1 πi di −bi e
Þ e di !
β0 + β1 πi
)
– where D is the set of accounts with where di = 0; D is the set of accounts with where di ≠ 0. To apply the Expectation Maximization (EM) algorithm, we introduce a new variable Zi, which is defined as
181
After we get all the E{Zi|θ,β0,β1} for the sample using Eq. (B.2), we can optimize h(β0, β1, Zi) and calculate a new θ following Eq. (B.1). Then we can start our second iteration. The MLE of β0, β1 and the new θ obtained will be used to calculate new Zis. After that, we optimize h(β0, β1, Zi) and calculate a new θ again. These values will be used in the third iteration. We repeat this process until θ, β0, β1 all converge. Letting H be the Hessian matrix, too cumbersome to present, we take 80 1 9 1 ˜ > > θ + r2i < β = 0 ind −1 @ β0 + r0i A e Normal B ˜1C : @β A; −H > > : ˜ ; β1 + r1i θ 0
Zi = 1; if di is pure zero; Zi = 0; else
We draw a random sample of (β0 +r0i, β1 +r1i, θ+r2i) from the normal density i=1,2,⋯, N. Then we construct λi as λi =eβ0 + r01 + (β1 +r1i)πi and we have Zi ~Bernoulli (θ+r2i). Now, if Zi =1, di =0; if Zi =0, we draw
Thus, we have
di from Poisson(biλi). Then we calculate ∑ di and ∑ di +
pðdi = 0jZi = 1Þ = 1
obtain D for the predicted or the projected estimator. We repeat this process 10,000 times and obtain 10,000 values of the overstatement error
˜
where ∑ Zi n−∑ Zi gðθ; Z Þ = θ D ð1−θÞ D ˜
β0 + β1 πi ∑½d ðβ −∑ð1−Zi Þbi e i 0 D eD
hðβ0 ; β1 ; Z Þ = e
β0 + β1 πi
+ β1 πi Þ−bi e
and the MLE of β0 and β1 can be obtained by Newton's method in Appendix A. Since θ only exists in the first two terms, we separate this product into two parts, the first two factors and the last two factors. We differentiate the log-likelihood function of g(θ, Z ), set it equal to zero ˜ and solve for θ. Thus we have ∑ Zi n−∑ Zi gðθ; Z Þ = θ D ð1−θÞ D ˜
lnðgðθ; Z ÞÞ = ∑ Zi lnθ + ðn−∑ Zi Þ lnð1−θÞ D
∂ lnðgðθ; Z ÞÞ = ∂θ ˜
D
∑ Zi D
θ
−
ðn−∑ Zi Þ D
=0
1−θ
Therefore, θ = ∑ Zi = n
!
θ
D
β0 + β1 πi
θ + ð1−θÞe−bi e Zi jðθ; β0 ; β1 Þ ind e Bernoulli
i= 1; 2; …; n
ðB:1Þ
And the conditional expectation of Zi, EfZi jθ; β0 ; β1 g =
di to
∑di .
gðθ; Z Þhðβ0 ; β1 ; Z Þ
˜
N
∑
i=n + 1
i=1
We can obtain the Hessian matrix of θ ̂, β̂0,1 to maximize the likelihood function; it is more convenient to use the EM algorithm. By EM algorithm, optimizing the joint probability density function above is equivalent to optimizing the following quantity:
˜
n
i=1
N
pðdi ≠0 jZi = 1Þ = 0:
˜
N
i=1
θ β 0 + β 1 πi
θ + ð1−θÞe−bi e
:
ðB:2Þ
First iteration, we use θ = 0.5 as a random starting value. For β0, β1, we use the MLE estimates of the regression coefficients as the starting values, i.e., β̃0 = −5.71, β1̃ = − 0.22.
References Akresh, A. D., & Tatum, K. W. (1988). Audit sampling — Dealing with the problems. The saga of SAS No. 39 and how firms are handling its requirements. Journal of Accountancy, 166(6), 58−64. American Institute of Certified Public Accountants (AICPA). (1981). Statement on Auditing Standards No. 39, Audit Sampling. New York: AICPA. American Institute of Certified Public Accountants (AICPA). (2006). Statement on Auditing Standarrds No. 111, Amendment to Statement on Auditing Standards No. 39, Audit Sampling. New York: AICPA. American Institute of Certified Public Accountants (AICPA). (2008). Audit Guide: Audit Sampling. New York: AICPA. Annulli, T. J., Mulrow, J., & Anziano, C. R. (2000). The Wisconsin audit sampling study. Corporate Business Taxation Monthly (pp. 19−30). Bickel, P. J. (1992). Inference and auditing: The Stringer bound. International Statistical Review, 60(2), 197−209. Burgstahler, D., & Jiambalvo, J. (1986). Sample error characteristics and projection of error to audit populations. Accounting Review, 61(2), 233−248. Chan, K. H. (1988). Estimating accounting errors in audit sampling: Extensions and empirical tests of a decomposition approach. Journal of Accounting, Auditing & Finance, 11(2), 153−161. Dworin, L., & Grimlund, R. A. (1986). Dollar-unit sampling: A comparison of the quasiBayesian and moment bounds. Accounting Review, 61(1), 36−58. Elder, R. J., & Allen, R. D. (1998). An empirical investigation of the auditor's decision to project errors. Auditing: A Journal of Practice & Theory, 17(2), 71−87. Felix, W. L., Jr., Leslie, D. A., & Neter, J. (1981). University of Georgia Center for Audit Research Monetary-Unit Sampling Conference, March 24, 1981. Auditing: A Journal of Practice & Theory, 1(2), 92−103. Frost, P., & Tamura, H. (1982). Jacknifed ratio estimation in statistical auditing. Journal of Accounting Research, 20(1), 103−120. Gafford, W. W., & Carmichael, D. R. (1984). Materiality, audit risk and sampling: A nutsand-bolts approach (part two). Journal of Accountancy, 158(4–5), 109−118. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis, 2nd edition, New York: Chapman & Hall. Grimlund, R. A., & Felix, D. W., Jr. (1987). Simulation evidence and analysis of alternative methods of evaluating dollar-unit samples. Accounting Review, 62(3), 455−480. Guy, D. M., & Carmichael, D. R. (1986). Audit sampling: An introduction to statistical sampling in auditing, 2nd edition, New York: John Wiley & Sons Inc. Hall, T. W., Hunton, J. E., & Pierce, B. J. (2000). The use of and selection biases associated with nonstatistical sampling in auditing. Behavioral Research in Accounting, 12, 231−255. Hall, T. W., Hunton, J. E., & Pierce, B. J. (2002). Sampling practices of auditors in public accounting, industry, and government. Accounting Horizons, 16(2), 125−136. Hansen, S. C. (1993). Strategic sampling, physical units sampling, and dollar units sampling. Accounting Review, 68(2), 323−345. Hitzig, N. (1995). Audit sampling: A survey of current practice. CPA Journal, 65(7), 54−58. Kaplan, R. S. (1973). Statistical sampling in auditing with auxiliary information estimators. Journal of Accounting Research, 11(2), 238−258. Kaplan, R. S. (1975). Sample size computations for dollar-unit sampling. Journal of Accounting Research, 13(3), 126−133. Lambert, D. (1992). Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics, 34(1), 1−14. Leitch, R. A., Neter, J., Plante, R., & Sinha, P. (1982). Modified multinomial bounds for larger number of errors in audits. Accounting Review, 57(2), 384−400. Lohr, S. L. (1999). Sampling: Design and analysis. New York: Duxbury Press. Menzefricke, U. (1984). Using decision theory for planning audit sample size with dollar unit sampling. Journal of Accounting Research, 22(2), 570−587.
182
H.N. Higgins, B. Nandram / Advances in Accounting, incorporating Advances in International Accounting 25 (2009) 174–182
Messier, W. S., Jr., Kachelmeier, S. J., & Jensen, K. L. (2001). An experimental assessment of recent professional development in nonstatistical audit sampling guidance. Auditing: A Journal of Practice & Theory, 20(1), 81−96. Nandram, B., Sedransk, J., & Pickle, L. (2000). Bayesian analysis and mapping of mortality rates for chronic obstructive pulmonary disease. Journal of the American Statistical Association, 95(452), 1110−1118. Nelson, M. K. (1995). Strategies of auditors: Evaluation of sample results. Auditing: A Journal of Practice & Theory, 14(1), 34−49. Neter, J., & Loebbecke, J. K. (1975). Behavior of major statistical estimators in sampling accounting populations — An empirical study. New York: AICPA. Neter, J., Leitch, R. A., & Fienberg, S. E. (1978). Dollar unit sampling: Multinomial bounds for total overstatement and understatement errors. Accounting Review, 53(1), 77−94. Reneau, J. H. (1978). CAV bounds in dollar unit sampling: Some simulation results. Accounting Review, 53(3), 669−680. Schwartz, D. A. (1997). Audit sampling — A practical approach. CPA Journal, 67(2), 56−60.
Silverman, B. W. (1986). Density estimation for statistics and data analysis. New York: Chapman and Hall. Smieliauskas, W. (1986). A simulation analysis of the power characteristics of some popular estimators under different risk and materiality levels. Journal of Accounting Research, 24(1), 217−230. Smieliauskas, W. (1986). Control of sampling risks in auditing. Contemporary Accounting Research, 3(1), 102−124. Stringer, K. W. (1963). Practical aspects of statistical sampling in auditing. Proceedings of the Business and Economic Statistics Section, American Statistical Association (pp. 404−411). Swinamer, K., Lesperance, M. L., & Will, H. (2004). Optimal bounds used in dollar unit sampling: A comparison of reliability and efficiency. Communications in Statistics — Simulation and Computation, 33(1), 109−143. Tsui, K. W., Matsumura, E. M., & Tsui, K. L. (1985). Multinomial-Dirichlet bounds for dollar-unit sampling in auditing. Accounting Review, 60(1), 76−97. Yancey, W. (2002). Statistical sampling in sales and use tax audits. Chicago: CCH Incorporated.