Chapter 9
Hypotheses Tests We must conduct research and then accept the results. If they don’t stand up to experimentation, Buddha’s own words must be rejected. Tenzin Gyatso, 14th Dalai Lama
9.1
INTRODUCTION
As discussed previously, one of the problems to be solved by statistical inference is hypotheses testing. A statistical hypothesis is an assumption about a certain population parameter, such as, the mean, the standard deviation, the correlation coefficient, etc. A hypothesis test is a procedure to decide the veracity or falsehood of a certain hypothesis. In order for a statistical hypothesis to be validated or rejected with accuracy, it would be necessary to examine the entire population, which in practice is not viable. As an alternative, we draw a random sample from the population we are interested in. Since the decision is made based on the sample, errors may occur (rejecting a hypothesis when it is true or not rejecting a hypothesis when it is false), as we will study later on. The procedures and concepts necessary to construct a hypothesis test will be presented. Let’s consider X a variable associated to a population and y a certain parameter of this population. We must define the hypothesis to be tested about parameter y of this population, which is called null hypothesis: H 0 : y ¼ y0
(9.1)
Let’s also define the alternative hypothesis (H1), in case H0 is rejected, which can be characterized as follows: H1 : y 6¼ y0
(9.2)
and the test is called bilateral test (or two-tailed test). The significance level of a test (a) represents the probability of rejecting the null hypothesis when it is true (it is one of the two errors that may occur, as we will see later). The critical region (CR) or rejection region (RR) of a bilateral test is represented by two tails of the same size, respectively, in the left and right extremities of the distribution curve, and each one of them corresponds to half of the significance level a, as shown in Fig. 9.1. Another way to define the alternative hypothesis (H1) would be: H 1 : y < y0
(9.3)
and the test is called unilateral test to the left (or left-tailed test). In this case, the critical region is in the left tail of the distribution and corresponds to significance level a, as shown in Fig. 9.2. Or the alternative hypothesis could be: FIG. 9.1 Critical region (CR) of a bilateral test, also emphasizing the nonrejection region (NR) of the null hypothesis.
Data Science for Business and Decision Making. https://doi.org/10.1016/B978-0-12-811216-8.00009-4 © 2019 Elsevier Inc. All rights reserved.
199
200
PART
IV Statistical Inference
H1 : y > y0
(9.4)
and the test is called unilateral test to the right (or right-tailed test). In this case, the critical region is in the right tail of the distribution and corresponds to significance level a, as shown in Fig. 9.3. Thus, if the main objective is to check whether a parameter is significantly higher or lower than a certain value, we have to use a unilateral test. On the other hand, if the objective is to check whether a parameter is different from a certain value, we have to use a bilateral test. After defining the null hypothesis to be tested, through a random sample collected from the population, we either prove the hypothesis or not. Since the decision is made based on the sample, two types of errors may happen: Type I error: rejecting the null hypothesis when it is true. The probability of this type of error is represented by a: Pðtype I errorÞ ¼ Pðrejecting H0 j H0 is trueÞ ¼ a
(9.5)
Type II error: not rejecting the null hypothesis when it is false. The probability of this type of error is represented by b: Pðtype II errorÞ ¼ Pðnot rejecting H0 j H0 is falseÞ ¼ b
(9.6)
Table 9.1 shows the types of errors that may happen in a hypothesis test. The procedure for defining hypotheses tests includes the following phases: Step Step Step Step Step Step
1: Choosing the most suitable statistical test, depending on the researcher’s intention. 2: Presenting the test’s null hypothesis H0 and its alternative hypothesis H1. 3: Setting the significance level a. 4: Calculating the value observed of the statistic based on the sample obtained from the population. 5: Determining the test’s critical region based on the value of a set in Step 3. 6: Decision: if the value of the statistic lies in the critical region, reject H0. Otherwise, do not reject H0.
According to Fa´vero et al. (2009), most statistical softwares, among them SPSS and Stata, calculate the P-value that corresponds to the probability associated to the value of the statistic calculated from the sample. P-value indicates the lowest significance level observed that would lead to the rejection of the null hypothesis. Thus, we reject H0 if P a.
FIG. 9.2 Critical region (CR) of a left-tailed test, also emphasizing the nonrejection region of the null hypothesis (NR).
FIG. 9.3 Critical region (CR) of a right-tailed test.
TABLE 9.1 Types of Errors Decision
H0 Is True
H0 Is False
Not rejecting H0
Correct decision (1 a)
Type II error (b)
Rejecting H0
Type I error (a)
Correct decision (1 b)
Hypotheses Tests Chapter
9
201
If we use P-value instead of the statistic’s critical value, Steps 5 and 6 of the construction of the hypotheses tests will be: Step 5: Determine the P-value that corresponds to the probability associated to the value of the statistic calculated in Step 4. Step 6: Decision: if P-value is less than the significance level a established in Step 3, reject H0. Otherwise, do not reject H0.
9.2
PARAMETRIC TESTS
Hypotheses tests are divided into parametric and nonparametric tests. In this chapter, we will study parametric tests. Nonparametric tests will be studied in the next chapter. Parametric tests involve population parameters. A parameter is any numerical measure or quantitative characteristic that describes a population. They are fixed values, usually unknown, and represented by Greek characters, such as, the population mean (m), the population standard deviation (s), the population variance (s2), among others. When hypotheses are formulated about population parameters, the hypothesis test is called parametric. In nonparametric tests, hypotheses are formulated about qualitative characteristics of the population. Therefore, parametric methods are applied to quantitative data and require strong assumptions in order to be validated, including: (i) The observations must be independent; (ii) The sample must be drawn from populations with a certain distribution, usually normal; (iii) The populations must have equal variances for the comparison tests of two paired population means or k population means (k 3); (iv) The variables being studied must be measured in an interval or in a reason scale, so that it can be possible to use arithmetic operations over their respective values. We will study the main parametric tests, including tests for normality, homogeneity of variance tests, Student’s t-test and its applications, in addition to the analysis of variance (ANOVA) and its extensions. All of them will be solved in an analytical way and also through the statistical softwares SPSS and Stata. To verify the univariate normality of the data, the most common tests used are Kolmogorov-Smirnov and Shapiro-Wilk. To compare the variance homogeneity between populations, we have Bartlett’s w2 (1937), Cochran’s C (1947a,b), Hartley’s Fmax (1950), and Levene’s F (1960) tests. We will describe Student’s t-test for three situations: to test hypotheses about the population mean, to test hypotheses to compare two independent means, and to compare two paired means. ANOVA is an extension of Student’s t-test and is used to compare the means of more than two populations. In this chapter, ANOVA of one factor, ANOVA of two factors and its extension for more than two factors will be described.
9.3
UNIVARIATE TESTS FOR NORMALITY
Among all univariate tests for normality, the most common are Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia.
9.3.1
Kolmogorov-Smirnov Test
The Kolmogorov-Smirnov test (K-S) is an adherence test, that is, it compares the cumulative frequency distribution of a set of sample values (values observed) to a theoretical distribution. The main goal is to test if the sample values come from a population with a supposed theoretical or expected distribution, in this case, the normal distribution. The statistic is given by the point with the biggest difference (in absolute values) between the two distributions. To use the K-S test, the population mean and standard deviation must be known. For small samples, the test loses power, so, it should be used with large samples (n 30). The K-S test assumes the following hypotheses: H0: the sample comes from a population with distribution N(m, s) H1: the sample does not come from a population with distribution N(m, s)
202
PART
IV Statistical Inference
As specified in Fa´vero et al. (2009), let Fexp(X) be an expected distribution function (normal) of cumulative relative frequencies of variable X, where Fexp(X) N(m,s), and Fobs(X) the observed cumulative relative frequency distribution of variable X. The objective is to test whether Fobs(X) ¼ Fexp(X), in contrast with the alternative that Fobs(X) 6¼ Fexp(X). The statistic can be calculated through the following expression: o n (9.7) Dcal ¼ max Fexp ðXi Þ Fobs ðXi Þ; Fexp ðXi Þ Fobs ðXi1 Þ , for i ¼ 1, …,n where: Fexp(Xi): expected cumulative relative frequency in category i; Fobs(Xi): observed cumulative relative frequency in category i; Fobs(Xi1): observed cumulative relative frequency in category i 1. The critical values of Kolmogorov-Smirnov statistic (Dc) are shown in Table G in the Appendix. This table provides the critical values of Dc considering that P(Dcal > Dc) ¼ a (for a right-tailed test). In order for the null hypothesis H0 to be rejected, the value of the Dcal statistic must be in the critical region, that is, Dcal > Dc. Otherwise, we do not reject H0. P-value (the probability associated to the value of Dcal statistic calculated from the sample) can also be seen in Table G. In this case, we reject H0 if P a. Example 9.1: Using the Kolmogorov-Smirnov Test Table 9.E.1 shows the data on a company’s monthly production of farming equipment in the last 36 months. Check and see if the data in Table 9.E.1 come from a population that follows a normal distribution, considering that a ¼ 5%.
TABLE 9.E.1 Production of Farming Equipment in the Last 36 Months 52
50
44
50
42
30
36
34
48
40
55
40
30
36
40
42
55
44
38
42
40
38
52
44
52
34
38
44
48
36
36
55
50
34
44
42
Solution Step 1: Since the objective is to verify if the data in Table 9.E.1 come from a population with a normal distribution, the most suitable test is Kolmogorov-Smirnov (K-S). Step 2: The K-S test hypotheses for this example are: H0: the production of farming equipment in the population follows distribution N(m, s) H1: the production of farming equipment in the population does not follow distribution N(m, s) Step 3: The significance level to be considered is 5%. Step 4: All the steps necessary to calculate Dcal from Expression (9.7) are specified in Table 9.E.2.
TABLE 9.E.2 Calculating the Kolmogorov-Smirnov Statistic |Fexp(Xi) 2 Fobs(Xi)|
|Fexp(Xi) 2 Fobs(Xi21)|
0.0375
0.018
0.036
1.2168
0.1118
0.027
0.056
0.250
0.9351
0.1743
0.076
0.035
12
0.333
0.6534
0.2567
0.077
0.007
4
16
0.444
0.3717
0.3551
0.089
0.022
42
4
20
0.556
0.0900
0.4641
0.092
0.020
44
5
25
0.694
0.1917
0.5760
0.118
0.020
Xi
a
Fac
c
d
30
2
2
0.056
1.7801
34
3
5
0.139
36
4
9
38
3
40
Fabs
b
Fracobs
Zi
e
Fracexp
Hypotheses Tests Chapter
9
203
TABLE 9.E.2 Calculating the Kolmogorov-Smirnov Statistic—cont’d Fracexp
| Fexp(Xi) 2 Fobs(Xi)|
| Fexp(Xi) 2 Fobs(Xi21)|
0.7551
0.7749
0.025
0.081
0.833
1.0368
0.8501
0.017
0.100
33
0.917
1.3185
0.9064
0.010
0.073
36
1
1.7410
0.9592
0.041
0.043
Xi
Fabs
Fac
Fracobs
48
2
27
0.750
50
3
30
52
3
55
3
Zi
a
Absolute frequency. Cumulative (absolute) frequency. Observed cumulative relative frequency of Xi. d Standardized Xi values according to the expression Zi ¼ Xi SX . e Expected cumulative relative frequency of Xi and it corresponds to the probability obtained in Table E in the Appendix (standard normal distribution table) from the value of Zi. b c
Therefore, the real value of the K-S statistic based on the sample is Dcal ¼ 0.118. Step 5: According to Table G in the Appendix, for n ¼ 36 and a ¼ 5%, the critical value of the Kolmogorov-Smirnov statistic is Dc ¼ 0.23. Step 6: Decision: since the value calculated is not in the critical region (Dcal < Dc), the null hypothesis is not rejected, which allows us to conclude, with a 95% confidence level, that the sample is drawn from a population that follows a normal distribution. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table G in the Appendix, for a sample size n ¼ 36, the probability associated to Dcal ¼ 0.118 has as its lowest limit P ¼ 0.20. Step 6: Decision: since P > 0.05, we do not reject H0.
9.3.2
Shapiro-Wilk Test
The Shapiro-Wilk test (S-W) is based on Shapiro and Wilk (1965) and can be applied to samples with 4 n 2000 observations, and it is an alternative to the Kolmogorov-Smirnov test for normality (K-S) in the case of small samples (n < 30). Analogous to the K-S test, the S-W test for normality assumes the following hypotheses: H0: the sample comes from a population with distribution N(m, s) H1: the sample does not come from a population with distribution N(m, s) The calculation of the Shapiro-Wilk statistic (Wcal) is given by: b2 Wcal ¼ Xn 2 , for i ¼ 1, …, n Xi X i¼1 b¼
n=2 X
ai, n Xðni + 1Þ XðiÞ
(9.8)
(9.9)
i¼1
where: X(i) are the sample statistics of order i, that is, the i-th ordered observation, so, X(1) X(2) … X(n); X is the mean of X; ai, n are constants generated from the means, variances, and covariances of the statistics of order i of a random sample of size n from a normal distribution. Their values can be seen in Table H2 in the Appendix. Small values of Wcal indicate that the distribution of the variable being studied is not normal. The critical values of Shapiro-Wilk statistic Wc are shown in Table H1 in the Appendix. Different from most tables, this table provides the critical values of Wc considering that P(Wcal < Wc) ¼ a (for a left-tailed test). In order for the null hypothesis H0 to be rejected, the value of the Wcal statistic must be in the critical region, that is, Wcal < Wc. Otherwise, we do not reject H0. P-value (the probability associated to the value of Wcal statistic calculated from the sample) can also be seen in Table H1. In this case, we reject H0 if P a.
204
PART
IV Statistical Inference
Example 9.2: Using the Shapiro-Wilk Test Table 9.E.3 shows the data on an aerospace company’s monthly production of aircraft in the last 24 months. Check and see if the data in Table 9.E.3 come from a population with a normal distribution, considering that a ¼ 1%.
TABLE 9.E.3 Production of Aircraft in the Last 24 Months 28
32
46
24
22
18
20
34
30
24
31
29
15
19
23
25
28
30
32
36
39
16
23
36
Solution Step 1: For a normality test in which n < 30, the most recommended test is the Shapiro-Wilk (S-W). Step 2: The S-W test hypotheses for this example are: H0: the production of aircraft in the population follows normal distribution N(m, s) H1: the production of aircraft in the population does not follow normal distribution N(m, s) Step 3: The significance level to be considered is 1%. Step 4: The calculation of the S-W statistic for the data in Table 9.E.3, according to Expressions (9.8) and (9.9), is shown. First of all, to calculate b, we must sort the data in Table 9.E.3 in ascending order, as shown in Table 9.E.4. All the steps necessary to calculate b, from Expression (9.9), are specified in Table 9.E.5. The values of ai,n were obtained from Table H2 in the Appendix.
TABLE 9.E.4 Values From Table 9.E.3 Sorted in Ascending Order 15
16
18
19
20
22
23
23
24
24
25
28
28
29
30
30
31
32
32
34
36
36
39
46
TABLE 9.E.5 Procedure to Calculate b n 2 i +1
ai,n
X(n 2 i+1)
X(i)
ai,n (X(n 2 i+1) 2 X(i))
1
24
0.4493
46
15
13.9283
2
23
0.3098
39
16
7.1254
3
22
0.2554
36
18
4.5972
4
21
0.2145
36
19
3.6465
5
20
0.1807
34
20
2.5298
6
19
0.1512
32
22
1.5120
7
18
0.1245
32
23
1.1205
8
17
0.0997
31
23
0.7976
9
16
0.0764
30
24
0.4584
10
15
0.0539
30
24
0.3234
11
14
0.0321
29
25
0.1284
12
13
0.0107
28
28
0.0000
i
b ¼ 36.1675
We have
Pn i¼1
Xi X
Therefore, Wcal ¼ Pn
2
¼ ð28 27:5Þ2 + ⋯ + ð36 27:5Þ2 ¼ 1388
b2 2 ðXi X Þ i¼1
2
Þ ¼ ð36:1675 ¼ 0:978 1338
Step 5: According to Table H1 in the Appendix, for n ¼ 24 and a ¼ 1%, the critical value of the Shapiro-Wilk statistic is Wc ¼ 0.884.
Hypotheses Tests Chapter
9
205
Step 6: Decision: the null hypothesis is not rejected, since Wcal > Wc (Table H1 provides the critical values of Wc considering that P(Wcal < Wc) ¼ a), which allows us to conclude, with a 99% confidence level, that the sample is drawn from a population with a normal distribution. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table H1 in the Appendix, for a sample size n ¼ 24, the probability associated to Wcal ¼ 0.978 is between 0.50 and 0.90 (a probability of 0.90 is associated to Wcal ¼ 0.981). Step 6: Decision: since P > 0.01, we do not reject H0.
9.3.3
Shapiro-Francia Test
This test is based on Shapiro and Francia (1972). According to Sarkadi (1975), the Shapiro-Wilk (S-W) and ShapiroFrancia tests (S-F) have the same format, being different only when it comes to defining the coefficients. Moreover, calculating the S-F test is much simpler and it can be considered a simplified version of the S-W test. Despite its simplicity, it is as robust as the Shapiro-Wilk test, making it a substitute for the S-W. The Shapiro-Francia test can be applied to samples with 5 n 5000 observations, and it is similar to the Shapiro-Wilk test for large samples. Analogous to the S-W test, the S-F test assumes the following hypotheses: H0: the sample comes from a population with distribution N(m, s) H1: the sample does not come from a population with distribution N(m, s) 0
The calculation of the Shapiro-Francia statistic (Wcal) is given by: " #2 , " # n n n X X X 2 0 2 Wcal ¼ mi XðiÞ mi Xi X , for i ¼ 1, …, n i¼1
i¼1
(9.10)
i¼1
where: X(i) are the sample statistics of order i, that is, the ith ordered observation, so, X(1) X(2) … X(n); mi is the approximate expected value of the ith observation (Z-score). The values of mi are estimated by: mi ¼ F1
i n+1
(9.11)
where F1 corresponds to the opposite of a standard normal distribution with a mean ¼ zero and a standard deviation ¼ 1. These values can be obtained from Table E in the Appendix. 0 Small values of Wcal indicate that the distribution of the variable being studied is not normal. The critical values of 0 Shapiro-Francia statistic (Wc) are shown in Table H1 in the Appendix. Different from most tables, this table provides 0 0 0 the critical values of Wc considering that P(Wcal < Wc) ¼ a ¼ a (for a left-tailed test). In order for the null hypothesis 0 0 0 H0 to be rejected, the value of the Wcal statistic must be in the critical region, that is, Wcal < Wc. Otherwise, we do not reject H0. 0 P-value (the probability associated to Wcal statistic calculated from the sample) can also be seen in Table H1. In this case, we reject H0 if P a. Example 9.3: Using the Shapiro-Francia Test Table 9.E.6 shows all the data regarding a company’s daily production of bicycles in the last 60 months. Check and see if the data come from a population with a normal distribution, considering a ¼ 5%. Solution Step 1: The normality of the data can be verified through the Shapiro-Francia test. Step 2: The S-F test hypotheses for this example are: H0: the production of bicycles in the population follows normal distribution N(m, s) H1: the production of bicycles in the population does not follow normal distribution N(m, s) Step 3: The significance level to be considered is 5%.
206
PART
IV Statistical Inference
TABLE 9.E.6 Production of Bicycles in the Last 60 Months 85
70
74
49
67
88
80
91
57
63
66
60
72
81
73
80
55
54
93
77
80
64
60
63
67
54
59
78
73
84
91
57
59
64
68
67
70
76
78
75
80
81
70
77
65
63
59
60
61
74
76
81
79
78
60
68
76
71
72
84
Step 4: The procedure to calculate the S-F statistic for the data in Table 9.E.6 is shown in Table 9.E.7. 0 Therefore, Wcal ¼ (574.6704)2/(53.1904 6278.8500) ¼ 0.989
TABLE 9.E.7 Procedure to Calculate the Shapiro-Francia Statistic i
m2i
(Xi 2 X)2
X(i)
i/(n + 1)
mi
mi X(i)
1
49
0.0164
2.1347
104.5995
4.5569
481.8025
2
54
0.0328
1.8413
99.4316
3.3905
287.3025
3
54
0.0492
1.6529
89.2541
2.7319
287.3025
4
55
0.0656
1.5096
83.0276
2.2789
254.4025
5
57
0.0820
1.3920
79.3417
1.9376
194.6025
6
57
0.0984
1.2909
73.5841
1.6665
194.6025
7
59
0.1148
1.2016
70.8960
1.4439
142.8025
8
59
0.1311
1.1210
66.1380
1.2566
142.8025
93
0.9836
2.1347
198.5256
4.5569
486.2025
574.6704
53.1904
6278.8500
… 60
Sum
Step 5: According to Table H1 in the Appendix, for n ¼ 60 and a ¼ 5%, the critical value of the Shapiro-Francia statistic is 0 Wc ¼ 0.9625. 0 0 0 Step 6: Decision: the null hypothesis is not rejected because Wcal > Wc (Table H1 provides the critical values of Wc considering 0 0 that P(Wcal < Wc) ¼ a), which allows us to conclude, with a 95% confidence level, that the sample is drawn from a population that follows a normal distribution. If we used P-value instead of the statistic’s critical value, Steps 5 and 6 would be: 0 Step 5: According to Table H1 in the Appendix, for a sample size n ¼ 60, the probability associated to Wcal ¼ 0.989 is greater than 0.10 (P-value). Step 6: Decision: since P > 0.05, we do not reject H0.
9.3.4
Solving Tests for Normality by Using SPSS Software
The Kolmogorov-Smirnov and Shapiro-Wilk tests for normality can be solved by using IBM SPSS Statistics Software. The Shapiro-Francia test, on the other hand, will be elaborated through the Stata software, as we will see in the next section. Based on the procedure that will be described, SPSS shows the results of the K-S and the S-W tests for the sample selected. The use of the images in this section has been authorized by the International Business Machines Corporation©. Let’s consider the data presented in Example 9.1 that are available in the file Production_FarmingEquipment.sav. Let´s open the file and select Analyze → Descriptive Statistics → Explore …, as shown in Fig. 9.4. From the Explore dialog box, we must select the variable we are interested in on the Dependent List, as shown in Fig. 9.5. Let´s click on Plots … (the Explore: Plots dialog box will open) and select the option Normality plots with tests (Fig. 9.6). Finally, let’s click on Continue and on OK.
Hypotheses Tests Chapter
9
207
FIG. 9.4 Procedure for elaborating a univariate normality test on SPSS for Example 9.1.
FIG. 9.5 Selecting the variable of interest.
The results of the Kolmogorov-Smirnov and Shapiro-Wilk tests for normality for the data in Example 9.1 are shown in Fig. 9.7. According to Fig. 9.7, the result of the K-S statistic was 0.118, similar to the value calculated in Example 9.1. Since the sample has more than 30 elements, we should only use the K-S test to verify the normality of the data (the S-W test was applied to Example 9.2). Nevertheless, SPSS also makes the result of the S-W statistic available for the sample selected.
208
PART
IV Statistical Inference
FIG. 9.6 Selecting the normality test on SPSS.
FIG. 9.7 Results of the tests for normality for Example 9.1 on SPSS.
FIG. 9.8 Results of the tests for normality for Example 9.2 on SPSS.
As presented in the introduction of this chapter, SPSS calculates the P-value that corresponds to the lowest significance level observed that would lead to the rejection of the null hypothesis. For the K-S and S-W tests the P-value corresponds to the lowest value of P from which Dcal > Dc and Wcal < Wc. As shown in Fig. 9.7, the value of P for the K-S test was of 0.200 (this probability can also be obtained from Table G in the Appendix, as shown in Example 9.1). Since P > 0.05, we do not reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the data distribution is normal. The S-W test also allows us to conclude that the data distribution follows a normal distribution. Applying the same procedure to verify the normality of the data in Example 9.2 (the data are available in the file Production_Aircraft.sav), we get the results shown in Fig. 9.8. Analogous to Example 9.2, the result of the S-W test was 0.978. The K-S test was not applied to this example due to the sample size (n < 30). The P-value of the S-W test is 0.857 (in Example 9.2, we saw that this probability would be between 0.50 and 0.90
Hypotheses Tests Chapter
9
209
and closer to 0.90) and, since P > 0.01, the null hypothesis is not rejected, which allows us to conclude that the data distribution in the population follows a normal distribution. We will use this test when estimating regression models in Chapter 13. For this example, we can also conclude from the K-S test that the data distribution follows a normal distribution.
9.3.5
Solving Tests for Normality by Using Stata
The Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia tests for normality can be solved by using Stata Statistical Software. The Kolmogorov-Smirnov test will be applied to Example 9.1, the Shapiro-Wilk test to Example 9.2, and the Shapiro-Francia test to Example 9.3. The use of the images in this section has been authorized by StataCorp LP©.
9.3.5.1 Kolmogorov-Smirnov Test on the Stata Software The data presented in Example 9.1 are available in the file Production_FarmingEquipment.dta. Let’s open this file and verify that the name of the variable being studied is production. To elaborate the Kolmogorov-Smirnov test on Stata, we must specify the mean and the standard deviation of the variable that interests us in the test syntax, so, the command summarize, or simply sum, must be typed first, followed by the respective variable: sum production
and we get Fig. 9.9. Therefore, we can see that the mean is 42.63889 and the standard deviation is 7.099911. The Kolmogorov-Smirnov test is given by the following command: ksmirnov production = normal((production-42.63889)/7.099911)
The result of the test can be seen in Fig. 9.10. We can see that the value of the statistic is similar to the one calculated in Example 9.1 and by SPSS software. Since P > 0.05, we conclude that the data distribution is normal.
9.3.5.2 Shapiro-Wilk Test on the Stata Software The data presented in Example 9.2 are available in the file Production_Aircraft.dta. To elaborate the Shapiro-Wilk test on Stata, the syntax of the command is: swilk variables*
where the term variables* should be substituted for the list of variables being considered. For the data in Example 9.2, we have a single variable called production, so, the command to be typed is: swilk production FIG. 9.9 Descriptive statistics of the variable production.
FIG. 9.10 Results of the Kolmogorov-Smirnov test on Stata.
210
PART
IV Statistical Inference
FIG. 9.11 Results of the Shapiro-Wilk test for Example 9.2 on Stata.
FIG. 9.12 Results of the Shapiro-Francia test for Example 9.3 on Stata.
The result of the Shapiro-Wilk test can be seen in Fig. 9.11. Since P > 0.05, we can conclude that the sample comes from a population with a normal distribution.
9.3.5.3 Shapiro-Francia Test on the Stata Software The data presented in Example 9.3 are available in the file Production_Bicycles.dta. To elaborate the Shapiro-Francia test on Stata, the syntax of the command is: sfrancia variables*
where the term variables* should be substituted for the list of variables being considered. For the data in Example 9.3, we have a single variable called production, so, the command to be typed is: sfrancia production
The result of the Shapiro-Francia test can be seen in Fig. 9.12. We can see that the value is similar to the one calculated in Example 9.3 (W 0 ¼ 0.989). Since P > 0.05, we conclude that the sample comes from a population with a normal distribution. We will use this test when estimating regression models in Chapter 13.
9.4
TESTS FOR THE HOMOGENEITY OF VARIANCES
One of the conditions to apply a parametric test to compare k population means is that the population variances, estimated from k representative samples, be homogeneous or equal. The most common tests to verify variance homogeneity are Bartlett’s w2 (1937), Cochran’s C (1947a,b), Hartley’s Fmax (1950), and Levene’s F (1960) tests. In the null hypothesis of variance homogeneity tests, the variances of k populations are homogeneous. In the alternative hypothesis, at least one population variance is different from the others. That is: H0 : s21 ¼ s22 ¼ … ¼ s2k H1 : 9i, j : s2i 6¼ s2j ði, j ¼ 1, …, kÞ
9.4.1
(9.12)
Bartlett’s x2 Test
The original test proposed to verify variance homogeneity among groups is Bartlett’s w2 test (1937). This test is very sensitive to normality deviations, and Levene’s test is an alternative in this case. Bartlett’s statistic is calculated from q: k X ðni 1Þ ln S2i q ¼ ðN kÞ ln S2p i¼1
(9.13)
Hypotheses Tests Chapter
9
211
where:
P ni, i ¼ 1, …, k, is the size of each sample i and ki¼1ni ¼ N; 2 Si , i ¼ 1, …, k, is the variance in each sample i;
and
Xk S2p ¼
i¼1
ðni 1Þ S2i
(9.14)
N k
A correction factor c is applied to q statistic, with the following expression: k X 1 1 1 c¼1+ 3 ð k 1Þ n 1 Nk i¼1 i
! (9.15)
where Bartlett’s statistic (Bcal) approximately follows a chi-square distribution with k 1 degrees of freedom: q (9.16) Bcal ¼ w2k1 c From the previous expressions, we can see that the higher the difference between the variances, the higher the value of B. On the other hand, if all the sample variances are equal, its value will be zero. To confirm if the null hypothesis of variance homogeneity will be rejected or not, the value calculated must be compared to the statistic’s critical value (w2c ), which is available in Table D in the Appendix. This table provides the critical values of w2c considering that P(w2cal > w2c ) ¼ a (for a right-tailed test). Therefore, we reject the null hypothesis if Bcal > w2c . On the other hand, if Bcal w2c , we do not reject H0. P-value (the probability associated to w2cal statistic) can also be obtained from Table D. In this case, we reject H0 if P a. Example 9.4: Applying Bartlett’s x2 Test A chain of supermarkets wishes to study the number of customers they serve every day in order to make strategic operational decisions. Table 9.E.8 shows the data of three stores throughout two weeks. Check if the variances between the groups are homogeneous. Consider a ¼ 5%.
TABLE 9.E.8 Number of Customers Served Per Day and Per Store Store 1
Store 2
Store 3
Day 1
620
710
924
Day 2
630
780
695
Day 3
610
810
854
Day 4
650
755
802
Day 5
585
699
931
Day 6
590
680
924
Day 7
630
710
847
Day 8
644
850
800
Day 9
595
844
769
Day 10
603
730
863
Day 11
570
645
901
Day 12
605
688
888
Day 13
622
718
757
Day 14
578
702
712
Standard deviation Variance
24.4059
62.2466
78.9144
595.6484
3874.6429
6227.4780
212
PART
IV Statistical Inference
Solution If we apply the Kolmogorov-Smirnov or the Shapiro-Wilk test for normality to the data in Table 9.E.8, we will verify that their distribution shows adherence to normality, with a 5% significance level, so, Bartlett’s w2 test can be applied to compare the homogeneity of the variances between the groups. Step 1: Since the main goal is to compare the equality of the variances between the groups, we can use Bartlett’s w2 test. Step 2: Bartlett’s w2 test hypotheses for this example are: H0: the population variances of all three groups are homogeneous H1: the population variance of at least one group is different from the others Step 3: The significance level to be considered is 5%. Step 4: The complete calculation of Bartlett’s w2 statistic is shown. First, we calculate the value of S2p, according to Expression (9.14): 13 ð595:65 + 3874:64 + 6227:48Þ ¼ 3565:92 42 3 Thus, we can calculate q through Expression (9.13): Sp2 ¼
q ¼ 39 ln ð3565:92Þ 13 ½ ln ð595:65Þ + ln ð3874:64Þ + ln ð6227:48Þ ¼ 14:94 The correction factor c for q statistic is calculated from Expression (9.15): 1 1 1 c ¼1+ 3 ¼ 1:0256 3 ð3 1Þ 13 42 3 Finally, we calculate Bcal: q 14:94 Bcal ¼ ¼ ¼ 14:567 c 1:0256 Step 5: According to Table D in the Appendix, for n ¼ 3 1 degrees of freedom and a ¼ 5%, the critical value of Bartlett’s w2 test is w2c ¼ 5.991. Step 6: Decision: since the value calculated lies in the critical region (Bcal > w2c ), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of at least one group is different from the others. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table D in the Appendix, for n ¼ 2 degrees of freedom, the probability associated to w2cal ¼ 14.567 is less than 0.005 (a probability of 0.005 is associated to w2cal ¼ 10.597). Step 6: Decision: since P < 0.05, we reject H0.
9.4.2
Cochran’s C Test
Cochran’s C test (1947a,b) compares the group with the highest variance in relation to the others. The test demands that the data have a normal distribution. Cochran’s C statistic is given by: S2 Ccal ¼ Xmax k
(9.17)
S2 i¼1 i
where: S2max is the highest variance in the sample; S2i is the variance in sample i, i ¼ 1, …, k. According to Expression (9.17), if all the variances are equal, the value of the Ccal statistic is 1/k. The higher the difference of S2max in relation to the other variances, the more the value of Ccal gets closer to 1. To confirm whether the null hypothesis will be rejected or not, the value calculated must be compared to Cochran’s (Cc) statistic’s critical value, which is available in Table M in the Appendix.
Hypotheses Tests Chapter
9
213
The values of Cc vary depending on the number of groups (k), the number of degrees of freedom n ¼ max(ni 1), and the value of a. Table M provides the critical values of Cc considering that P(Ccal > Cc) ¼ a (for a right-tailed test). Thus, we reject H0 if Ccal > Cc. Otherwise, we do not reject H0. Example 9.5: Applying Cochran’s C Test Use Cochran’s C test for the data in Example 9.4. The main objective here is to compare the group with the highest variability in relation to the others. Solution Step 1: Since the objective is to compare the group with the highest variance (group 3—see Table 9.E.8) in relation to the others, Cochran’s C test is the most recommended. Step 2: Cochran’s C test hypotheses for this example are: H0: the population variance of group 3 is equal to the others H1: the population variance of group 3 is different from the others Step 3: The significance level to be considered is 5%. Step 4: From Table 9.E.8, we can see that S2max ¼ 6227.48. Therefore, the calculation of Cochran’s C statistic is given by: S2 6227:48 ¼ 0:582 ¼ Ccal ¼ Xmax k 595:65 + 3874:64 + 6227:48 2 S i i¼1 Step 5: According to Table M in the Appendix, for k ¼ 3, n ¼ 13, and a ¼ 5%, the critical value of Cochran’s C statistic is Cc ¼ 0.575. Step 6: Decision: since the value calculated lies in the critical region (Ccal > Cc), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of group 3 is different from the others.
9.4.3
Hartley’s Fmax Test
Hartley’s Fmax test (1950) has the statistic that represents the relationship between the group with the highest variance (S2max) and the group with the lowest variance (S2min): Fmax ,cal ¼
S2max S2min
(9.18)
The test assumes that the number of observations per group is equal to (n1 ¼ n2 ¼ … ¼ nk ¼ n). If all the variances are equal, the value of Fmax will be 1. The higher the difference between S2max and S2min, the higher the value of Fmax. To confirm if the null hypothesis of variance homogeneity will be rejected or not, the value calculated must be compared to the (Fmax,c) statistic’s critical value, which is available in Table N in the Appendix. The critical values vary depending on the number of groups (k), the number of degrees of freedom n ¼ n 1, and the value of a, and this table provides the critical values of Fmax,c considering that P(Fmax,cal > Fmax,c) ¼ a (for a right-tailed test). Therefore, we reject the null hypothesis H0 of variance homogeneity if Fmax,cal > Fmax,c. Otherwise, we do not reject H0. P-value (the probability associated to Fmax,cal statistic) can also be obtained from Table N in the Appendix. In this case, we reject H0 if P a. Example 9.6: Applying Hartley’s Fmax Test Use Hartley’s Fmax test for the data in Example 9.4. The goal here is to compare the group with the highest variability to the group with the lowest variability. Solution Step 1: Since the main objective is to compare the group with the highest variance (group 3—see Table 9.E.8) to the group with the lowest variance (group 1), Hartley’s Fmax test is the most recommended. Step 2: Hartley’s Fmax test hypotheses for this example are: H0: the population variance of group 3 is the same as group 1
214
PART
IV Statistical Inference
H1: the population variance of group 3 is different from group 1 Step 3: The significance level to be considered is 5%. Step 4: From Table 9.E.8, we can see that S2min ¼ 595.65 and S2max ¼ 6227.48. Therefore, the calculation of Hartley’s Fmax statistic is given by:
F max , cal ¼
2 Smax 6,227:48 ¼ 10:45 ¼ 2 595:65 Smin
Step 5: According to Table N in the Appendix, for k ¼ 3, n ¼ 13, and a ¼ 5%, the critical value of the test is Fmax,c ¼ 3.953. Step 6: Decision: since the value calculated lies in the critical region (Fmax,cal > Fmax,c), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of group 3 is different from the population variance of group 1. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table N in the Appendix, the probability associated to Fmax,cal ¼ 10.45, for k ¼ 3 and n ¼ 13, is less than 0.01. Step 6: Decision: since P < 0.05, we reject H0.
9.4.4
Levene’s F-Test
The advantage of Levene’s F-test in relation to other homogeneity of variance tests is that it is less sensitive to deviations from normality, in addition to being considered a more robust test. Levene’s statistic is given by Expression (9.19) and it follows an F-distribution, approximately, with n1 ¼ k 1 and n2 ¼ N k degrees of freedom, for a significance level a: Xk 2 n Zi Z ðN k Þ i¼1 i Fcal ¼ (9.19) 2 Fk1, Nk, a H0 ðk 1Þ Xk Xni Z Z ij i i¼1 j¼1 where: ni is the dimension of each one of the k samples (i ¼ 1, …, k); N is the of the global sample (N ¼ n1 + n2 + ⋯ + nk); dimension Zij ¼ Xij Xi , i ¼ 1, …, k and j ¼ 1, …, ni; Xij is observation j in sample i; Xi is the mean of sample i; Zi is the mean of Zij in sample i; Z is the mean of Zi in the global sample. An expansion of Levene’s test can be found in Brown and Forsythe (1974). From the F-distribution table (Table A in the Appendix), we can determine the critical values of Levene’s statistic (Fc ¼ Fk1,N k,a). Table A provides the critical values of Fc considering that P(Fcal > Fc) ¼ a (right-tailed table). In order for the null hypothesis H0 to be rejected, the value of the statistic must be in the critical region, that is, Fcal > Fc. If Fcal Fc, we do not reject H0. P-value (the probability associated to Fcal statistic) can also be obtained from Table A. In this case, we reject H0 if P a. Example 9.7: Applying Levene’s Test Elaborate Levene’s test for the data in Example 9.4. Solution Step 1: Levene’s test can be applied to check variance homogeneity between the groups, and it is more robust than the other tests.
Hypotheses Tests Chapter
9
215
Step 2: Levene’s test hypotheses for this example are: H0: the population variances of all three groups are homogeneous H1: the population variance of at least one group is different from the others Step 3: The significance level to be considered is 5%. Step 4: The calculation of the Fcal statistic, according to Expression (9.19), is shown.
TABLE 9.E.9 Calculating the Fcal Statistic
I
X1j
Z1j ¼ X1j X 1
1
620
10.571
9.429
88.898
1
630
20.571
0.571
0.327
1
610
0.571
19.429
377.469
1
650
40.571
20.571
423.184
1
585
24.429
4.429
19.612
1
590
19.429
0.571
0.327
1
630
20.571
0.571
0.327
1
644
34.571
14.571
212.327
1
595
14.429
5.571
31.041
1
603
6.429
13.571
184.184
1
570
39.429
19.429
377.469
1
605
4.429
15.571
242.469
1
622
12.571
7.429
55.184
1
578
31.429
11.429
130.612
X 1 ¼ 609:429
Z 1 ¼ 20
Z2j ¼ X2j X 2
Z1j Z1
Z1j Z 1
Sum ¼ 2143.429
Z2j Z 2
Z2j Z 2
27.214
23.204
538.429
780
42.786
7.633
58.257
2
810
72.786
22.367
500.298
2
755
17.786
32.633
1064.890
2
699
38.214
12.204
148.940
2
680
57.214
6.796
46.185
2
710
27.214
23.204
538.429
2
850
112.786
62.367
3889.686
2
844
106.786
56.367
3177.278
2
730
7.214
43.204
1866.593
2
645
92.214
41.796
1746.899
2
688
49.214
1.204
1.450
2
718
19.214
31.204
973.695
2
702
35.214
15.204
231.164
I
X2j
2
710
2
X 2 ¼ 737:214
Z 2 ¼ 50:418
2
2
Sum ¼ 14,782.192
Continued
216
PART
IV Statistical Inference
TABLE 9.E.9 Calculating the Fcal Statistic—cont’d
Z3j ¼ X3j X 3
Z3j Z 3
Z3j Z 3
I
X3j
3
924
90.643
24.194
585.344
3
695
138.357
71.908
5170.784
3
854
20.643
45.806
2098.201
3
802
31.357
35.092
1231.437
3
931
97.643
31.194
973.058
3
924
90.643
24.194
585.344
3
847
13.643
52.806
2788.487
3
800
33.357
33.092
1095.070
3
769
64.357
2.092
4.376
3
863
29.643
36.806
1354.691
3
901
67.643
1.194
1.425
3
888
54.643
11.806
139.385
3
757
76.357
9.908
98.172
3
712
121.357
54.908
X 3 ¼ 833:36
Z 3 ¼ 66:449
2
3014.906 Sum ¼ 19,140.678
Therefore, the calculation of Fcal is carried out as follows: Fcal ¼
ð42 3Þ 14 ð20 45:62Þ2 + 14 ð50:418 45:62Þ2 + 14 ð66:449 45:62Þ2 2143:429 + 14, 782:192 + 19, 140:678 ð3 1Þ Fcal ¼ 8:427
Step 5: According to Table A in the Appendix, for n1 ¼ 2, n2 ¼ 39, and a ¼ 5%, the critical value of the test is Fc ¼ 3.24. Step 6: Decision: since the value calculated lies in the critical region (Fcal > Fc), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of at least one group is different from the others. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table A in the Appendix, for n1 ¼ 2 and n2 ¼ 39, the probability associated to Fcal ¼ 8.427 is less than 0.01 (P-value). Step 6: Decision: since P < 0.05, we reject H0.
9.4.5
Solving Levene’s Test by Using SPSS Software
The use of the images in this section has been authorized by the International Business Machines Corporation©. To test the variance homogeneity between the groups, SPSS uses Levene’s test. The data presented in Example 9.4 are available in the file CustomerServices_Store.sav. In order to elaborate the test, we must click on Analyze → Descriptive Statistics → Explore …, as shown in Fig. 9.13. Let’s include the variable Customer_services in the list of dependent variables (Dependent List) and the variable Store in the factor list (Factor List), as shown in Fig. 9.14. Next, we must click on Plots … and select the option Untransformed in Spread vs Level with Levene Test, as shown in Fig. 9.15. Finally, let’s click on Continue and on OK. The result of Levene’s test can also be obtained through the ANOVA test, by clicking on Analyze ! Compare Means ! One-Way ANOVA …. In Options …, we must select the option Homogeneity of variance test (Fig. 9.16).
Hypotheses Tests Chapter
FIG. 9.13 Procedure for elaborating Levene’s test on SPSS.
FIG. 9.14 Selecting the variables to elaborate Levene’s test on SPSS.
9
217
218
PART
IV Statistical Inference
FIG. 9.15 Continuation of the procedure to elaborate Levene’s test on SPSS.
FIG. 9.16 Results of Levene’s test for Example 9.4 on SPSS.
The value of Levene’s statistic is 8.427, exactly the same as the one calculated previously. Since the significance level observed is 0.001, a value lower than 0.05, the test shows the rejection of the null hypothesis, which allows us to conclude, with a 95% confidence level, that the population variances are not homogeneous.
9.4.6
Solving Levene’s Test by Using the Stata Software
The use of the images in this section has been authorized by StataCorp LP©. Levene’s statistical test for equality of variances is calculated on Stata by using the command robvar (robust-test for equality of variances), which has the following syntax: robvar variable*, by(groups*)
in which the term variable* should be substituted for the quantitative variable studied and the term groups* by the categorical variable that represents them. Let’s open the file CustomerServices_Store.dta that contains the data of Example 9.7. The three groups are represented by the variable store and the number of customers served by the variable services. Therefore, the command to be typed is: robvar services, by(store)
The result of the test can be seen in Fig. 9.17. We can verify that the value of the statistic (8.427) is similar to the one calculated in Example 9.7 and to the one generated on SPSS, as well as the calculation of the probability associated to
Hypotheses Tests Chapter
9
219
FIG. 9.17 Results of Levene’s test for Example 9.7 on Stata.
the statistic (0.001). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the variances are not homogeneous.
9.5 HYPOTHESES TESTS REGARDING A POPULATION MEAN (m) FROM ONE RANDOM SAMPLE The main goal is to test if a population mean assumes a certain value or not.
9.5.1 Z Test When the Population Standard Deviation (s) Is Known and the Distribution Is Normal This test is applied when a random sample of size n is obtained from a population with a normal distribution, whose mean (m) is unknown and whose standard deviation (s) is known. If the distribution of the population is not known, it is necessary to work with large samples (n > 30), because the central limit theorem guarantees that, as the sample size grows, the sample distribution of its mean gets closer and closer to a normal distribution. For a bilateral test, the hypotheses are: H0: the sample comes from a population with a certain mean (m ¼ m0) H1: it challenges the null hypothesis (m 6¼ m0) The statistical test used here refers to the sample mean (X). In order for the sample mean to be compared to the value in the table, it must be standardized, so: Zcal ¼
X m0 s Nð0, 1Þ, where sX ¼ pffiffiffi sX n
(9.20)
The critical values of the zc statistic are shown in Table E in the Appendix. This table provides the critical values of zc considering that P(Zcal > zc) ¼ a (for a right-tailed test). For a bilateral test, we must consider P(Zcal > zc) ¼ a/2, since P(Zcal < zc) + P(Zcal > zc) ¼ a. The null hypothesis H0 of a bilateral test is rejected if the value of the Zcal statistic lies in the critical region, that is, if Zcal < zc or Zcal > zc. Otherwise, we do not reject H0. The unilateral probabilities associated to Zcal statistic (P) can also be obtained from Table E. For a unilateral test, we consider that P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P a. Example 9.8: Applying the z Test to One Sample A cereal manufacturer states that the average quantity of food fiber in each portion of its product is, at least, 4.2 g with a standard deviation of 1 g. A health care agency wishes to verify if this statement is true. Collecting a random sample of 42 portions, in which the average quantity of food fiber is 3.9 g. With a significance level equal to 5%, is there evidence to reject the manufacturer’s statement?
220
PART
IV Statistical Inference
Solution Step 1: The suitable test for a population mean with a known s, considering a single sample of size n > 30 (normal distribution), is the z test. Step 2: For this example, the z test hypotheses are: H0: m 4.2 g (information provided by the supplier) H1: m < 4.2 g which corresponds to a left-tailed test. Step 3: The significance level to be considered is 5%. Step 4: The calculation of the Zcal statistic, according to Expression (9.20), is:
Zcal ¼
X m0 3:9 4:2 pffiffiffiffiffiffi ¼ 1:94 pffiffiffi ¼ s= n 1= 42
Step 5: According to Table E in the Appendix, for a left-tailed test with a ¼ 5%, the critical value of the test is zc ¼ 1.645. Step 6: Decision: since the value calculated lies in the critical region (zcal < 1.645), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the manufacturer’s average quantity of food fiber is less than 4.2 g. If, instead of comparing the value calculated to the critical value of the standard normal distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table E in the Appendix, for a left-tailed test, the probability associated to zcal ¼ 1.94 is 0.0262 (P-value). Step 6: Decision: since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the manufacturer’s average quantity of food fiber is less than 4.2 g.
9.5.2
Student’s t-Test When the Population Standard Deviation (s) Is Not Known
Student’s t-test for one sample is applied when we do not know the population standard deviation (s), so, its value is estimated from the sample standard deviation (S). However, to substitute s for S in Expression (9.20), the distribution of the variable will no longer be normal; it will become a Student’s t-distribution with n 1 degrees of freedom. Analogous to the z test, Student’s t-test for one sample assumes the following hypotheses for a bilateral test: H0: m ¼ m0 H1: m ¼ 6 m0 And the calculation of the statistic becomes: Tcal ¼
X m0 pffiffiffi tn1 S= n
(9.21)
The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) ¼ a (for a right-tailed test). For a bilateral test, we have P(Tcal < tc) ¼ a/2 ¼ P(Tcal > tc), as shown in Fig. 9.18. Therefore, for a bilateral test, the null hypothesis is rejected if Tcal < tc or Tcal > tc. If tc Tcal tc, we do not reject H0. FIG. 9.18 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.
Hypotheses Tests Chapter
9
221
The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P a. Example 9.9: Applying Student’s t-Test to One Sample The average processing time of a task using a certain machine has been 18 min. New concepts have been implemented in order to reduce the average processing time. Hence, after a certain period of time, a sample with 25 elements was collected, and an average time of 16.808 min was measured, with a standard deviation of 2.733 min. Check and see if this result represents an improvement in the average processing time. Consider a ¼ 1%. Solution Step 1: The suitable test for a population mean with an unknown s is Student’s t-test. Step 2: For this example, Student’s t-test hypotheses are: H0: m ¼ 18 H1: m < 18 which corresponds to a left-tailed test. Step 3: The significance level to be considered is 1%. Step 4: The calculation of the Tcal statistic, according to Expression (9.21), is:
Tcal ¼
X m0 16:808 18 pffiffiffiffiffiffi ¼ 2:18 pffiffiffi ¼ S= n 2:733= 25
Step 5: According to Table B in the Appendix, for a left-tailed test with 24 degrees of freedom and a ¼ 1%, the critical value of the test is tc ¼ 2.492. Step 6: Decision: since the value calculated is not in the critical region (Tcal > 2.492), the null hypothesis is not rejected, which allows us to conclude, with a 99% confidence level, that there was no improvement in the average processing time. If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table B in the Appendix, for a left-tailed test with 24 degrees of freedom, the probability associated to Tcal ¼ 2.18 is between 0.01 and 0.025 (P-value). Step 6: Decision: since P > 0.01, we do not reject the null hypothesis.
9.5.3
Solving Student’s t-Test for a Single Sample by Using SPSS Software
The use of the images in this section has been authorized by the International Business Machines Corporation©. If we wish to compare means from a single sample, SPSS makes Student’s t-test available. The data in Example 9.9 are available in the file T_test_One_Sample.sav. The procedure to apply the test from Example 9.9 will be described. Initially, let´s select Analyze ! Compare Means → One-Sample T Test …, as shown in Fig. 9.19. We must select the variable Time and specify the value 18 that will be tested in Test Value, as shown in Fig. 9.20. Now, we must click on Options … to define the desired confidence level (Fig. 9.21). Finally, let’s click on Continue and on OK. The results of the test are shown in Fig. 9.22. This figure shows the result of the t-test (similar to the value calculated in Example 9.9) and the associated probability (P-value) for a bilateral test. For a unilateral test, the associated probability is 0.0195 (we saw in Example 9.9 that this probability would be between 0.01 and 0.025). Since 0.0195 > 0.01, we do not reject the null hypothesis, which allows us to conclude, with a 99% confidence level, that there was no improvement in the average processing time.
9.5.4
Solving Student’s t-Test for a Single Sample by Using Stata Software
The use of the images in this section has been authorized by StataCorp LP©. Student’s t-test is elaborated on Stata by using the command ttest. For one population mean, the test syntax is: ttest variable* == #
222
PART
IV Statistical Inference
FIG. 9.19 Procedure for elaborating the t-test from one sample on SPSS.
FIG. 9.20 Selecting the variable and specifying the value to be tested.
where the term variable* should be substituted for the name of the variable considered in the analysis and # for the value of the population mean to be tested. The data in Example 9.9 are available in the file T_test_One_Sample.dta. In this case, the variable being analyzed is called time and the goal is to verify if the average processing time is still 18 min, so, the command to be typed is: ttest time == 18
The result of the test can be seen in Fig. 9.23. We can see that the calculated value of the statistic (2.180) is similar to the one calculated in Example 9.9 and also generated on SPSS, as well as the associated probability for a left-tailed test (0.0196). Since P > 0.01, we do not reject the null hypothesis, which allows us to conclude, with a 99% confidence level, that there was no improvement in the processing time.
Hypotheses Tests Chapter
9
223
FIG. 9.21 Options—defining the confidence level.
FIG. 9.22 Results of the t-test for one sample for Example 9.9 on SPSS.
FIG. 9.23 Results of the t-test for one sample for Example 9.9 on Stata.
9.6 STUDENT’S T-TEST TO COMPARE TWO POPULATION MEANS FROM TWO INDEPENDENT RANDOM SAMPLES The t-test for two independent samples is applied to compare the means of two random samples (X1i, i ¼ 1, …, n1; X2j, j ¼ 1, …, n2) obtained from the same population. In this test, the population variance is unknown. For a bilateral test, the null hypothesis of the test states that the population means are the same. If the population means are different, the null hypothesis is rejected, so: H0: m1 ¼ m2 H1: m1 6¼ m2 The calculation of the T statistic depends on the comparison of the population variances between the groups.
Case 1: s21 6¼ s22 Considering that the population variances are different, the calculation of the T statistic is given by:
224
PART
IV Statistical Inference
FIG. 9.24 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.
X1 X2 Tcal ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffi S21 S22 + n1 n2
(9.22)
2 2 S1 S22 + n1 n2 n¼ 2 2 2 2 S1 =n1 S =n + 2 2 ð n1 1 Þ ð n2 1 Þ
(9.23)
with the following degrees of freedom:
Case 2: s21 5 s22 When the population variances are homogeneous, to calculate the T statistic, the researcher has to use: X1 X2 rffiffiffiffiffiffiffiffiffiffiffiffiffiffi Tcal ¼ 1 1 + Sp n1 n 2 where:
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðn1 1Þ S21 + ðn2 1Þ S22 Sp ¼ n1 + n2 2
(9.24)
(9.25)
and Tcal follows Student’s t-distribution with n ¼ n1 + n2 2 degrees of freedom. The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) ¼ a (for a right-tailed test). For a bilateral test, we have P(Tcal < tc) ¼ a/2 ¼ P(Tcal > tc), as shown in Fig. 9.24. Therefore, for a bilateral test, if the value of the statistic lies in the critical region, that is, if Tcal < tc or Tcal > tc, the test allows us to reject the null hypothesis. On the other hand, if tc Tcal tc, we do not reject H0. The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P a. Example 9.10: Applying Student’s t-Test to Two Independent Samples A quality engineer believes that the average time to manufacture a certain plastic product may depend on the raw materials used, which come from two different suppliers. A sample with 30 observations from each supplier is collected for a test and the results are shown in Tables 9.E.10 and 9.E.11. For a significance level a ¼ 5%, check if there is any difference between the means. Solution Step 1: The suitable test to compare two population means with an unknown s is Student’s t-test for two independent samples. Step 2: For this example, Student’s t-test hypotheses are: H0: m1 ¼ m2 H1: m1 6¼ m2 Step 3: The significance level to be considered is 5%.
Hypotheses Tests Chapter
9
225
TABLE 9.E.10 Manufacturing Time Using Raw Materials From Supplier 1 22.8
23.4
26.2
24.3
22.0
24.8
26.7
25.1
23.1
22.8
25.6
25.1
24.3
24.2
22.8
23.2
24.7
26.5
24.5
23.6
23.9
22.8
25.4
26.7
22.9
23.5
23.8
24.6
26.3
22.7
TABLE 9.E.11 Manufacturing Time Using Raw Materials From Supplier 2 26.8
29.3
28.4
25.6
29.4
27.2
27.6
26.8
25.4
28.6
29.7
27.2
27.9
28.4
26.0
26.8
27.5
28.5
27.3
29.1
29.2
25.7
28.4
28.6
27.9
27.4
26.7
26.8
25.6
26.1
Step 4: For the data in Tables 9.E.10 and 9.E.11, we calculate X 1 ¼ 24:277, X 2 ¼ 27:530, S21 ¼ 1.810, and S22 ¼ 1.559. Considering that the population variances are homogeneous, according to the solution generated on SPSS, let’s use Expressions (9.24) and (9.25) to calculate the Tcal statistic, as follows: rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 29 1:810 + 29 1:559 ¼ 1:298 30 + 30 2 24:277 27:530 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 9:708 Tcal ¼ 1 1 1:298 + 30 30
Sp ¼
with n ¼ 30 + 30 – 2 ¼ 58 degrees of freedom. Step 5: The critical region of the bilateral test, considering n ¼ 58 degrees of freedom and a ¼ 5%, can be defined from Student’s t-distribution table (Table B in the Appendix), as shown in Fig. 9.25. For a bilateral test, each one of the tails corresponds to half of significance level a. FIG. 9.25 Critical region of Example 9.10.
Step 6: Decision: since the value calculated lies in the critical region, that is, Tcal < 2.002, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the population means are different. If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table B in the Appendix, for a right-tailed test with n ¼ 58 degrees of freedom, probability P1 associated to Tcal ¼ 9.708 is less than 0.0005. For a bilateral test, this probability must be doubled (P ¼ 2P1). Step 6: Decision: since P < 0.05, the null hypothesis is rejected.
9.6.1
Solving Student’s t-Test From Two Independent Samples by Using SPSS Software
The data in Example 9.10 are available in the file T_test_Two_Independent_Samples.sav. The procedure for solving Student’s t-test to compare two population means from two independent random samples on SPSS is described. The use of the images in this section has been authorized by the International Business Machines Corporation©. We must click on Analyze ! Compare Means → Independent-Samples T Test …, as shown in Fig. 9.26.
226
PART
IV Statistical Inference
Let’s include the variable Time in Test Variable(s) and the variable Supplier in Grouping Variable. Next, let’s click on Define Groups … to define the groups (categories) of the variable Supplier, as shown in Fig. 9.27. If the confidence level desired by the researcher is different from 95%, the button Options … must be selected to change it. Finally, let’s click on OK. The results of the test are shown in Fig. 9.28. The value of the t statistic for the test is 9.708 and the associated bilateral probability is 0.000 (P < 0.05), which leads us to reject the null hypothesis, and allows us to conclude, with a 95% confidence level, that the population means are different. We can notice that Fig. 9.28 also shows the result of Levene’s test. Since the significance level observed is 0.694, value greater than 0.05, we can also conclude, with a 95% confidence level, that the variances are homogeneous.
FIG. 9.26 Procedure for elaborating the t-test from two independent samples on SPSS.
FIG. 9.27 Selecting the variables and defining the groups.
Hypotheses Tests Chapter
9
227
FIG. 9.28 Results of the t-test for two independent samples for Example 9.10 on SPSS.
FIG. 9.29 Results of the t-test for two independent samples for Example 9.10 on Stata.
9.6.2
Solving Student’s t-Test From Two Independent Samples by Using Stata Software
The use of the images in this section has been authorized by StataCorp LP©. The t-test to compare the means of two independent groups on Stata is elaborated by using the following syntax: ttest variable*, by(groups*)
where the term variable* must be substituted for the quantitative variable being analyzed, and the term groups* for the categorical variable that represents them. The data in Example 9.10 are available in the file T_test_Two_Independent_Samples.dta. The variable supplier shows the groups of suppliers. The values for each group of suppliers are specified in the variable time. Thus, we must type the following command: ttest time, by(supplier)
The result of the test can be seen in Fig. 9.29. We can see that the calculated value of the statistic (9.708) is similar to the one calculated in Example 9.10 and also generated on SPSS, as well as the associated probability for a bilateral test (0.000). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population means are different.
9.7 STUDENT’S T-TEST TO COMPARE TWO POPULATION MEANS FROM TWO PAIRED RANDOM SAMPLES This test is applied to check whether the means of two paired or related samples, obtained from the same population (before and after) with a normal distribution, are significantly different or not. Besides the normality of the data of each sample, the test requires the homogeneity of the variances between the groups. Different from the t-test for two independent samples, first, we must calculate the difference between each pair of values in position i (di ¼ Xbefore,i Xafter,i, i ¼ 1, …, n) and, after that, test the null hypothesis that the mean of the differences in the population is zero.
228
PART
IV Statistical Inference
For a bilateral test, we have: H0: md ¼ 0, md ¼ mbefore mafter H1: md ¼ 6 0 The Tcal statistic for the test is given by: Tcal ¼
d md pffiffiffi t Sd = n n¼n1
where:
Xn d¼
and Sd ¼
i¼1
(9.26)
d
(9.27)
n
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Xn 2 d d i¼1 i
(9.28)
n1
The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) ¼ a (for a right-tailed test). For a bilateral test, we have P(Tcal < tc) ¼ a/2 ¼ P(Tcal > tc), as shown in Fig. 9.30. FIG. 9.30 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.
Therefore, for a bilateral test, the null hypothesis is rejected if Tcal < tc or Tcal > tc. If tc Tcal tc, we do not reject H0. The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P a. Example 9.11: Applying Student’s t-Test to Two Paired Samples A group of 10 machine operators, responsible for carrying out a certain task, is trained to perform the same task more efficiently. To verify if there is a reduction in the time taken to perform the task, we measured the time spent by each operator, before and after the training course. Test the hypothesis that the population means of both paired samples are similar, that is, that there is no reduction in time taken to perform the task after the training course. Consider a ¼ 5%.
TABLE 9.E.12 Time Spent Per Operator Before the Training Course 3.2
3.6
3.4
3.8
3.4
3.5
3.7
3.2
3.5
3.9
3.4
3.0
3.2
3.6
TABLE 9.E.13 Time Spent Per Operator After the Training Course 3.0
3.3
3.5
3.6
3.4
3.3
Solution Step 1: In this case, the most suitable test is Student’s t-test for two paired samples. Since the test requires the normality of the data in each sample and the homogeneity of the variances between the groups, K-S or S-W tests, besides Levene’s test, must be applied for such verification. As we will see, in the solution of this example on SPSS, all of these assumptions will be validated.
Hypotheses Tests Chapter
9
229
Step 2: For this example, Student’s t-test hypotheses are: H0: md ¼ 0 H1: md 6¼ 0 Step 3: The significance level to be considered is 5%. Step 4: In order to calculate the Tcal statistic, first, we must calculate di:
TABLE 9.E.14 Calculating di Xbefore, Xafter, di
i
i
3.2
3.6
3.4
3.8
3.4
3.5
3.7
3.2
3.5
3.9
3.0
3.3
3.5
3.6
3.4
3.3
3.4
3.0
3.2
3.6
0.2
0.3
0.1
0.2
0
0.2
0.3
0.2
0.3
0.3
Xn d 0:2 + 0:3 + ⋯ + 0:3 i¼1 i ¼ ¼ 0:19 d¼ n 10 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð0:2 0:19Þ2 + ð0:3 0:19Þ2 + ⋯ + ð0:3 0:19Þ2 Sd ¼ ¼ 0:137 9 Tcal ¼
d 0:19 pffiffiffi ¼ pffiffiffiffiffiffi ¼ 4:385 Sd = n 0:137= 10
Step 5: The critical region of the bilateral test can be defined from Student’s t-distribution table (Table B in the Appendix), considering n ¼ 9 degrees of freedom and a ¼ 5%, as shown in Fig. 9.31. For a bilateral test, each tail corresponds to half of significance level a. FIG. 9.31 Critical region of Example 9.11.
Step 6: Decision: since the value calculated lies in the critical region (Tcal > 2.262), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that there is a significant difference between the times spent by the operators before and after the training course. If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we used the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table B in the Appendix, for a right-tailed test with n ¼ 9 degrees of freedom, the P1 probability associated to Tcal ¼ 4.385 is between 0.0005 and 0.001. For a bilateral test, this probability must be doubled (P ¼ 2P1), so, 0.001 < P < 0.002. Step 6: Decision: since P < 0.05, the null hypothesis is rejected.
9.7.1
Solving Student’s t-Test From Two Paired Samples by Using SPSS Software
First, we must test the normality of the data in each sample, as well as the variance homogeneity between the groups. Using the same procedures described in Sections 9.3.3 and 9.4.5 (the data must be placed in a table the same way as in Section 9.4.5), we obtain Figs. 9.32 and 9.33. Based on Fig. 9.32, we conclude that there is normality of the data for each sample. From Fig. 9.33, we can conclude that the variances between the samples are homogeneous.
230
PART
IV Statistical Inference
The use of the images in this section has been authorized by the International Business Machines Corporation©. To solve Student’s t-test for two paired samples on SPSS, we must open the file T_test_Two_Paired_Samples.sav. Then, we have to click on Analyze ! Compare Means → Paired-Samples T Test …, as shown in Fig. 9.34. We must select the variable Before and move it to Variable1 and the variable After to Variable2, as shown in Fig. 9.35. If the desired confidence level is different from 95%, we must click on Options … to change it. Finally, let’s click on OK. The results of the test are shown in Fig. 9.36. The value of the t-test is 4.385 and the significance level observed for a bilateral test is 0.002, value less than 0.05, which leads us to reject the null hypothesis and allows us to conclude, with a 95% confidence level, that there is a significant difference between the times spent by the operators before and after the training course.
FIG. 9.32 Results of the normality tests on SPSS.
FIG. 9.33 Results of Levene’s test on SPSS.
FIG. 9.34 Procedure for elaborating the t-test from two paired samples on SPSS.
Hypotheses Tests Chapter
9
231
FIG. 9.35 Selecting the variables that will be paired.
FIG. 9.36 Results of the t-test for two paired samples.
FIG. 9.37 Results of Student’s t-test for two paired samples for Example 9.11 on Stata.
9.7.2
Solving Student’s t-Test From Two Paired Samples by Using Stata Software
The t-test to compare the means of two paired groups will be solved on Stata for the data in Example 9.11. The use of the images in this section has been authorized by StataCorp LP©. Therefore, let’s open the file T_test_Two_Paired_Samples.dta. The paired variables are called before and after. In this case, we must type the following command: ttest before == after
The result of the test can be seen in Fig. 9.37. We can see that the calculated value of the statistic (4.385) is similar to the one calculated in Example 9.11 and on SPSS, as well as the probability associated to the statistic for a bilateral test (0.0018). Since P < 0.05, we reject the null hypothesis that the times spent by the operators before and after the training course are the same, with a 95% confidence level.
232
PART
9.8
IV Statistical Inference
ANOVA TO COMPARE THE MEANS OF MORE THAN TWO POPULATIONS
ANOVA is a test used to compare the means of three or more populations, through the analysis of sample variances. The test is based on a sample obtained from each population, aiming at determining if the differences between the sample means suggest significant differences between the population means, or if such differences are only a result of the implicit variability of the sample. ANOVA’s assumptions are: (i) The samples must be independent from each other; (ii) The data in the populations must have a normal distribution; (iii) The population variances must be homogeneous.
9.8.1
One-Way ANOVA
One-way ANOVA is an extension of Student’s t-test for two population means, allowing the researcher to compare three or more population means. The null hypothesis of the test states that the population means are the same. If there is at least one group with a mean that is different from the others, the null hypothesis is rejected. As stated in Fa´vero et al. (2009), the one-way ANOVA allows researcher to verify the effect of a qualitative explanatory variable (factor) on a quantitative dependent variable. Each group includes the observations of the dependent variable in one category of the factor. Assuming that size n independent samples are obtained from k populations (k 3) and that the means of these populations can be represented by m1, m2, …, mk, the analysis of variance tests the following hypotheses: H0 : m1 ¼ m2 ¼ … ¼ mk H1 : 9ði, jÞ mi 6¼ mj , i 6¼ j
(9.29)
According to Maroco (2014), in general, the observations for this type of problem can be represented according to Table 9.2. where Yij represents observation i of sample or group P j (i ¼ 1, …, nj; j ¼ 1, …, k) and nj is the dimension of sample or group j. The dimension of the global sample is N ¼ ki¼1ni. Pestana and Gageiro (2008) present the following model: Yij ¼ mi + eij
(9.30)
Yij ¼ m + ðmi mÞ eij
(9.31)
Yij ¼ m + ai + eij
(9.32)
where: m is the global mean of the population; mi is the mean of sample or group i; ai is the effect of sample or group i; eij is the random error.
TABLE 9.2 Observations of the One-Way ANOVA Samples or Groups 1
2
…
K
Y11
Y12
…
Y1k
Y21
Y22
…
Y2k
…
…
…
…
Yn11
Yn22
…
Ynkk
Hypotheses Tests Chapter
9
233
Therefore, ANOVA assumes that each group comes from a population with a normal distribution, mean mi, and a homogeneous variance, that is, Yij N(mi,s), resulting in the hypothesis that the errors (residuals) have a normal distribution with a mean equal to zero and a constant variance, that is, eij N(0,s), besides being independent (Fa´vero et al., 2009). The technique’s hypotheses are tested from the calculation of the group variances, and that is where the name ANOVA comes from. The technique involves the calculation of the variations between the groups (Y i Y) and within each group (Yij Y i ). The residual sum of squares within groups (RSS) is calculated by: RSS ¼
nj k X X
Yij Y i
2 (9.33)
i¼1 j¼1
The residual sum of squares between groups, or the sum of squares of the factor (SSF), is given by: SSF ¼
k X
2 ni Y i Y
(9.34)
i¼1
Therefore, the total sum is: TSS ¼ RSS + SSF ¼
ni k X 2 X Yij Y
(9.35)
i¼1 j¼1
According to Fa´vero et al. (2009) and Maroco (2014), the ANOVA statistic is given by the division between the variance of the factor (SSF divided by k 1 degrees of freedom) and the variance of the residuals (RSS divided by N k degrees of freedom): SSF ðk 1Þ MSF ¼ Fcal ¼ RSS MSR ðN kÞ
(9.36)
where: MSF represents the mean square between groups (estimate of the variance of the factor); MSR represents the mean square within groups (estimate of the variance of the residuals). Table 9.3 summarizes the calculations of the one-way ANOVA. The value of F can be null or positive, but never negative. Therefore, ANOVA requires an asymmetrical F-distribution to the right. The calculated value (Fcal) must be compared to the value in the F-distribution table (Table A in the Appendix). This table provides the critical values of Fc ¼ Fk1,N k,a where P(Fcal > Fc) ¼ a (right-tailed test). Therefore, one-way ANOVA’s null hypothesis is rejected if Fcal > Fc. Otherwise, if (Fcal Fc), we do not reject H0. We will use these concepts when we study the estimation of regression models in Chapter 13.
TABLE 9.3 Calculating the One-Way ANOVA Source of Variation
Sum of Squares
Between the groups
SSF ¼
Within the groups
RSS ¼
Total
TSS ¼
Pk
Degrees of Freedom 2
Mean Squares
k1
MSF
2 Pk Pni i¼1 j¼1 Yij Y i
Nk
RSS MSR ¼ ðNk Þ
Pk Pni
N1
i¼1 ni
i¼1
Yi Y
j¼1
Yij Y
2
SSF ¼ ðk1 Þ
F MSF F ¼ MSR
Source: Fa´vero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Ana´lise de dados: modelagem multivariada para tomada de deciso˜es. Campus Elsevier, Rio de Janeiro; Maroco, J., 2014. Ana´lise estatı´stica com o SPSS Statistics, sixth ed. Edic¸o˜es Sı´labo, Lisboa.
234
PART
IV Statistical Inference
TABLE 9.E.15 Percentage of Sucrose for the Three Suppliers Supplier 1 (n1 5 12)
Supplier 2 (n2 5 10)
Supplier 3 (n3 5 10)
0.33
1.54
1.47
0.79
1.11
1.69
1.24
0.97
1.55
1.75
2.57
2.04
0.94
2.94
2.67
2.42
3.44
3.07
1.97
3.02
3.33
0.87
3.55
4.01
0.33
2.04
1.52
0.79
1.67
2.03
Y 1 ¼ 1:316
Y 2 ¼ 2:285
Y 3 ¼ 2:338
S1 ¼ 0.850
S2 ¼ 0.948
S3 ¼ 0.886
1.24 3.12
Example 9.12: Applying the One-Way ANOVA Test A sample with 32 products is collected to analyze the quality of the honey supplied by three different suppliers. One of the ways to test the quality of the honey is finding out how much sucrose it contains, which usually varies between 0.25% and 6.5%. Table 9. E.15 shows the percentage of sucrose in the sample collected from each supplier. Check if there are differences in this quality indicator among the three suppliers, considering a 5% significance level. Solution Step 1: In this case, the most suitable test is the one-way ANOVA. First, we must verify the assumptions of normality for each group and of variance homogeneity between the groups through the Kolmogorov-Smirnov, Shapiro-Wilk, and Levene tests. Figs. 9.38 and 9.39 show the results obtained by using SPSS software.
FIG. 9.38 Results of the tests for normality on SPSS.
FIG. 9.39 Results of Levene’s test on SPSS.
Hypotheses Tests Chapter
9
235
Since the significance level observed in the tests for normality for each group and in the variance homogeneity test between the groups is greater than 5%, we can conclude that each one of the groups shows data with a normal distribution and that the variances between the groups are homogeneous, with a 95% confidence level. Since the assumptions of the one-way ANOVA were met, the technique can be applied. Step 2: For this example, ANOVA’s null hypothesis states that there are no differences in the amount of sucrose coming from the three suppliers. If there is at least one supplier with a population mean that is different from the others, the null hypothesis will be rejected. Thus, we have: H0: m1 ¼ m2 ¼ m3 H1: 9(i,j) mi 6¼ mj, i 6¼ j Step 3: The significance level to be considered is 5%. Step 4: The calculation of the Fcal statistic is specified here. For this example, we know that k ¼ 3 groups and the global sample size is N ¼ 32. The global sample mean is Y ¼ 1:938. The sum of squares between groups (SSF) is: SSF ¼ 12 ð1:316 1:938Þ2 + 10 ð2:285 1:938Þ2 + 10 ð2:338 1:938Þ2 ¼ 7:449 Therefore, the mean square between groups (MSB) is: MSF ¼
SSF 7:449 ¼ ¼ 3:725 ðk 1Þ 2
The calculation of the sum of squares within groups (RSS) is shown in Table 9.E.16.
TABLE 9.E.16 Calculation of the Sum of Squares Within Groups (SSW)
Yij Y i
Supplier
Sucrose
Yij Y i
1
0.33
0.986
0.972
1
0.79
0.526
0.277
1
1.24
0.076
0.006
1
1.75
0.434
0.189
1
0.94
0.376
0.141
1
2.42
1.104
1.219
1
1.97
0.654
0.428
1
0.87
0.446
0.199
1
0.33
0.986
0.972
1
0.79
0.526
0.277
1
1.24
0.076
0.006
1
3.12
1.804
3.255
2
1.54
0.745
0.555
2
1.11
1.175
1.381
2
0.97
1.315
1.729
2
2.57
0.285
0.081
2
2.94
0.655
0.429
2
3.44
1.155
1.334
2
3.02
0.735
0.540
2
3.55
1.265
1.600
2
2.04
0.245
0.060
2
1.67
0.615
0.378
2
Continued
236
PART
IV Statistical Inference
TABLE 9.E.16 Calculation of the Sum of Squares Within Groups (SSW)—cont’d
Yij Y i
Supplier
Sucrose
Yij Y i
3
1.47
0.868
0.753
3
1.69
0.648
0.420
3
1.55
0.788
0.621
3
2.04
0.298
0.089
3
2.67
0.332
0.110
3
3.07
0.732
0.536
3
3.33
0.992
0.984
3
4.01
1.672
2.796
3
1.52
0.818
0.669
3
2.03
0.308
0.095
RSS
2
23.100
Therefore, the mean square within groups is: MSR ¼
RSS 23:100 ¼ ¼ 0:797 ðN k Þ 29
Thus, the value of the Fcal statistic is: Fcal ¼
MSF 3:725 ¼ ¼ 4:676 MSR 0:797
Step 5: According to Table A in the Appendix, the critical value of the statistic is Fc ¼ F2, 29,
5%
¼ 3.33.
Step 6: Decision: since the value calculated lies in the critical region (Fcal > Fc), we reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is at least one supplier with a population mean that is different from the others. If, instead of comparing the value calculated to the critical value of Snedecor’s F-distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table A in the Appendix, for n1 ¼ 2 degrees of freedom in the numerator and n2 ¼ 29 degrees of freedom in the denominator, the probability associated to Fcal ¼ 4.676 is between 0.01 and 0.025 (P-value). Step 6: Decision: since P < 0.05, the null hypothesis is rejected.
9.8.1.1 Solving the One-Way ANOVA Test by Using SPSS Software The use of the images in this section has been authorized by the International Business Machines Corporation©. The data in Example 9.12 are available in the file One_Way_ANOVA.sav. First of all, let´s click on Analyze ! Compare Means → One-Way ANOVA …, as shown in Fig. 9.40. Let’s include the variable Sucrose in the list of dependent variables (Dependent List) and the variable Supplier in the box Factor, according to Fig. 9.41. After that, we must click on Options … and select the option Homogeneity of variance test (Levene’s test for variance homogeneity). Finally, let’s click on Continue and on OK to obtain the result of Levene’s test, besides the ANOVA table. Since ANOVA does not make the normality test available, it must be obtained by applying the same procedure described in Section 9.3.3. According to Fig. 9.42, we can verify that each one of the groups has data that follow a normal distribution. Moreover, through Fig. 9.43, we can conclude that the variances between the groups are homogeneous.
Hypotheses Tests Chapter
9
237
FIG. 9.40 Procedure for the one-way ANOVA.
FIG. 9.41 Selecting the variables.
From the ANOVA table (Fig. 9.44), we can see that the value of the F-test is 4.676 and the respective P-value is 0.017 (we saw in Example 9.12 that this value would be between 0.01 and 0.025), value less than 0.05. This leads us to reject the null hypothesis and allows us to conclude, with a 95% confidence level, that at least one of the population means is different from the others (there are differences in the percentage of sucrose in the honey of the three suppliers).
9.8.1.2 Solving the One-Way ANOVA Test by Using Stata Software The use of the images in this section has been authorized by StataCorp LP©. The one-way ANOVA on Stata is generated from the following syntax: anova variabley* factor*
238
PART
IV Statistical Inference
FIG. 9.42 Results of the tests for normality for Example 9.12 on SPSS.
FIG. 9.43 Results of Levene’s test for Example 9.12 on SPSS.
FIG. 9.44 Results of the one-way ANOVA for Example 9.12 on SPSS.
FIG. 9.45 Results of the one-way ANOVA on Stata.
in which the term variabley* should be substituted for the quantitative dependent variable and the term factor* for the qualitative explanatory variable. The data in Example 9.12 are available in the file One_Way_Anova.dta. The quantitative dependent variable is called sucrose and the factor is represented by the variable supplier. Thus, we must type the following command: anova sucrose supplier
The result of the test can be seen in Fig. 9.45. We can see that the calculated value of the statistic (4.68) is similar to the one calculated in Example 9.12 and also generated on SPSS, as well as the probability associated to the value of the statistic (0.017). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that at least one of the population means is different from the others.
Hypotheses Tests Chapter
9.8.2
9
239
Factorial ANOVA
Factorial ANOVA is an extension of the one-way ANOVA, assuming the same assumptions, but considering two or more factors. Factorial ANOVA presumes that the quantitative dependent variable is influenced by more than one qualitative explanatory variable (factor). It also tests the possible interactions between the factors, through the resulting effect of the combination of factor A’s level i and factor B’s level j, as discussed by Pestana and Gageiro (2008), Fa´vero et al. (2009), and Maroco (2014). For Pestana and Gageiro (2008) and Fa´vero et al. (2009), the main objective of the factorial ANOVA is to determine whether the means for each factor level are the same (an isolated effect of the factors on the dependent variable), and to verify the interaction between the factors (the joint effect of the factors on the dependent variable). For educational purposes, the factorial ANOVA will be described for the two-way model.
9.8.2.1 Two-Way ANOVA According to Fa´vero et al. (2009) and Maroco (2014), the observations of the two-way ANOVA can be represented, in general, as shown in Table 9.4. For each cell, we can see the values of the dependent variable in the factors A and B that are being studied. where Yijk represents observation k (k ¼ 1, …, n) of factor A’s level i (i ¼ 1, …, a) and of factor B’s level j (j ¼ 1, …, b). First, in order to check the isolated effects of factors A and B, we must test the following hypotheses (Fa´vero et al., 2009; Maroco, 2014): HA0 : m1 ¼ m2 ¼ … ¼ ma
(9.37)
HA1 : 9ði, jÞ mi 6¼ mj , i 6¼ j ði, j ¼ 1, …, aÞ and HB0 : m1 ¼ m2 ¼ … ¼ mb
(9.38)
HB1 : 9ði, jÞ mi 6¼ mj , i 6¼ j ði, j ¼ 1, …, bÞ
TABLE 9.4 Observations of the Two-Way ANOVA Factor B
Factor A
1
1
2
… …
b
Y111
Y121
Y112
Y122
Yab2
⋮
⋮
⋮
Y11n
Y12n
Yabn
Y211
Y221
Y212
Y222
Y2b2
⋮
⋮
⋮
Y21n
Y22n
Y2bn
⋮
⋮
⋮
⋮
⋮
a
Ya11
Ya21
…
Yab1
Ya12
Ya22
Yab2
⋮
⋮
⋮
Ya1n
Ya2n
Yabn
2
…
Yab1
Y2b1
Source: Fa´vero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Ana´lise de dados: modelagem multivariada para tomada de deciso˜es. Campus Elsevier, Rio de Janeiro; Maroco, J., 2014. Ana´lise estatı´stica com o SPSS Statistics, sixth ed. Edic¸o˜es Sı´labo, Lisboa.
240
PART
IV Statistical Inference
Now, in order to verify the joint effect of the factors on the dependent variable, we must test the following hypotheses (Fa´vero et al., 2009; Maroco, 2014): H0 : gij ¼ 0, for i 6¼ j ðthere is no interaction between the factors A and BÞ H1 : gij 6¼ 0, for i 6¼ j ðthere is interaction between the factors A and BÞ
(9.39)
The model presented by Pestana and Gageiro (2008) can be described as: Yijk ¼ m + ai + bj + gij + eijk
(9.40)
where: m is the population’s global mean; ai is the effect of factor A’s level i, given by mi m; bi is the effect of factor B’s level j, given by mj m; gij is the interaction between the factors; eijk is the random error that follows a normal distribution with a mean equal to zero and a constant variance. To standardize the effects of the levels chosen of both factors, we must assume that: a X i¼1
ai ¼
b X
bj ¼
a X
j¼1
gij ¼
i¼1
b X
gij ¼ 0
(9.41)
i¼1
Let’s consider Y, Y ij , Y i , and Y j the general mean of the global sample, the mean per sample, the mean of factor A’s level i, and the mean of factor B’s level j, respectively. We can describe the residual sum of squares (RSS) as: RSS ¼
a X b X n 2 X Yijk Y ij
(9.42)
i¼1 j¼1 k¼1
On the other hand, the sum of squares of factor A (SSFA), the sum of squares of factor B (SSFB), and the sum of squares of the interaction (SSFAB) are represented below in Expressions (9.43)–(9.45), respectively: SSFA ¼ b n
a X
Yi Y
2
(9.43)
i¼1
SSFB ¼ a n
b 2 X Yj Y
(9.44)
j¼1
SSFAB ¼ n
a X b X
Y ij Y i Y j + Y
2 (9.45)
i¼1 j¼1
Therefore, the sum of total squares can be written as follows: TSS ¼ RSS + SSFA + SSFB + SSFAB ¼
a X b X n 2 X Yijk Y
(9.46)
i¼1 j¼1 k¼1
Thus, the ANOVA statistic for factor A is given by: SSFA MSFA ð a 1Þ FA ¼ ¼ RSS MSR ðn 1Þ ab where: MSFA is the mean square of factor A; MSR is the mean square of the errors.
(9.47)
Hypotheses Tests Chapter
9
241
TABLE 9.5 Calculations of the Two-Way ANOVA Source of Variation Factor A
SSF A ¼ b n
Factor B
SSF B ¼ a n
Interaction
SSF AB ¼ n
Error
RSS ¼
Total
Degrees of Freedom
Sum of Squares
TSS ¼
Pa
Mean Squares
F
a1
A MSF A ¼ ðSSF a1Þ
A FA ¼ MSF MSR
b1
B MSF B ¼ ðSSF b1Þ
B FB ¼ MSF MSR
(a 1). (b 1)
AB MSF AB ¼ ða1SSF Þ ðb1Þ
AB FAB ¼ MSF MSR
2 Yijk Y ij
(n 1) ab
RSS MSR ¼ ðn1 Þ ab
2 Yijk Y
N1
2
2 Pb j¼1 Y j Y
i¼1
Yi Y
2 Pa Pb i¼1 j¼1 Y ij Y i Y j + Y
Pa Pb Pn i¼1
j¼1
k¼1
Pa Pb Pn i¼1
j¼1
k¼1
Source: Fa´vero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Ana´lise de dados: modelagem multivariada para tomada de deciso˜es. Campus Elsevier, Rio de Janeiro; Maroco, J., 2014. Ana´lise estatı´stica com o SPSS Statistics, sixth ed. Edic¸o˜es Sı´labo, Lisboa.
On the other hand, the ANOVA statistic for factor B is given by: SSFB MSFB ðb 1Þ FB ¼ ¼ RSS MSR ðn 1Þ ab
(9.48)
where: MSFB is the mean square of factor B. And the ANOVA statistic for the interaction is represented by: SSFAB ða 1Þ ðb 1Þ MSFAB FAB ¼ ¼ RSS MSR ðn 1Þ ab
(9.49)
where: MSFAB is the mean square of the interaction. The calculations of the two-way ANOVA are summarized in Table 9.5. cal cal The calculated values of the statistics (Fcal A , FB , and FAB) must be compared to the critical values obtained from the c F-distribution table (Table A in the Appendix): FA ¼ Fa1, (n1)ab, a, FcB ¼ Fb1, (n1)ab, a, and FcAB ¼ F(a1)(b1), (n1)ab, a. c cal c cal c For each statistic, if the value lies in the critical region (Fcal A > FA, FB > FB, FAB > FAB), we must reject the null hypothesis. Otherwise, we do not reject H0. Example 9.13: Using the Two-Way ANOVA A sample with 24 passengers who travel from Sao Paulo to Campinas in a certain week is collected. The following variables are analyzed (1) travel time in minutes, (2) the bus company chosen, and (3) the day of the week. The main objective is to verify if there is a relationship between the travel time and the bus company, between the travel time and the day of the week, and between the bus company and the day of the week. The levels considered in the variable bus company are Company A (1), Company B (2), and Company C (3). On the other hand, the levels regarding the day of the week are Monday (1), Tuesday (2), Wednesday (3), Thursday (4), Friday (5), Saturday (6), and Sunday (7). The results of the sample are shown in Table 9.E.17 and are available in the file Two_Way_ANOVA.sav as well. Test these hypotheses, considering a 5% significance level.
242
PART
IV Statistical Inference
TABLE 9.E.17 Data From Example 9.13 (Using the Two-Way ANOVA) Time (Min)
9.8.2.1.1
Company
Day of the Week
90
2
4
100
1
5
72
1
6
76
3
1
85
2
2
95
1
5
79
3
1
100
2
4
70
1
7
80
3
1
85
2
3
90
1
5
77
2
7
80
1
2
85
3
4
74
2
7
72
3
6
92
1
5
84
2
4
80
1
3
79
2
1
70
3
6
88
3
5
84
2
4
Solving the Two-Way ANOVA Test by Using SPSS Software
The use of the images in this section has been authorized by the International Business Machines Corporation©. Step 1: In this case, the most suitable test is the two-way ANOVA. First, we must verify if there is normality in the variable Time (metric) in the model (as shown in Fig. 9.46). According to this figure, we can conclude that variable Time follows a normal distribution, with a 95% confidence level. The hypothesis of variance homogeneity will be verified in Step 4. Step 2: The null hypothesis H0 of the two-way ANOVA for this example assumes that the population means of each level of the factor Company and of each level of the factor Day_of_the_week are equal, that is, HA0 : m1 ¼ m2 ¼ m3 and HB0 : m1 ¼ m2 ¼ … ¼ m7. The null hypothesis H0 also states that there is no interaction between the factor Company and the factor Day_of_the_week, that is, H0: gij ¼ 0 for i 6¼ j. Step 3: The significance level to be considered is 5%.
Hypotheses Tests Chapter
9
243
FIG. 9.46 Results of the normality tests on SPSS.
FIG. 9.47 Procedure for elaborating the two-way ANOVA on SPSS.
Step 4: The F statistics in ANOVA for the factor Company, for the factor Day_of_the_week, and for the interaction Company * Day_of_the_week will be obtained through the SPSS software, according to the procedure specified below. In order to do that, let´s click on Analyze ! General Linear Model → Univariate …, as shown in Fig. 9.47. After that, let´s include the variable Time in the box of dependent variables (Dependent Variable) and the variables Company and Day_of_the_week in the box of Fixed Factor(s), as shown in Fig. 9.48. This example is based on the one-way ANOVA, in which the factors are fixed. If one of the factors were chosen randomly, it would be inserted into the box Random Factor(s), resulting in a case of a three-way ANOVA. The button Model … defines the variance analysis model to be tested. Through the button Contrasts …, we can assess if the category of one of the factors is significantly different from the other categories of the same factor. Charts can be constructed through the button Plots …, thus allowing the visualization of the existence or nonexistence of interactions between the factors. Button Post Hoc …, on the other hand, allows us to compare multiple means. Finally, from the button Options …, we can obtain descriptive statistics and the result of Levene’s variance homogeneity test, as well as select the appropriate significance level (Fa´vero et al., 2009; Maroco, 2014).
244
PART
IV Statistical Inference
FIG. 9.48 Selection of the variables to elaborate the two-way ANOVA.
Therefore, since we want to test variance homogeneity, we must select, in Options …, the option Homogeneity tests, as shown in Fig. 9.49. Finally, let’s click on Continue and on OK to obtain Levene’s variance homogeneity test and the two-way ANOVA table. In Fig. 9.50, we can see that the variances between groups are homogeneous (P ¼ 0.451 > 0.05). Based on Fig. 9.51, we can conclude that there are no significant differences between the travel times of the companies analyzed, that is, the factor Company does not have a significant impact on the variable Time (P ¼ 0.330 > 0.05). On the other hand, we conclude that there are significant differences between the days of the week, that is, the factor Day_of_the_week has a significant effect on the variable Time (P ¼ 0.003 < 0.05). We finally conclude that there is no significant interaction, with a 95% confidence level, between the two factors Company and Day_of_the_week, since P ¼ 0.898 > 0.05. 9.8.2.1.2
Solving the Two-Way ANOVA Test by Using Stata Software
The use of the images in this section has been authorized by StataCorp LP©. The command anova on Stata specifies the dependent variable being analyzed, as well as the respective factors. The interactions are specified using the character # between the factors. Thus, the two-way ANOVA is generated through the following syntax: anova variableY* factorA* factorB* factorA#factorB
or simply: anova variabley* factorA*## factorB*
in which the term variabley* should be substituted for the quantitative dependent variable and the terms factorA* and factorB* for the respective factors. If we type the syntax anova variableY* factorA* factorB*, only the ANOVA for each factor will be elaborated, and not between the factors.
Hypotheses Tests Chapter
9
245
FIG. 9.49 Test of variance homogeneity.
FIG. 9.50 Results of Levene’s test on SPSS.
The data presented in Example 9.13 are available in the file Two_Way_ANOVA.dta. The quantitative dependent variable is called time and the factors correspond to the variables company and day_of_the_week. Thus, we must type the following command: anova time company##day_of_the_week
The results can be seen in Fig. 9.52 and are similar to those presented on SPSS, which allows us to conclude, with a 95% confidence level, that only the factor day_of_the_week has a significant effect on the variable time (P ¼ 0.003 < 0.05), and that there is no significant interaction between the two factors analyzed (P ¼ 0.898 > 0.05).
246
PART
IV Statistical Inference
FIG. 9.51 Results of the two-way ANOVA for Example 9.13 on SPSS.
FIG. 9.52 Results of the two-way ANOVA for Example 9.13 on Stata.
9.8.2.2 ANOVA With More Than Two Factors The two-way ANOVA can be generalized to three or more factors. According to Maroco (2014), the model becomes very complex, since the effect of multiple interactions can make the effect of the factors a bit confusing. The generic model with three factors presented by the author is: Yijkl ¼ m + ai + bj + gk + abij + agik + bgjk + abgijk + eijkl
9.9
(9.50)
FINAL REMARKS
This chapter presented the concepts and objectives of parametric hypotheses tests and the general procedures for constructing each one of them. We studied the main types of tests and the situations in which each one of them must be used. Moreover, the advantages and disadvantages of each test were established, as well as their assumptions. We studied the tests for normality (Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia), variance homogeneity tests (Bartlett’s w2, Cochran’s C, Hartley’s Fmax, and Levene’s F), Student’s t-test for one population mean, for two independent means, and for two paired means, as well as ANOVA and its extensions.
Hypotheses Tests Chapter
9
247
Regardless of the application’s main goal, parametric tests can provide good and interesting research results that will be useful in the decision-making process. From a conscious choice of the modeling software, the correct use of each test must always be made based on the underlying theory, without ever ignoring the researcher’s experience and intuition.
9.10 EXERCISES (1) In what situations should parametric tests be applied and what are the assumptions of these tests? (2) What are the advantages and disadvantages of parametric tests? (3) What are the main parametric tests to verify the normality of the data? In what situations must we use each one of them? (4) What are the main parametric tests to verify the variance homogeneity between groups? In what situations must we use each one of them? (5) To test a single population mean, we can use z-test and Student’s t-test. In what cases must each one of them be applied? (6) What are the main mean comparison tests? What are the assumptions of each test? (7) The monthly aircraft sales data throughout last year can be seen in the table below. Check and see if there is normality in the data. Consider a ¼ 5%. Jan.
Feb.
Mar.
Apr.
May
Jun.
Jul.
Aug.
Sept.
Oct.
Nov.
Dec.
48
52
50
49
47
50
51
54
39
56
52
55
(8) Test the normality of the temperature data listed (a ¼ 5%): 12.5
14.2
13.4
14.6
12.7
10.9
16.5
14.7
11.2
10.9
12.1
12.8
13.8
13.5
13.2
14.1
15.5
16.2
10.8
14.3
12.8
12.4
11.4
16.2
14.3
14.8
14.6
13.7
13.5
10.8
10.4
11.5
11.9
11.3
14.2
11.2
13.4
16.1
13.5
17.5
16.2
15.0
14.2
13.2
12.4
13.4
12.7
11.2
(9) The table shows the final grades of two students in nine subjects. Check and see if there is variance homogeneity between the students (a ¼ 5%). Student 1
6.4
5.8
6.9
5.4
7.3
8.2
6.1
5.5
6.0
Student 2
6.5
7.0
7.5
6.5
8.1
9.0
7.5
6.5
6.8
(10) A fat-free yogurt manufacturer states that the number of calories in each cup is 60 cal. In order to check if this information is true, a random sample with 36 cups is collected; and we observed that the average number of calories was 65 cal. with a standard deviation of 3.5. Apply the appropriate test and check if the manufacturer’s statement is true, considering a significance level of 5%. (11) We would like to compare the average waiting time before being seen by a doctor (in minutes) in two hospitals. In order to do that, we collected a sample with 20 patients from each hospital. The data are available in the tables. Check and see if there are differences between the average waiting times in both hospitals. Consider a ¼ 1%.
248
PART
IV Statistical Inference
Hospital 1
72
58
91
88
70
76
98
101
65
73
79
82
80
91
93
88
97
83
71
74
66
40
55
70
76
61
53
50
47
61
52
48
60
72
57
70
66
55
46
51
Hospital 2
(12) Thirty teenagers whose total cholesterol level is higher than what is advisable underwent treatment that consisted of a diet and physical activities. The tables show the levels of LDL cholesterol (mg/dL) before and after the treatment. Check if the treatment was effective (a ¼ 5%). Before the treatment 220
212
227
234
204
209
211
245
237
250
208
224
220
218
208
205
227
207
222
213
210
234
240
227
229
224
204
210
215
228
After the treatment 195
180
200
204
180
195
200
210
205
211
175
198
195
200
190
200
222
198
201
194
190
204
230
222
209
198
195
190
201
210
(13) An aerospace company produces civilian and military helicopters at its three factories. The tables show the monthly production of helicopters in the last 12 months at each factory. Check if there is a difference between the population means. Consider a ¼ 5%. Factory 1 24
26
28
22
31
25
27
28
30
21
20
24
26
24
30
24
27
25
29
30
27
26
25
25
24
26
20
22
22
27
20
26
24
25
Factory 2 28
Factory 3
29