Hypotheses Tests

Hypotheses Tests

Chapter 9 Hypotheses Tests We must conduct research and then accept the results. If they don’t stand up to experimentation, Buddha’s own words must b...

8MB Sizes 0 Downloads 131 Views

Chapter 9

Hypotheses Tests We must conduct research and then accept the results. If they don’t stand up to experimentation, Buddha’s own words must be rejected. Tenzin Gyatso, 14th Dalai Lama

9.1

INTRODUCTION

As discussed previously, one of the problems to be solved by statistical inference is hypotheses testing. A statistical hypothesis is an assumption about a certain population parameter, such as, the mean, the standard deviation, the correlation coefficient, etc. A hypothesis test is a procedure to decide the veracity or falsehood of a certain hypothesis. In order for a statistical hypothesis to be validated or rejected with accuracy, it would be necessary to examine the entire population, which in practice is not viable. As an alternative, we draw a random sample from the population we are interested in. Since the decision is made based on the sample, errors may occur (rejecting a hypothesis when it is true or not rejecting a hypothesis when it is false), as we will study later on. The procedures and concepts necessary to construct a hypothesis test will be presented. Let’s consider X a variable associated to a population and y a certain parameter of this population. We must define the hypothesis to be tested about parameter y of this population, which is called null hypothesis: H 0 : y ¼ y0

(9.1)

Let’s also define the alternative hypothesis (H1), in case H0 is rejected, which can be characterized as follows: H1 : y 6¼ y0

(9.2)

and the test is called bilateral test (or two-tailed test). The significance level of a test (a) represents the probability of rejecting the null hypothesis when it is true (it is one of the two errors that may occur, as we will see later). The critical region (CR) or rejection region (RR) of a bilateral test is represented by two tails of the same size, respectively, in the left and right extremities of the distribution curve, and each one of them corresponds to half of the significance level a, as shown in Fig. 9.1. Another way to define the alternative hypothesis (H1) would be: H 1 : y < y0

(9.3)

and the test is called unilateral test to the left (or left-tailed test). In this case, the critical region is in the left tail of the distribution and corresponds to significance level a, as shown in Fig. 9.2. Or the alternative hypothesis could be: FIG. 9.1 Critical region (CR) of a bilateral test, also emphasizing the nonrejection region (NR) of the null hypothesis.

Data Science for Business and Decision Making. https://doi.org/10.1016/B978-0-12-811216-8.00009-4 © 2019 Elsevier Inc. All rights reserved.

199

200

PART

IV Statistical Inference

H1 : y > y0

(9.4)

and the test is called unilateral test to the right (or right-tailed test). In this case, the critical region is in the right tail of the distribution and corresponds to significance level a, as shown in Fig. 9.3. Thus, if the main objective is to check whether a parameter is significantly higher or lower than a certain value, we have to use a unilateral test. On the other hand, if the objective is to check whether a parameter is different from a certain value, we have to use a bilateral test. After defining the null hypothesis to be tested, through a random sample collected from the population, we either prove the hypothesis or not. Since the decision is made based on the sample, two types of errors may happen: Type I error: rejecting the null hypothesis when it is true. The probability of this type of error is represented by a: Pðtype I errorÞ ¼ Pðrejecting H0 j H0 is trueÞ ¼ a

(9.5)

Type II error: not rejecting the null hypothesis when it is false. The probability of this type of error is represented by b: Pðtype II errorÞ ¼ Pðnot rejecting H0 j H0 is falseÞ ¼ b

(9.6)

Table 9.1 shows the types of errors that may happen in a hypothesis test. The procedure for defining hypotheses tests includes the following phases: Step Step Step Step Step Step

1: Choosing the most suitable statistical test, depending on the researcher’s intention. 2: Presenting the test’s null hypothesis H0 and its alternative hypothesis H1. 3: Setting the significance level a. 4: Calculating the value observed of the statistic based on the sample obtained from the population. 5: Determining the test’s critical region based on the value of a set in Step 3. 6: Decision: if the value of the statistic lies in the critical region, reject H0. Otherwise, do not reject H0.

According to Fa´vero et al. (2009), most statistical softwares, among them SPSS and Stata, calculate the P-value that corresponds to the probability associated to the value of the statistic calculated from the sample. P-value indicates the lowest significance level observed that would lead to the rejection of the null hypothesis. Thus, we reject H0 if P  a.

FIG. 9.2 Critical region (CR) of a left-tailed test, also emphasizing the nonrejection region of the null hypothesis (NR).

FIG. 9.3 Critical region (CR) of a right-tailed test.

TABLE 9.1 Types of Errors Decision

H0 Is True

H0 Is False

Not rejecting H0

Correct decision (1  a)

Type II error (b)

Rejecting H0

Type I error (a)

Correct decision (1  b)

Hypotheses Tests Chapter

9

201

If we use P-value instead of the statistic’s critical value, Steps 5 and 6 of the construction of the hypotheses tests will be: Step 5: Determine the P-value that corresponds to the probability associated to the value of the statistic calculated in Step 4. Step 6: Decision: if P-value is less than the significance level a established in Step 3, reject H0. Otherwise, do not reject H0.

9.2

PARAMETRIC TESTS

Hypotheses tests are divided into parametric and nonparametric tests. In this chapter, we will study parametric tests. Nonparametric tests will be studied in the next chapter. Parametric tests involve population parameters. A parameter is any numerical measure or quantitative characteristic that describes a population. They are fixed values, usually unknown, and represented by Greek characters, such as, the population mean (m), the population standard deviation (s), the population variance (s2), among others. When hypotheses are formulated about population parameters, the hypothesis test is called parametric. In nonparametric tests, hypotheses are formulated about qualitative characteristics of the population. Therefore, parametric methods are applied to quantitative data and require strong assumptions in order to be validated, including: (i) The observations must be independent; (ii) The sample must be drawn from populations with a certain distribution, usually normal; (iii) The populations must have equal variances for the comparison tests of two paired population means or k population means (k 3); (iv) The variables being studied must be measured in an interval or in a reason scale, so that it can be possible to use arithmetic operations over their respective values. We will study the main parametric tests, including tests for normality, homogeneity of variance tests, Student’s t-test and its applications, in addition to the analysis of variance (ANOVA) and its extensions. All of them will be solved in an analytical way and also through the statistical softwares SPSS and Stata. To verify the univariate normality of the data, the most common tests used are Kolmogorov-Smirnov and Shapiro-Wilk. To compare the variance homogeneity between populations, we have Bartlett’s w2 (1937), Cochran’s C (1947a,b), Hartley’s Fmax (1950), and Levene’s F (1960) tests. We will describe Student’s t-test for three situations: to test hypotheses about the population mean, to test hypotheses to compare two independent means, and to compare two paired means. ANOVA is an extension of Student’s t-test and is used to compare the means of more than two populations. In this chapter, ANOVA of one factor, ANOVA of two factors and its extension for more than two factors will be described.

9.3

UNIVARIATE TESTS FOR NORMALITY

Among all univariate tests for normality, the most common are Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia.

9.3.1

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov test (K-S) is an adherence test, that is, it compares the cumulative frequency distribution of a set of sample values (values observed) to a theoretical distribution. The main goal is to test if the sample values come from a population with a supposed theoretical or expected distribution, in this case, the normal distribution. The statistic is given by the point with the biggest difference (in absolute values) between the two distributions. To use the K-S test, the population mean and standard deviation must be known. For small samples, the test loses power, so, it should be used with large samples (n 30). The K-S test assumes the following hypotheses: H0: the sample comes from a population with distribution N(m, s) H1: the sample does not come from a population with distribution N(m, s)

202

PART

IV Statistical Inference

As specified in Fa´vero et al. (2009), let Fexp(X) be an expected distribution function (normal) of cumulative relative frequencies of variable X, where Fexp(X)  N(m,s), and Fobs(X) the observed cumulative relative frequency distribution of variable X. The objective is to test whether Fobs(X) ¼ Fexp(X), in contrast with the alternative that Fobs(X) 6¼ Fexp(X). The statistic can be calculated through the following expression:   o n     (9.7) Dcal ¼ max Fexp ðXi Þ  Fobs ðXi Þ; Fexp ðXi Þ  Fobs ðXi1 Þ , for i ¼ 1, …,n where: Fexp(Xi): expected cumulative relative frequency in category i; Fobs(Xi): observed cumulative relative frequency in category i; Fobs(Xi1): observed cumulative relative frequency in category i  1. The critical values of Kolmogorov-Smirnov statistic (Dc) are shown in Table G in the Appendix. This table provides the critical values of Dc considering that P(Dcal > Dc) ¼ a (for a right-tailed test). In order for the null hypothesis H0 to be rejected, the value of the Dcal statistic must be in the critical region, that is, Dcal > Dc. Otherwise, we do not reject H0. P-value (the probability associated to the value of Dcal statistic calculated from the sample) can also be seen in Table G. In this case, we reject H0 if P  a. Example 9.1: Using the Kolmogorov-Smirnov Test Table 9.E.1 shows the data on a company’s monthly production of farming equipment in the last 36 months. Check and see if the data in Table 9.E.1 come from a population that follows a normal distribution, considering that a ¼ 5%.

TABLE 9.E.1 Production of Farming Equipment in the Last 36 Months 52

50

44

50

42

30

36

34

48

40

55

40

30

36

40

42

55

44

38

42

40

38

52

44

52

34

38

44

48

36

36

55

50

34

44

42

Solution Step 1: Since the objective is to verify if the data in Table 9.E.1 come from a population with a normal distribution, the most suitable test is Kolmogorov-Smirnov (K-S). Step 2: The K-S test hypotheses for this example are: H0: the production of farming equipment in the population follows distribution N(m, s) H1: the production of farming equipment in the population does not follow distribution N(m, s) Step 3: The significance level to be considered is 5%. Step 4: All the steps necessary to calculate Dcal from Expression (9.7) are specified in Table 9.E.2.

TABLE 9.E.2 Calculating the Kolmogorov-Smirnov Statistic |Fexp(Xi) 2 Fobs(Xi)|

|Fexp(Xi) 2 Fobs(Xi21)|

0.0375

0.018

0.036

1.2168

0.1118

0.027

0.056

0.250

0.9351

0.1743

0.076

0.035

12

0.333

0.6534

0.2567

0.077

0.007

4

16

0.444

0.3717

0.3551

0.089

0.022

42

4

20

0.556

0.0900

0.4641

0.092

0.020

44

5

25

0.694

0.1917

0.5760

0.118

0.020

Xi

a

Fac

c

d

30

2

2

0.056

1.7801

34

3

5

0.139

36

4

9

38

3

40

Fabs

b

Fracobs

Zi

e

Fracexp

Hypotheses Tests Chapter

9

203

TABLE 9.E.2 Calculating the Kolmogorov-Smirnov Statistic—cont’d Fracexp

| Fexp(Xi) 2 Fobs(Xi)|

| Fexp(Xi) 2 Fobs(Xi21)|

0.7551

0.7749

0.025

0.081

0.833

1.0368

0.8501

0.017

0.100

33

0.917

1.3185

0.9064

0.010

0.073

36

1

1.7410

0.9592

0.041

0.043

Xi

Fabs

Fac

Fracobs

48

2

27

0.750

50

3

30

52

3

55

3

Zi

a

Absolute frequency. Cumulative (absolute) frequency. Observed cumulative relative frequency of Xi. d Standardized Xi values according to the expression Zi ¼ Xi SX . e Expected cumulative relative frequency of Xi and it corresponds to the probability obtained in Table E in the Appendix (standard normal distribution table) from the value of Zi. b c

Therefore, the real value of the K-S statistic based on the sample is Dcal ¼ 0.118. Step 5: According to Table G in the Appendix, for n ¼ 36 and a ¼ 5%, the critical value of the Kolmogorov-Smirnov statistic is Dc ¼ 0.23. Step 6: Decision: since the value calculated is not in the critical region (Dcal < Dc), the null hypothesis is not rejected, which allows us to conclude, with a 95% confidence level, that the sample is drawn from a population that follows a normal distribution. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table G in the Appendix, for a sample size n ¼ 36, the probability associated to Dcal ¼ 0.118 has as its lowest limit P ¼ 0.20. Step 6: Decision: since P > 0.05, we do not reject H0.

9.3.2

Shapiro-Wilk Test

The Shapiro-Wilk test (S-W) is based on Shapiro and Wilk (1965) and can be applied to samples with 4  n  2000 observations, and it is an alternative to the Kolmogorov-Smirnov test for normality (K-S) in the case of small samples (n < 30). Analogous to the K-S test, the S-W test for normality assumes the following hypotheses: H0: the sample comes from a population with distribution N(m, s) H1: the sample does not come from a population with distribution N(m, s) The calculation of the Shapiro-Wilk statistic (Wcal) is given by: b2 Wcal ¼ Xn  2 , for i ¼ 1, …, n Xi  X i¼1 b¼

n=2 X

  ai, n  Xðni + 1Þ  XðiÞ

(9.8)

(9.9)

i¼1

where: X(i) are the sample statistics of order i, that is, the i-th ordered observation, so, X(1)  X(2)  …  X(n); X is the mean of X; ai, n are constants generated from the means, variances, and covariances of the statistics of order i of a random sample of size n from a normal distribution. Their values can be seen in Table H2 in the Appendix. Small values of Wcal indicate that the distribution of the variable being studied is not normal. The critical values of Shapiro-Wilk statistic Wc are shown in Table H1 in the Appendix. Different from most tables, this table provides the critical values of Wc considering that P(Wcal < Wc) ¼ a (for a left-tailed test). In order for the null hypothesis H0 to be rejected, the value of the Wcal statistic must be in the critical region, that is, Wcal < Wc. Otherwise, we do not reject H0. P-value (the probability associated to the value of Wcal statistic calculated from the sample) can also be seen in Table H1. In this case, we reject H0 if P  a.

204

PART

IV Statistical Inference

Example 9.2: Using the Shapiro-Wilk Test Table 9.E.3 shows the data on an aerospace company’s monthly production of aircraft in the last 24 months. Check and see if the data in Table 9.E.3 come from a population with a normal distribution, considering that a ¼ 1%.

TABLE 9.E.3 Production of Aircraft in the Last 24 Months 28

32

46

24

22

18

20

34

30

24

31

29

15

19

23

25

28

30

32

36

39

16

23

36

Solution Step 1: For a normality test in which n < 30, the most recommended test is the Shapiro-Wilk (S-W). Step 2: The S-W test hypotheses for this example are: H0: the production of aircraft in the population follows normal distribution N(m, s) H1: the production of aircraft in the population does not follow normal distribution N(m, s) Step 3: The significance level to be considered is 1%. Step 4: The calculation of the S-W statistic for the data in Table 9.E.3, according to Expressions (9.8) and (9.9), is shown. First of all, to calculate b, we must sort the data in Table 9.E.3 in ascending order, as shown in Table 9.E.4. All the steps necessary to calculate b, from Expression (9.9), are specified in Table 9.E.5. The values of ai,n were obtained from Table H2 in the Appendix.

TABLE 9.E.4 Values From Table 9.E.3 Sorted in Ascending Order 15

16

18

19

20

22

23

23

24

24

25

28

28

29

30

30

31

32

32

34

36

36

39

46

TABLE 9.E.5 Procedure to Calculate b n 2 i +1

ai,n

X(n 2 i+1)

X(i)

ai,n (X(n 2 i+1) 2 X(i))

1

24

0.4493

46

15

13.9283

2

23

0.3098

39

16

7.1254

3

22

0.2554

36

18

4.5972

4

21

0.2145

36

19

3.6465

5

20

0.1807

34

20

2.5298

6

19

0.1512

32

22

1.5120

7

18

0.1245

32

23

1.1205

8

17

0.0997

31

23

0.7976

9

16

0.0764

30

24

0.4584

10

15

0.0539

30

24

0.3234

11

14

0.0321

29

25

0.1284

12

13

0.0107

28

28

0.0000

i

b ¼ 36.1675

We have

Pn  i¼1

Xi  X

Therefore, Wcal ¼ Pn

2

¼ ð28  27:5Þ2 + ⋯ + ð36  27:5Þ2 ¼ 1388

b2 2 ðXi X Þ i¼1

2

Þ ¼ ð36:1675 ¼ 0:978 1338

Step 5: According to Table H1 in the Appendix, for n ¼ 24 and a ¼ 1%, the critical value of the Shapiro-Wilk statistic is Wc ¼ 0.884.

Hypotheses Tests Chapter

9

205

Step 6: Decision: the null hypothesis is not rejected, since Wcal > Wc (Table H1 provides the critical values of Wc considering that P(Wcal < Wc) ¼ a), which allows us to conclude, with a 99% confidence level, that the sample is drawn from a population with a normal distribution. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table H1 in the Appendix, for a sample size n ¼ 24, the probability associated to Wcal ¼ 0.978 is between 0.50 and 0.90 (a probability of 0.90 is associated to Wcal ¼ 0.981). Step 6: Decision: since P > 0.01, we do not reject H0.

9.3.3

Shapiro-Francia Test

This test is based on Shapiro and Francia (1972). According to Sarkadi (1975), the Shapiro-Wilk (S-W) and ShapiroFrancia tests (S-F) have the same format, being different only when it comes to defining the coefficients. Moreover, calculating the S-F test is much simpler and it can be considered a simplified version of the S-W test. Despite its simplicity, it is as robust as the Shapiro-Wilk test, making it a substitute for the S-W. The Shapiro-Francia test can be applied to samples with 5  n  5000 observations, and it is similar to the Shapiro-Wilk test for large samples. Analogous to the S-W test, the S-F test assumes the following hypotheses: H0: the sample comes from a population with distribution N(m, s) H1: the sample does not come from a population with distribution N(m, s) 0

The calculation of the Shapiro-Francia statistic (Wcal) is given by: " #2 , " # n n n  X X X 2 0 2 Wcal ¼ mi  XðiÞ mi  Xi  X , for i ¼ 1, …, n i¼1

i¼1

(9.10)

i¼1

where: X(i) are the sample statistics of order i, that is, the ith ordered observation, so, X(1)  X(2)  …  X(n); mi is the approximate expected value of the ith observation (Z-score). The values of mi are estimated by: mi ¼ F1 



i n+1

 (9.11)

where F1 corresponds to the opposite of a standard normal distribution with a mean ¼ zero and a standard deviation ¼ 1. These values can be obtained from Table E in the Appendix. 0 Small values of Wcal indicate that the distribution of the variable being studied is not normal. The critical values of 0 Shapiro-Francia statistic (Wc) are shown in Table H1 in the Appendix. Different from most tables, this table provides 0 0 0 the critical values of Wc considering that P(Wcal < Wc) ¼ a ¼ a (for a left-tailed test). In order for the null hypothesis 0 0 0 H0 to be rejected, the value of the Wcal statistic must be in the critical region, that is, Wcal < Wc. Otherwise, we do not reject H0. 0 P-value (the probability associated to Wcal statistic calculated from the sample) can also be seen in Table H1. In this case, we reject H0 if P  a. Example 9.3: Using the Shapiro-Francia Test Table 9.E.6 shows all the data regarding a company’s daily production of bicycles in the last 60 months. Check and see if the data come from a population with a normal distribution, considering a ¼ 5%. Solution Step 1: The normality of the data can be verified through the Shapiro-Francia test. Step 2: The S-F test hypotheses for this example are: H0: the production of bicycles in the population follows normal distribution N(m, s) H1: the production of bicycles in the population does not follow normal distribution N(m, s) Step 3: The significance level to be considered is 5%.

206

PART

IV Statistical Inference

TABLE 9.E.6 Production of Bicycles in the Last 60 Months 85

70

74

49

67

88

80

91

57

63

66

60

72

81

73

80

55

54

93

77

80

64

60

63

67

54

59

78

73

84

91

57

59

64

68

67

70

76

78

75

80

81

70

77

65

63

59

60

61

74

76

81

79

78

60

68

76

71

72

84

Step 4: The procedure to calculate the S-F statistic for the data in Table 9.E.6 is shown in Table 9.E.7. 0 Therefore, Wcal ¼ (574.6704)2/(53.1904  6278.8500) ¼ 0.989

TABLE 9.E.7 Procedure to Calculate the Shapiro-Francia Statistic i

m2i

(Xi 2 X)2

X(i)

i/(n + 1)

mi

mi X(i)

1

49

0.0164

2.1347

104.5995

4.5569

481.8025

2

54

0.0328

1.8413

99.4316

3.3905

287.3025

3

54

0.0492

1.6529

89.2541

2.7319

287.3025

4

55

0.0656

1.5096

83.0276

2.2789

254.4025

5

57

0.0820

1.3920

79.3417

1.9376

194.6025

6

57

0.0984

1.2909

73.5841

1.6665

194.6025

7

59

0.1148

1.2016

70.8960

1.4439

142.8025

8

59

0.1311

1.1210

66.1380

1.2566

142.8025

93

0.9836

2.1347

198.5256

4.5569

486.2025

574.6704

53.1904

6278.8500

… 60

Sum

Step 5: According to Table H1 in the Appendix, for n ¼ 60 and a ¼ 5%, the critical value of the Shapiro-Francia statistic is 0 Wc ¼ 0.9625. 0 0 0 Step 6: Decision: the null hypothesis is not rejected because Wcal > Wc (Table H1 provides the critical values of Wc considering 0 0 that P(Wcal < Wc) ¼ a), which allows us to conclude, with a 95% confidence level, that the sample is drawn from a population that follows a normal distribution. If we used P-value instead of the statistic’s critical value, Steps 5 and 6 would be: 0 Step 5: According to Table H1 in the Appendix, for a sample size n ¼ 60, the probability associated to Wcal ¼ 0.989 is greater than 0.10 (P-value). Step 6: Decision: since P > 0.05, we do not reject H0.

9.3.4

Solving Tests for Normality by Using SPSS Software

The Kolmogorov-Smirnov and Shapiro-Wilk tests for normality can be solved by using IBM SPSS Statistics Software. The Shapiro-Francia test, on the other hand, will be elaborated through the Stata software, as we will see in the next section. Based on the procedure that will be described, SPSS shows the results of the K-S and the S-W tests for the sample selected. The use of the images in this section has been authorized by the International Business Machines Corporation©. Let’s consider the data presented in Example 9.1 that are available in the file Production_FarmingEquipment.sav. Let´s open the file and select Analyze → Descriptive Statistics → Explore …, as shown in Fig. 9.4. From the Explore dialog box, we must select the variable we are interested in on the Dependent List, as shown in Fig. 9.5. Let´s click on Plots … (the Explore: Plots dialog box will open) and select the option Normality plots with tests (Fig. 9.6). Finally, let’s click on Continue and on OK.

Hypotheses Tests Chapter

9

207

FIG. 9.4 Procedure for elaborating a univariate normality test on SPSS for Example 9.1.

FIG. 9.5 Selecting the variable of interest.

The results of the Kolmogorov-Smirnov and Shapiro-Wilk tests for normality for the data in Example 9.1 are shown in Fig. 9.7. According to Fig. 9.7, the result of the K-S statistic was 0.118, similar to the value calculated in Example 9.1. Since the sample has more than 30 elements, we should only use the K-S test to verify the normality of the data (the S-W test was applied to Example 9.2). Nevertheless, SPSS also makes the result of the S-W statistic available for the sample selected.

208

PART

IV Statistical Inference

FIG. 9.6 Selecting the normality test on SPSS.

FIG. 9.7 Results of the tests for normality for Example 9.1 on SPSS.

FIG. 9.8 Results of the tests for normality for Example 9.2 on SPSS.

As presented in the introduction of this chapter, SPSS calculates the P-value that corresponds to the lowest significance level observed that would lead to the rejection of the null hypothesis. For the K-S and S-W tests the P-value corresponds to the lowest value of P from which Dcal > Dc and Wcal < Wc. As shown in Fig. 9.7, the value of P for the K-S test was of 0.200 (this probability can also be obtained from Table G in the Appendix, as shown in Example 9.1). Since P > 0.05, we do not reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the data distribution is normal. The S-W test also allows us to conclude that the data distribution follows a normal distribution. Applying the same procedure to verify the normality of the data in Example 9.2 (the data are available in the file Production_Aircraft.sav), we get the results shown in Fig. 9.8. Analogous to Example 9.2, the result of the S-W test was 0.978. The K-S test was not applied to this example due to the sample size (n < 30). The P-value of the S-W test is 0.857 (in Example 9.2, we saw that this probability would be between 0.50 and 0.90

Hypotheses Tests Chapter

9

209

and closer to 0.90) and, since P > 0.01, the null hypothesis is not rejected, which allows us to conclude that the data distribution in the population follows a normal distribution. We will use this test when estimating regression models in Chapter 13. For this example, we can also conclude from the K-S test that the data distribution follows a normal distribution.

9.3.5

Solving Tests for Normality by Using Stata

The Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia tests for normality can be solved by using Stata Statistical Software. The Kolmogorov-Smirnov test will be applied to Example 9.1, the Shapiro-Wilk test to Example 9.2, and the Shapiro-Francia test to Example 9.3. The use of the images in this section has been authorized by StataCorp LP©.

9.3.5.1 Kolmogorov-Smirnov Test on the Stata Software The data presented in Example 9.1 are available in the file Production_FarmingEquipment.dta. Let’s open this file and verify that the name of the variable being studied is production. To elaborate the Kolmogorov-Smirnov test on Stata, we must specify the mean and the standard deviation of the variable that interests us in the test syntax, so, the command summarize, or simply sum, must be typed first, followed by the respective variable: sum production

and we get Fig. 9.9. Therefore, we can see that the mean is 42.63889 and the standard deviation is 7.099911. The Kolmogorov-Smirnov test is given by the following command: ksmirnov production = normal((production-42.63889)/7.099911)

The result of the test can be seen in Fig. 9.10. We can see that the value of the statistic is similar to the one calculated in Example 9.1 and by SPSS software. Since P > 0.05, we conclude that the data distribution is normal.

9.3.5.2 Shapiro-Wilk Test on the Stata Software The data presented in Example 9.2 are available in the file Production_Aircraft.dta. To elaborate the Shapiro-Wilk test on Stata, the syntax of the command is: swilk variables*

where the term variables* should be substituted for the list of variables being considered. For the data in Example 9.2, we have a single variable called production, so, the command to be typed is: swilk production FIG. 9.9 Descriptive statistics of the variable production.

FIG. 9.10 Results of the Kolmogorov-Smirnov test on Stata.

210

PART

IV Statistical Inference

FIG. 9.11 Results of the Shapiro-Wilk test for Example 9.2 on Stata.

FIG. 9.12 Results of the Shapiro-Francia test for Example 9.3 on Stata.

The result of the Shapiro-Wilk test can be seen in Fig. 9.11. Since P > 0.05, we can conclude that the sample comes from a population with a normal distribution.

9.3.5.3 Shapiro-Francia Test on the Stata Software The data presented in Example 9.3 are available in the file Production_Bicycles.dta. To elaborate the Shapiro-Francia test on Stata, the syntax of the command is: sfrancia variables*

where the term variables* should be substituted for the list of variables being considered. For the data in Example 9.3, we have a single variable called production, so, the command to be typed is: sfrancia production

The result of the Shapiro-Francia test can be seen in Fig. 9.12. We can see that the value is similar to the one calculated in Example 9.3 (W 0 ¼ 0.989). Since P > 0.05, we conclude that the sample comes from a population with a normal distribution. We will use this test when estimating regression models in Chapter 13.

9.4

TESTS FOR THE HOMOGENEITY OF VARIANCES

One of the conditions to apply a parametric test to compare k population means is that the population variances, estimated from k representative samples, be homogeneous or equal. The most common tests to verify variance homogeneity are Bartlett’s w2 (1937), Cochran’s C (1947a,b), Hartley’s Fmax (1950), and Levene’s F (1960) tests. In the null hypothesis of variance homogeneity tests, the variances of k populations are homogeneous. In the alternative hypothesis, at least one population variance is different from the others. That is: H0 : s21 ¼ s22 ¼ … ¼ s2k H1 : 9i, j : s2i 6¼ s2j ði, j ¼ 1, …, kÞ

9.4.1

(9.12)

Bartlett’s x2 Test

The original test proposed to verify variance homogeneity among groups is Bartlett’s w2 test (1937). This test is very sensitive to normality deviations, and Levene’s test is an alternative in this case. Bartlett’s statistic is calculated from q: k   X   ðni  1Þ  ln S2i q ¼ ðN  kÞ  ln S2p  i¼1

(9.13)

Hypotheses Tests Chapter

9

211

where:

P ni, i ¼ 1, …, k, is the size of each sample i and ki¼1ni ¼ N; 2 Si , i ¼ 1, …, k, is the variance in each sample i;

and

Xk S2p ¼

i¼1

ðni  1Þ  S2i

(9.14)

N k

A correction factor c is applied to q statistic, with the following expression: k X 1 1 1   c¼1+ 3  ð k  1Þ n 1 Nk i¼1 i

! (9.15)

where Bartlett’s statistic (Bcal) approximately follows a chi-square distribution with k  1 degrees of freedom: q (9.16) Bcal ¼  w2k1 c From the previous expressions, we can see that the higher the difference between the variances, the higher the value of B. On the other hand, if all the sample variances are equal, its value will be zero. To confirm if the null hypothesis of variance homogeneity will be rejected or not, the value calculated must be compared to the statistic’s critical value (w2c ), which is available in Table D in the Appendix. This table provides the critical values of w2c considering that P(w2cal > w2c ) ¼ a (for a right-tailed test). Therefore, we reject the null hypothesis if Bcal > w2c . On the other hand, if Bcal  w2c , we do not reject H0. P-value (the probability associated to w2cal statistic) can also be obtained from Table D. In this case, we reject H0 if P  a. Example 9.4: Applying Bartlett’s x2 Test A chain of supermarkets wishes to study the number of customers they serve every day in order to make strategic operational decisions. Table 9.E.8 shows the data of three stores throughout two weeks. Check if the variances between the groups are homogeneous. Consider a ¼ 5%.

TABLE 9.E.8 Number of Customers Served Per Day and Per Store Store 1

Store 2

Store 3

Day 1

620

710

924

Day 2

630

780

695

Day 3

610

810

854

Day 4

650

755

802

Day 5

585

699

931

Day 6

590

680

924

Day 7

630

710

847

Day 8

644

850

800

Day 9

595

844

769

Day 10

603

730

863

Day 11

570

645

901

Day 12

605

688

888

Day 13

622

718

757

Day 14

578

702

712

Standard deviation Variance

24.4059

62.2466

78.9144

595.6484

3874.6429

6227.4780

212

PART

IV Statistical Inference

Solution If we apply the Kolmogorov-Smirnov or the Shapiro-Wilk test for normality to the data in Table 9.E.8, we will verify that their distribution shows adherence to normality, with a 5% significance level, so, Bartlett’s w2 test can be applied to compare the homogeneity of the variances between the groups. Step 1: Since the main goal is to compare the equality of the variances between the groups, we can use Bartlett’s w2 test. Step 2: Bartlett’s w2 test hypotheses for this example are: H0: the population variances of all three groups are homogeneous H1: the population variance of at least one group is different from the others Step 3: The significance level to be considered is 5%. Step 4: The complete calculation of Bartlett’s w2 statistic is shown. First, we calculate the value of S2p, according to Expression (9.14): 13  ð595:65 + 3874:64 + 6227:48Þ ¼ 3565:92 42  3 Thus, we can calculate q through Expression (9.13): Sp2 ¼

q ¼ 39  ln ð3565:92Þ  13  ½ ln ð595:65Þ + ln ð3874:64Þ + ln ð6227:48Þ ¼ 14:94 The correction factor c for q statistic is calculated from Expression (9.15):     1 1 1 c ¼1+ 3  ¼ 1:0256 3  ð3  1Þ 13 42  3 Finally, we calculate Bcal: q 14:94 Bcal ¼ ¼ ¼ 14:567 c 1:0256 Step 5: According to Table D in the Appendix, for n ¼ 3  1 degrees of freedom and a ¼ 5%, the critical value of Bartlett’s w2 test is w2c ¼ 5.991. Step 6: Decision: since the value calculated lies in the critical region (Bcal > w2c ), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of at least one group is different from the others. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table D in the Appendix, for n ¼ 2 degrees of freedom, the probability associated to w2cal ¼ 14.567 is less than 0.005 (a probability of 0.005 is associated to w2cal ¼ 10.597). Step 6: Decision: since P < 0.05, we reject H0.

9.4.2

Cochran’s C Test

Cochran’s C test (1947a,b) compares the group with the highest variance in relation to the others. The test demands that the data have a normal distribution. Cochran’s C statistic is given by: S2 Ccal ¼ Xmax k

(9.17)

S2 i¼1 i

where: S2max is the highest variance in the sample; S2i is the variance in sample i, i ¼ 1, …, k. According to Expression (9.17), if all the variances are equal, the value of the Ccal statistic is 1/k. The higher the difference of S2max in relation to the other variances, the more the value of Ccal gets closer to 1. To confirm whether the null hypothesis will be rejected or not, the value calculated must be compared to Cochran’s (Cc) statistic’s critical value, which is available in Table M in the Appendix.

Hypotheses Tests Chapter

9

213

The values of Cc vary depending on the number of groups (k), the number of degrees of freedom n ¼ max(ni  1), and the value of a. Table M provides the critical values of Cc considering that P(Ccal > Cc) ¼ a (for a right-tailed test). Thus, we reject H0 if Ccal > Cc. Otherwise, we do not reject H0. Example 9.5: Applying Cochran’s C Test Use Cochran’s C test for the data in Example 9.4. The main objective here is to compare the group with the highest variability in relation to the others. Solution Step 1: Since the objective is to compare the group with the highest variance (group 3—see Table 9.E.8) in relation to the others, Cochran’s C test is the most recommended. Step 2: Cochran’s C test hypotheses for this example are: H0: the population variance of group 3 is equal to the others H1: the population variance of group 3 is different from the others Step 3: The significance level to be considered is 5%. Step 4: From Table 9.E.8, we can see that S2max ¼ 6227.48. Therefore, the calculation of Cochran’s C statistic is given by: S2 6227:48 ¼ 0:582 ¼ Ccal ¼ Xmax k 595:65 + 3874:64 + 6227:48 2 S i i¼1 Step 5: According to Table M in the Appendix, for k ¼ 3, n ¼ 13, and a ¼ 5%, the critical value of Cochran’s C statistic is Cc ¼ 0.575. Step 6: Decision: since the value calculated lies in the critical region (Ccal > Cc), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of group 3 is different from the others.

9.4.3

Hartley’s Fmax Test

Hartley’s Fmax test (1950) has the statistic that represents the relationship between the group with the highest variance (S2max) and the group with the lowest variance (S2min): Fmax ,cal ¼

S2max S2min

(9.18)

The test assumes that the number of observations per group is equal to (n1 ¼ n2 ¼ … ¼ nk ¼ n). If all the variances are equal, the value of Fmax will be 1. The higher the difference between S2max and S2min, the higher the value of Fmax. To confirm if the null hypothesis of variance homogeneity will be rejected or not, the value calculated must be compared to the (Fmax,c) statistic’s critical value, which is available in Table N in the Appendix. The critical values vary depending on the number of groups (k), the number of degrees of freedom n ¼ n  1, and the value of a, and this table provides the critical values of Fmax,c considering that P(Fmax,cal > Fmax,c) ¼ a (for a right-tailed test). Therefore, we reject the null hypothesis H0 of variance homogeneity if Fmax,cal > Fmax,c. Otherwise, we do not reject H0. P-value (the probability associated to Fmax,cal statistic) can also be obtained from Table N in the Appendix. In this case, we reject H0 if P  a. Example 9.6: Applying Hartley’s Fmax Test Use Hartley’s Fmax test for the data in Example 9.4. The goal here is to compare the group with the highest variability to the group with the lowest variability. Solution Step 1: Since the main objective is to compare the group with the highest variance (group 3—see Table 9.E.8) to the group with the lowest variance (group 1), Hartley’s Fmax test is the most recommended. Step 2: Hartley’s Fmax test hypotheses for this example are: H0: the population variance of group 3 is the same as group 1

214

PART

IV Statistical Inference

H1: the population variance of group 3 is different from group 1 Step 3: The significance level to be considered is 5%. Step 4: From Table 9.E.8, we can see that S2min ¼ 595.65 and S2max ¼ 6227.48. Therefore, the calculation of Hartley’s Fmax statistic is given by:

F max , cal ¼

2 Smax 6,227:48 ¼ 10:45 ¼ 2 595:65 Smin

Step 5: According to Table N in the Appendix, for k ¼ 3, n ¼ 13, and a ¼ 5%, the critical value of the test is Fmax,c ¼ 3.953. Step 6: Decision: since the value calculated lies in the critical region (Fmax,cal > Fmax,c), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of group 3 is different from the population variance of group 1. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table N in the Appendix, the probability associated to Fmax,cal ¼ 10.45, for k ¼ 3 and n ¼ 13, is less than 0.01. Step 6: Decision: since P < 0.05, we reject H0.

9.4.4

Levene’s F-Test

The advantage of Levene’s F-test in relation to other homogeneity of variance tests is that it is less sensitive to deviations from normality, in addition to being considered a more robust test. Levene’s statistic is given by Expression (9.19) and it follows an F-distribution, approximately, with n1 ¼ k  1 and n2 ¼ N  k degrees of freedom, for a significance level a: Xk  2 n  Zi  Z ðN  k Þ i¼1 i Fcal ¼ (9.19)   2  Fk1, Nk, a H0 ðk  1Þ Xk Xni Z  Z ij i i¼1 j¼1 where: ni is the dimension of each one of the k samples (i ¼ 1, …, k); N is the of the global sample (N ¼ n1 + n2 + ⋯ + nk);  dimension    Zij ¼ Xij  Xi , i ¼ 1, …, k and j ¼ 1, …, ni; Xij is observation j in sample i; Xi is the mean of sample i; Zi is the mean of Zij in sample i; Z is the mean of Zi in the global sample. An expansion of Levene’s test can be found in Brown and Forsythe (1974). From the F-distribution table (Table A in the Appendix), we can determine the critical values of Levene’s statistic (Fc ¼ Fk1,N k,a). Table A provides the critical values of Fc considering that P(Fcal > Fc) ¼ a (right-tailed table). In order for the null hypothesis H0 to be rejected, the value of the statistic must be in the critical region, that is, Fcal > Fc. If Fcal  Fc, we do not reject H0. P-value (the probability associated to Fcal statistic) can also be obtained from Table A. In this case, we reject H0 if P  a. Example 9.7: Applying Levene’s Test Elaborate Levene’s test for the data in Example 9.4. Solution Step 1: Levene’s test can be applied to check variance homogeneity between the groups, and it is more robust than the other tests.

Hypotheses Tests Chapter

9

215

Step 2: Levene’s test hypotheses for this example are: H0: the population variances of all three groups are homogeneous H1: the population variance of at least one group is different from the others Step 3: The significance level to be considered is 5%. Step 4: The calculation of the Fcal statistic, according to Expression (9.19), is shown.

TABLE 9.E.9 Calculating the Fcal Statistic 

I

X1j

    Z1j ¼ X1j  X 1 

1

620

10.571

9.429

88.898

1

630

20.571

0.571

0.327

1

610

0.571

19.429

377.469

1

650

40.571

20.571

423.184

1

585

24.429

4.429

19.612

1

590

19.429

0.571

0.327

1

630

20.571

0.571

0.327

1

644

34.571

14.571

212.327

1

595

14.429

5.571

31.041

1

603

6.429

13.571

184.184

1

570

39.429

19.429

377.469

1

605

4.429

15.571

242.469

1

622

12.571

7.429

55.184

1

578

31.429

11.429

130.612

X 1 ¼ 609:429

Z 1 ¼ 20

    Z2j ¼ X2j  X 2 

Z1j  Z1

Z1j  Z 1

Sum ¼ 2143.429

 Z2j  Z 2

Z2j  Z 2

27.214

23.204

538.429

780

42.786

7.633

58.257

2

810

72.786

22.367

500.298

2

755

17.786

32.633

1064.890

2

699

38.214

12.204

148.940

2

680

57.214

6.796

46.185

2

710

27.214

23.204

538.429

2

850

112.786

62.367

3889.686

2

844

106.786

56.367

3177.278

2

730

7.214

43.204

1866.593

2

645

92.214

41.796

1746.899

2

688

49.214

1.204

1.450

2

718

19.214

31.204

973.695

2

702

35.214

15.204

231.164

I

X2j

2

710

2

X 2 ¼ 737:214

Z 2 ¼ 50:418

2

2

Sum ¼ 14,782.192

Continued

216

PART

IV Statistical Inference

TABLE 9.E.9 Calculating the Fcal Statistic—cont’d

    Z3j ¼ X3j  X 3 

Z3j  Z 3



Z3j  Z 3

I

X3j

3

924

90.643

24.194

585.344

3

695

138.357

71.908

5170.784

3

854

20.643

45.806

2098.201

3

802

31.357

35.092

1231.437

3

931

97.643

31.194

973.058

3

924

90.643

24.194

585.344

3

847

13.643

52.806

2788.487

3

800

33.357

33.092

1095.070

3

769

64.357

2.092

4.376

3

863

29.643

36.806

1354.691

3

901

67.643

1.194

1.425

3

888

54.643

11.806

139.385

3

757

76.357

9.908

98.172

3

712

121.357

54.908

X 3 ¼ 833:36

Z 3 ¼ 66:449

2

3014.906 Sum ¼ 19,140.678

Therefore, the calculation of Fcal is carried out as follows: Fcal ¼

ð42  3Þ 14  ð20  45:62Þ2 + 14  ð50:418  45:62Þ2 + 14  ð66:449  45:62Þ2  2143:429 + 14, 782:192 + 19, 140:678 ð3  1Þ Fcal ¼ 8:427

Step 5: According to Table A in the Appendix, for n1 ¼ 2, n2 ¼ 39, and a ¼ 5%, the critical value of the test is Fc ¼ 3.24. Step 6: Decision: since the value calculated lies in the critical region (Fcal > Fc), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population variance of at least one group is different from the others. If we use P-value instead of the statistic’s critical value, Steps 5 and 6 will be: Step 5: According to Table A in the Appendix, for n1 ¼ 2 and n2 ¼ 39, the probability associated to Fcal ¼ 8.427 is less than 0.01 (P-value). Step 6: Decision: since P < 0.05, we reject H0.

9.4.5

Solving Levene’s Test by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©. To test the variance homogeneity between the groups, SPSS uses Levene’s test. The data presented in Example 9.4 are available in the file CustomerServices_Store.sav. In order to elaborate the test, we must click on Analyze → Descriptive Statistics → Explore …, as shown in Fig. 9.13. Let’s include the variable Customer_services in the list of dependent variables (Dependent List) and the variable Store in the factor list (Factor List), as shown in Fig. 9.14. Next, we must click on Plots … and select the option Untransformed in Spread vs Level with Levene Test, as shown in Fig. 9.15. Finally, let’s click on Continue and on OK. The result of Levene’s test can also be obtained through the ANOVA test, by clicking on Analyze ! Compare Means ! One-Way ANOVA …. In Options …, we must select the option Homogeneity of variance test (Fig. 9.16).

Hypotheses Tests Chapter

FIG. 9.13 Procedure for elaborating Levene’s test on SPSS.

FIG. 9.14 Selecting the variables to elaborate Levene’s test on SPSS.

9

217

218

PART

IV Statistical Inference

FIG. 9.15 Continuation of the procedure to elaborate Levene’s test on SPSS.

FIG. 9.16 Results of Levene’s test for Example 9.4 on SPSS.

The value of Levene’s statistic is 8.427, exactly the same as the one calculated previously. Since the significance level observed is 0.001, a value lower than 0.05, the test shows the rejection of the null hypothesis, which allows us to conclude, with a 95% confidence level, that the population variances are not homogeneous.

9.4.6

Solving Levene’s Test by Using the Stata Software

The use of the images in this section has been authorized by StataCorp LP©. Levene’s statistical test for equality of variances is calculated on Stata by using the command robvar (robust-test for equality of variances), which has the following syntax: robvar variable*, by(groups*)

in which the term variable* should be substituted for the quantitative variable studied and the term groups* by the categorical variable that represents them. Let’s open the file CustomerServices_Store.dta that contains the data of Example 9.7. The three groups are represented by the variable store and the number of customers served by the variable services. Therefore, the command to be typed is: robvar services, by(store)

The result of the test can be seen in Fig. 9.17. We can verify that the value of the statistic (8.427) is similar to the one calculated in Example 9.7 and to the one generated on SPSS, as well as the calculation of the probability associated to

Hypotheses Tests Chapter

9

219

FIG. 9.17 Results of Levene’s test for Example 9.7 on Stata.

the statistic (0.001). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the variances are not homogeneous.

9.5 HYPOTHESES TESTS REGARDING A POPULATION MEAN (m) FROM ONE RANDOM SAMPLE The main goal is to test if a population mean assumes a certain value or not.

9.5.1 Z Test When the Population Standard Deviation (s) Is Known and the Distribution Is Normal This test is applied when a random sample of size n is obtained from a population with a normal distribution, whose mean (m) is unknown and whose standard deviation (s) is known. If the distribution of the population is not known, it is necessary to work with large samples (n > 30), because the central limit theorem guarantees that, as the sample size grows, the sample distribution of its mean gets closer and closer to a normal distribution. For a bilateral test, the hypotheses are: H0: the sample comes from a population with a certain mean (m ¼ m0) H1: it challenges the null hypothesis (m 6¼ m0) The statistical test used here refers to the sample mean (X). In order for the sample mean to be compared to the value in the table, it must be standardized, so: Zcal ¼

X  m0 s  Nð0, 1Þ, where sX ¼ pffiffiffi sX n

(9.20)

The critical values of the zc statistic are shown in Table E in the Appendix. This table provides the critical values of zc considering that P(Zcal > zc) ¼ a (for a right-tailed test). For a bilateral test, we must consider P(Zcal > zc) ¼ a/2, since P(Zcal <  zc) + P(Zcal > zc) ¼ a. The null hypothesis H0 of a bilateral test is rejected if the value of the Zcal statistic lies in the critical region, that is, if Zcal <  zc or Zcal > zc. Otherwise, we do not reject H0. The unilateral probabilities associated to Zcal statistic (P) can also be obtained from Table E. For a unilateral test, we consider that P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P  a. Example 9.8: Applying the z Test to One Sample A cereal manufacturer states that the average quantity of food fiber in each portion of its product is, at least, 4.2 g with a standard deviation of 1 g. A health care agency wishes to verify if this statement is true. Collecting a random sample of 42 portions, in which the average quantity of food fiber is 3.9 g. With a significance level equal to 5%, is there evidence to reject the manufacturer’s statement?

220

PART

IV Statistical Inference

Solution Step 1: The suitable test for a population mean with a known s, considering a single sample of size n > 30 (normal distribution), is the z test. Step 2: For this example, the z test hypotheses are: H0: m 4.2 g (information provided by the supplier) H1: m < 4.2 g which corresponds to a left-tailed test. Step 3: The significance level to be considered is 5%. Step 4: The calculation of the Zcal statistic, according to Expression (9.20), is:

Zcal ¼

X  m0 3:9  4:2 pffiffiffiffiffiffi ¼ 1:94 pffiffiffi ¼ s= n 1= 42

Step 5: According to Table E in the Appendix, for a left-tailed test with a ¼ 5%, the critical value of the test is zc ¼  1.645. Step 6: Decision: since the value calculated lies in the critical region (zcal < 1.645), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the manufacturer’s average quantity of food fiber is less than 4.2 g. If, instead of comparing the value calculated to the critical value of the standard normal distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table E in the Appendix, for a left-tailed test, the probability associated to zcal ¼  1.94 is 0.0262 (P-value). Step 6: Decision: since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the manufacturer’s average quantity of food fiber is less than 4.2 g.

9.5.2

Student’s t-Test When the Population Standard Deviation (s) Is Not Known

Student’s t-test for one sample is applied when we do not know the population standard deviation (s), so, its value is estimated from the sample standard deviation (S). However, to substitute s for S in Expression (9.20), the distribution of the variable will no longer be normal; it will become a Student’s t-distribution with n  1 degrees of freedom. Analogous to the z test, Student’s t-test for one sample assumes the following hypotheses for a bilateral test: H0: m ¼ m0 H1: m ¼ 6 m0 And the calculation of the statistic becomes: Tcal ¼

X  m0 pffiffiffi  tn1 S= n

(9.21)

The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) ¼ a (for a right-tailed test). For a bilateral test, we have P(Tcal <  tc) ¼ a/2 ¼ P(Tcal > tc), as shown in Fig. 9.18. Therefore, for a bilateral test, the null hypothesis is rejected if Tcal <  tc or Tcal > tc. If  tc  Tcal  tc, we do not reject H0. FIG. 9.18 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.

Hypotheses Tests Chapter

9

221

The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P  a. Example 9.9: Applying Student’s t-Test to One Sample The average processing time of a task using a certain machine has been 18 min. New concepts have been implemented in order to reduce the average processing time. Hence, after a certain period of time, a sample with 25 elements was collected, and an average time of 16.808 min was measured, with a standard deviation of 2.733 min. Check and see if this result represents an improvement in the average processing time. Consider a ¼ 1%. Solution Step 1: The suitable test for a population mean with an unknown s is Student’s t-test. Step 2: For this example, Student’s t-test hypotheses are: H0: m ¼ 18 H1: m < 18 which corresponds to a left-tailed test. Step 3: The significance level to be considered is 1%. Step 4: The calculation of the Tcal statistic, according to Expression (9.21), is:

Tcal ¼

X  m0 16:808  18 pffiffiffiffiffiffi ¼ 2:18 pffiffiffi ¼ S= n 2:733= 25

Step 5: According to Table B in the Appendix, for a left-tailed test with 24 degrees of freedom and a ¼ 1%, the critical value of the test is tc ¼  2.492. Step 6: Decision: since the value calculated is not in the critical region (Tcal > 2.492), the null hypothesis is not rejected, which allows us to conclude, with a 99% confidence level, that there was no improvement in the average processing time. If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table B in the Appendix, for a left-tailed test with 24 degrees of freedom, the probability associated to Tcal ¼  2.18 is between 0.01 and 0.025 (P-value). Step 6: Decision: since P > 0.01, we do not reject the null hypothesis.

9.5.3

Solving Student’s t-Test for a Single Sample by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©. If we wish to compare means from a single sample, SPSS makes Student’s t-test available. The data in Example 9.9 are available in the file T_test_One_Sample.sav. The procedure to apply the test from Example 9.9 will be described. Initially, let´s select Analyze ! Compare Means → One-Sample T Test …, as shown in Fig. 9.19. We must select the variable Time and specify the value 18 that will be tested in Test Value, as shown in Fig. 9.20. Now, we must click on Options … to define the desired confidence level (Fig. 9.21). Finally, let’s click on Continue and on OK. The results of the test are shown in Fig. 9.22. This figure shows the result of the t-test (similar to the value calculated in Example 9.9) and the associated probability (P-value) for a bilateral test. For a unilateral test, the associated probability is 0.0195 (we saw in Example 9.9 that this probability would be between 0.01 and 0.025). Since 0.0195 > 0.01, we do not reject the null hypothesis, which allows us to conclude, with a 99% confidence level, that there was no improvement in the average processing time.

9.5.4

Solving Student’s t-Test for a Single Sample by Using Stata Software

The use of the images in this section has been authorized by StataCorp LP©. Student’s t-test is elaborated on Stata by using the command ttest. For one population mean, the test syntax is: ttest variable* == #

222

PART

IV Statistical Inference

FIG. 9.19 Procedure for elaborating the t-test from one sample on SPSS.

FIG. 9.20 Selecting the variable and specifying the value to be tested.

where the term variable* should be substituted for the name of the variable considered in the analysis and # for the value of the population mean to be tested. The data in Example 9.9 are available in the file T_test_One_Sample.dta. In this case, the variable being analyzed is called time and the goal is to verify if the average processing time is still 18 min, so, the command to be typed is: ttest time == 18

The result of the test can be seen in Fig. 9.23. We can see that the calculated value of the statistic (2.180) is similar to the one calculated in Example 9.9 and also generated on SPSS, as well as the associated probability for a left-tailed test (0.0196). Since P > 0.01, we do not reject the null hypothesis, which allows us to conclude, with a 99% confidence level, that there was no improvement in the processing time.

Hypotheses Tests Chapter

9

223

FIG. 9.21 Options—defining the confidence level.

FIG. 9.22 Results of the t-test for one sample for Example 9.9 on SPSS.

FIG. 9.23 Results of the t-test for one sample for Example 9.9 on Stata.

9.6 STUDENT’S T-TEST TO COMPARE TWO POPULATION MEANS FROM TWO INDEPENDENT RANDOM SAMPLES The t-test for two independent samples is applied to compare the means of two random samples (X1i, i ¼ 1, …, n1; X2j, j ¼ 1, …, n2) obtained from the same population. In this test, the population variance is unknown. For a bilateral test, the null hypothesis of the test states that the population means are the same. If the population means are different, the null hypothesis is rejected, so: H0: m1 ¼ m2 H1: m1 6¼ m2 The calculation of the T statistic depends on the comparison of the population variances between the groups.

Case 1: s21 6¼ s22 Considering that the population variances are different, the calculation of the T statistic is given by:

224

PART

IV Statistical Inference

FIG. 9.24 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.



 X1  X2 Tcal ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffi S21 S22 + n1 n2

(9.22)

 2 2 S1 S22 + n1 n2 n¼ 2  2 2 2 S1 =n1 S =n + 2 2 ð n1  1 Þ ð n2  1 Þ

(9.23)

with the following degrees of freedom:

Case 2: s21 5 s22 When the population variances are homogeneous, to calculate the T statistic, the researcher has to use:   X1  X2 rffiffiffiffiffiffiffiffiffiffiffiffiffiffi Tcal ¼ 1 1 + Sp  n1 n 2 where:

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðn1  1Þ  S21 + ðn2  1Þ  S22 Sp ¼ n1 + n2  2

(9.24)

(9.25)

and Tcal follows Student’s t-distribution with n ¼ n1 + n2  2 degrees of freedom. The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) ¼ a (for a right-tailed test). For a bilateral test, we have P(Tcal <  tc) ¼ a/2 ¼ P(Tcal > tc), as shown in Fig. 9.24. Therefore, for a bilateral test, if the value of the statistic lies in the critical region, that is, if Tcal <  tc or Tcal > tc, the test allows us to reject the null hypothesis. On the other hand, if  tc  Tcal  tc, we do not reject H0. The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P  a. Example 9.10: Applying Student’s t-Test to Two Independent Samples A quality engineer believes that the average time to manufacture a certain plastic product may depend on the raw materials used, which come from two different suppliers. A sample with 30 observations from each supplier is collected for a test and the results are shown in Tables 9.E.10 and 9.E.11. For a significance level a ¼ 5%, check if there is any difference between the means. Solution Step 1: The suitable test to compare two population means with an unknown s is Student’s t-test for two independent samples. Step 2: For this example, Student’s t-test hypotheses are: H0: m1 ¼ m2 H1: m1 6¼ m2 Step 3: The significance level to be considered is 5%.

Hypotheses Tests Chapter

9

225

TABLE 9.E.10 Manufacturing Time Using Raw Materials From Supplier 1 22.8

23.4

26.2

24.3

22.0

24.8

26.7

25.1

23.1

22.8

25.6

25.1

24.3

24.2

22.8

23.2

24.7

26.5

24.5

23.6

23.9

22.8

25.4

26.7

22.9

23.5

23.8

24.6

26.3

22.7

TABLE 9.E.11 Manufacturing Time Using Raw Materials From Supplier 2 26.8

29.3

28.4

25.6

29.4

27.2

27.6

26.8

25.4

28.6

29.7

27.2

27.9

28.4

26.0

26.8

27.5

28.5

27.3

29.1

29.2

25.7

28.4

28.6

27.9

27.4

26.7

26.8

25.6

26.1

Step 4: For the data in Tables 9.E.10 and 9.E.11, we calculate X 1 ¼ 24:277, X 2 ¼ 27:530, S21 ¼ 1.810, and S22 ¼ 1.559. Considering that the population variances are homogeneous, according to the solution generated on SPSS, let’s use Expressions (9.24) and (9.25) to calculate the Tcal statistic, as follows: rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 29  1:810 + 29  1:559 ¼ 1:298 30 + 30  2 24:277  27:530 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ 9:708 Tcal ¼ 1 1 1:298  + 30 30

Sp ¼

with n ¼ 30 + 30 – 2 ¼ 58 degrees of freedom. Step 5: The critical region of the bilateral test, considering n ¼ 58 degrees of freedom and a ¼ 5%, can be defined from Student’s t-distribution table (Table B in the Appendix), as shown in Fig. 9.25. For a bilateral test, each one of the tails corresponds to half of significance level a. FIG. 9.25 Critical region of Example 9.10.

Step 6: Decision: since the value calculated lies in the critical region, that is, Tcal < 2.002, we must reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that the population means are different. If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table B in the Appendix, for a right-tailed test with n ¼ 58 degrees of freedom, probability P1 associated to Tcal ¼ 9.708 is less than 0.0005. For a bilateral test, this probability must be doubled (P ¼ 2P1). Step 6: Decision: since P < 0.05, the null hypothesis is rejected.

9.6.1

Solving Student’s t-Test From Two Independent Samples by Using SPSS Software

The data in Example 9.10 are available in the file T_test_Two_Independent_Samples.sav. The procedure for solving Student’s t-test to compare two population means from two independent random samples on SPSS is described. The use of the images in this section has been authorized by the International Business Machines Corporation©. We must click on Analyze ! Compare Means → Independent-Samples T Test …, as shown in Fig. 9.26.

226

PART

IV Statistical Inference

Let’s include the variable Time in Test Variable(s) and the variable Supplier in Grouping Variable. Next, let’s click on Define Groups … to define the groups (categories) of the variable Supplier, as shown in Fig. 9.27. If the confidence level desired by the researcher is different from 95%, the button Options … must be selected to change it. Finally, let’s click on OK. The results of the test are shown in Fig. 9.28. The value of the t statistic for the test is 9.708 and the associated bilateral probability is 0.000 (P < 0.05), which leads us to reject the null hypothesis, and allows us to conclude, with a 95% confidence level, that the population means are different. We can notice that Fig. 9.28 also shows the result of Levene’s test. Since the significance level observed is 0.694, value greater than 0.05, we can also conclude, with a 95% confidence level, that the variances are homogeneous.

FIG. 9.26 Procedure for elaborating the t-test from two independent samples on SPSS.

FIG. 9.27 Selecting the variables and defining the groups.

Hypotheses Tests Chapter

9

227

FIG. 9.28 Results of the t-test for two independent samples for Example 9.10 on SPSS.

FIG. 9.29 Results of the t-test for two independent samples for Example 9.10 on Stata.

9.6.2

Solving Student’s t-Test From Two Independent Samples by Using Stata Software

The use of the images in this section has been authorized by StataCorp LP©. The t-test to compare the means of two independent groups on Stata is elaborated by using the following syntax: ttest variable*, by(groups*)

where the term variable* must be substituted for the quantitative variable being analyzed, and the term groups* for the categorical variable that represents them. The data in Example 9.10 are available in the file T_test_Two_Independent_Samples.dta. The variable supplier shows the groups of suppliers. The values for each group of suppliers are specified in the variable time. Thus, we must type the following command: ttest time, by(supplier)

The result of the test can be seen in Fig. 9.29. We can see that the calculated value of the statistic (9.708) is similar to the one calculated in Example 9.10 and also generated on SPSS, as well as the associated probability for a bilateral test (0.000). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that the population means are different.

9.7 STUDENT’S T-TEST TO COMPARE TWO POPULATION MEANS FROM TWO PAIRED RANDOM SAMPLES This test is applied to check whether the means of two paired or related samples, obtained from the same population (before and after) with a normal distribution, are significantly different or not. Besides the normality of the data of each sample, the test requires the homogeneity of the variances between the groups. Different from the t-test for two independent samples, first, we must calculate the difference between each pair of values in position i (di ¼ Xbefore,i  Xafter,i, i ¼ 1, …, n) and, after that, test the null hypothesis that the mean of the differences in the population is zero.

228

PART

IV Statistical Inference

For a bilateral test, we have: H0: md ¼ 0, md ¼ mbefore  mafter H1: md ¼ 6 0 The Tcal statistic for the test is given by: Tcal ¼

d  md pffiffiffi  t Sd = n n¼n1

where:

Xn d¼

and Sd ¼

i¼1

(9.26)

d

(9.27)

n

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Xn  2 d d i¼1 i

(9.28)

n1

The value calculated must be compared to the value in Student’s t-distribution table (Table B in the Appendix). This table provides the critical values of tc considering that P(Tcal > tc) ¼ a (for a right-tailed test). For a bilateral test, we have P(Tcal <  tc) ¼ a/2 ¼ P(Tcal > tc), as shown in Fig. 9.30. FIG. 9.30 Nonrejection region (NR) and critical region (CR) of Student’s t-distribution for a bilateral test.

Therefore, for a bilateral test, the null hypothesis is rejected if Tcal <  tc or Tcal > tc. If  tc  Tcal  tc, we do not reject H0. The unilateral probabilities associated to Tcal statistic (P1) can also be obtained from Table B. For a unilateral test, we have P ¼ P1. For a bilateral test, this probability must be doubled (P ¼ 2P1). Therefore, for both tests, we reject H0 if P  a. Example 9.11: Applying Student’s t-Test to Two Paired Samples A group of 10 machine operators, responsible for carrying out a certain task, is trained to perform the same task more efficiently. To verify if there is a reduction in the time taken to perform the task, we measured the time spent by each operator, before and after the training course. Test the hypothesis that the population means of both paired samples are similar, that is, that there is no reduction in time taken to perform the task after the training course. Consider a ¼ 5%.

TABLE 9.E.12 Time Spent Per Operator Before the Training Course 3.2

3.6

3.4

3.8

3.4

3.5

3.7

3.2

3.5

3.9

3.4

3.0

3.2

3.6

TABLE 9.E.13 Time Spent Per Operator After the Training Course 3.0

3.3

3.5

3.6

3.4

3.3

Solution Step 1: In this case, the most suitable test is Student’s t-test for two paired samples. Since the test requires the normality of the data in each sample and the homogeneity of the variances between the groups, K-S or S-W tests, besides Levene’s test, must be applied for such verification. As we will see, in the solution of this example on SPSS, all of these assumptions will be validated.

Hypotheses Tests Chapter

9

229

Step 2: For this example, Student’s t-test hypotheses are: H0: md ¼ 0 H1: md 6¼ 0 Step 3: The significance level to be considered is 5%. Step 4: In order to calculate the Tcal statistic, first, we must calculate di:

TABLE 9.E.14 Calculating di Xbefore, Xafter, di

i

i

3.2

3.6

3.4

3.8

3.4

3.5

3.7

3.2

3.5

3.9

3.0

3.3

3.5

3.6

3.4

3.3

3.4

3.0

3.2

3.6

0.2

0.3

0.1

0.2

0

0.2

0.3

0.2

0.3

0.3

Xn d 0:2 + 0:3 + ⋯ + 0:3 i¼1 i ¼ ¼ 0:19 d¼ n 10 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð0:2  0:19Þ2 + ð0:3  0:19Þ2 + ⋯ + ð0:3  0:19Þ2 Sd ¼ ¼ 0:137 9 Tcal ¼

d 0:19 pffiffiffi ¼ pffiffiffiffiffiffi ¼ 4:385 Sd = n 0:137= 10

Step 5: The critical region of the bilateral test can be defined from Student’s t-distribution table (Table B in the Appendix), considering n ¼ 9 degrees of freedom and a ¼ 5%, as shown in Fig. 9.31. For a bilateral test, each tail corresponds to half of significance level a. FIG. 9.31 Critical region of Example 9.11.

Step 6: Decision: since the value calculated lies in the critical region (Tcal > 2.262), the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that there is a significant difference between the times spent by the operators before and after the training course. If, instead of comparing the value calculated to the critical value of Student’s t-distribution, we used the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table B in the Appendix, for a right-tailed test with n ¼ 9 degrees of freedom, the P1 probability associated to Tcal ¼ 4.385 is between 0.0005 and 0.001. For a bilateral test, this probability must be doubled (P ¼ 2P1), so, 0.001 < P < 0.002. Step 6: Decision: since P < 0.05, the null hypothesis is rejected.

9.7.1

Solving Student’s t-Test From Two Paired Samples by Using SPSS Software

First, we must test the normality of the data in each sample, as well as the variance homogeneity between the groups. Using the same procedures described in Sections 9.3.3 and 9.4.5 (the data must be placed in a table the same way as in Section 9.4.5), we obtain Figs. 9.32 and 9.33. Based on Fig. 9.32, we conclude that there is normality of the data for each sample. From Fig. 9.33, we can conclude that the variances between the samples are homogeneous.

230

PART

IV Statistical Inference

The use of the images in this section has been authorized by the International Business Machines Corporation©. To solve Student’s t-test for two paired samples on SPSS, we must open the file T_test_Two_Paired_Samples.sav. Then, we have to click on Analyze ! Compare Means → Paired-Samples T Test …, as shown in Fig. 9.34. We must select the variable Before and move it to Variable1 and the variable After to Variable2, as shown in Fig. 9.35. If the desired confidence level is different from 95%, we must click on Options … to change it. Finally, let’s click on OK. The results of the test are shown in Fig. 9.36. The value of the t-test is 4.385 and the significance level observed for a bilateral test is 0.002, value less than 0.05, which leads us to reject the null hypothesis and allows us to conclude, with a 95% confidence level, that there is a significant difference between the times spent by the operators before and after the training course.

FIG. 9.32 Results of the normality tests on SPSS.

FIG. 9.33 Results of Levene’s test on SPSS.

FIG. 9.34 Procedure for elaborating the t-test from two paired samples on SPSS.

Hypotheses Tests Chapter

9

231

FIG. 9.35 Selecting the variables that will be paired.

FIG. 9.36 Results of the t-test for two paired samples.

FIG. 9.37 Results of Student’s t-test for two paired samples for Example 9.11 on Stata.

9.7.2

Solving Student’s t-Test From Two Paired Samples by Using Stata Software

The t-test to compare the means of two paired groups will be solved on Stata for the data in Example 9.11. The use of the images in this section has been authorized by StataCorp LP©. Therefore, let’s open the file T_test_Two_Paired_Samples.dta. The paired variables are called before and after. In this case, we must type the following command: ttest before == after

The result of the test can be seen in Fig. 9.37. We can see that the calculated value of the statistic (4.385) is similar to the one calculated in Example 9.11 and on SPSS, as well as the probability associated to the statistic for a bilateral test (0.0018). Since P < 0.05, we reject the null hypothesis that the times spent by the operators before and after the training course are the same, with a 95% confidence level.

232

PART

9.8

IV Statistical Inference

ANOVA TO COMPARE THE MEANS OF MORE THAN TWO POPULATIONS

ANOVA is a test used to compare the means of three or more populations, through the analysis of sample variances. The test is based on a sample obtained from each population, aiming at determining if the differences between the sample means suggest significant differences between the population means, or if such differences are only a result of the implicit variability of the sample. ANOVA’s assumptions are: (i) The samples must be independent from each other; (ii) The data in the populations must have a normal distribution; (iii) The population variances must be homogeneous.

9.8.1

One-Way ANOVA

One-way ANOVA is an extension of Student’s t-test for two population means, allowing the researcher to compare three or more population means. The null hypothesis of the test states that the population means are the same. If there is at least one group with a mean that is different from the others, the null hypothesis is rejected. As stated in Fa´vero et al. (2009), the one-way ANOVA allows researcher to verify the effect of a qualitative explanatory variable (factor) on a quantitative dependent variable. Each group includes the observations of the dependent variable in one category of the factor. Assuming that size n independent samples are obtained from k populations (k 3) and that the means of these populations can be represented by m1, m2, …, mk, the analysis of variance tests the following hypotheses: H0 : m1 ¼ m2 ¼ … ¼ mk H1 : 9ði, jÞ mi 6¼ mj , i 6¼ j

(9.29)

According to Maroco (2014), in general, the observations for this type of problem can be represented according to Table 9.2. where Yij represents observation i of sample or group P j (i ¼ 1, …, nj; j ¼ 1, …, k) and nj is the dimension of sample or group j. The dimension of the global sample is N ¼ ki¼1ni. Pestana and Gageiro (2008) present the following model: Yij ¼ mi + eij

(9.30)

Yij ¼ m + ðmi  mÞ  eij

(9.31)

Yij ¼ m + ai + eij

(9.32)

where: m is the global mean of the population; mi is the mean of sample or group i; ai is the effect of sample or group i; eij is the random error.

TABLE 9.2 Observations of the One-Way ANOVA Samples or Groups 1

2



K

Y11

Y12



Y1k

Y21

Y22



Y2k









Yn11

Yn22



Ynkk

Hypotheses Tests Chapter

9

233

Therefore, ANOVA assumes that each group comes from a population with a normal distribution, mean mi, and a homogeneous variance, that is, Yij  N(mi,s), resulting in the hypothesis that the errors (residuals) have a normal distribution with a mean equal to zero and a constant variance, that is, eij  N(0,s), besides being independent (Fa´vero et al., 2009). The technique’s hypotheses are tested from the calculation of the group variances, and that is where the name ANOVA comes from. The technique involves the calculation of the variations between the groups (Y i  Y) and within each group (Yij  Y i ). The residual sum of squares within groups (RSS) is calculated by: RSS ¼

nj  k X X

Yij  Y i

2 (9.33)

i¼1 j¼1

The residual sum of squares between groups, or the sum of squares of the factor (SSF), is given by: SSF ¼

k X

 2 ni  Y i  Y

(9.34)

i¼1

Therefore, the total sum is: TSS ¼ RSS + SSF ¼

ni  k X 2 X Yij  Y

(9.35)

i¼1 j¼1

According to Fa´vero et al. (2009) and Maroco (2014), the ANOVA statistic is given by the division between the variance of the factor (SSF divided by k  1 degrees of freedom) and the variance of the residuals (RSS divided by N  k degrees of freedom): SSF ðk  1Þ MSF ¼ Fcal ¼ RSS MSR ðN  kÞ

(9.36)

where: MSF represents the mean square between groups (estimate of the variance of the factor); MSR represents the mean square within groups (estimate of the variance of the residuals). Table 9.3 summarizes the calculations of the one-way ANOVA. The value of F can be null or positive, but never negative. Therefore, ANOVA requires an asymmetrical F-distribution to the right. The calculated value (Fcal) must be compared to the value in the F-distribution table (Table A in the Appendix). This table provides the critical values of Fc ¼ Fk1,N k,a where P(Fcal > Fc) ¼ a (right-tailed test). Therefore, one-way ANOVA’s null hypothesis is rejected if Fcal > Fc. Otherwise, if (Fcal  Fc), we do not reject H0. We will use these concepts when we study the estimation of regression models in Chapter 13.

TABLE 9.3 Calculating the One-Way ANOVA Source of Variation

Sum of Squares

Between the groups

SSF ¼

Within the groups

RSS ¼

Total

TSS ¼

Pk



Degrees of Freedom 2

Mean Squares

k1

MSF

2 Pk Pni  i¼1 j¼1 Yij  Y i

Nk

RSS MSR ¼ ðNk Þ

Pk Pni 

N1

i¼1 ni

i¼1

Yi Y

j¼1

Yij  Y

2

SSF ¼ ðk1 Þ

F MSF F ¼ MSR

Source: Fa´vero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Ana´lise de dados: modelagem multivariada para tomada de deciso˜es. Campus Elsevier, Rio de Janeiro; Maroco, J., 2014. Ana´lise estatı´stica com o SPSS Statistics, sixth ed. Edic¸o˜es Sı´labo, Lisboa.

234

PART

IV Statistical Inference

TABLE 9.E.15 Percentage of Sucrose for the Three Suppliers Supplier 1 (n1 5 12)

Supplier 2 (n2 5 10)

Supplier 3 (n3 5 10)

0.33

1.54

1.47

0.79

1.11

1.69

1.24

0.97

1.55

1.75

2.57

2.04

0.94

2.94

2.67

2.42

3.44

3.07

1.97

3.02

3.33

0.87

3.55

4.01

0.33

2.04

1.52

0.79

1.67

2.03

Y 1 ¼ 1:316

Y 2 ¼ 2:285

Y 3 ¼ 2:338

S1 ¼ 0.850

S2 ¼ 0.948

S3 ¼ 0.886

1.24 3.12

Example 9.12: Applying the One-Way ANOVA Test A sample with 32 products is collected to analyze the quality of the honey supplied by three different suppliers. One of the ways to test the quality of the honey is finding out how much sucrose it contains, which usually varies between 0.25% and 6.5%. Table 9. E.15 shows the percentage of sucrose in the sample collected from each supplier. Check if there are differences in this quality indicator among the three suppliers, considering a 5% significance level. Solution Step 1: In this case, the most suitable test is the one-way ANOVA. First, we must verify the assumptions of normality for each group and of variance homogeneity between the groups through the Kolmogorov-Smirnov, Shapiro-Wilk, and Levene tests. Figs. 9.38 and 9.39 show the results obtained by using SPSS software.

FIG. 9.38 Results of the tests for normality on SPSS.

FIG. 9.39 Results of Levene’s test on SPSS.

Hypotheses Tests Chapter

9

235

Since the significance level observed in the tests for normality for each group and in the variance homogeneity test between the groups is greater than 5%, we can conclude that each one of the groups shows data with a normal distribution and that the variances between the groups are homogeneous, with a 95% confidence level. Since the assumptions of the one-way ANOVA were met, the technique can be applied. Step 2: For this example, ANOVA’s null hypothesis states that there are no differences in the amount of sucrose coming from the three suppliers. If there is at least one supplier with a population mean that is different from the others, the null hypothesis will be rejected. Thus, we have: H0: m1 ¼ m2 ¼ m3 H1: 9(i,j) mi 6¼ mj, i 6¼ j Step 3: The significance level to be considered is 5%. Step 4: The calculation of the Fcal statistic is specified here. For this example, we know that k ¼ 3 groups and the global sample size is N ¼ 32. The global sample mean is Y ¼ 1:938. The sum of squares between groups (SSF) is: SSF ¼ 12  ð1:316  1:938Þ2 + 10  ð2:285  1:938Þ2 + 10  ð2:338  1:938Þ2 ¼ 7:449 Therefore, the mean square between groups (MSB) is: MSF ¼

SSF 7:449 ¼ ¼ 3:725 ðk  1Þ 2

The calculation of the sum of squares within groups (RSS) is shown in Table 9.E.16.

TABLE 9.E.16 Calculation of the Sum of Squares Within Groups (SSW) 

Yij  Y i

Supplier

Sucrose

Yij  Y i

1

0.33

0.986

0.972

1

0.79

0.526

0.277

1

1.24

0.076

0.006

1

1.75

0.434

0.189

1

0.94

0.376

0.141

1

2.42

1.104

1.219

1

1.97

0.654

0.428

1

0.87

0.446

0.199

1

0.33

0.986

0.972

1

0.79

0.526

0.277

1

1.24

0.076

0.006

1

3.12

1.804

3.255

2

1.54

0.745

0.555

2

1.11

1.175

1.381

2

0.97

1.315

1.729

2

2.57

0.285

0.081

2

2.94

0.655

0.429

2

3.44

1.155

1.334

2

3.02

0.735

0.540

2

3.55

1.265

1.600

2

2.04

0.245

0.060

2

1.67

0.615

0.378

2

Continued

236

PART

IV Statistical Inference

TABLE 9.E.16 Calculation of the Sum of Squares Within Groups (SSW)—cont’d 

Yij  Y i

Supplier

Sucrose

Yij  Y i

3

1.47

0.868

0.753

3

1.69

0.648

0.420

3

1.55

0.788

0.621

3

2.04

0.298

0.089

3

2.67

0.332

0.110

3

3.07

0.732

0.536

3

3.33

0.992

0.984

3

4.01

1.672

2.796

3

1.52

0.818

0.669

3

2.03

0.308

0.095

RSS

2

23.100

Therefore, the mean square within groups is: MSR ¼

RSS 23:100 ¼ ¼ 0:797 ðN  k Þ 29

Thus, the value of the Fcal statistic is: Fcal ¼

MSF 3:725 ¼ ¼ 4:676 MSR 0:797

Step 5: According to Table A in the Appendix, the critical value of the statistic is Fc ¼ F2, 29,

5%

¼ 3.33.

Step 6: Decision: since the value calculated lies in the critical region (Fcal > Fc), we reject the null hypothesis, which allows us to conclude, with a 95% confidence level, that there is at least one supplier with a population mean that is different from the others. If, instead of comparing the value calculated to the critical value of Snedecor’s F-distribution, we use the calculation of P-value, Steps 5 and 6 will be: Step 5: According to Table A in the Appendix, for n1 ¼ 2 degrees of freedom in the numerator and n2 ¼ 29 degrees of freedom in the denominator, the probability associated to Fcal ¼ 4.676 is between 0.01 and 0.025 (P-value). Step 6: Decision: since P < 0.05, the null hypothesis is rejected.

9.8.1.1 Solving the One-Way ANOVA Test by Using SPSS Software The use of the images in this section has been authorized by the International Business Machines Corporation©. The data in Example 9.12 are available in the file One_Way_ANOVA.sav. First of all, let´s click on Analyze ! Compare Means → One-Way ANOVA …, as shown in Fig. 9.40. Let’s include the variable Sucrose in the list of dependent variables (Dependent List) and the variable Supplier in the box Factor, according to Fig. 9.41. After that, we must click on Options … and select the option Homogeneity of variance test (Levene’s test for variance homogeneity). Finally, let’s click on Continue and on OK to obtain the result of Levene’s test, besides the ANOVA table. Since ANOVA does not make the normality test available, it must be obtained by applying the same procedure described in Section 9.3.3. According to Fig. 9.42, we can verify that each one of the groups has data that follow a normal distribution. Moreover, through Fig. 9.43, we can conclude that the variances between the groups are homogeneous.

Hypotheses Tests Chapter

9

237

FIG. 9.40 Procedure for the one-way ANOVA.

FIG. 9.41 Selecting the variables.

From the ANOVA table (Fig. 9.44), we can see that the value of the F-test is 4.676 and the respective P-value is 0.017 (we saw in Example 9.12 that this value would be between 0.01 and 0.025), value less than 0.05. This leads us to reject the null hypothesis and allows us to conclude, with a 95% confidence level, that at least one of the population means is different from the others (there are differences in the percentage of sucrose in the honey of the three suppliers).

9.8.1.2 Solving the One-Way ANOVA Test by Using Stata Software The use of the images in this section has been authorized by StataCorp LP©. The one-way ANOVA on Stata is generated from the following syntax: anova variabley* factor*

238

PART

IV Statistical Inference

FIG. 9.42 Results of the tests for normality for Example 9.12 on SPSS.

FIG. 9.43 Results of Levene’s test for Example 9.12 on SPSS.

FIG. 9.44 Results of the one-way ANOVA for Example 9.12 on SPSS.

FIG. 9.45 Results of the one-way ANOVA on Stata.

in which the term variabley* should be substituted for the quantitative dependent variable and the term factor* for the qualitative explanatory variable. The data in Example 9.12 are available in the file One_Way_Anova.dta. The quantitative dependent variable is called sucrose and the factor is represented by the variable supplier. Thus, we must type the following command: anova sucrose supplier

The result of the test can be seen in Fig. 9.45. We can see that the calculated value of the statistic (4.68) is similar to the one calculated in Example 9.12 and also generated on SPSS, as well as the probability associated to the value of the statistic (0.017). Since P < 0.05, the null hypothesis is rejected, which allows us to conclude, with a 95% confidence level, that at least one of the population means is different from the others.

Hypotheses Tests Chapter

9.8.2

9

239

Factorial ANOVA

Factorial ANOVA is an extension of the one-way ANOVA, assuming the same assumptions, but considering two or more factors. Factorial ANOVA presumes that the quantitative dependent variable is influenced by more than one qualitative explanatory variable (factor). It also tests the possible interactions between the factors, through the resulting effect of the combination of factor A’s level i and factor B’s level j, as discussed by Pestana and Gageiro (2008), Fa´vero et al. (2009), and Maroco (2014). For Pestana and Gageiro (2008) and Fa´vero et al. (2009), the main objective of the factorial ANOVA is to determine whether the means for each factor level are the same (an isolated effect of the factors on the dependent variable), and to verify the interaction between the factors (the joint effect of the factors on the dependent variable). For educational purposes, the factorial ANOVA will be described for the two-way model.

9.8.2.1 Two-Way ANOVA According to Fa´vero et al. (2009) and Maroco (2014), the observations of the two-way ANOVA can be represented, in general, as shown in Table 9.4. For each cell, we can see the values of the dependent variable in the factors A and B that are being studied. where Yijk represents observation k (k ¼ 1, …, n) of factor A’s level i (i ¼ 1, …, a) and of factor B’s level j (j ¼ 1, …, b). First, in order to check the isolated effects of factors A and B, we must test the following hypotheses (Fa´vero et al., 2009; Maroco, 2014): HA0 : m1 ¼ m2 ¼ … ¼ ma

(9.37)

HA1 : 9ði, jÞ mi 6¼ mj , i 6¼ j ði, j ¼ 1, …, aÞ and HB0 : m1 ¼ m2 ¼ … ¼ mb

(9.38)

HB1 : 9ði, jÞ mi 6¼ mj , i 6¼ j ði, j ¼ 1, …, bÞ

TABLE 9.4 Observations of the Two-Way ANOVA Factor B

Factor A

1

1

2

… …

b

Y111

Y121

Y112

Y122

Yab2







Y11n

Y12n

Yabn

Y211

Y221

Y212

Y222

Y2b2







Y21n

Y22n

Y2bn











a

Ya11

Ya21



Yab1

Ya12

Ya22

Yab2







Ya1n

Ya2n

Yabn

2



Yab1

Y2b1

Source: Fa´vero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Ana´lise de dados: modelagem multivariada para tomada de deciso˜es. Campus Elsevier, Rio de Janeiro; Maroco, J., 2014. Ana´lise estatı´stica com o SPSS Statistics, sixth ed. Edic¸o˜es Sı´labo, Lisboa.

240

PART

IV Statistical Inference

Now, in order to verify the joint effect of the factors on the dependent variable, we must test the following hypotheses (Fa´vero et al., 2009; Maroco, 2014): H0 : gij ¼ 0, for i 6¼ j ðthere is no interaction between the factors A and BÞ H1 : gij 6¼ 0, for i 6¼ j ðthere is interaction between the factors A and BÞ

(9.39)

The model presented by Pestana and Gageiro (2008) can be described as: Yijk ¼ m + ai + bj + gij + eijk

(9.40)

where: m is the population’s global mean; ai is the effect of factor A’s level i, given by mi  m; bi is the effect of factor B’s level j, given by mj  m; gij is the interaction between the factors; eijk is the random error that follows a normal distribution with a mean equal to zero and a constant variance. To standardize the effects of the levels chosen of both factors, we must assume that: a X i¼1

ai ¼

b X

bj ¼

a X

j¼1

gij ¼

i¼1

b X

gij ¼ 0

(9.41)

i¼1

Let’s consider Y, Y ij , Y i , and Y j the general mean of the global sample, the mean per sample, the mean of factor A’s level i, and the mean of factor B’s level j, respectively. We can describe the residual sum of squares (RSS) as: RSS ¼

a X b X n  2 X Yijk  Y ij

(9.42)

i¼1 j¼1 k¼1

On the other hand, the sum of squares of factor A (SSFA), the sum of squares of factor B (SSFB), and the sum of squares of the interaction (SSFAB) are represented below in Expressions (9.43)–(9.45), respectively: SSFA ¼ b  n 

a  X

Yi  Y

2

(9.43)

i¼1

SSFB ¼ a  n 

b  2 X Yj  Y

(9.44)

j¼1

SSFAB ¼ n 

a X b  X

Y ij  Y i  Y j + Y

2 (9.45)

i¼1 j¼1

Therefore, the sum of total squares can be written as follows: TSS ¼ RSS + SSFA + SSFB + SSFAB ¼

a X b X n  2 X Yijk  Y

(9.46)

i¼1 j¼1 k¼1

Thus, the ANOVA statistic for factor A is given by: SSFA MSFA ð a  1Þ FA ¼ ¼ RSS MSR ðn  1Þ  ab where: MSFA is the mean square of factor A; MSR is the mean square of the errors.

(9.47)

Hypotheses Tests Chapter

9

241

TABLE 9.5 Calculations of the Two-Way ANOVA Source of Variation Factor A

SSF A ¼ b  n 

Factor B

SSF B ¼ a  n 

Interaction

SSF AB ¼ n 

Error

RSS ¼

Total

Degrees of Freedom

Sum of Squares

TSS ¼

Pa 

Mean Squares

F

a1

A MSF A ¼ ðSSF a1Þ

A FA ¼ MSF MSR

b1

B MSF B ¼ ðSSF b1Þ

B FB ¼ MSF MSR

(a  1). (b  1)

AB MSF AB ¼ ða1SSF Þ  ðb1Þ

AB FAB ¼ MSF MSR

 2 Yijk  Y ij

(n  1)  ab

RSS MSR ¼ ðn1 Þ  ab

 2 Yijk  Y

N1

2

2 Pb  j¼1 Y j  Y

i¼1

Yi Y

2 Pa Pb  i¼1 j¼1 Y ij  Y i  Y j + Y

Pa Pb Pn i¼1

j¼1

k¼1

Pa Pb Pn i¼1

j¼1

k¼1

Source: Fa´vero, L.P., Belfiore, P., Silva, F.L., Chan, B.L., 2009. Ana´lise de dados: modelagem multivariada para tomada de deciso˜es. Campus Elsevier, Rio de Janeiro; Maroco, J., 2014. Ana´lise estatı´stica com o SPSS Statistics, sixth ed. Edic¸o˜es Sı´labo, Lisboa.

On the other hand, the ANOVA statistic for factor B is given by: SSFB MSFB ðb  1Þ FB ¼ ¼ RSS MSR ðn  1Þ  ab

(9.48)

where: MSFB is the mean square of factor B. And the ANOVA statistic for the interaction is represented by: SSFAB ða  1Þ  ðb  1Þ MSFAB FAB ¼ ¼ RSS MSR ðn  1Þ  ab

(9.49)

where: MSFAB is the mean square of the interaction. The calculations of the two-way ANOVA are summarized in Table 9.5. cal cal The calculated values of the statistics (Fcal A , FB , and FAB) must be compared to the critical values obtained from the c F-distribution table (Table A in the Appendix): FA ¼ Fa1, (n1)ab, a, FcB ¼ Fb1, (n1)ab, a, and FcAB ¼ F(a1)(b1), (n1)ab, a. c cal c cal c For each statistic, if the value lies in the critical region (Fcal A > FA, FB > FB, FAB > FAB), we must reject the null hypothesis. Otherwise, we do not reject H0. Example 9.13: Using the Two-Way ANOVA A sample with 24 passengers who travel from Sao Paulo to Campinas in a certain week is collected. The following variables are analyzed (1) travel time in minutes, (2) the bus company chosen, and (3) the day of the week. The main objective is to verify if there is a relationship between the travel time and the bus company, between the travel time and the day of the week, and between the bus company and the day of the week. The levels considered in the variable bus company are Company A (1), Company B (2), and Company C (3). On the other hand, the levels regarding the day of the week are Monday (1), Tuesday (2), Wednesday (3), Thursday (4), Friday (5), Saturday (6), and Sunday (7). The results of the sample are shown in Table 9.E.17 and are available in the file Two_Way_ANOVA.sav as well. Test these hypotheses, considering a 5% significance level.

242

PART

IV Statistical Inference

TABLE 9.E.17 Data From Example 9.13 (Using the Two-Way ANOVA) Time (Min)

9.8.2.1.1

Company

Day of the Week

90

2

4

100

1

5

72

1

6

76

3

1

85

2

2

95

1

5

79

3

1

100

2

4

70

1

7

80

3

1

85

2

3

90

1

5

77

2

7

80

1

2

85

3

4

74

2

7

72

3

6

92

1

5

84

2

4

80

1

3

79

2

1

70

3

6

88

3

5

84

2

4

Solving the Two-Way ANOVA Test by Using SPSS Software

The use of the images in this section has been authorized by the International Business Machines Corporation©. Step 1: In this case, the most suitable test is the two-way ANOVA. First, we must verify if there is normality in the variable Time (metric) in the model (as shown in Fig. 9.46). According to this figure, we can conclude that variable Time follows a normal distribution, with a 95% confidence level. The hypothesis of variance homogeneity will be verified in Step 4. Step 2: The null hypothesis H0 of the two-way ANOVA for this example assumes that the population means of each level of the factor Company and of each level of the factor Day_of_the_week are equal, that is, HA0 : m1 ¼ m2 ¼ m3 and HB0 : m1 ¼ m2 ¼ … ¼ m7. The null hypothesis H0 also states that there is no interaction between the factor Company and the factor Day_of_the_week, that is, H0: gij ¼ 0 for i 6¼ j. Step 3: The significance level to be considered is 5%.

Hypotheses Tests Chapter

9

243

FIG. 9.46 Results of the normality tests on SPSS.

FIG. 9.47 Procedure for elaborating the two-way ANOVA on SPSS.

Step 4: The F statistics in ANOVA for the factor Company, for the factor Day_of_the_week, and for the interaction Company * Day_of_the_week will be obtained through the SPSS software, according to the procedure specified below. In order to do that, let´s click on Analyze ! General Linear Model → Univariate …, as shown in Fig. 9.47. After that, let´s include the variable Time in the box of dependent variables (Dependent Variable) and the variables Company and Day_of_the_week in the box of Fixed Factor(s), as shown in Fig. 9.48. This example is based on the one-way ANOVA, in which the factors are fixed. If one of the factors were chosen randomly, it would be inserted into the box Random Factor(s), resulting in a case of a three-way ANOVA. The button Model … defines the variance analysis model to be tested. Through the button Contrasts …, we can assess if the category of one of the factors is significantly different from the other categories of the same factor. Charts can be constructed through the button Plots …, thus allowing the visualization of the existence or nonexistence of interactions between the factors. Button Post Hoc …, on the other hand, allows us to compare multiple means. Finally, from the button Options …, we can obtain descriptive statistics and the result of Levene’s variance homogeneity test, as well as select the appropriate significance level (Fa´vero et al., 2009; Maroco, 2014).

244

PART

IV Statistical Inference

FIG. 9.48 Selection of the variables to elaborate the two-way ANOVA.

Therefore, since we want to test variance homogeneity, we must select, in Options …, the option Homogeneity tests, as shown in Fig. 9.49. Finally, let’s click on Continue and on OK to obtain Levene’s variance homogeneity test and the two-way ANOVA table. In Fig. 9.50, we can see that the variances between groups are homogeneous (P ¼ 0.451 > 0.05). Based on Fig. 9.51, we can conclude that there are no significant differences between the travel times of the companies analyzed, that is, the factor Company does not have a significant impact on the variable Time (P ¼ 0.330 > 0.05). On the other hand, we conclude that there are significant differences between the days of the week, that is, the factor Day_of_the_week has a significant effect on the variable Time (P ¼ 0.003 < 0.05). We finally conclude that there is no significant interaction, with a 95% confidence level, between the two factors Company and Day_of_the_week, since P ¼ 0.898 > 0.05. 9.8.2.1.2

Solving the Two-Way ANOVA Test by Using Stata Software

The use of the images in this section has been authorized by StataCorp LP©. The command anova on Stata specifies the dependent variable being analyzed, as well as the respective factors. The interactions are specified using the character # between the factors. Thus, the two-way ANOVA is generated through the following syntax: anova variableY* factorA* factorB* factorA#factorB

or simply: anova variabley* factorA*## factorB*

in which the term variabley* should be substituted for the quantitative dependent variable and the terms factorA* and factorB* for the respective factors. If we type the syntax anova variableY* factorA* factorB*, only the ANOVA for each factor will be elaborated, and not between the factors.

Hypotheses Tests Chapter

9

245

FIG. 9.49 Test of variance homogeneity.

FIG. 9.50 Results of Levene’s test on SPSS.

The data presented in Example 9.13 are available in the file Two_Way_ANOVA.dta. The quantitative dependent variable is called time and the factors correspond to the variables company and day_of_the_week. Thus, we must type the following command: anova time company##day_of_the_week

The results can be seen in Fig. 9.52 and are similar to those presented on SPSS, which allows us to conclude, with a 95% confidence level, that only the factor day_of_the_week has a significant effect on the variable time (P ¼ 0.003 < 0.05), and that there is no significant interaction between the two factors analyzed (P ¼ 0.898 > 0.05).

246

PART

IV Statistical Inference

FIG. 9.51 Results of the two-way ANOVA for Example 9.13 on SPSS.

FIG. 9.52 Results of the two-way ANOVA for Example 9.13 on Stata.

9.8.2.2 ANOVA With More Than Two Factors The two-way ANOVA can be generalized to three or more factors. According to Maroco (2014), the model becomes very complex, since the effect of multiple interactions can make the effect of the factors a bit confusing. The generic model with three factors presented by the author is: Yijkl ¼ m + ai + bj + gk + abij + agik + bgjk + abgijk + eijkl

9.9

(9.50)

FINAL REMARKS

This chapter presented the concepts and objectives of parametric hypotheses tests and the general procedures for constructing each one of them. We studied the main types of tests and the situations in which each one of them must be used. Moreover, the advantages and disadvantages of each test were established, as well as their assumptions. We studied the tests for normality (Kolmogorov-Smirnov, Shapiro-Wilk, and Shapiro-Francia), variance homogeneity tests (Bartlett’s w2, Cochran’s C, Hartley’s Fmax, and Levene’s F), Student’s t-test for one population mean, for two independent means, and for two paired means, as well as ANOVA and its extensions.

Hypotheses Tests Chapter

9

247

Regardless of the application’s main goal, parametric tests can provide good and interesting research results that will be useful in the decision-making process. From a conscious choice of the modeling software, the correct use of each test must always be made based on the underlying theory, without ever ignoring the researcher’s experience and intuition.

9.10 EXERCISES (1) In what situations should parametric tests be applied and what are the assumptions of these tests? (2) What are the advantages and disadvantages of parametric tests? (3) What are the main parametric tests to verify the normality of the data? In what situations must we use each one of them? (4) What are the main parametric tests to verify the variance homogeneity between groups? In what situations must we use each one of them? (5) To test a single population mean, we can use z-test and Student’s t-test. In what cases must each one of them be applied? (6) What are the main mean comparison tests? What are the assumptions of each test? (7) The monthly aircraft sales data throughout last year can be seen in the table below. Check and see if there is normality in the data. Consider a ¼ 5%. Jan.

Feb.

Mar.

Apr.

May

Jun.

Jul.

Aug.

Sept.

Oct.

Nov.

Dec.

48

52

50

49

47

50

51

54

39

56

52

55

(8) Test the normality of the temperature data listed (a ¼ 5%): 12.5

14.2

13.4

14.6

12.7

10.9

16.5

14.7

11.2

10.9

12.1

12.8

13.8

13.5

13.2

14.1

15.5

16.2

10.8

14.3

12.8

12.4

11.4

16.2

14.3

14.8

14.6

13.7

13.5

10.8

10.4

11.5

11.9

11.3

14.2

11.2

13.4

16.1

13.5

17.5

16.2

15.0

14.2

13.2

12.4

13.4

12.7

11.2

(9) The table shows the final grades of two students in nine subjects. Check and see if there is variance homogeneity between the students (a ¼ 5%). Student 1

6.4

5.8

6.9

5.4

7.3

8.2

6.1

5.5

6.0

Student 2

6.5

7.0

7.5

6.5

8.1

9.0

7.5

6.5

6.8

(10) A fat-free yogurt manufacturer states that the number of calories in each cup is 60 cal. In order to check if this information is true, a random sample with 36 cups is collected; and we observed that the average number of calories was 65 cal. with a standard deviation of 3.5. Apply the appropriate test and check if the manufacturer’s statement is true, considering a significance level of 5%. (11) We would like to compare the average waiting time before being seen by a doctor (in minutes) in two hospitals. In order to do that, we collected a sample with 20 patients from each hospital. The data are available in the tables. Check and see if there are differences between the average waiting times in both hospitals. Consider a ¼ 1%.

248

PART

IV Statistical Inference

Hospital 1

72

58

91

88

70

76

98

101

65

73

79

82

80

91

93

88

97

83

71

74

66

40

55

70

76

61

53

50

47

61

52

48

60

72

57

70

66

55

46

51

Hospital 2

(12) Thirty teenagers whose total cholesterol level is higher than what is advisable underwent treatment that consisted of a diet and physical activities. The tables show the levels of LDL cholesterol (mg/dL) before and after the treatment. Check if the treatment was effective (a ¼ 5%). Before the treatment 220

212

227

234

204

209

211

245

237

250

208

224

220

218

208

205

227

207

222

213

210

234

240

227

229

224

204

210

215

228

After the treatment 195

180

200

204

180

195

200

210

205

211

175

198

195

200

190

200

222

198

201

194

190

204

230

222

209

198

195

190

201

210

(13) An aerospace company produces civilian and military helicopters at its three factories. The tables show the monthly production of helicopters in the last 12 months at each factory. Check if there is a difference between the population means. Consider a ¼ 5%. Factory 1 24

26

28

22

31

25

27

28

30

21

20

24

26

24

30

24

27

25

29

30

27

26

25

25

24

26

20

22

22

27

20

26

24

25

Factory 2 28

Factory 3

29