Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria

Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria

Journal Pre-proof Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria Ezra Gayawan, Oluwatoyin Deborah ...

710KB Sizes 0 Downloads 12 Views

Journal Pre-proof Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria Ezra Gayawan, Oluwatoyin Deborah Fasusi, Dipankar Bandyopadhyay

PII: DOI: Reference:

S2211-6753(20)30009-9 https://doi.org/10.1016/j.spasta.2020.100415 SPASTA 100415

To appear in:

Spatial Statistics

Received date : 4 November 2019 Revised date : 31 December 2019 Accepted date : 23 January 2020 Please cite this article as: E. Gayawan, O.D. Fasusi and D. Bandyopadhyay, Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria. Spatial Statistics (2020), doi: https://doi.org/10.1016/j.spasta.2020.100415. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Elsevier B.V. All rights reserved.

Journal Pre-proof

Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria Ezra Gayawan1 , Oluwatoyin Deborah Fasusi1 , Dipankar Bandyopadhyay2 2

Department of Statistics, Federal University of Technology, Akure, Nigeria

pro of

1



Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, U.S.A.

Abstract

lP

re-

Child mortality has remained persistently high in most sub-Saharan African countries. Majority of the effort in analysing the determinants, or covariables did not consider the duration of exposure to mortality risks. In addition, covariates are usually linked to the mean of the response variable, thereby neglecting the possible association with other higher moments. In this paper, we account for the duration of exposure via the child mortality index, defined as the ratio of observed to expected child death, for all women captured in the 2013 Nigeria Demographic and Health Survey. Based on this index, a structured additive distributional beta regression model was adopted to examine covariate effects on the probability of a woman experiencing no child mortality, the conditional expectation of mortality, and the mortality spread, controlling for latent spatial associations. Our inferential framework is Bayesian inference, powered by generic MCMC tools based on iterative weighted least squares. Results confirm the existence of significant variation in the likelihood of a woman experiencing no child mortality, and in the spread of mortality, across Nigerian states. Findings also show that although mortality is fairly spread among women aged ≥ 30 years, it is concentrated among the younger women.

1

1. Introduction

urn a

Keywords: Zero-augmented beta; child mortality; spatial variation; life table; Nigeria

The survival probability of young children is a major indicator of a country’s socio-economic,

3

health and other developmental indices (Gayawan and Turra, 2015). Though appreciable progress

4

has been made in reducing the burden of child mortality globally, the statistics from developing

5

countries are still terrifying, especially in sub-Saharan Africa, where roughly half of the estimated

6

5.4 million children under-five years that died in year 2017 are concentrated (UNICEF, 2019). ∗

Jo

2

Address for correspondence: Biostatistics and Spatial Statistics Laboratory, Department of Statistics, Federal University of Technology, PMB 704, Akure, Ondo State, Nigeria. E-mail: [email protected]

Preprint submitted to Elsevier

January 25, 2020

Journal Pre-proof

Mortality, the extreme outcome of diseases in children, is usually the cumulative consequence

8

of multiple disease processes. Classical approaches for analyzing child mortality in developing

9

countries included parametric regressions, or survival modeling, mainly restricted to children under

10

the age of five years, which consequently do not take cognizance of the duration of exposure to

11

diseases, and the cumulative effects of other risks that enhance mortality. However, mortality

12

among older children contribute substantially to the overall child mortality. Furthermore, (child)

13

mortality can be spatially clustered, i.e., mortality experience of women residing within proximally

14

located geographical entities (such as, a state, and its associated neighboring states, in a country)

15

can be similar, and quite different from states that are geographically distant. In addition, among a

16

group of women located within a geographic unit, mortality experience may also vary substantially,

17

due to differences in cultural belief and practices, which often affect child care and health-seeking

18

behaviour. All these need to be taken into account in describing the experience in a multicultural

19

setting like Nigeria, which is the focus of this study.

re-

pro of

7

The geographical location where a child lives often set the stage for the operation of other

21

factors to determine the survival probability of the child. Individuals located in the urban areas and

22

other more favoured settings of most developing countries usually have access to social amenities

23

and improved health care facilities, which jointly enhances their well-being and those of their

24

children. However, this is often a concern for the non-privileged ones. Consequently, there has

25

been considerable interest in estimating spatial differences in mortality among young children, and

26

in the distribution of other health issues in many developing countries (Adebayo and Fahrmeir,

27

2005; Kandala et al., 2009; Kazembe and Namangale, 2007; Yadav et al., 2015; Gayawan et al.,

28

2016; Kinyoki et al., 2017). Inequality in the distribution of available wealth could also lead to

29

varied levels of morbidity and mortality among people living in same geographical environment.

30

However, most of the studies fail to account for the length of exposure, i.e., time since birth

31

(Preston and Haines, 1991) to the risks of illness and hence, mortality. They have also been

32

limited to linking covariates to the average mortality, neglecting possible association with higher

33

moments such as variance of the response variable. As a child lives longer in an unfavourable

34

condition, the effects of such condition accumulates over time and can influence the severity of the

35

impact of such conditions on the well-being. Given that mortality levels are generally changing

Jo

urn a

lP

20

2

Journal Pre-proof

over time as condition of living are fast changing with modernization, children born around the

37

same period of time are likely to be exposed to similar risks of death, which will be different

38

from those born at other time periods. Model life table (United Nations, 1982) provides valuable

39

insight into the manner in which age pattern of mortality vary as mortality level changes with time

40

(Preston and Haines, 1991; Gayawan and Turra, 2015).

pro of

36

This study was motivated by the need to investigate the spatial distributions of the level and

42

spread of mortality among children in Nigeria. We focus on women who have given birth to at

43

least a child, as recorded in the 2013 Nigeria Demographic and Health Survey (NDHS) database.

44

In order to take into account for the duration of exposure to the mortality risks, we adopt an

45

indirect technique by computing mortality index, defined as the ratio of the observed to expected

46

child death, as described by Preston and Haines (1991) and Trussell and Preston (1982). While the

47

observed death counts the number of child death for each woman, the expected death accumulates

48

the risks of dying from birth to the current age of the child at the time of the survey. Being an index,

49

the variable was restricted to the interval [0, 1). To model this, some researchers have considered

50

a mixture of the multiplicative logistic normal distribution, and a degenerate distribution that

51

assigns a probability to the essential zeros (Stewart and Field, 2011). However, such an approach

52

involving data transformations often lacks appeal, primarily, during interpretation of study findings

53

based on transformed data, in light of the study hypothesis generated from the original data (Feng

54

et al., 2014). In this context, the two-parameter beta regression (henceforth, BR) (Ferrari and

55

Cribari-Neto, 2004) that naturally models a response bounded in (0, 1) is also not applicable here,

56

given that the index exhibits an abundance of ‘essential’ zeros resulting from women who never

57

experienced child mortality. We circumvent this via the zero-augmented beta (ZAB) regression

58

(Stewart, 2013; Bandyopadhyay et al., 2017) model, which initiates a mixture setup by augmenting

59

the probability of observing zero (the third parameter) to the beta density.

urn a

lP

re-

41

The broad and generic distributional regression (DR) framework (Klein et al., 2015) similar in

61

spirit to the generalized additive models for location, scale and shape (Rigby and Stasinopoulos,

62

2005) expands the exponential family regression by encompassing continuous, discrete and mixed

63

discrete-continuous responses. Within the premise of DR, it is possible to link the three parameters

64

(the probability of observing zero, and the two from the BR) of the ZAB model to the regression

Jo

60

3

Journal Pre-proof

predictors to estimate the covariate effects on each of them, rather than modeling only the mean

66

mortality. Our major contribution is to extend the DR framework to ZAB responses with struc-

67

tured additive predictors (Fahrmeir et al., 2013; Klein et al., 2015), i.e., where the predictor space

68

for each parameter includes a nonparametric effect of a continuous covariate (age), linear effects

69

of categorical covariates, and a spatial component. Our inferential paradigm is Bayesian, powered

70

by a Markov chain Monte Carlo (MCMC) simulation algorithm based on distribution-specific it-

71

eratively weighted least squares approximations to the full conditionals (Klein et al., 2015), with

72

a multivariate Gaussian prior to enforce desired properties, thus yielding numerically stable esti-

73

mates. With these, we are able to discern the spatial association not only in the levels of child

74

mortality, but also in its spread, and in the probability of a woman not experiencing child death.

75

To the best of our knowledge, this current initiative advances previous approaches on estimating

76

child mortality in an attempt to disentangle covariate and spatial effects on important mortality

77

summaries.

re-

pro of

65

The rest of the article is structured as follows. In Section 2, we present the description of our

79

motivating NDHS survey database, and construct the mortality index. After an introduction to the

80

ZABR model, Section 3 develops the corresponding structured additive DR modeling, and related

81

Bayesian inference. Section 4 summarizes the posterior estimates derived from fitting the model

82

to the NDHS data. Finally, Section 5 presents a conclusion, with directions of future research.

83

2. Data

84

2.1. 2013 NDHS survey data

urn a

lP

78

Our motivating database is the 2013 NDHS generated by the Demographic and Health Surveys

86

(DHS) Program, and responsible for collecting and disseminating accurate, nationally represen-

87

tative data on health and population in developing countries. Data from the surveys are freely

88

available from the DHS website (www.dhsprogram.com) upon request which entails filling a pre-

89

scribed form on the website and submission of a short description on the intended use. If approved,

90

the user is notified within a short time, possibly within 2 working days. The sampling frame was

91

based on the list of enumeration areas used during the 2006 Population Census of Nigeria. Nigeria

92

is administratively divided into 36 states and a Federal Capital Territory (FCT), and each state

Jo

85

4

Journal Pre-proof

is further divided into local government areas (LGAs). Data were realized through a stratified,

94

three-stage, cluster design consisting of 904 clusters; 372 in urban areas and 532 in rural areas.

95

In all, a representative sample of 40, 320 households was selected for the survey, with a minimum

96

target of 943 completed interviews per state. All women aged 15 to 49, who were either permanent

97

residents or visitors in the selected households, were eligible for individual interview. A total of

98

39, 902 were identified as eligible for individual interviews, and 98 percent of them were success-

99

fully interviewed. The DHS data sets come in different format, each containing information on the

100

households, women, men, and children. The Birth Recode data set that contains the birth history

101

of all the interviewed women and information on all children ever born to the women was used for

102

this study. Date of birth for all the children ever born, information on whether the child is dead or

103

alive were extracted together with other important covariates and used in creating the mortality

104

index, described in section 2.2. Dummy variables were generated from all categorical variables.

re-

pro of

93

Extensive literature exists on the socio-economic and demographic factors that shape child

106

mortality in resource-poor countries. Following a careful review of the analytical framework by

107

Mosley and Chen (1984) and other studies focusing on Nigeria (Adebayo and Fahrmeir, 2005;

108

Gayawan et al., 2016), the following variables were considered in the study: mother’s age, mother’s

109

level of educational, type of place of residence, woman’s working status, exposure to mass media

110

(newspaper, radio, and television; whether or not the woman accesses these at least once a week),

111

household source of water, type of household toilet facility and cooking fuel being used, and

112

whether or not the household has electricity. Figure 1 presents a labelled map of Nigeria, showing

113

the administrative states and the Federal Capital Territory, Abuja.

Jo

urn a

lP

105

5

Journal Pre-proof

114

re-

pro of

Figure 1: Map of Nigeria showing the 36 states and the Federal Capital Territory (FCT)

2.2. Mortality index

The mortality index was created as a ratio of actual to expected child death for each woman, as

116

described by Trussell and Preston (1982) and Preston and Haines (1991). The actual death counts

117

the number of deaths of children to each woman as reported in the survey data set. Expected

118

child death was obtained by accumulating the probabilities of dying between date of birth and

119

age at period of survey, for all children borne by each woman. These probabilities were derived

120

from a standard model life table, the general pattern of the United Nations Model Life Table for

121

developing countries (United Nations, 1982). For each cohort of children, the probability of dying

122

from birth to exact age x was obtained as:

urn a

lP

115

 X  q(x) = 1 − exp − qx

124

Jo

123

(1)

x

where qx is the probability that a child aged exactly x would die before reaching age x+1. Summing these probabilities q(x) for each woman based on children ever born yields her expected child death

6

Journal Pre-proof

(ECD). The mortality index, M I is then calculated as: MI =

ACD ECD

(2)

where ACD is the count of actual death for each woman. This index have been found to provide

126

robust and reliable estimates of mortality when used as a response variable in a regression analysis

127

(Gayawan and Turra, 2015; Preston and Haines, 1991).

128

3. Statistical Modeling

129

3.1. ZABR model

pro of

125

The ZABR model, a mixed discrete-continuous regression model, is an extension of the BR introduced by (Ferrari and Cribari-Neto, 2004). Given a response variable yi ∈ (0, 1) following a

re-

beta density with parameters p and q, the corresponding density function is given by f (yi | µ, σ 2 ) =

y p−1 × (1 − y)q−1 B(p, q)

parametrization µ =

p , p+q

σ2 =

lP

where p, q > 0, B(p, q) is the beta function expressed as B(p, q) = 1 , p+q+1

(3) Γ(p)Γ(q) Γ(p+q)

=

(p−1)!(q−1)! . (p+q−1)!

The

where µ, σ 2 ∈ (0, 1) yields E(yi ) = µi , with σi2 directly

proportional to the variance, i.e., V ar(yi ) = σi2 µi (1 − µi ). For the ZAB density, a new parameter

131

urn a

130

θ is introduced to account for the probability of observations at zero. The density now becomes     θ y=0    f (y | µ, σ 2 , θ) = (4)     p−1 q−1  (1 − θ) y ×(1−p) y ∈ (0, 1) B(p,q) where 0 ≤ θ ≤ 1, such that the mean and variance of the ZAB model becomes E(yi ) = (1 − θ)µi

and V ar(yi ) = (1 − θ)σi2 µi (1 − µi ), respectively. The maximum likelihood estimate for θ is the (n−n0 ) , n

proportion of zeros in the sample, i.e., θˆ =

133

the n observations.

134

3.2. Structured additive distributional regression

Jo

132

where n0 is the number of observed zeros from

135

Consider the scalar response variable y1 , . . . , yn and covariate information v collected from

136

the n respondents, in this case, mothers. Under a structured additive distributional regression 7

Journal Pre-proof

(Klein et al., 2015) framework, the parameter space ϑk = (ϑ1 = θ, ϑ2 = µ, ϑ3 = σ 2 ) of the ZABR

138

model can be linked to the semi-parametric regression predictor ηiϑk through a suitable (one-to-

139

ϑk ). The link function ensures appropriate restrictions one) link function, such that ϑik = hkϑk (ηik

140

on the parameter space. For example, one may employ the logit, probit or complimentary log log

141

(cloglog) link functions for µ and θ, and a log link on σ 2 to ensure positivity. The generic form of

142

the structured additive distributional model is given by:

ηiϑk

=

β0ϑk

+

pro of

137

Jk X

fjϑk (υ)

(5)

j=1

where ϑk is a generic parameter k and fjϑk (υ) represents various functions defined on the

re-

complete covariates. The predictors for the different distributional parameters can be linked to entirely different functions, or different number of functions. Simplifying notation, each function fj of (5) can be represented as a linear combination of basis functions, such that, in matrix form,

lP

fj = Zj βj where Zj is a design matrix and βj is the vector of coefficients to be estimated. This leads to the following matrix representation:

η = β0 1 + Z1 β1 + · · · + Zl βl

urn a

A multiplicative normal prior is then assumed for each parameter vectors βj :  j)   Rk(K   2 1 1 0 2 p(βj |τj ) ∝ exp − 2 βj Kj βj τj2 2τj

(6)

(7)

143

where Kj is a prior precision matrix which corresponds to the penalty matrix in a frequentist

144

formulation. The hyperparameters τj2 are assigned inverse gamma hyperpriors. For linear effects, fj (υ) = x0i βj where x0i is a subvector of υ for all appropriately coded

146

categorical variables, which include mother’s level of education, working status, and so on. In this

147

case, Zj = X, the data matrix and a non-informative prior was considered such that Kj =0. To

148

account for possible non-linear effects of the (continuous) mother’s age, we have fj (υ) = fj (wi ),

149

where fj is an appropriate smooth function for estimating the single continuous variable wi . Here,

150

Zj is constructed through a B-spline basis evaluated at the observation wi (Eilers and Marx, 1996;

Jo

145

8

Journal Pre-proof

151

Brezger and Lang, 2006), and Kj = θ2 R0 R where R can be a first, or second order random walk

152

difference matrix, and θ2 is a hyperprior. The hyperprior was further assigned inverse gamma

153

prior, i.e., θ2 ∼ IV G(a, b) with a and b chosen such that the prior is non-informative.

154

For the discrete spatial variable, fj (υ) = fj (si ), where si ∈ {1, . . . , 37} is a discrete spatial unit

for the residence of the ith mother observed over the states of Nigeria. Here, the matrix Zj has n

156

rows and 37 columns (corresponding to the number of states involved), where the ith row and pth

157

column is 1 if the ith woman is from the pth state, and 0 if otherwise. Markov random fields are a

158

common approach for estimating this spatial effect (Rue and Held, 2005). In this case, K =

159

being a penalty matrix with −1 for states which share common boundary and 0 for distant states. Again, θ2 is assigned inverse gamma.

161

3.3. Bayesian inference

1 Kj θ2

re-

160

pro of

155

The complex likelihood structures of non-standard distributions utilised in distributional re-

163

gression, in most cases, result in full conditionals for the unknown regression coefficients that are

164

analytically intractable. As a result, fully Bayesian inference are based on the posterior distribution

165

of the model parameters, which are not of known form. Consequently, we resort to Markov chain

166

Monte Carlo (MCMC) sampling techniques to generate samples from the full conditionals for the

167

linear, nonlinear, spatial effects, and smoothing parameters, to be used for posterior analysis. The

168

MCMC sampler was executed as a Metropolis-Hastings algorithm based on iteratively weighted

169

least square (IWLS) developed by Klein et al. (2015), and implemented in BayesX - a software

170

package used for Bayesian inference in structured additive regression models.

urn a

lP

162

The MCMC simulation was carried out based on a total of 35,000 iterations with a burn-in

172

sample of 5,000 and thinning every 30th observation for parameter estimation. Convergence was

173

monitored through the plots of the sampling path for all parameters. We considered three models

174

of different specifications in order to ascertain the gain of controlling for covariates in smoothing

175

the structured spatial effects. The first model (Model I) is the ‘only spatial’ model, without

176

accounting for any covariates. The second (Model II) includes all covariates, and the spatial term,

177

but estimates age as a linear effect rather than a smooth function. Finally, in Model III, age was

178

included as a nonlinear effect, in addition to all available covariates and the spatial term. Model

179

comparison was based on Deviance Information Criterion, or DIC (Spiegelhalter et al., 2002). The

Jo

171

9

Journal Pre-proof

180

structured additive distributional model described in Section 3.2 was implemented with the three-

181

parameter ZABR framework presented in Section 3.1 to describe the geographical differences in

182

various aspects of mortality in Nigeria. The full model is described as follows:

pro of

 θ θ θ θ θ θ  η = β0 + urbanβ1 + · · · + workβp + f (age) + fspat (s)

η µ = β µ + urbanβ µ + · · · + workβpµ + f µ (age) + f µ (s)

spat 0 1  η σ2 = β σ2 + urbanβ σ2 + · · · + workβ σ2 + f σ2 (age) + f σ2 (s) p spat 0 1

(8)

where β0 is the intercept term, βp is the linear parameter for working status, f (age) is the

184

nonlinear effects for age (in years) and fspat (s) is the discrete spatial effect for the state where the

185

child resides.

186

4. Results

re-

183

Table 1 presents the frequency distribution of women and children that were used in this study,

188

classified according to various socio-demographic and behavioral variables. Overall, information on

189

27, 451 women who altogether gave birth to 119, 386 children were analysed. As highlights, about

190

44% of the women had no education, and these women gave birth to about half of the children

191

(51%) while the other half was from women who attained at least primary education. Also, about

192

64% of the women who gave birth to 68% of the children live in rural areas, while 62% living in

193

households with unimproved toilet facilities gave birth to about 64% of the children. For other

194

details, see Table 1. Figure 2(a-c) plots the maps of proportion of women who have experienced no

195

child death, mean mortality index given that death was recorded and conditional variance of the

196

index, respectively, while (d-f) are the respective (spatially) smoothed versions based on model I.

197

Comparing the crude and smoothed maps, there seems to be no obvious differences in the spatial

198

distributions. Apparently, most states in the southern, and north-central regions and Borno (in

199

the north-east) have high proportion of women with no child mortality, but the proportion is lower

200

for states in the northern fringe of the country. The conditional mean index of child mortality

201

appears higher for the states Bauchi, Kano, Zamfara, Kebbi and Katsina, but lower in other parts

202

of the country, while the variability of mortality given the occurrence of death is highest in Bauchi,

203

Kano, Zamfara, Oyo and Ebonyi states.

Jo

urn a

lP

187

10

Journal Pre-proof

Table 1: Frequency distribution of the women and children considered based on the categorical variables

No of Children (%) 119,386 (100)

11,952 (43.54) 5,953 (21.69) 7,475 (27.23) 2,071 (7.54)

60,778 (50.91) 27,945 (23.41) 24,388 (20.43) 6,275 (5.26)

9,778 (35.62) 17,673 (64.38)

38,786 (32.46) 80,600 (67.51)

11,046 (40.24) 16,342 (59.53) 63 (0.23)

68,459 (57.34) 50,632 (42.41) 295 (0.25)

re-

pro of

No of Women (%) 27,451 (100)

17,010 (61.96) 10,410 (37.92) 31 (0.11)

76,353 (63.95) 42,899 (35.93) 134 (0.11)

13,913 (50.68) 13,492 (49.15) 46 (0.17)

64,494 (54.02) 54,703 (45.82) 189 (0.16)

22,905 (83.44) 4,387 (15.98) 159 (0.58)

103,870 (87.00) 14,815 (12.41)

10,517 (38.31) 16,851 (61.39) 83 (0.30)

48,891 (40.95) 70,120 (58.73) 375 (0.31)

13,796 (50.26) 13,547 (49.35) 108 (0.39)

66,678 (55.85) 52,239 (43.76) 469 (0.39)

1541 (5.62) 20,300 (73.95) 4.797 (17.47) 496 (1.81) 317 (1.15)

6,522 (5.46) 94,944 (79.53) 15,488 (12.97) 1,435 (1.20) 997 (0.84)

19,932 (72.61) 7,410 (26.99) 109 (0.40)

90,776 (76.04) 28,210 (23.63) 400 (0.34)

Jo

urn a

lP

Variables Total Education No Education (ref) Primary Secondary Higher Residence Urban (ref) Rural Water Unprotected (ref) Protected Missing Toilet Unimproved (ref) Improved Missing Electricity No (ref) Yes Missing Newspaper No (ref) Yes Missing Radio No (ref) Yes Missing Television No (ref) Yes Missing Cooking fuel Any other means (ref) Wood Kerosene Electricity/Gas Missing Work No (ref) Yes Missing

11

(b)

re-

(a)

pro of

Journal Pre-proof

(e)

lP

(d)

(c)

(f)

Figure 2: Maps of Nigeria showing (a) unadjusted proportion of women who have not experienced death of a child; (b) unadjusted mean mortality index; and (c) unadjusted variance of the index; (d) smoothed proportion (θ); (e) smoothed mean (µ), and (f) smoothed variance (σ 2 )

Table 2 presents the summary DIC statistics comparing Models I-III. It is clear that Model

205

III, which incorporates the spatial effect, nonlinear effect of age and the linear effects of other

206

covariates performs better than both Models I and II. As expected, model complexity (as reflected

207

in the estimate of pD, the effective number of parameters) increases substantially from Models I

208

to Model III. Henceforth, the discussion of our findings will be based on Model III, our model of

209

choice.

Jo

urn a

204

Table 2: Model fit and complexity criterion

Deviance pD DIC Model I -27392.269 64.359 -27263.552 Model II -28353.589 87.230 -28179.128 Model III -34813.110 104.133 -34604.845

12

Journal Pre-proof

Table 3 presents the posterior estimates evaluating the linear effects of covariables on the prob-

211

ability of a woman having no children death, the mean conditional expectation i.e., the expected

212

mortality, conditioned on occurrence of death and the spread of mortality given occurrence of

213

death (the variance in the mortality index, given that death has occurred). For each parameter,

214

the posterior means, along with the 95% credible interval (CI) are reported. As expected, the

215

results reveal that as the level of education increases, the likelihood that a woman would have

216

experienced no child death increases, and is significant. This strong link between the mother’s

217

education and child survival is well established (Caldwell, 1979; Adetunji, 1995). Educated women

218

are mostly aware of the availability and access of appropriate child care for the benefit of their chil-

219

dren. Findings also show that a rural area woman has a significantly lower chance of not recording

220

the death of a child, compared to her urban counterpart. This is unsurprising, given that the

221

urban dwelling in Nigeria is associated with better health care and general living conditions than

222

the rural areas. Furthermore, compared to their counterparts, the probability that a women would

223

experience no child death is significantly higher among women from households with electricity,

224

those that depend on electricity/gas as cooking fuel, and the working women. Estimates for the

225

other variables are not significant.

lP

re-

pro of

210

We now present the findings from the regression on the posterior mean conditional expectation

227

based on the mortality index as presented in columns 5-7 of Table 3. Note, following the construc-

228

tion of our mortality index (ratio of observed to expected child death), a woman who has given

229

birth to many children would have accumulated higher value for the expected death, compared to

230

a woman with fewer children. Thus, dividing the observed death with the accumulated expected

231

death (considering all children ever borne by the woman) would always favour a woman with

232

many children with a lower mortality index. Results on educational level, for instance, show that

233

compared to women with no education, the expected mortality index given occurrence of death is

234

significantly lower among women who attain primary level of education, but significantly higher

235

for those with secondary or higher levels. This is in tune to studies that have shown education

236

to be an opportunity cost for childbearing and thus, number of children ever born reduces as

237

level of education increases (Manda and Meyer, 2005). Therefore any child death experienced by

238

these women, who are assumed to have better knowledge of mortality preventive measures, would

Jo

urn a

226

13

Journal Pre-proof

drastically enhance their mortality index. Similarly, results show significantly higher estimates

240

for women residing in rural areas, those who read newspapers at least once a week, those who

241

use electricity/gas as cooking fuel and those who are working. Apart from rural women, all these

242

behavioral variables can be attributed to women with tendency of having fewer children than their

243

counterparts. Generally speaking, rural women have higher number of children than those in ur-

244

ban areas, and also experience higher child mortality (Bongaarts, 2017). Also, significantly lower

245

estimates are obtained for women from households having electricity, those who watch television

246

at least once a week, and those who use wood as cooking fuel.

pro of

239

Results from the regression on the variability of the mortality index, given death occurrence as

248

presented in columns 8-10 of Table 3 contributes to our understanding on the divergences of the

249

mortality experience among the various socio-demographic and behavioral sub groups (covariates).

250

This is a measure of mortality risk, and quantifies uncertainty in predicting expected mortality,

251

provided that death has occurred (R´ıos-Pena et al., 2018). For example, a positive posterior

252

estimate corresponding to a covariate indicates higher conditional variability among the women

253

represented, whereas, a negative estimate indicates lower variation. The conditional variability in

254

the mortality index was significantly higher among women possessing higher level of education, but

255

lower for those with primary education, when compared to their counterparts with no education.

256

This translates to the fact that among women who attained higher level of education in Nigeria,

257

child mortality is a concentrated issue among a few women, but is fairly spread among those with

258

primary education. This higher variability among highly educated women can be attributed to the

259

classification of higher education in the DHS data, which considered a simple two-year National

260

diploma and the highest terminal PhD degree to be in the same category. It’s likely that women

261

within this highly educated category may have varying understanding of child care practices.

262

Similarly, the conditional variation is significantly higher among women who read newspaper at

263

least once a week, those using electricity/gas, and the working women, as compared to their

264

counterparts.

Jo

urn a

lP

re-

247

14

Journal Pre-proof

Table 3: Linear effects on the likelihood of observing zero death, mean conditional expectation of mortality index and conditional variability of mortality

0.611

0.471

0.746

-2.013

0.053 0.375 0.838

0.221 0.569 1.165

0 -0.215

-0.290

-0.137

0 0.065

0.002

0.130

0 0.010

-0.055

0.077

0 0.020

0.181

0 -0.050

0.172

0 0.062

0 0.076

0.039

lP

0 0.112

-0.026

95% CI -2.083

1.943

Spread Mean -3.151

95% CI -3.307

-2.993

-0.195 -0.046 0.269

-0.022 0.175 0.654

0 -0.067 0.045 0.189

-0.103 0.002 0.103

-0.031 0.089 0.279

0 -0.107 0.066 0.459

0 0.043

0.007

0.077

0 -0.014

-0.094

0.072

0 0.022

-0.004

0.049

0 0.010

-0.055

0.076

0.049

0 0.043

-0.024

0.112

-0.019

0 0.006

-0.066

0.077

0.113

0 0.205

0.083

0.325

-0.014

0.041

0 -0.038

-0.111

0.034

re-

0 0.135 0.475 0.999

-0.009

-0.077

0.011

-0.050

0.094

0 0.013

0 0.031

-0.047

0.109

0 -0.075

-0.109

-0.037

0 -0.077

-0.162

0.008

0 -0.022 0.128 0.406

-0.139 -0.017 0.125

0.100 0.274 0.677

0 -0.066 0.035 0.381

-0.123 -0.032 0.161

-0.015 0.103 0.606

0 -0.070 0.128 0.884

-0.189 -0.277 0.491

0.049 0.050 1.276

0.226

0 0.081

0.110

0 0.161

0.092

0.226

urn a

0 0.024

0 0.153

Jo

Constant Education No education (ref) Primary Secondary Higher Residence Urban (ref) Rural Water Source Unprotected (ref) Protected Toilet Improved (ref) Unimproved Electricity No (ref) Yes Newspaper No (ref) Yes Radio No (ref) Yes Television No (ref) Yes Cooking Facility Others(ref) Wood Kerosene Electricity/Gas Working Status No (ref) Yes

Mean index Mean

95% CI

pro of

Zero death Mean

Variables

0.081

0.053

265

Results evaluating the nonlinear effect of the mother’s age on the three aforementioned param-

266

eters are presented in Figure 3(a-c), where the posterior mean estimates are denoted by black lines,

15

Journal Pre-proof

and the 95% credible intervals by blue lines. The plots reveal an overall declining pattern with

268

increasing mother’s age. Specifically, as a woman advances in age, the probability that she would

269

experience no child death reduces, and this is expected as with increase in age, a woman is prone

270

to having more children which are in turn exposed to higher risks of death. As for the conditional

271

expectation of the index, the expected greater number of children borne by older woman would

272

have resulted into lower mortality index, compared to the younger women with fewer number of

273

child births. For the measure of conditional variability given occurrence of death, we observe sub-

274

stantial differences among younger women below 30 years, while the variation diminishes for ages

275

≥ 30.

urn a

lP

re-

pro of

267

Figure 3: Nonlinear effects of the mother’s age on (a) probability of zero death, (b) mean conditional expectation of mortality index, and (c) conditional variance of mortality. The ‘black’ lines represent the posterior mean estimates, while the 95% credible intervals are denoted by the ‘blue’ lines.

Posterior evaluation of the spatial effects on the above three quantities are presented in Figure

277

4(a-f). While the left panel of the figure presents the maps of the posterior means, the right

278

panel considers the 95% CIs to determine the significance of the corresponding posterior mean

279

estimates. For the CI maps, black colour signifies states with significantly high estimates, gray

280

colour indicates low estimates, while the estimates for states in white colour are not significant.

Jo

276

16

Journal Pre-proof

As for the probability component (Figure 4 (a & b)), the chance of experiencing no child death

282

is significantly higher in the southern states of Anambra and Enugu (South-East region); Edo,

283

Delta, Akwa Ibom, Bayelsa, Rivers, Cross Rivers (all in the South-South); Ogun, Oyo, and Osun

284

(South-West region), as well as in the northern states of Kogi, Niger, Plateau, and Kwara (North-

285

Central); Kaduna (North-West), Borno (North-East) and the FCT. The probabilities are lower

286

in mostly northern states of Kebbi, Sokoto, Zamfara, Katsina, Kano, and Jigawa (North-West);

287

Bauchi, Gombe, Adamawa, and Taraba (North-East), and Ebonyi (the only one from the southern

288

fringe). These findings confirm the existence of a north-south divide in child mortality in Nigeria

289

(Antai, 2011). Regarding the conditional expectation of the mean component, significantly higher

290

estimates are obtained for Abia and Ebonyi (South-East region); Jigawa and Zamfara (North-

291

West) Lagos states in the South-West, but lower in Oyo, Kaduna, Yobe, Gombe and Benue states.

292

For the conditional variability, we observe lower estimates in Oyo and in neighbouring Borno and

293

Yobe states in the North-East region, but higher in Lagos and the neighbouring Abia, Anambra,

294

Ebonyi and Rivers. Estimates for the majority of the other states are not significant.

re-

pro of

281

The findings from this analysis, which, in particular, reveal a divide in mortality experience

296

of the women, can be attributed to the widespread regional disparity in health-seeking behaviour

297

regarding immunization and maternal and child care utilization. Evidence has shown that in some

298

of the northern Nigerian states where the practice of Purdah is still prevalent and women are

299

undervalued, the women’s reproductive and health-seeking behavior either for themselves or their

300

children are strictly determined by men, or their mother-in-laws. This limits their ability to provide

301

the desired care for their children, often leading to higher proportion of home delivery without pro-

302

fessional assistance to handle birth-related complications (Antai, 2011; Babalola and Fatusi, 2009).

303

Further studies (Gayawan et al., 2019; Kandala et al., 2007; Uthman, 2008) have also reported

304

similar north-south divide in childhood nutritional intake, communicable diseases, younger age at

305

marriage and first birth, and widespread socioeconomic inequality, which are established factors

306

to exacerbate the survival chances of the less privileged children.

Jo

urn a

lP

295

17

pro of

Journal Pre-proof

(b)

(d)

Jo

urn a

(c)

lP

re-

(a)

(e)

(f)

Figure 4: Spatial effects of (a) probability of zero death (b) mean conditional expectation of mortality index and (c) conditional variance of mortality

18

Journal Pre-proof

307

5. Conclusion Over the years, studies on the determinants of child mortality and the spatial variations were

309

based on statistical models that relates variables to the conditional mean of the response variable,

310

thereby ignoring vital information on the covariate effects on higher moments of the response.

311

Such ignored information could be valuable in policy formulation and implementation. Further,

312

little attention is given to the duration of exposure in studying the risks of mortality. In this

313

paper, under a ZABR framework, we explore a structured additive distributional regression model

314

to explore both linear and non-linear effects of covariates/determinants on various parameters as-

315

sociated with the child mortality index, controlled for spatial association. Our choice of the ZABR

316

framework ensues that the discrete-continuous nature of the index variable was preserved, rather

317

than adopting any transformation-based approach that may have resulted in loss of information.

318

Of particular interest are the findings that indicate a divide in the likelihood of a woman experi-

319

encing child mortality across the states; huge differences in mortality experienced by women living

320

in Lagos, Abia, Anambra, Ebonyi and Rivers, but evenly distributed among those in Oyo, Borno

321

and Yobe states; and that variation in child mortality is significant among younger women, but

322

somewhat uniform for older women aged thirty years and above. As an intervention, younger

323

women could be incentivized to attend antenatal and postnatal clinics, where experienced atten-

324

dants could be designated to tutor these women on child care practices. Furthermore, based on

325

the different parameters considered, we found education and working status of women to be signif-

326

icantly associated with mortality. Consequently, federal and state governments can ensure that all

327

potential mothers receive education higher than the primary level. This would lead to knowledge

328

enhancements regarding of child care, and thereby promote decision-making as a parent. From

329

policy considerations, the government can also consider an extended maternity leave for working

330

women as opposed to the present practice of three months leave. This would avail mothers ample

331

time for child care after delivery.

Jo

urn a

lP

re-

pro of

308

332

As alternatives to the ZABR model, the beta rectangular and simplex regressions, along with

333

their zero and one augmented versions have been considered in the literature (Bandyopadhyay

334

et al., 2017) to model responses in the closed interval [0,1]. However, their extensions to include

19

Journal Pre-proof

spatial random effects, and other non-linear covariate effects via structured additive distributional

336

regression are not available. This remains a viable potential area for future research, which we

337

hope to undertake.

338

Acknowledgements

lP

re-

United States National Institutes of Health.

urn a

340

Bandyopadhyay acknowledges support from grant R01DE024984 and P30CA016059 from the

Jo

339

pro of

335

20

Journal Pre-proof

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

pro of

345

Science and Medicine 40, 253–263.

Antai, D. (2011). Regional inequalities in under-5 mortality in Nigeria: A population-based analysis of individual- and community-level determinants. Population Health Metric 9, 6. Babalola, S. and A. Fatusi (2009). Determinants of use of maternal health services in nigeria looking beyond individual and household factors. BMC Pregnancy and Childbirth 9 (43). Bandyopadhyay, D., D. M. Galvis, and V. H. Lachos (2017). Augmented mixed models for clustered

re-

344

Adetunji, J. A. (1995). Infant mortality and mother’s education in Ondo state, Nigeria. Social

proportion data. Statistical Methods in Medical Research 26, 880–897. Bongaarts, J. (2017). Africa’s unique fertility transition. Population and Development Review 43, 39–58.

lP

343

Statistics in Medicine 24, 709728.

Brezger, A. and S. Lang (2006). Generalized structured additive regression based on Bayesian P-splines. Computational Statistics and Data Analysis 50, 967–991. Caldwell, J. C. (1979). Education as a factor in mortality decline: an examination of Nigerian data. Population Studies 33, 395–413.

urn a

342

Adebayo, S. and L. Fahrmeir (2005). Analysing child mortality in nigeria with geoadditive discrete.

Eilers, P. H. and B. D. Marx (1996). Flexible smoothing using b-splines and penalized likelihood. Statistical Science 11, 236–245.

Fahrmeir, L., T. Kneib, S. Lang, and B. Marx (2013). Regression: Models, Methods and Applications. Springer, Hiedelberg.

Jo

341

Feng, C., H. Wang, N. Lu, T. Chen, H. He, Y. Lu, and X. Tu (2014). Log-transformation and its implications for data analysis. Shanghai Archives of Psychiatry 26 (2), 105–109. Ferrari, S. L. P. and F. Cribari-Neto (2004). Beta regression for modelling rates and proportions. Journal of the Royal Statistical Society, Series C 31, 799–815. 21

Journal Pre-proof

365

Gayawan, E., M. I. Adarabioyo, D. M. Okewole, S. G. Fashoto, and J. C. Ukaegbu (2016). Ge-

366

ographical variations in infant and child mortality in West Africa: a geo-additive discrete-time

367

survival modelling. Genus 72 (5), 1–20. Gayawan, E., S. B. Adebayo, and E. Waldmann (2019). Modeling the spatial variability in the

369

spread and correlation of childhood malnutrition in nigeria. Statistics in Medicine 38 (10), 1869–

370

1890.

371

372

pro of

368

Gayawan, E. and M. C. Turra (2015). Mapping the determinants of child mortality in nigeria: estimates from mortality index. African Geographical Review 34, 269–293.

Kandala, N. B., L. Fahrmeir, S. Klasen, and J. Priebe (2009). Geo-additive models of childhood

374

undernutrition in three Sub-Saharan African countries. Population, Space and Place 15, 461–473.

375

Kandala, N. B., C. Ji, N. Stallard, S. Stranges, and F. P. Cappuccio (2007). Spatial analysis

376

of risk factors for childhood morbidity in nigeria. American Journal of Tropical Medicine and

377

Hygiene 77, 770–779.

lP

re-

373

Kazembe, L. N. and J. J. Namangale (2007). A Bayesian multnomial model to analyse spatial

379

patterns of childhood co-morbidity in Malawi. European Journal of Epidemiology 22, 545–556.

380

Kinyoki, D. K., S. O. Manda, G. M. Moloney, E. O. Odundo, J. A. Berkley, A. M. Noor, and N.-B.

381

Kandala (2017). Modelling the ecological comorbidity of acute respiratory infection, diarrhoea

382

and stunting among children under the age of 5 years in Somalia. International Statistical

383

Review 85 (1), 164–176.

urn a

378

Klein, N., T. Kneib, S. Klasen, and S. Lang (2015). Bayesian structured additive distributional

385

regression for multivariate responses. Journal of the Royal Statistical Society: Series C (Applied

386

Statistics) 64 (4), 569–591.

Jo

384

387

Klein, N., T. Kneib, and S. Lang (2015). Bayesian generalized additive models for location, scale,

388

and shape for zero-inflated and overdispersed count data. Journal of the American Statistical

389

Association 110 (509), 405–419. 22

Journal Pre-proof

390

Klein, N., T. Kneib, S. Lang, A. Sohn, et al. (2015). Bayesian structured additive distributional

391

regression with an application to regional income inequality in Germany. The Annals of Applied

392

Statistics 9 (2), 1024–1052. Manda, S. and R. Meyer (2005). Age at first marriage in Malawi: a Bayesian multilevel analysis

394

using a discrete time-to-event model. Journal of Royal Statistical Society Series A 168, 439–455.

395

Mosley, H. W. and L. C. Chen (1984). An analytical framework for the study of child survival in

397

398

399

400

developing countries. Population and Development Review 10, 25–45.

Preston, S. H. and M. R. Haines (1991). Fatal years: Child mortality in late nineteenth-century America. Preston Unversity Press, New Jersey.

re-

396

pro of

393

Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society (Series C) 54, 507–554. R´ıos-Pena, L., T. Kneib, C. Cadarso-Su´arez, N. Klein, and M. Marey-P´erez (2018). Studying the

402

occurrence and burnt area of wildfires using zero-one-inflated structured additive beta regression.

403

Environmental Modelling & Software 110, 107–118.

406

407

408

409

410

411

412

413

and Hall/CRC.

urn a

405

Rue, H. and L. Held (2005). Gaussian Markov random fields: Theory and Applications. Chapman

Spiegelhalter, D. J., N. J. Best, B. Carlin, and A. Van Der Linde (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B 64, 583–639. Stewart, C. (2013). Zero-inflated beta distribution for modeling the proportions in quantitative fatty acid signature analysis. Journal of Applied Statistics 30, 985–992. Stewart, C. and C. Field (2011). Managing the essential zeros in quantitative fatty acid signatures

Jo

404

lP

401

analysis. Journal of Agricultural, Biological and Environmental Statistics 16, 45–69. Trussell, J. and S. Preston (1982). Estimating the covariates of childhood mortality from retrospective reports of mothers. Health Policy and Education 3, 1–36. 23

Journal Pre-proof

414

415

UNICEF (2019).

Under-five mortality.

https://data.unicef.org/topic/child-survival/

under-five-mortality. Accessed: 2019-09-23. United Nations (1982). Model life tables for developing countries.

417

Uthman, O. A. (2008). Geographical variations and contextual effects on age of initiation of sexual

418

intercourse among women in Nigeria: A multilevel and spatial analysis. International Journal

419

of Health Geographics 7, 27.

lP

re-

variation in child malnutrition in india? Journal of Public Health 23(5), 277–287.

urn a

421

Yadav, A., L. Ladusingh, and E. Gayawan (2015). Does a geographical context explain regional

Jo

420

pro of

416

24