Journal Pre-proof Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria Ezra Gayawan, Oluwatoyin Deborah Fasusi, Dipankar Bandyopadhyay
PII: DOI: Reference:
S2211-6753(20)30009-9 https://doi.org/10.1016/j.spasta.2020.100415 SPASTA 100415
To appear in:
Spatial Statistics
Received date : 4 November 2019 Revised date : 31 December 2019 Accepted date : 23 January 2020 Please cite this article as: E. Gayawan, O.D. Fasusi and D. Bandyopadhyay, Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria. Spatial Statistics (2020), doi: https://doi.org/10.1016/j.spasta.2020.100415. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2020 Elsevier B.V. All rights reserved.
Journal Pre-proof
Structured additive distributional zero augmented beta regression modeling of mortality in Nigeria Ezra Gayawan1 , Oluwatoyin Deborah Fasusi1 , Dipankar Bandyopadhyay2 2
Department of Statistics, Federal University of Technology, Akure, Nigeria
pro of
1
∗
Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, U.S.A.
Abstract
lP
re-
Child mortality has remained persistently high in most sub-Saharan African countries. Majority of the effort in analysing the determinants, or covariables did not consider the duration of exposure to mortality risks. In addition, covariates are usually linked to the mean of the response variable, thereby neglecting the possible association with other higher moments. In this paper, we account for the duration of exposure via the child mortality index, defined as the ratio of observed to expected child death, for all women captured in the 2013 Nigeria Demographic and Health Survey. Based on this index, a structured additive distributional beta regression model was adopted to examine covariate effects on the probability of a woman experiencing no child mortality, the conditional expectation of mortality, and the mortality spread, controlling for latent spatial associations. Our inferential framework is Bayesian inference, powered by generic MCMC tools based on iterative weighted least squares. Results confirm the existence of significant variation in the likelihood of a woman experiencing no child mortality, and in the spread of mortality, across Nigerian states. Findings also show that although mortality is fairly spread among women aged ≥ 30 years, it is concentrated among the younger women.
1
1. Introduction
urn a
Keywords: Zero-augmented beta; child mortality; spatial variation; life table; Nigeria
The survival probability of young children is a major indicator of a country’s socio-economic,
3
health and other developmental indices (Gayawan and Turra, 2015). Though appreciable progress
4
has been made in reducing the burden of child mortality globally, the statistics from developing
5
countries are still terrifying, especially in sub-Saharan Africa, where roughly half of the estimated
6
5.4 million children under-five years that died in year 2017 are concentrated (UNICEF, 2019). ∗
Jo
2
Address for correspondence: Biostatistics and Spatial Statistics Laboratory, Department of Statistics, Federal University of Technology, PMB 704, Akure, Ondo State, Nigeria. E-mail:
[email protected]
Preprint submitted to Elsevier
January 25, 2020
Journal Pre-proof
Mortality, the extreme outcome of diseases in children, is usually the cumulative consequence
8
of multiple disease processes. Classical approaches for analyzing child mortality in developing
9
countries included parametric regressions, or survival modeling, mainly restricted to children under
10
the age of five years, which consequently do not take cognizance of the duration of exposure to
11
diseases, and the cumulative effects of other risks that enhance mortality. However, mortality
12
among older children contribute substantially to the overall child mortality. Furthermore, (child)
13
mortality can be spatially clustered, i.e., mortality experience of women residing within proximally
14
located geographical entities (such as, a state, and its associated neighboring states, in a country)
15
can be similar, and quite different from states that are geographically distant. In addition, among a
16
group of women located within a geographic unit, mortality experience may also vary substantially,
17
due to differences in cultural belief and practices, which often affect child care and health-seeking
18
behaviour. All these need to be taken into account in describing the experience in a multicultural
19
setting like Nigeria, which is the focus of this study.
re-
pro of
7
The geographical location where a child lives often set the stage for the operation of other
21
factors to determine the survival probability of the child. Individuals located in the urban areas and
22
other more favoured settings of most developing countries usually have access to social amenities
23
and improved health care facilities, which jointly enhances their well-being and those of their
24
children. However, this is often a concern for the non-privileged ones. Consequently, there has
25
been considerable interest in estimating spatial differences in mortality among young children, and
26
in the distribution of other health issues in many developing countries (Adebayo and Fahrmeir,
27
2005; Kandala et al., 2009; Kazembe and Namangale, 2007; Yadav et al., 2015; Gayawan et al.,
28
2016; Kinyoki et al., 2017). Inequality in the distribution of available wealth could also lead to
29
varied levels of morbidity and mortality among people living in same geographical environment.
30
However, most of the studies fail to account for the length of exposure, i.e., time since birth
31
(Preston and Haines, 1991) to the risks of illness and hence, mortality. They have also been
32
limited to linking covariates to the average mortality, neglecting possible association with higher
33
moments such as variance of the response variable. As a child lives longer in an unfavourable
34
condition, the effects of such condition accumulates over time and can influence the severity of the
35
impact of such conditions on the well-being. Given that mortality levels are generally changing
Jo
urn a
lP
20
2
Journal Pre-proof
over time as condition of living are fast changing with modernization, children born around the
37
same period of time are likely to be exposed to similar risks of death, which will be different
38
from those born at other time periods. Model life table (United Nations, 1982) provides valuable
39
insight into the manner in which age pattern of mortality vary as mortality level changes with time
40
(Preston and Haines, 1991; Gayawan and Turra, 2015).
pro of
36
This study was motivated by the need to investigate the spatial distributions of the level and
42
spread of mortality among children in Nigeria. We focus on women who have given birth to at
43
least a child, as recorded in the 2013 Nigeria Demographic and Health Survey (NDHS) database.
44
In order to take into account for the duration of exposure to the mortality risks, we adopt an
45
indirect technique by computing mortality index, defined as the ratio of the observed to expected
46
child death, as described by Preston and Haines (1991) and Trussell and Preston (1982). While the
47
observed death counts the number of child death for each woman, the expected death accumulates
48
the risks of dying from birth to the current age of the child at the time of the survey. Being an index,
49
the variable was restricted to the interval [0, 1). To model this, some researchers have considered
50
a mixture of the multiplicative logistic normal distribution, and a degenerate distribution that
51
assigns a probability to the essential zeros (Stewart and Field, 2011). However, such an approach
52
involving data transformations often lacks appeal, primarily, during interpretation of study findings
53
based on transformed data, in light of the study hypothesis generated from the original data (Feng
54
et al., 2014). In this context, the two-parameter beta regression (henceforth, BR) (Ferrari and
55
Cribari-Neto, 2004) that naturally models a response bounded in (0, 1) is also not applicable here,
56
given that the index exhibits an abundance of ‘essential’ zeros resulting from women who never
57
experienced child mortality. We circumvent this via the zero-augmented beta (ZAB) regression
58
(Stewart, 2013; Bandyopadhyay et al., 2017) model, which initiates a mixture setup by augmenting
59
the probability of observing zero (the third parameter) to the beta density.
urn a
lP
re-
41
The broad and generic distributional regression (DR) framework (Klein et al., 2015) similar in
61
spirit to the generalized additive models for location, scale and shape (Rigby and Stasinopoulos,
62
2005) expands the exponential family regression by encompassing continuous, discrete and mixed
63
discrete-continuous responses. Within the premise of DR, it is possible to link the three parameters
64
(the probability of observing zero, and the two from the BR) of the ZAB model to the regression
Jo
60
3
Journal Pre-proof
predictors to estimate the covariate effects on each of them, rather than modeling only the mean
66
mortality. Our major contribution is to extend the DR framework to ZAB responses with struc-
67
tured additive predictors (Fahrmeir et al., 2013; Klein et al., 2015), i.e., where the predictor space
68
for each parameter includes a nonparametric effect of a continuous covariate (age), linear effects
69
of categorical covariates, and a spatial component. Our inferential paradigm is Bayesian, powered
70
by a Markov chain Monte Carlo (MCMC) simulation algorithm based on distribution-specific it-
71
eratively weighted least squares approximations to the full conditionals (Klein et al., 2015), with
72
a multivariate Gaussian prior to enforce desired properties, thus yielding numerically stable esti-
73
mates. With these, we are able to discern the spatial association not only in the levels of child
74
mortality, but also in its spread, and in the probability of a woman not experiencing child death.
75
To the best of our knowledge, this current initiative advances previous approaches on estimating
76
child mortality in an attempt to disentangle covariate and spatial effects on important mortality
77
summaries.
re-
pro of
65
The rest of the article is structured as follows. In Section 2, we present the description of our
79
motivating NDHS survey database, and construct the mortality index. After an introduction to the
80
ZABR model, Section 3 develops the corresponding structured additive DR modeling, and related
81
Bayesian inference. Section 4 summarizes the posterior estimates derived from fitting the model
82
to the NDHS data. Finally, Section 5 presents a conclusion, with directions of future research.
83
2. Data
84
2.1. 2013 NDHS survey data
urn a
lP
78
Our motivating database is the 2013 NDHS generated by the Demographic and Health Surveys
86
(DHS) Program, and responsible for collecting and disseminating accurate, nationally represen-
87
tative data on health and population in developing countries. Data from the surveys are freely
88
available from the DHS website (www.dhsprogram.com) upon request which entails filling a pre-
89
scribed form on the website and submission of a short description on the intended use. If approved,
90
the user is notified within a short time, possibly within 2 working days. The sampling frame was
91
based on the list of enumeration areas used during the 2006 Population Census of Nigeria. Nigeria
92
is administratively divided into 36 states and a Federal Capital Territory (FCT), and each state
Jo
85
4
Journal Pre-proof
is further divided into local government areas (LGAs). Data were realized through a stratified,
94
three-stage, cluster design consisting of 904 clusters; 372 in urban areas and 532 in rural areas.
95
In all, a representative sample of 40, 320 households was selected for the survey, with a minimum
96
target of 943 completed interviews per state. All women aged 15 to 49, who were either permanent
97
residents or visitors in the selected households, were eligible for individual interview. A total of
98
39, 902 were identified as eligible for individual interviews, and 98 percent of them were success-
99
fully interviewed. The DHS data sets come in different format, each containing information on the
100
households, women, men, and children. The Birth Recode data set that contains the birth history
101
of all the interviewed women and information on all children ever born to the women was used for
102
this study. Date of birth for all the children ever born, information on whether the child is dead or
103
alive were extracted together with other important covariates and used in creating the mortality
104
index, described in section 2.2. Dummy variables were generated from all categorical variables.
re-
pro of
93
Extensive literature exists on the socio-economic and demographic factors that shape child
106
mortality in resource-poor countries. Following a careful review of the analytical framework by
107
Mosley and Chen (1984) and other studies focusing on Nigeria (Adebayo and Fahrmeir, 2005;
108
Gayawan et al., 2016), the following variables were considered in the study: mother’s age, mother’s
109
level of educational, type of place of residence, woman’s working status, exposure to mass media
110
(newspaper, radio, and television; whether or not the woman accesses these at least once a week),
111
household source of water, type of household toilet facility and cooking fuel being used, and
112
whether or not the household has electricity. Figure 1 presents a labelled map of Nigeria, showing
113
the administrative states and the Federal Capital Territory, Abuja.
Jo
urn a
lP
105
5
Journal Pre-proof
114
re-
pro of
Figure 1: Map of Nigeria showing the 36 states and the Federal Capital Territory (FCT)
2.2. Mortality index
The mortality index was created as a ratio of actual to expected child death for each woman, as
116
described by Trussell and Preston (1982) and Preston and Haines (1991). The actual death counts
117
the number of deaths of children to each woman as reported in the survey data set. Expected
118
child death was obtained by accumulating the probabilities of dying between date of birth and
119
age at period of survey, for all children borne by each woman. These probabilities were derived
120
from a standard model life table, the general pattern of the United Nations Model Life Table for
121
developing countries (United Nations, 1982). For each cohort of children, the probability of dying
122
from birth to exact age x was obtained as:
urn a
lP
115
X q(x) = 1 − exp − qx
124
Jo
123
(1)
x
where qx is the probability that a child aged exactly x would die before reaching age x+1. Summing these probabilities q(x) for each woman based on children ever born yields her expected child death
6
Journal Pre-proof
(ECD). The mortality index, M I is then calculated as: MI =
ACD ECD
(2)
where ACD is the count of actual death for each woman. This index have been found to provide
126
robust and reliable estimates of mortality when used as a response variable in a regression analysis
127
(Gayawan and Turra, 2015; Preston and Haines, 1991).
128
3. Statistical Modeling
129
3.1. ZABR model
pro of
125
The ZABR model, a mixed discrete-continuous regression model, is an extension of the BR introduced by (Ferrari and Cribari-Neto, 2004). Given a response variable yi ∈ (0, 1) following a
re-
beta density with parameters p and q, the corresponding density function is given by f (yi | µ, σ 2 ) =
y p−1 × (1 − y)q−1 B(p, q)
parametrization µ =
p , p+q
σ2 =
lP
where p, q > 0, B(p, q) is the beta function expressed as B(p, q) = 1 , p+q+1
(3) Γ(p)Γ(q) Γ(p+q)
=
(p−1)!(q−1)! . (p+q−1)!
The
where µ, σ 2 ∈ (0, 1) yields E(yi ) = µi , with σi2 directly
proportional to the variance, i.e., V ar(yi ) = σi2 µi (1 − µi ). For the ZAB density, a new parameter
131
urn a
130
θ is introduced to account for the probability of observations at zero. The density now becomes θ y=0 f (y | µ, σ 2 , θ) = (4) p−1 q−1 (1 − θ) y ×(1−p) y ∈ (0, 1) B(p,q) where 0 ≤ θ ≤ 1, such that the mean and variance of the ZAB model becomes E(yi ) = (1 − θ)µi
and V ar(yi ) = (1 − θ)σi2 µi (1 − µi ), respectively. The maximum likelihood estimate for θ is the (n−n0 ) , n
proportion of zeros in the sample, i.e., θˆ =
133
the n observations.
134
3.2. Structured additive distributional regression
Jo
132
where n0 is the number of observed zeros from
135
Consider the scalar response variable y1 , . . . , yn and covariate information v collected from
136
the n respondents, in this case, mothers. Under a structured additive distributional regression 7
Journal Pre-proof
(Klein et al., 2015) framework, the parameter space ϑk = (ϑ1 = θ, ϑ2 = µ, ϑ3 = σ 2 ) of the ZABR
138
model can be linked to the semi-parametric regression predictor ηiϑk through a suitable (one-to-
139
ϑk ). The link function ensures appropriate restrictions one) link function, such that ϑik = hkϑk (ηik
140
on the parameter space. For example, one may employ the logit, probit or complimentary log log
141
(cloglog) link functions for µ and θ, and a log link on σ 2 to ensure positivity. The generic form of
142
the structured additive distributional model is given by:
ηiϑk
=
β0ϑk
+
pro of
137
Jk X
fjϑk (υ)
(5)
j=1
where ϑk is a generic parameter k and fjϑk (υ) represents various functions defined on the
re-
complete covariates. The predictors for the different distributional parameters can be linked to entirely different functions, or different number of functions. Simplifying notation, each function fj of (5) can be represented as a linear combination of basis functions, such that, in matrix form,
lP
fj = Zj βj where Zj is a design matrix and βj is the vector of coefficients to be estimated. This leads to the following matrix representation:
η = β0 1 + Z1 β1 + · · · + Zl βl
urn a
A multiplicative normal prior is then assumed for each parameter vectors βj : j) Rk(K 2 1 1 0 2 p(βj |τj ) ∝ exp − 2 βj Kj βj τj2 2τj
(6)
(7)
143
where Kj is a prior precision matrix which corresponds to the penalty matrix in a frequentist
144
formulation. The hyperparameters τj2 are assigned inverse gamma hyperpriors. For linear effects, fj (υ) = x0i βj where x0i is a subvector of υ for all appropriately coded
146
categorical variables, which include mother’s level of education, working status, and so on. In this
147
case, Zj = X, the data matrix and a non-informative prior was considered such that Kj =0. To
148
account for possible non-linear effects of the (continuous) mother’s age, we have fj (υ) = fj (wi ),
149
where fj is an appropriate smooth function for estimating the single continuous variable wi . Here,
150
Zj is constructed through a B-spline basis evaluated at the observation wi (Eilers and Marx, 1996;
Jo
145
8
Journal Pre-proof
151
Brezger and Lang, 2006), and Kj = θ2 R0 R where R can be a first, or second order random walk
152
difference matrix, and θ2 is a hyperprior. The hyperprior was further assigned inverse gamma
153
prior, i.e., θ2 ∼ IV G(a, b) with a and b chosen such that the prior is non-informative.
154
For the discrete spatial variable, fj (υ) = fj (si ), where si ∈ {1, . . . , 37} is a discrete spatial unit
for the residence of the ith mother observed over the states of Nigeria. Here, the matrix Zj has n
156
rows and 37 columns (corresponding to the number of states involved), where the ith row and pth
157
column is 1 if the ith woman is from the pth state, and 0 if otherwise. Markov random fields are a
158
common approach for estimating this spatial effect (Rue and Held, 2005). In this case, K =
159
being a penalty matrix with −1 for states which share common boundary and 0 for distant states. Again, θ2 is assigned inverse gamma.
161
3.3. Bayesian inference
1 Kj θ2
re-
160
pro of
155
The complex likelihood structures of non-standard distributions utilised in distributional re-
163
gression, in most cases, result in full conditionals for the unknown regression coefficients that are
164
analytically intractable. As a result, fully Bayesian inference are based on the posterior distribution
165
of the model parameters, which are not of known form. Consequently, we resort to Markov chain
166
Monte Carlo (MCMC) sampling techniques to generate samples from the full conditionals for the
167
linear, nonlinear, spatial effects, and smoothing parameters, to be used for posterior analysis. The
168
MCMC sampler was executed as a Metropolis-Hastings algorithm based on iteratively weighted
169
least square (IWLS) developed by Klein et al. (2015), and implemented in BayesX - a software
170
package used for Bayesian inference in structured additive regression models.
urn a
lP
162
The MCMC simulation was carried out based on a total of 35,000 iterations with a burn-in
172
sample of 5,000 and thinning every 30th observation for parameter estimation. Convergence was
173
monitored through the plots of the sampling path for all parameters. We considered three models
174
of different specifications in order to ascertain the gain of controlling for covariates in smoothing
175
the structured spatial effects. The first model (Model I) is the ‘only spatial’ model, without
176
accounting for any covariates. The second (Model II) includes all covariates, and the spatial term,
177
but estimates age as a linear effect rather than a smooth function. Finally, in Model III, age was
178
included as a nonlinear effect, in addition to all available covariates and the spatial term. Model
179
comparison was based on Deviance Information Criterion, or DIC (Spiegelhalter et al., 2002). The
Jo
171
9
Journal Pre-proof
180
structured additive distributional model described in Section 3.2 was implemented with the three-
181
parameter ZABR framework presented in Section 3.1 to describe the geographical differences in
182
various aspects of mortality in Nigeria. The full model is described as follows:
pro of
θ θ θ θ θ θ η = β0 + urbanβ1 + · · · + workβp + f (age) + fspat (s)
η µ = β µ + urbanβ µ + · · · + workβpµ + f µ (age) + f µ (s)
spat 0 1 η σ2 = β σ2 + urbanβ σ2 + · · · + workβ σ2 + f σ2 (age) + f σ2 (s) p spat 0 1
(8)
where β0 is the intercept term, βp is the linear parameter for working status, f (age) is the
184
nonlinear effects for age (in years) and fspat (s) is the discrete spatial effect for the state where the
185
child resides.
186
4. Results
re-
183
Table 1 presents the frequency distribution of women and children that were used in this study,
188
classified according to various socio-demographic and behavioral variables. Overall, information on
189
27, 451 women who altogether gave birth to 119, 386 children were analysed. As highlights, about
190
44% of the women had no education, and these women gave birth to about half of the children
191
(51%) while the other half was from women who attained at least primary education. Also, about
192
64% of the women who gave birth to 68% of the children live in rural areas, while 62% living in
193
households with unimproved toilet facilities gave birth to about 64% of the children. For other
194
details, see Table 1. Figure 2(a-c) plots the maps of proportion of women who have experienced no
195
child death, mean mortality index given that death was recorded and conditional variance of the
196
index, respectively, while (d-f) are the respective (spatially) smoothed versions based on model I.
197
Comparing the crude and smoothed maps, there seems to be no obvious differences in the spatial
198
distributions. Apparently, most states in the southern, and north-central regions and Borno (in
199
the north-east) have high proportion of women with no child mortality, but the proportion is lower
200
for states in the northern fringe of the country. The conditional mean index of child mortality
201
appears higher for the states Bauchi, Kano, Zamfara, Kebbi and Katsina, but lower in other parts
202
of the country, while the variability of mortality given the occurrence of death is highest in Bauchi,
203
Kano, Zamfara, Oyo and Ebonyi states.
Jo
urn a
lP
187
10
Journal Pre-proof
Table 1: Frequency distribution of the women and children considered based on the categorical variables
No of Children (%) 119,386 (100)
11,952 (43.54) 5,953 (21.69) 7,475 (27.23) 2,071 (7.54)
60,778 (50.91) 27,945 (23.41) 24,388 (20.43) 6,275 (5.26)
9,778 (35.62) 17,673 (64.38)
38,786 (32.46) 80,600 (67.51)
11,046 (40.24) 16,342 (59.53) 63 (0.23)
68,459 (57.34) 50,632 (42.41) 295 (0.25)
re-
pro of
No of Women (%) 27,451 (100)
17,010 (61.96) 10,410 (37.92) 31 (0.11)
76,353 (63.95) 42,899 (35.93) 134 (0.11)
13,913 (50.68) 13,492 (49.15) 46 (0.17)
64,494 (54.02) 54,703 (45.82) 189 (0.16)
22,905 (83.44) 4,387 (15.98) 159 (0.58)
103,870 (87.00) 14,815 (12.41)
10,517 (38.31) 16,851 (61.39) 83 (0.30)
48,891 (40.95) 70,120 (58.73) 375 (0.31)
13,796 (50.26) 13,547 (49.35) 108 (0.39)
66,678 (55.85) 52,239 (43.76) 469 (0.39)
1541 (5.62) 20,300 (73.95) 4.797 (17.47) 496 (1.81) 317 (1.15)
6,522 (5.46) 94,944 (79.53) 15,488 (12.97) 1,435 (1.20) 997 (0.84)
19,932 (72.61) 7,410 (26.99) 109 (0.40)
90,776 (76.04) 28,210 (23.63) 400 (0.34)
Jo
urn a
lP
Variables Total Education No Education (ref) Primary Secondary Higher Residence Urban (ref) Rural Water Unprotected (ref) Protected Missing Toilet Unimproved (ref) Improved Missing Electricity No (ref) Yes Missing Newspaper No (ref) Yes Missing Radio No (ref) Yes Missing Television No (ref) Yes Missing Cooking fuel Any other means (ref) Wood Kerosene Electricity/Gas Missing Work No (ref) Yes Missing
11
(b)
re-
(a)
pro of
Journal Pre-proof
(e)
lP
(d)
(c)
(f)
Figure 2: Maps of Nigeria showing (a) unadjusted proportion of women who have not experienced death of a child; (b) unadjusted mean mortality index; and (c) unadjusted variance of the index; (d) smoothed proportion (θ); (e) smoothed mean (µ), and (f) smoothed variance (σ 2 )
Table 2 presents the summary DIC statistics comparing Models I-III. It is clear that Model
205
III, which incorporates the spatial effect, nonlinear effect of age and the linear effects of other
206
covariates performs better than both Models I and II. As expected, model complexity (as reflected
207
in the estimate of pD, the effective number of parameters) increases substantially from Models I
208
to Model III. Henceforth, the discussion of our findings will be based on Model III, our model of
209
choice.
Jo
urn a
204
Table 2: Model fit and complexity criterion
Deviance pD DIC Model I -27392.269 64.359 -27263.552 Model II -28353.589 87.230 -28179.128 Model III -34813.110 104.133 -34604.845
12
Journal Pre-proof
Table 3 presents the posterior estimates evaluating the linear effects of covariables on the prob-
211
ability of a woman having no children death, the mean conditional expectation i.e., the expected
212
mortality, conditioned on occurrence of death and the spread of mortality given occurrence of
213
death (the variance in the mortality index, given that death has occurred). For each parameter,
214
the posterior means, along with the 95% credible interval (CI) are reported. As expected, the
215
results reveal that as the level of education increases, the likelihood that a woman would have
216
experienced no child death increases, and is significant. This strong link between the mother’s
217
education and child survival is well established (Caldwell, 1979; Adetunji, 1995). Educated women
218
are mostly aware of the availability and access of appropriate child care for the benefit of their chil-
219
dren. Findings also show that a rural area woman has a significantly lower chance of not recording
220
the death of a child, compared to her urban counterpart. This is unsurprising, given that the
221
urban dwelling in Nigeria is associated with better health care and general living conditions than
222
the rural areas. Furthermore, compared to their counterparts, the probability that a women would
223
experience no child death is significantly higher among women from households with electricity,
224
those that depend on electricity/gas as cooking fuel, and the working women. Estimates for the
225
other variables are not significant.
lP
re-
pro of
210
We now present the findings from the regression on the posterior mean conditional expectation
227
based on the mortality index as presented in columns 5-7 of Table 3. Note, following the construc-
228
tion of our mortality index (ratio of observed to expected child death), a woman who has given
229
birth to many children would have accumulated higher value for the expected death, compared to
230
a woman with fewer children. Thus, dividing the observed death with the accumulated expected
231
death (considering all children ever borne by the woman) would always favour a woman with
232
many children with a lower mortality index. Results on educational level, for instance, show that
233
compared to women with no education, the expected mortality index given occurrence of death is
234
significantly lower among women who attain primary level of education, but significantly higher
235
for those with secondary or higher levels. This is in tune to studies that have shown education
236
to be an opportunity cost for childbearing and thus, number of children ever born reduces as
237
level of education increases (Manda and Meyer, 2005). Therefore any child death experienced by
238
these women, who are assumed to have better knowledge of mortality preventive measures, would
Jo
urn a
226
13
Journal Pre-proof
drastically enhance their mortality index. Similarly, results show significantly higher estimates
240
for women residing in rural areas, those who read newspapers at least once a week, those who
241
use electricity/gas as cooking fuel and those who are working. Apart from rural women, all these
242
behavioral variables can be attributed to women with tendency of having fewer children than their
243
counterparts. Generally speaking, rural women have higher number of children than those in ur-
244
ban areas, and also experience higher child mortality (Bongaarts, 2017). Also, significantly lower
245
estimates are obtained for women from households having electricity, those who watch television
246
at least once a week, and those who use wood as cooking fuel.
pro of
239
Results from the regression on the variability of the mortality index, given death occurrence as
248
presented in columns 8-10 of Table 3 contributes to our understanding on the divergences of the
249
mortality experience among the various socio-demographic and behavioral sub groups (covariates).
250
This is a measure of mortality risk, and quantifies uncertainty in predicting expected mortality,
251
provided that death has occurred (R´ıos-Pena et al., 2018). For example, a positive posterior
252
estimate corresponding to a covariate indicates higher conditional variability among the women
253
represented, whereas, a negative estimate indicates lower variation. The conditional variability in
254
the mortality index was significantly higher among women possessing higher level of education, but
255
lower for those with primary education, when compared to their counterparts with no education.
256
This translates to the fact that among women who attained higher level of education in Nigeria,
257
child mortality is a concentrated issue among a few women, but is fairly spread among those with
258
primary education. This higher variability among highly educated women can be attributed to the
259
classification of higher education in the DHS data, which considered a simple two-year National
260
diploma and the highest terminal PhD degree to be in the same category. It’s likely that women
261
within this highly educated category may have varying understanding of child care practices.
262
Similarly, the conditional variation is significantly higher among women who read newspaper at
263
least once a week, those using electricity/gas, and the working women, as compared to their
264
counterparts.
Jo
urn a
lP
re-
247
14
Journal Pre-proof
Table 3: Linear effects on the likelihood of observing zero death, mean conditional expectation of mortality index and conditional variability of mortality
0.611
0.471
0.746
-2.013
0.053 0.375 0.838
0.221 0.569 1.165
0 -0.215
-0.290
-0.137
0 0.065
0.002
0.130
0 0.010
-0.055
0.077
0 0.020
0.181
0 -0.050
0.172
0 0.062
0 0.076
0.039
lP
0 0.112
-0.026
95% CI -2.083
1.943
Spread Mean -3.151
95% CI -3.307
-2.993
-0.195 -0.046 0.269
-0.022 0.175 0.654
0 -0.067 0.045 0.189
-0.103 0.002 0.103
-0.031 0.089 0.279
0 -0.107 0.066 0.459
0 0.043
0.007
0.077
0 -0.014
-0.094
0.072
0 0.022
-0.004
0.049
0 0.010
-0.055
0.076
0.049
0 0.043
-0.024
0.112
-0.019
0 0.006
-0.066
0.077
0.113
0 0.205
0.083
0.325
-0.014
0.041
0 -0.038
-0.111
0.034
re-
0 0.135 0.475 0.999
-0.009
-0.077
0.011
-0.050
0.094
0 0.013
0 0.031
-0.047
0.109
0 -0.075
-0.109
-0.037
0 -0.077
-0.162
0.008
0 -0.022 0.128 0.406
-0.139 -0.017 0.125
0.100 0.274 0.677
0 -0.066 0.035 0.381
-0.123 -0.032 0.161
-0.015 0.103 0.606
0 -0.070 0.128 0.884
-0.189 -0.277 0.491
0.049 0.050 1.276
0.226
0 0.081
0.110
0 0.161
0.092
0.226
urn a
0 0.024
0 0.153
Jo
Constant Education No education (ref) Primary Secondary Higher Residence Urban (ref) Rural Water Source Unprotected (ref) Protected Toilet Improved (ref) Unimproved Electricity No (ref) Yes Newspaper No (ref) Yes Radio No (ref) Yes Television No (ref) Yes Cooking Facility Others(ref) Wood Kerosene Electricity/Gas Working Status No (ref) Yes
Mean index Mean
95% CI
pro of
Zero death Mean
Variables
0.081
0.053
265
Results evaluating the nonlinear effect of the mother’s age on the three aforementioned param-
266
eters are presented in Figure 3(a-c), where the posterior mean estimates are denoted by black lines,
15
Journal Pre-proof
and the 95% credible intervals by blue lines. The plots reveal an overall declining pattern with
268
increasing mother’s age. Specifically, as a woman advances in age, the probability that she would
269
experience no child death reduces, and this is expected as with increase in age, a woman is prone
270
to having more children which are in turn exposed to higher risks of death. As for the conditional
271
expectation of the index, the expected greater number of children borne by older woman would
272
have resulted into lower mortality index, compared to the younger women with fewer number of
273
child births. For the measure of conditional variability given occurrence of death, we observe sub-
274
stantial differences among younger women below 30 years, while the variation diminishes for ages
275
≥ 30.
urn a
lP
re-
pro of
267
Figure 3: Nonlinear effects of the mother’s age on (a) probability of zero death, (b) mean conditional expectation of mortality index, and (c) conditional variance of mortality. The ‘black’ lines represent the posterior mean estimates, while the 95% credible intervals are denoted by the ‘blue’ lines.
Posterior evaluation of the spatial effects on the above three quantities are presented in Figure
277
4(a-f). While the left panel of the figure presents the maps of the posterior means, the right
278
panel considers the 95% CIs to determine the significance of the corresponding posterior mean
279
estimates. For the CI maps, black colour signifies states with significantly high estimates, gray
280
colour indicates low estimates, while the estimates for states in white colour are not significant.
Jo
276
16
Journal Pre-proof
As for the probability component (Figure 4 (a & b)), the chance of experiencing no child death
282
is significantly higher in the southern states of Anambra and Enugu (South-East region); Edo,
283
Delta, Akwa Ibom, Bayelsa, Rivers, Cross Rivers (all in the South-South); Ogun, Oyo, and Osun
284
(South-West region), as well as in the northern states of Kogi, Niger, Plateau, and Kwara (North-
285
Central); Kaduna (North-West), Borno (North-East) and the FCT. The probabilities are lower
286
in mostly northern states of Kebbi, Sokoto, Zamfara, Katsina, Kano, and Jigawa (North-West);
287
Bauchi, Gombe, Adamawa, and Taraba (North-East), and Ebonyi (the only one from the southern
288
fringe). These findings confirm the existence of a north-south divide in child mortality in Nigeria
289
(Antai, 2011). Regarding the conditional expectation of the mean component, significantly higher
290
estimates are obtained for Abia and Ebonyi (South-East region); Jigawa and Zamfara (North-
291
West) Lagos states in the South-West, but lower in Oyo, Kaduna, Yobe, Gombe and Benue states.
292
For the conditional variability, we observe lower estimates in Oyo and in neighbouring Borno and
293
Yobe states in the North-East region, but higher in Lagos and the neighbouring Abia, Anambra,
294
Ebonyi and Rivers. Estimates for the majority of the other states are not significant.
re-
pro of
281
The findings from this analysis, which, in particular, reveal a divide in mortality experience
296
of the women, can be attributed to the widespread regional disparity in health-seeking behaviour
297
regarding immunization and maternal and child care utilization. Evidence has shown that in some
298
of the northern Nigerian states where the practice of Purdah is still prevalent and women are
299
undervalued, the women’s reproductive and health-seeking behavior either for themselves or their
300
children are strictly determined by men, or their mother-in-laws. This limits their ability to provide
301
the desired care for their children, often leading to higher proportion of home delivery without pro-
302
fessional assistance to handle birth-related complications (Antai, 2011; Babalola and Fatusi, 2009).
303
Further studies (Gayawan et al., 2019; Kandala et al., 2007; Uthman, 2008) have also reported
304
similar north-south divide in childhood nutritional intake, communicable diseases, younger age at
305
marriage and first birth, and widespread socioeconomic inequality, which are established factors
306
to exacerbate the survival chances of the less privileged children.
Jo
urn a
lP
295
17
pro of
Journal Pre-proof
(b)
(d)
Jo
urn a
(c)
lP
re-
(a)
(e)
(f)
Figure 4: Spatial effects of (a) probability of zero death (b) mean conditional expectation of mortality index and (c) conditional variance of mortality
18
Journal Pre-proof
307
5. Conclusion Over the years, studies on the determinants of child mortality and the spatial variations were
309
based on statistical models that relates variables to the conditional mean of the response variable,
310
thereby ignoring vital information on the covariate effects on higher moments of the response.
311
Such ignored information could be valuable in policy formulation and implementation. Further,
312
little attention is given to the duration of exposure in studying the risks of mortality. In this
313
paper, under a ZABR framework, we explore a structured additive distributional regression model
314
to explore both linear and non-linear effects of covariates/determinants on various parameters as-
315
sociated with the child mortality index, controlled for spatial association. Our choice of the ZABR
316
framework ensues that the discrete-continuous nature of the index variable was preserved, rather
317
than adopting any transformation-based approach that may have resulted in loss of information.
318
Of particular interest are the findings that indicate a divide in the likelihood of a woman experi-
319
encing child mortality across the states; huge differences in mortality experienced by women living
320
in Lagos, Abia, Anambra, Ebonyi and Rivers, but evenly distributed among those in Oyo, Borno
321
and Yobe states; and that variation in child mortality is significant among younger women, but
322
somewhat uniform for older women aged thirty years and above. As an intervention, younger
323
women could be incentivized to attend antenatal and postnatal clinics, where experienced atten-
324
dants could be designated to tutor these women on child care practices. Furthermore, based on
325
the different parameters considered, we found education and working status of women to be signif-
326
icantly associated with mortality. Consequently, federal and state governments can ensure that all
327
potential mothers receive education higher than the primary level. This would lead to knowledge
328
enhancements regarding of child care, and thereby promote decision-making as a parent. From
329
policy considerations, the government can also consider an extended maternity leave for working
330
women as opposed to the present practice of three months leave. This would avail mothers ample
331
time for child care after delivery.
Jo
urn a
lP
re-
pro of
308
332
As alternatives to the ZABR model, the beta rectangular and simplex regressions, along with
333
their zero and one augmented versions have been considered in the literature (Bandyopadhyay
334
et al., 2017) to model responses in the closed interval [0,1]. However, their extensions to include
19
Journal Pre-proof
spatial random effects, and other non-linear covariate effects via structured additive distributional
336
regression are not available. This remains a viable potential area for future research, which we
337
hope to undertake.
338
Acknowledgements
lP
re-
United States National Institutes of Health.
urn a
340
Bandyopadhyay acknowledges support from grant R01DE024984 and P30CA016059 from the
Jo
339
pro of
335
20
Journal Pre-proof
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
pro of
345
Science and Medicine 40, 253–263.
Antai, D. (2011). Regional inequalities in under-5 mortality in Nigeria: A population-based analysis of individual- and community-level determinants. Population Health Metric 9, 6. Babalola, S. and A. Fatusi (2009). Determinants of use of maternal health services in nigeria looking beyond individual and household factors. BMC Pregnancy and Childbirth 9 (43). Bandyopadhyay, D., D. M. Galvis, and V. H. Lachos (2017). Augmented mixed models for clustered
re-
344
Adetunji, J. A. (1995). Infant mortality and mother’s education in Ondo state, Nigeria. Social
proportion data. Statistical Methods in Medical Research 26, 880–897. Bongaarts, J. (2017). Africa’s unique fertility transition. Population and Development Review 43, 39–58.
lP
343
Statistics in Medicine 24, 709728.
Brezger, A. and S. Lang (2006). Generalized structured additive regression based on Bayesian P-splines. Computational Statistics and Data Analysis 50, 967–991. Caldwell, J. C. (1979). Education as a factor in mortality decline: an examination of Nigerian data. Population Studies 33, 395–413.
urn a
342
Adebayo, S. and L. Fahrmeir (2005). Analysing child mortality in nigeria with geoadditive discrete.
Eilers, P. H. and B. D. Marx (1996). Flexible smoothing using b-splines and penalized likelihood. Statistical Science 11, 236–245.
Fahrmeir, L., T. Kneib, S. Lang, and B. Marx (2013). Regression: Models, Methods and Applications. Springer, Hiedelberg.
Jo
341
Feng, C., H. Wang, N. Lu, T. Chen, H. He, Y. Lu, and X. Tu (2014). Log-transformation and its implications for data analysis. Shanghai Archives of Psychiatry 26 (2), 105–109. Ferrari, S. L. P. and F. Cribari-Neto (2004). Beta regression for modelling rates and proportions. Journal of the Royal Statistical Society, Series C 31, 799–815. 21
Journal Pre-proof
365
Gayawan, E., M. I. Adarabioyo, D. M. Okewole, S. G. Fashoto, and J. C. Ukaegbu (2016). Ge-
366
ographical variations in infant and child mortality in West Africa: a geo-additive discrete-time
367
survival modelling. Genus 72 (5), 1–20. Gayawan, E., S. B. Adebayo, and E. Waldmann (2019). Modeling the spatial variability in the
369
spread and correlation of childhood malnutrition in nigeria. Statistics in Medicine 38 (10), 1869–
370
1890.
371
372
pro of
368
Gayawan, E. and M. C. Turra (2015). Mapping the determinants of child mortality in nigeria: estimates from mortality index. African Geographical Review 34, 269–293.
Kandala, N. B., L. Fahrmeir, S. Klasen, and J. Priebe (2009). Geo-additive models of childhood
374
undernutrition in three Sub-Saharan African countries. Population, Space and Place 15, 461–473.
375
Kandala, N. B., C. Ji, N. Stallard, S. Stranges, and F. P. Cappuccio (2007). Spatial analysis
376
of risk factors for childhood morbidity in nigeria. American Journal of Tropical Medicine and
377
Hygiene 77, 770–779.
lP
re-
373
Kazembe, L. N. and J. J. Namangale (2007). A Bayesian multnomial model to analyse spatial
379
patterns of childhood co-morbidity in Malawi. European Journal of Epidemiology 22, 545–556.
380
Kinyoki, D. K., S. O. Manda, G. M. Moloney, E. O. Odundo, J. A. Berkley, A. M. Noor, and N.-B.
381
Kandala (2017). Modelling the ecological comorbidity of acute respiratory infection, diarrhoea
382
and stunting among children under the age of 5 years in Somalia. International Statistical
383
Review 85 (1), 164–176.
urn a
378
Klein, N., T. Kneib, S. Klasen, and S. Lang (2015). Bayesian structured additive distributional
385
regression for multivariate responses. Journal of the Royal Statistical Society: Series C (Applied
386
Statistics) 64 (4), 569–591.
Jo
384
387
Klein, N., T. Kneib, and S. Lang (2015). Bayesian generalized additive models for location, scale,
388
and shape for zero-inflated and overdispersed count data. Journal of the American Statistical
389
Association 110 (509), 405–419. 22
Journal Pre-proof
390
Klein, N., T. Kneib, S. Lang, A. Sohn, et al. (2015). Bayesian structured additive distributional
391
regression with an application to regional income inequality in Germany. The Annals of Applied
392
Statistics 9 (2), 1024–1052. Manda, S. and R. Meyer (2005). Age at first marriage in Malawi: a Bayesian multilevel analysis
394
using a discrete time-to-event model. Journal of Royal Statistical Society Series A 168, 439–455.
395
Mosley, H. W. and L. C. Chen (1984). An analytical framework for the study of child survival in
397
398
399
400
developing countries. Population and Development Review 10, 25–45.
Preston, S. H. and M. R. Haines (1991). Fatal years: Child mortality in late nineteenth-century America. Preston Unversity Press, New Jersey.
re-
396
pro of
393
Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society (Series C) 54, 507–554. R´ıos-Pena, L., T. Kneib, C. Cadarso-Su´arez, N. Klein, and M. Marey-P´erez (2018). Studying the
402
occurrence and burnt area of wildfires using zero-one-inflated structured additive beta regression.
403
Environmental Modelling & Software 110, 107–118.
406
407
408
409
410
411
412
413
and Hall/CRC.
urn a
405
Rue, H. and L. Held (2005). Gaussian Markov random fields: Theory and Applications. Chapman
Spiegelhalter, D. J., N. J. Best, B. Carlin, and A. Van Der Linde (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B 64, 583–639. Stewart, C. (2013). Zero-inflated beta distribution for modeling the proportions in quantitative fatty acid signature analysis. Journal of Applied Statistics 30, 985–992. Stewart, C. and C. Field (2011). Managing the essential zeros in quantitative fatty acid signatures
Jo
404
lP
401
analysis. Journal of Agricultural, Biological and Environmental Statistics 16, 45–69. Trussell, J. and S. Preston (1982). Estimating the covariates of childhood mortality from retrospective reports of mothers. Health Policy and Education 3, 1–36. 23
Journal Pre-proof
414
415
UNICEF (2019).
Under-five mortality.
https://data.unicef.org/topic/child-survival/
under-five-mortality. Accessed: 2019-09-23. United Nations (1982). Model life tables for developing countries.
417
Uthman, O. A. (2008). Geographical variations and contextual effects on age of initiation of sexual
418
intercourse among women in Nigeria: A multilevel and spatial analysis. International Journal
419
of Health Geographics 7, 27.
lP
re-
variation in child malnutrition in india? Journal of Public Health 23(5), 277–287.
urn a
421
Yadav, A., L. Ladusingh, and E. Gayawan (2015). Does a geographical context explain regional
Jo
420
pro of
416
24