A model of the probability of survival from birth

A model of the probability of survival from birth

Mathl. Comput. Pergamon PII: SO8957177(97)00170-2 Modelling Vol. 26, No. 6, pp. 69-78, 1997 Copyright@1997 Elsevier Science Ltd Printed in Great Br...

773KB Sizes 1 Downloads 21 Views

Mathl. Comput.

Pergamon PII:

SO8957177(97)00170-2

Modelling Vol. 26, No. 6, pp. 69-78, 1997 Copyright@1997 Elsevier Science Ltd Printed in Great Britain. All rights reserved 08957177197 $17.00 $ 0.00

A Model of the Probability of Survival from Birth C. DENNY Department of Sociology, Emory University Atlanta, GA 30322, U.S.A.

Abstract-It has recently been shown [l] that the reciprocal of the survivorship function of a life table is very strongly related to a linear combination of the reciprocals of the same function of two or more standard life tables. It follows, therefore, that the mathematical function which describes the reciprocal of e(z) must take the form -

1

=

1+afl(r,ml,nl,...

)+bm(~,ntz,nn,...)+cf3(x,m3,723,.,.)+...,

w

where a, b, c, . are constants for specific life tables and ml, ml,. ; mg, 712,. , etc. are constant for all life tables. If ms, ns, are not constant, then equation (1) will not hold for all values of x. Furthermore, there must be at least two f functions or else one standard life table would be sufficient to model the reciprocal of e(x) [2]. A three parameter equation which meets the above criteria and models the e(x) values of national life tables well is presented along with measures of the goodness of fit. The three parameters are shown to correspond with variations in mortality at young, middle, and old ages.

Keywords-Force

of mortality, Survivorship function, Life expectancy, Nonlinear regression.

1. INTRODUCTION Mathematical models of mortality are necessary for demographers working with incomplete or poor quality data. They are also useful in making demographic projections. The quality of the models is usually judged by at least three criteria: empirical accuracy in explaining a wide variety of observed patterns, economy of parameters, and the ability of the parameters to be meaningfully interpreted [3]. This paper presents a three parameter model of the life table survivorship function which fulfills all of these requirements.

2. MODELS

OF LIFE TABLE

FUNCTIONS

Perhaps the earliest attempt to model a life t,able function was that of DeMoivre [4] who in 1725 hypothesized a linear relationship between age, x, and the survivorship function of the life table, f?(x). He applied his model,

l(x) = L?(O)(1-

;> ,

where (I is an upper age limit, to Halley’s [5] life table. above,

and the upper age limit was determined

course the observed

The model covers the ages of 12 and

by the value which best fit the life table

values of e(x) do not exhibit

Typeset 69

[S]. Of

such a pattern. by A@-‘I$$

C. DENNY

70

Gompertz [7] developed his model of mortality by reasoning from the pattern of mortality in the upper ages of life. He noted that man’s ability to avoid death decreased with age and assumed that this decrease occurred in equal amounts “at the end of equal infinitely small intervals of time” [7, p. 5181. Thus he hypothesized an exponential relationship between age and the force of mortality. His model, which he fit to ages past childhood, can be expressed as ~(5) = BC”, where z is age,

-1 ‘(x)

= m

(2)

S(z)

(3)

dx

is the force of mortality at age x, and B and C are constants. Makeham [8] improved on Gompertz’ model by adding a constant which he saw as accounting for causes of mortality which did not vary with age. His equation /L(X) = A + BC”,

(4

where A is that constant, was somewhat better but Ieft much to be desired. Much later, Perks [9] made further improvements on the works of Gompertz and Makeham. His five parameter equation

A+BC” /4x> = KC-” + 1 + DC=

(5)

not only produced a better fit than the earlier equations but also covered the entire age interval. Besides mathematical equations, model life tables have been developed to aid analysis of mortality patterns with incomplete and inaccurate data. The user, based on the information he or she has about a population, chooses a model table to describe the population from a large collection of published model life tables. The model life tables are primarily derived from statistical analysis of a collection of life tables from actual populations. Various sets of model life tables have been published, the first in 1955 by the United Nations [lo]. The main weakness with using model life tables is a lack of flexibility. For example, the model life tables published in 1955 by the United Nations [lo] are comprised of twenty-four life tables for each sex. These tables vary by the general level of mortality, but variations, for example, in adult versus child mortality for the same life expectancy at birth, are not captured. The widely used Coale and Demeny [11,12] set of model life tables, first published in 1966 and then revised in 1983, has more variability. One chooses a table based on two properties. One is the overall level of mortality, of which there are twenty-five levels in the 1983 edition. The other is region divided into four categories, named “North”, “South”, “East”, and “West”, which vary among themselves by the relative level of infant, childhood, adult, and old age mortality. Still, equations with continuous parameters can achieve much more flexibility than large collections of model life tables, such as that of Coale and Demeny [12]. Probably the most often applied statistical method of modeling life table functions is the logit model of Brass [13]. He found a high correlation between the logits of the survivorship functions of two life tables and accordingly suggested the linear equation

where Lo(z) and l,(x) are the survivorship functions of two life tables. Thus, life tables can be generated by the use of a standard life table and appropriate values of the parameters a and b. The goodness of fit of the expected values of the survivorship function will of course depend on the choice of the standard life table. An eight parameter model of the probability of dying at successive ages was developed by Heligman and PoIlard [14]. That model is

qz = A(“+B)c + De-E& s-h F)’ +

GH= 1 +GHz’

(7)

where qz is the proportion of persons aged exactly x who are expected to die within one year

Survival

and the eight

parameters

to be estimated

are represented

According

to them, the first term of the equation

accidental

deaths

and maternal

mortality,

from Birth

71

by the capital

reflects mortality

A, B, . . . , H.

letters

in early childhood,

and the last the rise in mortality

at adult

the second ages.

The

strength of their model is that it provides a very good fit and all of its parameters have plausible demographic interpretations. Its weakness is that some of the parameters show no discernible trend with changes in level of mortality and therefore cannot be projected [15]. A two parameter model of mortality which has shown promise in regards to the goodness of fit with the parameters demonstrating a consistent pattern is that of Mitra and Denny [16]. The double log model, as it was named, expresses ln(- ln(e(x))) as a function of In(x) and ln(cu -- CC). where z is age and cy is the upper limit of life and is based on an earlier model by Mitra (171. The latter model was derived from the former by forcing it to pass through e(l), a measure of infant

mortality,

earlier

model

and it is this constraint

which reduces the number

from three to two. The parameters ln(-In!!(z))

- ln(-l&(l))

of parameters

in Mitra’s

(171

of the equation

= mlnz

- n[ln(ck - X) -- ln(a: - l)]

are estimated by using weighted least squares regression with the constraint zero. For a given CC,the corresponding weight is the value of

(8)

that the intercept

is

(9) which is an estimate of the reciprocal of the variance of the dependent variable. In order for the double log model to generate a survivorship function, an estimate of C(1) and the parameters m and n are necessary. Given a value for e(l), the model can, from a technical point of view, generate an infinite number of life tables by varying the values of m and n within their boundaries 0 < m 5 1 and n > m. The value of n indicates the overall level of mortality while m reflects the pattern of mortality.

3. DEVELOPMENT

OF THE

NEW

MODEL

Mitra [l], while examining the Brass [13] model, found that the boundary conditions of the survivorship function require the slope coefficient, b to be equal to one which reduces the number of parameters of equation (6) from two to one. He also showed that when b = 1, equation (6) can alternatively be expressed as

(10) He found a good fit with the one parameter equation estimated by weighted least squares with intercept zero, but discovered an even better fit by adding a second parameter and a second standard life table to the model which became

(11) where C,(Z) and C,(Z) stand for the survivorship functions of two standard life tables. The quality of fit by the method of least squares regression through the origin turned out to be superior to that of Brass’s [13] logit model. It follows that one would find an ever better fit with the addition of more parameters and standard tables. Mitra [2] reasoned that if equation (11) is true for any set of life tables, then the mathematical function (if a simple one exists) which describes the e(z) function or its reciprocal must take a certain form. More exactly, the reciprocal of J?(X) must be expressible as 1 -=l+af~(z,ml,nl,... e(x)

)+b_f2(~,m2,n~

. . . . )+cf~(x,m3,n3

,...

)+...

.

(12)

C. DENNY

72

where a, b, c, . . . are constants for specific life tables and ml, n1, . . . ; 7712,122,.. . , etc. are constants for all life tables. If the ms, ns, . . . are not constant, then equation (11) will not hold for all values of z. Equation (12) will always produce a monotonic nonincreasing survivorship function when the f functions

are monotonically

increasing

functions

of z and the parameters

a, b, c, . . . are

positive. However, it is not absolutely necessary for a, b, c, . . . to be all positive as long as they can maintain the monotonic nature of (12). Furthermore, Mitra [2] pointed out that there must be at least two f functions

or else one standard

life table would be sufficient

to model the reciprocal

of e(Z). Before developing an equation which fulfills the above two prerequisites, two additional tions in the form of boundary conditions of f!(z) are introduced. These are 1

(13)

e(o)=l and

restric-

1 1

-

e(a)

=

(14)

00,

because the value of the survivorship function at age zero is always equal to one, the radix of the life table, and the probability of survival equals zero at some age a, the end of the human life span. It follows that if (12) can model t(z), then each of its components fi must satisfy (13), and at least one of them must satisfy (14). Simple functions like (z/(cr - z))~ and (eZ/(a-Z) - 1)” satisfy both of the conditions exactly exactly for for m,n > 0, while still simpler functions like 1 - e-lcx satisfy the first condition positive integral values of k. Denoting these three monotonically increasing functions of x as fi, fz, and f3, respectively, and by experimenting with a number of life tables, it has been found that simple values for the constants like m = 3, n = l/2, and k = 2 produce excellent fit with varying positive values of a, b, and c a value of 105 years. This value was selecting the age which provided the Analysis below demonstrates that

for the different life tables. The last age (Yhas been assigned selected by varying the value of cy between 80 and 120 and best fit for a wide range of mortality levels. the model equation 1

e(x) = 1 f a(~/(105

- z))~ + b (ez/(lo5-r)

- 1)“2 + c(1 - e-2Z)

(15)

can be regarded as a good approximation of the life table survivorship function. It conforms to the four criteria stated above in that it has the form of equation (12), where the parameters a, b, and c are the constants for a specific life table and the other constants ms, ns, . . . are constant for all life tables. Furthermore, the equation has more than one monotonically increasing f function, in fact it has three, the value of C(0) is always 1, and at age 105, the end of the life span, the probability of survival is zero.

4. EXPERIMENTS

WITH

THE

NEW

MODEL

The new model as presented in equation (15) was tested in a number of ways to examine how well it fit actual national life tables. It was tried on the male and female tables from ten countries, drawn from the 1985 Demographic Yearbook of the United Nations [18]. The countries represent a wide range of mortality levels with life expectancies at birth ranging from 45.11 years for the 1978 Rwanda male life table to 79.75 years for the 1984 Japan female table. A complete list of the countries along with their life expectancies at birth can be found in Tables 1 and 2. The values of the parameters in equation (15) were estimated by using nonlinear regression through application of the CNLR command in SPSS and the NLIN procedure in SAS. The results of those regressions are presented in Tables 1 and 2. The value of the coefficient of

Survival from Birth

73

Table 1. Parameters of the regression equation for ten male life tables (by ascending life expectancy at birth).

Table 2. Parameters of the regression equatron for ten female life tables (by ascending life expectancy at birth).

I

Bahrain Argentina

I

1984

/

I

79.75

.01562

.02081

I

.00028

.99986

determination, R2, varies between .99780 and .99993 for the twenty life tables. This is a promising result for the first measure of the goodness of fit. Since the observations are not independent, R2 does not have its usual statistical interpretation but is helpful as one of the summary measures of error. It should also be noted that all three parameters a, b, and c tend to decline with increasing life expectancy. The values of n, b, and c are expected to be positive, but there were two minor exceptions: the 1984 Japan male life table has a value of c equal to -.00295 and the same parameter for the 1983 Hong Kong female life table equals -.00281. These values of c are not significantly different from 0. One may accordingly use c = 0 for the generation of life tables in such cases, but since the monotonic nature of e(x) was not disturbed by these values of c, they were retained for the derivation of the model values of l(r) of those life tables. Another way to examine the strength of the new model is to see how well equation (15) is able to reproduce the observed values of e(z) of the national life tables. Tables 3 and 4 present observed and expected values of the survivorship function for three male and three female national life tables, respectively. Figures 1 and 2 graphically depict the results presented in Tables 3 and 4. Both the tables and the figures demonstrate that the model is able to reproduce the observed values of the survivorship function quite well.

74

C. DENNY Table 3. Observed and expected values of e(z) for selected national male life tables.

Age

India

India

Bahrain

Bahrain

Japan

1961-70

1961-70

1976-81

1976-81

1984

Japan 1984

Observed

Expected

Observed

Expected

Observed

Expected

01 1

1

1

.86987

I1 (

I1 .86643

1

I1

I1 .94669

1

.95018

1

I1 .99338

1

.99887

.92836

.93296

.99113

.99450

5

.81092

.81611

10

.78249

.78736

.92216

.92374

.98987

.99053

15

.76251

.76366

.91793

.91569

.98882

.98701

20

.74301

.74154

.91139

.90771

.98539

.98347

25

I

.72230

1

.71931

1

.90209

(

.89910

1

.98125

1

.97958

30

1

.69973

1

.69562

1

.89146

1

.88915

1

.97724

1

.97499

35

.67334

.66908

.87893

.87690

.97245

.96919

40

.64109

.63803

.86255

.86097

.96541

.96142

45

.60183

.60038

.84014

.83929

.95401

.95048

50

I

.55091

I

.55371

1

.80844

1

.80879

1

.93587

(

.93439

55

1

.49004

1

.49556

1

.76290

1

.76505

1

.90605

1

.90997

60

1

.42084

1

.42439

1

.69895

(

.70228

1

.86492

I

.87208

.33916

.34133

.61089

.61445

.80705

.81283

70

.24965

.25198

.49720

.49896

.72198

.72174

75

.16825

.16642

.36465

.36285

.59599

.58955

80

.10384

.09560

.23103

.22658

.42318

.41974

85

.05866

.04605

.I1868

.11591

.23645

.24223

65

0.8 0.6 0.4 0.2 0 0

20

40

60

80

Figure 1. Observed and expected values of e(z) for selected national male life tables.

Two other measures of how well the new model fit the observed data were performed. One was to examine the difference between the observed and expected life expectancy at birth for each national life table. The life expectancy at birth was calculated using standard demographic procedures from the observed and expected L(s) values. The results of this examination are presented in Table 5 for both the male and female life tables. The absolute value of the difference between the actual and expected life expectancies at birth for the twenty national life tables had

Survival from Birth

75

Table 4. Observed and expected values of L’(s) for selected national female life tables.

1 ~~_--=====~-._~-~___ - ~--_________ -_. --------__ '... ,;apan 1984~ ~\,\ --1.. ___ -.'~ _..._ 0.8 1 ._'..i.__:___ ....I_ \_ 1.. ‘._.Y I : Bahrain1976-81 ‘\,,, .-::-\. 0.6

3

\

I

\

0.4 ~

0.2 IL--observed __ expected

() I___ 0

I !

----L._L--.~ 20

%.I

India1961-70 ' _.\\__ A.

',.,~

.-. .___ 40

~______ 60

-:_ ..I 80

age Figure 2. Observed and expected values of e(r) for selected national female life tables. a range of 0 to 0.21 years. For 15 of the 20 life tables, the difference was less than 0.1. The second measure was to calculate the index of dissimilarity between the actual and the expected stationary percentage age distributions for the twenty life tables. These values presented in Table 6 vary from .0813 to .8612. Only four of the twenty tables have an index value greater than 0.5. Both the difference between actual and expected life expectancies at birth and the index of dissimilarity are small enough to show support for the new model.

,,LM X.+-D

76

C. DENNY Table 5. Actual and expected life expectancies at birth and their difference for ten male and ten female life tables (by ascending male life expectancy at birth).

-T-

Country

Rwanda

Actual

Female

Minus

Actual

Exp.

c(0)

India Botswana Iran Argentina

1969-71

1

61.90

1 61.93

Bahrain Costa Rica 1970-72

69.54

1

.Ol

1

.04

1 66.25

66.22

1

.03

I

.02

I

.08

1

.06

1

70.27

1

) 69.21

1 69.13

.08

1 75.91

75.83

1

.09

77.91

77.83

.08

79.75

79.75

.oo

Hong Kong

1983

72.34

72.25

Japan

1984

74.41

74.42

-.Ol

70.29

Table 6. Values of the index of dissimilarity between actual and expected stationary percentage age distributions for ten male and ten female life tables.

1 Argentina

1 1969-71

1

.1802

Bahrain

1 1976-81

1

.I587

Botswana

1 198&81

)

.5333

I

I I

.1630 .I566 .5401

Canada

1970-72

.2461

.3138

Costa Rica

1972-74

.1823

.0864

Hong Kong

1983

.2497

.3213

India

1961-70

.3716

.8612

Iran

1976

.4850

.5308

Japan

1984

.1900

.0813

Rwanda

1 1978

1

.4595

1

.5957

I

The final experiment conducted with this model was to examine the effect of varying individually the values of the three parameters of equation (15) on the resultant life tables. Specifically, the estimated parameters for the 1969-71 Argentina male life table, a country with intermediate mortality, were substituted in equation (15) and the expected values of the survivorship function were graphed. Then, one by one, the parameters of equation (15) were assigned the values estimated for both the 1978 Rwanda male and 1984 Japan male life tables, countries with, respectively, high and low mortality. The results of this experiment are presented in Figures 3, 4, and 5. These figures demonstrate that parameters a, b, and c are indicators of mortality in the end, middle, and beginning of the life span, respectively.

5. CONCLUDING

REMARKS

The three parameter model presented in equation (15) of this paper has been shown to reproduce well the survivorship functions of ten life tables covering both sexes and a wide range of mortality levels. The three parameters a, b, and c determine the shape of the mortality curve in the end, middle, and beginning of the life span, respectively.

Survival from Birth

/

‘--------

0.8i

-==GS:;~-__ Y.:..---., ‘:\ ‘.. .y-...

77

‘1.

-,

! /

'. a=.03674

,ja =,,.I2756 / '\ 1

a = 38461 ,_

0

20

40

60

a0

100

age Figure 3. Varying the value of the parameter a. ’

LZL .‘,

0.81

‘.

--_-_=__ _:_____-______ -- ___ -‘-‘:::_, “1.,.

b=.435O6

~

_

---

L:\

b=.03742

--....~~066;4

\

0.6 ' ';7 Y 0.4

0.2

.‘\

\‘l\ ‘_

0 ‘0

-<

-.-~.-__-.-._~__ 20

40

60

80

100

age Figure 4. Varying the value of the parameter b

0.6

g

, j \’

0.4 1

\ ,L__.__.__.__.~.~~_~~!

0.2 /

0

20

40

60

age Figure 5. Varying the value of the parameter c.

80

100

C. DENNY

78

A number of measures of the goodness of fit of the new model were presented: visual inspection of the graphs and expected values of the survivorship function, the coefficient of determination from estimating the parameters of the equation, examination of actual and expected life expectancy at birth, and the index of dissimilarity between the actual and expected stationary percentage literature

age distributions. Other measures are based on regressing the observed

not shown here but found in the demographic on the expected 1(z) values and examining the

deviation of the slope from one and the intercept form zero [16] and the root mean square ror [15]. More research could be done comparing this model with others. In fact, comparison the various presented attention

models thus

of mortality

far are sufficient

current

in the literature

to demonstrate

that

needs to be developed. this three

parameter

model

erof

Yet the results warrants

the

of practitioners.

REFERENCES 1. S. Mitra, Boundary condition sets limits on Brass’s model life table parameters, Canadian Studies in Population 22, 67-78, (1995). 2. S. Mitra, The search for model life table functions, Presented at the 1995 Annual Meeting of the Canadian Population Society, Montreal, Demography India, (Forthcoming). 3. United Nations, Manual X: Indirect Techniques for Demographic Estimation, Population Studies No. 81, United Nations, New York, (1983). 4. A. DeMoivre, Annuities on Lives: Or, the Valuation of Annuities Upon Any Number of Lives; as also of Reversions, London, (1725). 5. E. Halley, An estimate of the degrees of mortality of mankind drawn from curious tables of the births and funerals at the city of Bresiau, Philosophical Z?ansactions of the Royal Society 17, 596-610, 653-656, (1693). 6. D. Smith and N. Keyfitz, Mathematical Demography: Selected Papers, Springer-Verlag, New York, (1977). 7. B. Gompertz, On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies, Philosophical lPransactions of the Royal Society 115, 513-585, (1825). 8. W.M. Makeham, On the law of mortality, Journal of the Institute of Actuaries 13, 325-358, (1860). 9. W. Perks, On some experiments in the graduation of mortality statistics, Journal of the Institute of Actuaries 63, 12-40, (1932). Model Life-Tables for Under-Developed Countries, 10. United Nations, Age and Sex Patterns of Mortality: Population Studies No. 22, United Nations, New York, (1955). 11. A.J. Coale and P. Demeny, Regional Model Life Tables and Stable Populations, Princeton University Press, Princeton, (1966). 12. A.J. Coaie and P. Demeny, Regional Model Life Tables and Stable Populations, Academic Press, New York, (1983). 13. W. Brass, The Demography of Tropical Africa, Princeton University Press, Princeton, (1968). 14. L. Heligman and J.H. Pollard, The age pattern of mortality, Journal of the Institute of Actuaries 107, 49-80, (1980). 15. N. Keyfitz, Experiments in the projection of mortality, Canadian Studies in Population 18, 1-17, (1991). 16. S. Mitra and C. Denny, On the application of a model of mortality, Canadian Studies in Population 21, 117-132, (1994). 17. S. Mitra, Modelling life table functions, Demography India 12, 115-121, (1983). 18. United Nations, 1985 Demographic Yearbook, United Nations, New York, (1987).