Interval Between Fetal Measurements in Predicting Growth Restriction PHILIP OWEN, MD, MRCOG, SURINDRA MAHARAJ, MB, MRCOG, KHALID S. KHAN, MSc, MRCOG, AND P. W. HOWIE, MD, FRCOG Objective: To determine the influence of the interval between fetal measurements on performance of fetal growth velocity for predicting infants with anthropometric features of fetal growth restriction (FGR). Methods: Two hundred seventy-four low-risk women had serial fetal biometry at scheduled intervals. Growth velocity of the fetal abdominal area for each was calculated with 2-, 4-, and 6-week scan intervals in which the second measurement was the last scan before delivery. Fetal abdominal area velocity over a 4-week interval in the early third trimester also was included. Fetal growth restriction was defined as skinfold thickness under the tenth percentile, ponderal index under the 25th percentile, midarm circumference–to– occipitofrontal circumference ratio of under ⴚ 1 standard deviation (SD). Test performance was expressed as likelihood ratios with 95% confidence intervals (CI). Results: Fetal abdominal area velocity calculated over a 4-week interval predicted FGR with a likelihood ratio of 10.4 (95% CI 3.9, 26) for skinfold thickness; 9.5 (95% CI 4.6, 19) for ponderal index; and 4.7 (2.3, 8.4) for midarm circumference– to– occipitofrontal circumference ratio. Intermeasurement intervals of 6 weeks had a likelihood ratio of 8.5 (95% CI 4, 17) for skinfold thickness; 7.5 (95% CI 3.4, 16.1) for ponderal index; and 14 (6.7, 28) for midarm circumference–to– occipitofrontal circumference ratio. The likelihood ratios for the 2-week interval and the early third trimester 4-week interval were all less than 5. Conclusion: Four- and 6-week measurement intervals were useful for predicting infants with FGR and were superior to a 2-week interval. Fetal growth velocity is influenced by proximity of the last fetal measurement to date of delivery, which adversely affects clinical use of growth velocity for predicting FGR. (Obstet Gynecol 2001;97:499 –504. © 2001 by The American College of Obstetricians and Gynecologists.)
Reliable antenatal identification of growth-restricted infants at risk of adverse outcomes might be expected to improve allocation of monitoring resources, with the possibility of improving perinatal outcomes. Growth velocity standards quantify fetal growth from two ultrasound measurements1 provided the interval between measurements and gestational age at second measurement are known. Fetal growth velocity has been described as potentially valuable for antenatal prediction of growth-restricted infants and intrapartum heart rate abnormalities.2,3 The distinction between growth-restricted and constitutionally small infants is well recognized, as is the limited importance of birth weight as an indicator of fetal growth achievement and perinatal outcome.4 Abnormalities of neonatal body constitution appear to be more useful indicators of adverse short- and long-term outcomes than birth weight alone.5–7 Yet, only a few studies have used neonatal anthropometric criteria to diagnose fetal growth restriction (FGR).2,8 Using absence of an increment in fetal abdominal circumference measurement as their definition of FGR, Mongelli et al9 found that shortening of intermeasurement interval resulted in increased false-positive diagnosis rates for FGR. However, that analysis was limited because of the lack of an appropriate standard for the confirmation of FGR. Against that background we decided to investigate the influence of betweenmeasurement interval on diagnostic performance of fetal growth velocity for predicting three different neonatal anthropometric criteria of FGR.
Materials and Methods From the Department of Obstetrics, Glasgow Royal Maternity Hospital, Glasgow, Scotland; the Department of Obstetrics and Gynaecology, Birmingham Women’s Hospital, West Midlands, England; and the Department of Obstetrics and Gynaecology, Ninewells Hospital and Medical Scool, Dundee, Scotland. Philip Owen was supported by Wellbeing, the charitable arm of the Royal College of Obstetricians and Gynaecologists, London.
VOL. 97, NO. 4, APRIL 2001
Three hundred thirteen women who attended the antenatal clinic at Ninewells Hospital (Dundee, Scotland) were enrolled, the details of which were presented in detail elsewhere.1 Entry criteria were singleton pregnancy, gestational age less than 85 days confirmed by
0029-7844/01/$20.00 PII S0029-7844(00)01155-8
499
crown-rump length measurement, and absence of recognized risk factors of accelerated or restricted fetal growth, including history of small for gestational age (SGA) infants, existing medical disorders, or heavy smoking (more than 20 cigarettes per day). All subjects were sequentially entered into one of the four scheduled scanning schedules: 1. 2. 3. 4. (n
22, 26, 30, 32, 34, 36, 38, 40 weeks (n ⫽ 72) 23, 27, 31, 33, 35, 37, 39, 41 weeks (n ⫽ 72) 24, 28, 32, 34, 36, 38, 40 weeks (n ⫽ 63) 25, 29, 33, 35, 37, 39, 41 weeks (n ⫽ 67) ⫽ number continuing in the study).
Ultrasound measurements were made using an Aloka SSD-650 (Aloka Co. Ltd., Mitaka-shi, Tokyo, Japan) real-time ultrasound scanner with a 3.5-MHz probe, by the same observer (PO). Crown-rump length was measured in a standard manner10 and gestational age calculated with reference to that measurement.11 The fetal abdominal area was measured at the level of the umbilical vein by tracing the outline of the trunk on-screen.12 Three measurements were made and the mean recorded. Intraobserver reproducibility was assessed by measuring ten volunteers outside the study in the third trimester. The coefficient of variation was 1.45%. Birth weight was adjusted to account for mothers’ height and midpregnancy weight, and accorded a centile position according to gestational age, sex, and birth order.13 Skinfold thickness was measured on the second or third day of life using Holtain calipers (CMS Weighing Equipment, London, England). Three measurements were made at the infant’s subscapular and triceps areas, the mean measurement recorded, and a centile position measured after adjustment for gestational age and sex.14 The occipitofrontal circumference was measured with a tape measure and the mean of three measurements recorded. Midarm circumference was measured at the point halfway between acromion and olecranon process of the ulna on the right arm, flexed at 90°. The mean of three measurements was recorded, the midarm circumference–to– occipitofrontal circumference ratio calculated, and ratios compared with gestational age–adjusted reference data.15 Neonatal length was measured on a standard neonatal anthropometer on the third day of life. The mean of three measurements was recorded, the ponderal index calculated, and centile position recorded.16 Neonatal measurements were made by one investigator (PO) without knowledge of fetal growth velocity calculations. Ethical approval for the study was obtained from the local Research Ethics Committee. Fetal growth velocity was determined for four intervals and gestational ages: at 2-week intervals using the
500 Owen et al
Time Interval Influences Prediction of FGR
last measurement before delivery, at 4-week intervals using the last measurement before delivery, at 6-week intervals using the last measurement before delivery, and at 4-week intervals in the early third trimester (included to examine the influence of proximity to delivery time on test performance). The velocity standard deviation score represents the gestational age– adjusted mean daily increment between the two measurements and was calculated from the following formula: Velocity standard deviation score ⫽
daily increment ⫺ reference mean increment standard deviation
The reference mean increment and standard deviation refer to the gestational age-specific values determined from published reference ranges for ultrasound growth velocity established from the same population.1 The gestational ages of the reference values were the second fetal measurements used in the calculations of daily increments. We analyzed the data using receiver operating characteristic (ROC) curves in the first instance because when there are many tests and outcomes, the area under the ROC curve is considered the best discriminator of diagnostic performance. In ROC analysis, the sensitivity is plotted against 1 ⫺ specificity, and this plot provides an opportunity to define the cutoff points for classifying positive and negative cases for test results on a continuous scale. We used an iterative process to determine a cutoff point that maximized specificity of the various growth velocity measurements because in conditions of low prevalence such as FGR, a diagnostic test would be more useful if a positive result ruled in disease (ie, a test with high specificity). For each cutoff point we calculated the likelihood ratio, defined as sensitivity/(1 ⫺ specificity). That calculation is a clinically useful measure of test accuracy that enables one to quantify the effect a particular test result has on the probability of a certain outcome. Using a simplified form of Bayes’ theorem: Posterior odds ⫽ prior odds ⫻ likelihood ratio, where odds ⫽ probability/共1 ⫺ probability兲 and probability ⫽ odds/共odds ⫹ 1兲. For a positive test result, likelihood ratios exceeding 10 generate significant changes in the pretest probability of growth restriction, whereas likelihood ratios of 5–10 generate only moderate changes. For a negative result, likelihood ratios less than 0.1 generate statistically significant changes, whereas likelihood ratios of 0.1– 0.2
Obstetrics & Gynecology
Table 1. Performance of Growth Velocity Indices in Predicting FGR Criterion for FGR SKFT SKFT SKFT SKFT PI PI PI PI MAC/OFC MAC/OFC MAC/OFC MAC/OFC
Velocity parameter
Number of cases
Area under ROC curve
Cut off velocity SD score
Sensitivity (%)
Specificity (%)
2-wk 4-wk 6-wk Early third trimester 2-wk 4-wk 6-wk Early third trimester 2-wk 4-wk 6-wk Early third trimester
225 221 219 196 241 233 233 237 213 202 206 194
0.68 0.74 0.83 0.73 0.65 0.75 0.80 0.73 0.59 0.69 0.66 0.73
⫺2.5 ⫺2 ⫺1.16 ⫺1.5 ⫺2.5 ⫺1.55 ⫺1.17 ⫺1.4 ⫺2.5 ⫺1.5 ⫺1.17 ⫺1.3
14 30 44 18 11 40 34 19 22 47 67 28
95 97 95 95 95 96 95 96 95 90 95 95
FGR ⫽ fetal growth restriction; ROC ⫽ receiver operating characteristic; SD ⫽ standard deviation; SKFT ⫽ skinfold thickness ⬍ 10th percentile; PI ⫽ ponderal index ⬍ 25th percentile; MAC/OFC ⫽ midarm circumference–to– occipitofrontal circumference ratio ⬍ ⫺1 SD.
generate only moderate changes in the pretest probability.17 The power of our sample was explored according to the method proposed by Simel et al.18 We were interested in the impact of the test result on the likelihood of disease, particularly that of a positive test result in accurately predicting FGR. That emphasis meant that we wanted a high value of likelihood ratio for a positive test result with a minimum clinically important likelihood ratio threshold of 10.17 Based on a previous study2 that involved the same population, we assumed a disease prevalence of 10%, a sensitivity of 55%, and a specificity of 90%, which allowed us to estimate that approximately 203 cases (including 18 with FGR) were required for an appropriately narrow confidence interval (CI) around a clinically meaningful likelihood ratio value. The total number of women who continued in the study was 274, but the number available for analysis varied according to 12 combinations of measurement interval and criterion for diagnosis of FGR.
Results Among the 274 women who continued in the study, mean maternal age was 26 years, with 148 (54%) in their first pregnancies. Two hundred sixty infants (95%) were delivered at 37 weeks’ gestation or later. Twenty-two (8%) had adjusted birth weights below the tenth percentile, among whom 11 (4%) also were below the third percentile. Skinfold thickness, ponderal index, and midarm circumference–to– occipitofrontal circumference ratios were available in 238, 257, and 237 cases, respectively. Some cases were missed because of early discharge, or could not be categorized because the reference data for neonatal anthropometry used did not extend to preterm births. Twenty-six infants (10.9%) had one or both skinfold thicknesses under the tenth
VOL. 97, NO. 4, APRIL 2001
centile, 40 (15.6%) had a ponderal index under 25th centile, and 17 (7%) cases had midarm circumference– to– occipitofrontal circumference ratios below ⫺1 standard deviation. The median interval and range in days for 2-week, 4-week, 6-week, and early third trimester groups were 14 (11–21), 28 (22–30), 42 (31–57), and 28 (21–37), respectively. For the early third trimester 4-week interval, the median and range of gestational ages for the first and second measurements used to calculate velocity were 32 weeks (range 27–33) and 36 weeks (32–37), respectively. The areas under the ROC curves, together with sensitivity and specificity for the different velocity calculations are presented in Table 1. Velocity intervals have some discriminatory capacity for the three criteria of FGR, but that is highest for the 4- and 6-week intervals. The test performances are further described by the likelihood ratios presented in Table 2. As anticipated from the analysis of the ROC curve areas, the 4and 6-week intervals (both using the last measurement before delivery) had the highest likelihood ratios for a positive test. With the exception of predicting infants with a midarm circumference–to– occipitofrontal circumference ratio of under ⫺1 SD, the 4-week interval had higher likelihood ratios than the 6-week velocity interval, which suggests that a 4-week betweenmeasurement interval will optimize performance of fetal growth velocity for predicting FGR. The likelihood ratios for the 2- and 4-week interval in the early third trimester are low for all three criteria of FGR, suggesting that those intervals and timings of measurements are not useful for predicting FGR.
Discussion Single estimates of fetal size and measurements of fetal body proportionality do not accurately predict infants
Owen et al
Time Interval Influences Prediction of FGR
501
Table 2. Growth Velocity in Predicting FGR Criterion for FGR SKFT
Velocity parameter 2-wk
SKFT
4-wk
SKFT
6-wk
SKFT
Third trimester
PI
2-wk
PI
4-wk
PI
6-wk
PI
Third trimester
MAC/OFC
2-wk
MAC/OFC
4-wk
MAC/OFC
6-wk
MAC/OFC
Third trimester
Pretest probability (%) 13 13 11 11 12 12 13 13 15 15 16 16 15 15 15 15 8 8 7 7 9 9 8 8
LR (⫹) (95% CI)
LR (⫺) (95% CI)
2.7 (0.9, 7.4) 0.9 (0.7, 1) 10.4 (3.9, 26) 0.72 (0.5, 0.8) 8.5 (4, 17) 0.6 (0.4, 0.8) 3.9 (1.4, 10) 0.86 (0.67, 0.97) 2.2 (0.75, 6.15) 0.94 (0.8, 1) 9.5 (4.6, 19) 0.62 (0.4, 0.7) 7.5 (3.4, 16.1) 0.7 (0.5, 0.8) 4.3 (1.7, 10.2) 0.84 (0.68, 0.95) 4.3 (1.5, 11.2) 4.7 (2.3, 8.4)
0.82 (0.6, 1) 0.5 (0.8, 0.8)
14 (6.7, 28) 0.35 (0.2, 0.6) 5.4 (2, 13) 0.76 (0.52, 0.93)
Posttest probability (%)
Change in probability (%)
29 12 57 9 55 8 36 11 29 15 64 10 57 11 44 13 29 7 27 4 57 3 33 7
16 1 46 2 43 4 23 2 14 0 48 6 42 4 29 2 21 1 20 3 48 6 25 1
FGR ⫽ fetal growth restriction; LR (⫹) ⫽ likelihood ratio of a positive test; LR (⫺) ⫽ likelihood ratio of a negative test; CI ⫽ confidence interval; SKFT ⫽ skinfold thickness ⬍ 10th percentile; PI ⫽ ponderal index ⬍ 25th percentile; MAC/OFC ⫽ midarm circumference–to– occipitofrontal circumference ratio ⬍ ⫺1 SD.
with anthropometric features of intrauterine malnourishment.19,20 Serial fetal measurements, however, may show the dynamic processes of normal and abnormal fetal growth, but quantification and interpretation of serial fetal biometry is potentially fraught with difficulties.21 One approach to quantifying serial fetal measurements is the calculation of a growth velocity standard deviation score using values for gestational age-specific daily growth increment means and standard deviations.1 Calculating fetal growth velocity that way was described in the prediction of infants with FGR and pregnancies that required intrapartum cesarean delivery for fetal heart rate abnormality.2,3 In our department, fetal abdominal area measurement is preferred to abdominal circumference because circumference measurements are appropriate when the outline is circular, but only the area is truly representative of a crosssectional fetal profile if the outline is elliptical.22 Several observational studies support measurements of neonatal anthropometry or ratios of actual to expected birth weight in identifying infants who had intrauterine malnourishment and adverse consequences.5–7 Describing fetal growth achievement in terms of a ratio of actual birth weight to an expected optimal birth weight more usefully identifies infants with low skin-
502 Owen et al
Time Interval Influences Prediction of FGR
fold thicknesses and other features of FGR than birth weight alone.23 For those reasons, we have chosen to classify the infants in this study as growth-restricted or not on the basis of three anthropometric measures and not on birth weight. No data are available to instruct us about the relative importance of the three described anthropometric measures in terms of perinatal and long-term performance so we included results of all three. Standard deviation scores are increasingly used in studies of prenatal ultrasound to describe fetal size and growth2,24 because they provide more precise estimates of fetal size and growth than a description of above or below a particular centile, which might under- or overestimate any growth delay. Standard deviation scores also allow us to describe fetal biometry independent of gestational age, whereas traditional centile charts do not permit comparison of different fetuses at different gestational ages. A standard deviation score of zero is the 50th centile and scores of 1.25 and ⫺1.25 approximate to the 90th and 10th centiles, respectively. We used likelihood ratios to evaluate growth velocity because expressing performance of a test in those terms enables us to assess how useful the test might be in clinical practice, which depends on change in pretest
Obstetrics & Gynecology
probability from the test result. We believe that is the best method of evaluating diagnostic tests. Traditionally, concepts of sensitivity and specificity are used to assess diagnostic tests but these values are less useful. Evidence of usefulness of a test is considered very strong when the likelihood ratio exceeds 10.17 Among a population of SGA fetuses, Chang et al8 found that change in abdominal circumference or estimated fetal weight was useful for identifying those with anthropometric features of FGR, but they did not address influence of between-measurement interval or proximity of last measurement to delivery. We found that calculating fetal growth velocity with standards in the current study were useful in identifying growthrestricted infants when a 4-week measurement interval is used.2 The present study is an extension of our earlier work examining influence of between-measurement interval on fetal growth velocity. The results of this study are in broad agreement with the simulation model of Mongelli et al,9 who concluded that an interval of at least 3 weeks was necessary to minimize the false-positive rate of diagnosis of FGR; our study had the advantage of using actual cases with appropriate standards for quantifying growth and appropriate anthropometric categorization of growth restriction. There are many possible explanations for our observations. The 2-week interval probably works poorly because measurement variation will have a greater influence on velocity calculations compared with 4- and 6-week intervals. Reducing the coefficient of variation of the ultrasound measurements reduces the rate of falsepositive diagnoses.9 There is only a small difference in likelihood ratios between 4- and 6-week measurement intervals, with 4-week results slightly better. That finding is probably because growth restriction resulting in anthropometric features of malnourishment in neonates occurs late in pregnancy, and growth failure is likely to be diluted by more normal growth in the preceding 1–3 weeks when a 6-week interval is used. That assumption could be explored further by analyzing growth velocity calculated with varying between-measurement intervals at specific gestational ages. In this study, a 4-week interval earlier in the third trimester did not usefully identify growth-restricted fetuses, presumably because (in a low-risk population at least) growth failure is not manifest at that point, which raises important issues regarding applicability of growth velocity calculations when screening for FGR. Test performance over a 4-week interval was good when the last measurement before delivery was used, but establishing which is the last measurement before delivery can be made only retrospectively. Test performance appears to be poor at earlier gestational ages, thus suggesting that calculating growth velocity is
VOL. 97, NO. 4, APRIL 2001
likely to have only limited clinical application when screening for FGR in low-risk populations. Further work is necessary to find whether a 4-week measurement interval in the early third trimester is useful for identifying growth abnormalities in high-risk populations, and if so whether that information can be translated into improved obstetric management.
References 1. Owen P, Donnet L, Ogston S, Christie AD, Patel N, Howie PW. Standards for ultrasound fetal growth velocity. Br J Obstet Gynaecol 1996;103:60 –9. 2. Owen P, Khan KS. Fetal growth velocity in the prediction of intrauterine growth retardation in a low risk population. Br J Obstet Gynaecol 1998;105:536 – 49. 3. Owen P, Harrold A, Farrel T. Fetal size and growth velocity in the prediction of intrapartum caesarean section for fetal distress. Br J Obstet Gynaecol 1997;104:445–9. 4. Beattie RB, Johnson P. Practical assessment of neonatal nutrition status beyond birthweight; an imperative for the 1990s. Br J Obstet Gynaecol 1994;101:842– 6. 5. Georgieff MK, Sasanow SR, Mammel SC, Pereira GR. Mid-arm circumference; head circumference ratios for the identification of symptomatic LGA, AGA and SGA newborns. J Pediatr 1986;109: 316 –21. 6. Hill RM, Verniaud WM, Deter RL, Tennyson LM, Rettig GM, Zion TE, et al. The effect of intrauterine malnutrition on the term infant. Acta Paediatr 1984;73:482–7. 7. Fay RA, Dey PL, Saadie CM, Buhl JA, Gebski VJ. Ponderal index: A better definition of the “at risk” group with intrauterine growth problems than birth-weight for gestational age in term infants. Aust N Z J Obstet Gynaecol 1991;31:17–9. 8. Chang TC, Robson SC, Spencer JD, Gallivan S. Identification of fetal growth retardation: Comparison of Doppler waveform indices and serial ultrasound measurements of abdominal circumference and fetal weight. Obstet Gynecol 1993;82:230 – 6. 9. Mongelli M, Sverker E, Tambyrajia R. Screening for fetal growth restriction: A mathematical model of the effect of time interval and ultrasound error. Obstet Gynecol 1998;92:908 –12. 10. Christie AD. Standards in ultrasound fetal biometry. In: Kurjak A, ed. Progress in medical ultrasound: Reviews and comments. Vol. 2. Amsterdam: Exerpta Medica, 1981:111–24. 11. Geirsson RT, Busby-Earle RMC. Certain dates may not provide a reliable estimate of gestational age. Br J Obstet Gynaecol 1991;98: 108 –9. 12. Campbell S, Wilkin D. Ultrasonic measurement of fetal abdomen circumference in the estimation of fetal weight. Br J Obstet Gynaecol 1975;82:689 –97. 13. Altman DG, Coles EC. Nomograms for the precise determination of birthweight for dates. Br J Obstet Gynaecol 1980;87:81–5. 14. Oakley R, Parsons RJ, Whitelaw AGL. Standards for skinfold thickness in British newborn infants. Arch Dis Child 1977;52:287– 90. 15. Meadows NJ, Till J, Leaf A, Hughes E, Jani B, Larcher V. Screening for intrauterine growth retardation using ratio of mid-arm circumference to occipito-frontal circumference. Br Med J Clin Res Ed 1986;292:1039 – 40. 16. Lubchenco LO, Hansman C, Boyd E. Intrauterine growth in length and head circumference as estimated from live births at gestational ages from 26 to 42 weeks. Pediatrics 1966;37:403–9. 17. Jaeschke R, Guyatt G, Sackett DL, for the Evidence-Based Medicine Working Group. Users’ guide to the medical literature III. How to
Owen et al
Time Interval Influences Prediction of FGR
503
18.
19.
20.
21. 22.
23.
24.
use an article about a diagnostic test (2 parts). JAMA 1994;271:389 – 91, 703–7. Simel DL, Samsa GP, Matchar DB. Likelihood ratios with confidence: Sample size estimation for diagnostic test studies. J Clin Epidemiol 1990;44:763–70. Sumners JE, Findley GM, Ferguson KA. Evaluation methods for intrauterine growth using neonatal fat stores instead of birth weight as outcome measures; fetal and neonatal measurements correlated with neonatal skinfold thicknesses. J Clin Ultrasound 1990;18:9 –14. Vintzileos AM, Lodeiro JG, Feinstein SJ, Campbell WA, Weinbaum PJ, Nochimson DJ. Value of fetal ponderal index in predicting growth retardation. Obstet Gynecol 1986;67:584 – 8. Royston P, Altman DG. Design and analysis of longitudinal studies of fetal size. Ultrasound Obstet Gynecol 1995;6:307–12. Rossavik IK, Deter RL. The effect of abdominal profile shape changes on the estimation of fetal weight. J Clin Ultrasound 1984;12:57–9. Sanderson DA, Wilcox MA, Johnson IR. The individualised birthweight ratio: A new method of identifying intrauterine growth retardation. Br J Obstet Gynaecol 1994;101:310 – 4. Kurmanavicius J, Wright E, Royston P, Wisser J, Huch R, Huch A,
504 Owen et al
Time Interval Influences Prediction of FGR
et al. Fetal ultrasound biometry: 1. Head reference values. Br J Obstet Gynaecol 1999;106:126 –35.
Address reprint requests to:
Philip Owen, MD, MRCOG Department of Obstetrics Glasgow Royal Maternity Hospital Rottenrow Glasgow Scotland E-mail:
[email protected]
Received July 13, 2000. Received in revised form October 20, 2000. Accepted November 9, 2000. Copyright © 2001 by The American College of Obstetricians and Gynecologists. Published by Elsevier Science Inc.
Obstetrics & Gynecology