Automated fetal head detection and measurement in ultrasound images by iterative randomized hough transform

Automated fetal head detection and measurement in ultrasound images by iterative randomized hough transform

Ultrasound in Med. & Biol., Vol. 31, No. 7, pp. 929 –936, 2005 Copyright © 2005 World Federation for Ultrasound in Medicine & Biology Printed in the U...

379KB Sizes 0 Downloads 23 Views

Ultrasound in Med. & Biol., Vol. 31, No. 7, pp. 929 –936, 2005 Copyright © 2005 World Federation for Ultrasound in Medicine & Biology Printed in the USA. All rights reserved 0301-5629/05/$–see front matter

doi:10.1016/j.ultrasmedbio.2005.04.002

● Original Contribution AUTOMATED FETAL HEAD DETECTION AND MEASUREMENT IN ULTRASOUND IMAGES BY ITERATIVE RANDOMIZED HOUGH TRANSFORM WEI LU,* JINGLU TAN* and RANDALL FLOYD† *Departments of Biological Engineering and †Obstetrics and Gynecology, University of Missouri, Columbia, MO, USA (Received 6 July 2004, revised 31 March 2005, in final form 7 April 2005)

Abstract—An image-processing and object-detection method was developed to automate the measurements of biparietal diameter (BPD) and head circumference (HC) in ultrasound fetal images. The heads in 214 of 217 images were detected by an iterative randomized Hough transform. A head was assumed to have an elliptical shape with parameters progressively estimated by the iterative randomized Hough transform. No user input or size range of the head was required. The detection and measurement took 1.6 s on a personal computer. The interrun variations of the algorithm were small at 0.84% for BPD and 2.08% for HC. The differences between the automatic measurements and sonographers’ manual measurements were 0.12% for BPD and ⴚ0.52% for HC. The 95% limits of agreement were ⴚ3.34%, 3.58% for BPD and ⴚ5.50%, 4.45% for HC. The results demonstrated that the automatic measurements were consistent and accurate. This method provides a valuable tool for fetal examinations. (E-mail: [email protected]) © 2005 World Federation for Ultrasound in Medicine & Biology. Key Words: Object detection, Randomized Hough transform, Fetal ultrasound, Fetal growth, Image processing, Pattern recognition.

because of the fuzziness of fetal head boundaries (Pathak et al. 1997) and the low resolution and signal-to-noise ratio of US images, there are 1 to 7% intra- and interobserver variations in the head measurements (Salari et al. 1990; Zador et al. 1991; Gull et al. 2002). The variations are a major contributor to inaccurate measurements. Multiple images are often taken to assure measurement consistency (Zador et al. 1991; Dudley and Chapman 2002) and additional examiners are required to reduce major discrepancies (⬎ 10%) for critical measurements (Gull et al. 2002). In addition to the problem of measurement errors and inconsistencies, manual measurements take time and labor (Zador et al. 1991; Chalana et al. 1996). To overcome the disadvantages of manual measurements, image-processing and object-recognition techniques have been used for automatic or semiautomatic fetal measurements (Salari et al. 1990; Thomas et al. 1991; Zador et al. 1991; Matsopoulos and Marshall 1994; Chalana et al. 1996; Hanna and Youssef 1997; Pathak et al. 1997). In one study, noncontiguous skull bones were connected by iterative dilations and the head area was filled by a grow-filling algorithm (Matsopoulos

INTRODUCTION Analysis of ultrasound (US) fetal head images is a daily routine for obstetricians, radiologists and sonographers. The biparietal diameter (BPD) and the head circumference (HC) are two important measurements for evaluating fetal growth (Hadlock 1988), estimating gestational age (Hadlock et al. 1981, 1983, 1984), predicting fetal maturity (Hadlock 1988) and weight (Gull et al. 2002) and diagnosing a wide range of obstetric problems (William and Roya 1998). The BPD is the distance from the outer margin of the proximal skull to the inner margin of the distal skull and is usually measured as the distance between two manually-marked endpoints. The HC is the circumference of the outer skull and is manually measured by tracing the skull or fitting an ellipse to it with a mouse-like device (Athey and Hadlock 1985). Dudley and Chapman (2002) demonstrated the clinical importance of fetal measurement accuracy and suggested that significant improvements were required. Partially Address correspondence to: Dr. Jinglu Tan, Department of Biological Engineering, 215 Agricultural Engineering Building, University of Missouri, Columbia, MO 65211 USA. E-mail: [email protected] 929

930

Ultrasound in Medicine and Biology

and Marshall 1994). The algorithm was tested with one image of limited complexity. Salari et al. (1990) and Zador et al. (1991) used a Hough transform to find the center of the head in thresholded edge images and a least-squares method to determine other parameters, for 74 of 75 images. A computation time of 8 s per image was reported on a 10-MHz personal computer (PC). In another study, Hough transforms and image morphologic operations based on some a priori information about the head bone size were used to reduce noise in a preprocessing procedure (Hanna and Youssef 1997). A circular model and then a more accurate elliptical model were fitted to the head bones by using the 4-D Hough transform and the head axis ratio (eccentricity). The algorithm required approximately 4 min on a 66-MHz PC per image. It is not clear how many images were tested. Chalana et al. (1996) detected the heads in 33 of 35 images by using an active contour model. This method requires the user to specify a point near the true center of a head. The computation time for each image was 32 s on a Sun SparcStation 20/71 and 248 ms on a high-performance MS5000 system (Pathak et al. 1997). Practical issues with computerized measurements in clinical settings, including integration of algorithms into an US machine, interactive use of computerized measurements and improvement of computation efficiency, were discussed by a number of researchers (Zador et al. 1991; Chalana et al. 1996; Pathak et al. 1997). There are several limitations of the existing algorithms for fetal head detection and measurement. The standard Hough transform requires extensive preprocessing to eliminate reflections from skin and other irrelevant tissues based on a priori knowledge of the head bone size (Hanna and Youssef 1997). Least-squares fitting methods usually suffer when there are pixels off the target curve (Wu and Wang 1993). Methods based on iterative dilations are ineffective for images with large gaps or moderate to strong noise (Matsopoulos and Marshall 1994). The active contour-model method requires the user to specify an initial point close to the center of a head and is computationally inefficient (Chalana et al. 1996; Pathak et al. 1997). In the existing work, the image complexity was limited or user input was required. In addition, the computation time was long unless a highperformance computer system was used (Pathak et al. 1997). This paper describes an application of an automatic method for fetal head detection and measurement. The new method has the following advantages: 1. it provides reliable, consistent BPD and HC measurements; 2. it requires little user intervention; and 3. it is efficient in terms of both computation time and storage space usage. The foundation of this method is a newly-developed iterative randomized Hough transform (IRHT) (Lu

Volume 31, Number 7, 2005

2003). Results for 217 clinical US images are compared with manual measurements. MATERIALS AND METHODS Image acquisition Two sets of fetal head images were randomly selected from an archive resulting from routine pregnancy examinations at a regional hospital. A small set of 11 images was used for initial algorithm testing and evaluation of interobserver variations and a larger set of 206 images was used for statistical analysis. The first set was from six patients examined in one day and the second set included images from 96 patients whose last names started with a letter in a segment of the alphabet. All images were acquired with an HDI 5000 US machine (Philips; Bothell, WA, USA) over a period of 1 month. The gestational age (GA) of the fetuses was 18 to 34 weeks for the first set and 12 to 39 weeks for the second set. Both the time-gain control and the spatial resolution were set to 2-D optimization and the spatial resolution range was 158 to 358 pixels/cm. The images were stored in a Tag Image File Format (TIFF) of 256 grey levels and 640 ⫻ 476 pixels. BPD and HC measurements were made by the imaging sonographer. Three imaging sonographers were involved, but only one made measurements for each image. Image segmentation by K-mean classifier Bones (head skull, femur, etc.) appear as bright objects in an US image (Fig. 1a). Often, there are other bright structures adjacent to the head skull and moderate to large gaps exist between skull segments. The image intensities are not always consistent, even among images acquired under apparently the same conditions. Furthermore, various artefacts and noises are usually present in an US image. As a result, thresholding or iterative dilation was found to be ineffective for fetal head segmentation. In this study, a K-mean algorithm (Tou and Gonzalez 1974; Duda et al. 2001) was used to classify each pixel according to its intensity value into one of the three groups: these were bright object, grey object and background (Fig. 1b). Each image was preprocessed with a 3 ⫻ 3 low-pass filter to reduce high-frequency noise and through a white top-hat transformation with a 11 ⫻ 11 structuring element to improve the contrast. The Kmean algorithm was then applied to identify the bright objects. A morphologic binary area opening operation was used to remove small bright objects (⬍ 20 pixels). Morphologic dilation with a 1 ⫻ 1 structuring element and closing with a 2 ⫻ 2 structuring element were used to smooth the boundaries of large bright objects. Additional details can be found in Lu and Tan (2000) and Lu (2003). After segmentation, the skeletons of the bright

Fetal head detection and measurement ● W. LU et al.

Fig. 1. Skull segmentation and skeleton extraction. (a) Ultrasound fetal head image. (b) Result of segmentation by K-mean classifier into bright object, grey object and background. (c) Skeleton of bright objects.

objects were extracted (Fig. 1c). The skeleton image provided a simple representation of the skull segments and was used for head detection. Head detection by iterative randomized Hough transform Because fetal skulls are not completely closed, fetal head contours usually contain moderate to large gaps. Fetal images generally have artefacts and moderate noise. Structures adjacent to the head and skin reflections near the transducer may generate bright spots in an image (Fig. 1). For head detection, the gaps, extraneous tissues and imaging noise are disturbances. A useful head-detection algorithm must effectively deal with these disturbances. An iterative randomized Hough transform (IRHT) was recently developed for detection of parametric curves with large gaps and strong noise (random impulse or “salt and pepper” noise, random deviations in pixel coordinates from a perfect curve and nonuniform noise) (Lu 2003). It iteratively applies a randomized Hough transform (RHT) (Xu et al. 1990; Leavers 1992; Xu and Oja 1993) to an adaptively adjusted region-of-interest (ROI). The following gives a brief description of the RHT and IRHT. More details can be found in Lu (2003).

931

In a binary image, an n-dimensional parametric curve can be represented as f(c, z) ⫽ 0, where c ⫽ [␣1,. . . , ␣n]t comprises n parameters and z ⫽ (x, y) represents the coordinates of pixels on the curve. The RHT randomly takes a sample of n object pixels zi ⫽ (xi, yi), i ⫽ 1, . . ., n, from the image and maps the sample into one point c ⑀ Rn in the n-D parameter space by solving a set of n equations f(c, zi) ⫽ 0. If c is valid for the curve type of interest (ellipse for this work), the occurrence count for c is increased by one. This process is repeated until some criterion is met. The often-used criteria include a maximum number of valid samples and a threshold for the maximum occurrence count, both of which are usually experimentally determined (Xu et al. 1990; Leavers 1992; Xu and Oja 1993). The counts of occurrence for different c values are accumulated in an n-D array as a function of c, thereby resulting in a histogram in the n-D space. The location of the maximum count corresponds to the c value of the highest frequency and, thus, to the most prominent curve of interest. To reduce storage and to simplify computation, n 1-D accumulators, where each accumulator stores the count for one ␣i, i ⫽ 1, . . ., n, may be used instead of an n-D accumulator. This will result in n histograms in the 1-D space, each corresponding to one of the n parameters. For ellipse detection (n ⫽ 5), the following stable and parametrically linear model (Leavers 1992) was used in this work: x2 ⫹ y2 ⫺ U(x2 ⫺ y2) ⫺ V2xy ⫺ Rx ⫺ Sy ⫺ T ⫽ 0 (1) where U, V, R, S and T are parameters. These values can be converted into the natural parameters of an ellipse, c ⫽ [x0, y0, a, b, ␾]t, where (x0, y0) is the center coordinates, a and b are, respectively, the long and short semiaxes and ␾ is the angle of rotation. For a solution of c to be valid: x0, y0, a and b are positive; x0 and y0 are not larger than the image width and height, respectively; a is greater than b; and a and b are not larger than half of the diagonal length of the image. The locations of the five count maximums in the five 1-D accumulators provide an estimate for c ⫽ [x0, y0, a, b, ␾]t. Strong disturbances and noise may completely corrupt the true maximums associated with a target curve, leading to detection failures by RHT (Lu 2003). To deal with this situation, an iterative RHT was developed and its flow diagram is shown in Fig. 2. In step 1, a ROI is initially defined as the entire image. In step 2, the algorithm scans the ROI and stores all foreground pixels (see Fig. 1c) in a pixel list. The n 1-D accumulators are initialized with zeros. In steps 3 to 4, a sample of n pixels is randomly picked from the pixel list and a parameter solution ci is found, for this sample, with eqn (1). If ci is

932

Ultrasound in Medicine and Biology

Volume 31, Number 7, 2005

(Fig. 3d). It is evidently a very close fit to the skull contour. In this study, a K value of 200 was experimentally found to give reliable results. K values greater than 200 brought little change. K needs to be large enough so that stable count peaks corresponding to the most prominent ellipse are formed in the accumulators. The minimum K value will, thus, depend on how prominent the skull contour is relative to the disturbances on the ROI (Fig. 3), which is commonly referred to as the signal-tonoise ratio. For the binary skeleton images, this ratio is a reflection of the fetal structure rather than a strong function of the detailed image properties, such as resolution. As a result, the K value used in this work is expected to have good general applicability, although it needs to be verified for different machines. BPD and HC measurements The BPD and HC were computed from the ellipse detected by the IRHT. Because the ellipse was fitted to the skull skeleton, which was the center-line (see Fig. 3c and d), the BPD was simply 2b. In clinical practice, HC is calculated as 1.57 ⫻ (l ⫹ s), where l and s are the long and short axes of the outer skull, respectively (Athey and

Fig. 2. Flow diagram of iterative randomized Hough transform (IRHT) for curve detection.

valid as described above, the corresponding counts for ci are increased in the n 1-D accumulators; otherwise, ci is discarded and a new sample is drawn. In the unlikely event that no valid ci is found after a given number of samples, the program exits and reports failure to detect an ellipse. This random sampling and count accumulation process (steps 3 to 5) is repeated until K valid samples are processed. In step 6, a parameter estimate, cest, is determined from the locations of the count maxima in the 1-D accumulators. The ROI is updated with an area that contains the ellipse defined by cest (Fig. 3a). If the change in cest from the previous iteration is small (less than 2.5° in ␾; less than two pixels in any of x0, y0, a and b; and less than six pixels total in x0, y0, a and b), the algorithm is considered to have converged; otherwise, steps 2 to 6 are repeated in the new ROI. Figure 3a to c shows the process of the ROI converging to the target ellipse and gradual exclusion of disturbances from the region. For this example, the IRHT converged in five iterations. The resulting ellipse is superimposed on the skeleton image (Fig. 3c) and on the original image

Fig. 3. Head detection by IRHT. (a) After first iteration of IRHT, estimation is biased by noise structures. (b) After second iteration, angle of rotation is still incorrect. (c) After fifth iteration, IRHT has converged; resulting ellipse superimposed on the skeleton image. (d) Resulting ellipse superimposed on the original image.

Fetal head detection and measurement ● W. LU et al.

933

Hadlock 1985). If the average thickness of the skull segments is ៮t, then l equals to 2a ⫹ ៮t and s equals to 2b ⫹ ៮t. ៮t was computed as: 1 t៮ ⫽ M

M

兺 i⫽1

1 ti ⫽ M

M

Si

兺L i⫽1

(2)

i

where Si and Li were, respectively, the area and skeleton length of a skull segment and M was the number of skull segments. A skeleton length was computed from the skeleton image (Fig. 1c), and the skull segment area was the area of the corresponding bright object in the segmented image (Fig. 1b). Only large skull objects (⬎ 100 pixels or 25 mm2) inside the final ROI were included in the computation. The 5-mm or 10-mm markers on the images were automatically identified and used to convert pixels into area or length values in mm2 or mm. Measurement comparison and statistical analysis The automatic measurements were compared with the conventional manual measurements. In addition to the two sonographers who did measurements for the first image set, six research staff members or graduate students were trained to measure the BPD and HC by manually fitting an ellipse to the head skull in Adobe Photoshop (Adobe, San Jose, CA, USA). This resulted in eight sets of manual BPD and HC measurements for evaluation of interobserver variations. For the second image set, each fetal head was measured by only one of the three imaging sonographers during on-line examination. Because the IRHT involves random sampling of pixels, the automatically-measured values may vary slightly from run to run. The computerized method was, thus, executed 8 times to generate eight sets of automatic measurements for both image sets. Measurements were averaged over the eight observers (or runs) to establish the average manual (or automatic) measurement. The measurement consistency was evaluated by the interobserver or interrun variation as: Interobserver (interrun) variation

冋 冉 冊册 兺 兺 冏

⫽ N

Q 2

⫺1 N

i⫽1 p⬍q



xpi ⫺ xqi ⫻ 100 xqi

(3)

where xki(k ⫽ p or q) is the measurement of the ith image by the kth observer (or run), N is the number of images and Q is the total number of observers (or runs) and ␴p ⬎ q is the sum over all p and q so that 1 ⱕ p ⱕ q ⱕ Q. The interobserver variation is the average of all absolute percent interobserver differences, which was used in the literature (Deter et al. 1982; Chalana et al. 1996; Pathak et al. 1997; Gull et al. 2002). Another criterion used was a generalized

Fig. 4. Histograms of GA and BPD and HC measurements made by sonographers for all samples.

Cohen’s kappa (␬), which measures chance-corrected agreement among multiple raters (Berry and Mielke 1988). After evaluation of measurement consistency, the mean signed percent differences between the average automatic measurements and the average manual measurements were computed to assess the agreement between the two sets of measurements. The t-test was used to test if the mean signed percent differences were significantly different from zero. The 95% limits of agreement, which are the mean difference ⫾ 2 SDs of the difference, were computed as indicators of agreement (Bland and Altman 1999, 2003). Linear regression analysis was performed to establish how accurately conventional manual measurements can be obtained by the automatic method after some simple calibration. RESULTS AND DISCUSSION Figure 4 shows the histograms of GA, BPD and HC measured by the imaging sonographers for all the sample images. Based on published work (Snijders and Nicolaides 1994), the samples covered a wide range of normal fetuses in terms of these three variables. For the human fetal head, the cephalic index (or eccentricity e) has a mean (␮e) of 0.783 and a SD (␴e) of 0.044 (Hadlock et al. 1981). This general knowledge was used to construct a constraint for validating the parameter solutions in the IRHT: ␮e ⫺ 3␴e ⱕ ␮e ⫹ 3␴e, or 0.651 ⱕ e ⱕ 0.915. About 99.7% of fetal heads have a cephalic index in this range. For the first image set, the IRHT converged to the head for all 11 images in an average of about five iterations. For the second image set, the IRHT converged to the head in 203 of the 206 images in an average of approximately six iterations. Of

934

Ultrasound in Medicine and Biology

Volume 31, Number 7, 2005

Table 1. Consistency of the observers and the algorithm in measuring BPD and HC* Measurement†

n of images

Interobserver variation (%)

␬ for observers excluding initial sonographer

␬ for all 8 observers

Interrun variation (%)

␬ for 8 runs

BPD1 HC1 BPD2 HC2

11 11 203 201

2.05 (1.79) 1.78 (1.43) NA NA

0.929 0.937 NA NA

0.926 0.938 NA NA

0.54 (0.80) 1.61 (1.40) 0.84 (2.79) 2.08 (3.06)

0.982 0.919 0.977 0.930

* Values in parentheses are SDs; † subscripts indicate image set number.

the 203 images, two manual HC measurements were missing; therefore, there were 203 images for BPD comparison and 201 images for HC comparison. In the three images for which the IRHT failed to converge or converged to a wrong object, the skull contrast was very low and there existed unusually large gaps between skull segments. For both image sets, the total computation time was 1.6 s per image on a PC with an Athlon 1987-MHz CPU (AMD; Sunnyvale, CA, USA). Table 1 lists the interobserver variation, interrun variation and the generalized Cohen’s kappa (␬) for both image sets. All interobserver and interrun variations are small compared with published interobserver variations of 1.0 to 7.0% (Deter et al. 1982; Chalana et al. 1996; Gull et al. 2002). The high ␬ values (⬎ 0.90) indicate a high level of consistency among the observers and among the algorithm runs. The ␬ values changed only slightly if the initial sonographer was excluded, indicating that the remaining seven observers agreed with the initial sonographer. It should be acknowledged that these results may not be typical, because not all observers were sonographers. The data, however, were within the published ranges and they at least provided an additional indication of human (if not sonographer) variations. In fact, the interobserver variations were lower (thus, better) than most published data; therefore, the new data constituted a more rigorous standard against which the automatic method was compared. The interrun variations and the ␬ values for the automatic HC measurement are close to those for the manual measurement. The interrun variations for the automatic BPD measurement are smaller than the current and the reported interobserver variations for manual measurement and the ␬ values are higher than those for manual measurement. These results

suggest that the automatic method is at least as consistent as the manual method in measuring HC, and it is more consistent than the manual method in measuring BPD. The improved consistency in BPD measurement was apparently because BPD was computed from a fitted head contour ellipse in the automatic method, whereas it was measured from two subjectively selected points in the manual method. Table 2 shows the results of agreement analysis between the automatic and the manual measurements. The agreement analysis was based only on the second image set, because the sample size of the first image set was too small for a meaningful comparison. The meansigned-differences between the average automatic measurements and manual measurements were 0.12% for BPD and ⫺0.52% for HC. The p value for BPD indicates that the mean-signed-difference was not significantly different from zero, or that there was no significant systematic bias in the automatic BPD measurement. The p value for HC indicates that, in comparison with the manual method, there was a systematic 0.52% underestimation of HC by the automatic method. It should be noted that the difference is not necessarily an error of the automatic method. If the automatic method is used to replace the conventional manual method for gestational age prediction or other purposes, problems are not expected, because a systematic difference can be easily accounted for in the prediction formulae and other uses of the measurement. Figure 5 shows scatter plots of the average automatic measurements against the average manual measurements. The closeness of the data points to the line of equality indicates a high degree of agreement between the two measurement methods. Figure 6 shows the mea-

Table 2. Agreement between automatic and manual measurements* Measurement†

n of images

Mean signed difference (%)

p value for t-test

Mean absolute difference (%)

95% limits of agreement (%)

BPD2 HC2

203 201

0.12 (1.73) ⫺0.52 (2.49)

0.32 ⬍0.01

1.19 (1.25) 1.93 (1.66)

⫺3.34, 3.58 ⫺5.50, 4.45

* Values in parentheses are SDs; † subscripts indicate image set number.

Fetal head detection and measurement ● W. LU et al.

935

(4) and 0.988 for eqn (5). The high R2 values indicated that nearly all the variations in the manual BPD and HC measurements were, respectively, accounted for by the automatic BPD and HC measurements. The 95% confidence intervals of both estimated slopes included unity, indicating that there were little slope errors. The 0.352-mm intercept in eqn (4) was not statistically significant. The systematic difference in HC was accounted for by the nonzero intercept in eqn (5). The agreement analysis and regression analysis both show that there were little differences between the two methods for BPD measurement and only a constant offset for HC measurement, which can be easily corrected. For the samples analyzed, minimal errors would occur if the automatic measurements, with the HC measurement corrected by a constant, were used to replace the manual measurements for fetal examinations. There was, however, only one set of manual measurements available for the second image set. To make eqns (4) and (5) generally applicable, they must be verified with additional data from more sonographers, patients and US machines.

Fig. 5. Scatterplots of automatic vs. manual measurements for (a) BPD and (b) HC.

surement differences between the two methods as a function of the mean values of both methods. No apparent trends or significant biases could be seen for BPD. The HC variance of difference seems to show a slight increase with the mean. Transformation of variables (Neter et al. 1996) did not lead to improvements. The histograms of the differences displayed approximately normal distributions for both BPD and HC (not shown). There was, therefore, no evident violation of the assumptions made for the agreement analysis. The 95% limits of agreement were ⫺3.34%, 3.58% for BPD and ⫺5.50%, 4.45% for HC (dashed lines in Fig. 6). These plots and narrow limits further indicate a very good level of agreement between the automatic and the manual measurements. Linear regression analysis of the manual measurements against the automatic measurements resulted in the following two regression equations: Manual BPD ⫽ 0.993 ⫻ Auto BPD ⫹ 0.352 共mm兲 (4) Manual HC ⫽ 0.997 ⫻ Auto HC ⫹ 2.135 (mm) (5) The coefficient of determination (R2) was 0.995 for eqn

Fig. 6. Differences between automatic and manual measurements as function of their means for (a) BPD and (b) HC. ( - - - - ) 95% limits.

936

Ultrasound in Medicine and Biology

Until the algorithm is further improved through test use and additional experimentation, the method is not yet foolproof. There will be cases where the automatic method may fail or lead to incorrect results. Errors will occur, for example, when a head skull does not exist in an image because of improper detector positioning or it is not detectable because of very low image contrast or very large gaps. A method is needed to verify that a resulting ellipse represents a real head contour. This supervisory verification could be done automatically when enough data and experiences are accumulated from test use of the algorithm. Before this happens, the algorithm will need some user’s attention to the resulting ellipses or the validity of the measurement values, as has also been suggested by others (Chalana et al. 1996; Pathak et al. 1997). Nonetheless, the sonographer is relieved of the tedious manual marking, tracing or fitting in most cases, so that he or she can focus on acquiring good images which, in turn, improves the head detection by the automatic method. Furthermore, the automatic method brings the benefits of reduced measurement subjectivity and improved measurement consistency. SUMMARY A new method is presented for automating fetal head measurements in US images. Tests with real clinical images showed that measurements made by the method are consistent and in good agreement with the conventional manual measurements. The algorithm is computationally efficient and does not require user input. It provides a potentially very useful tool for routine pregnancy examinations and obstetric diagnosis. REFERENCES Athey P, Hadlock F. Ultrasound in obstetrics and gynecology. St. Louis, MO: CV Mosby, 1985. Berry K, Mielke P. A generalization of Cohen’s Kappa agreement measure to interval measurement and multiple raters. Educ Psychol Meas 1988;48:921–933. Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res 1999;8(2):135–160. Bland JM, Altman DG. Applying the right statistics: Analyses of measurement studies. Ultrasound Obstet Gynecol 2003;22(1):85–93. Chalana V, Winter TC 3rd, Cyr DR, Haynor DR, Kim Y. Automatic fetal head measurements from sonographic images. Acad Radiol 1996;3(8):628 – 635. Deter RL, Harrist RB, Hadlock FP, Carpenter RJ. Fetal head and abdominal circumferences: I. Evaluation of measurement errors. J Clin Ultrasound 1982;10(8):357–363. Duda RO, Hart PE, Stork DG. Pattern classification. New York, NY: John Wiley & Sons, 2001.

Volume 31, Number 7, 2005 Dudley NJ, Chapman E. The importance of quality management in fetal measurement. Ultrasound Obstet Gynecol 2002;19(2):190 – 196. Gull I, Fait G, Har-Toov J, et al. Prediction of fetal weight by ultrasound: The contribution of additional examiners. Ultrasound Obstet Gynecol 2002;20(1):57– 60. Hadlock F. Ultrasound evaluation of fetal growth. In: Callen P, ed. Ultrasonography in obstetrics and gynecology. Philadelphia, PA: Saunders,1988:129 –142. Hadlock FP, Deter RL, Carpenter RJ, Park SK. Estimating fetal age: Effect of head shape on BPD. Am J Roentgenol 1981;137(1):83– 85. Hadlock FP, Deter RL, Harrist RB, Park SK. Computer assisted analysis of fetal age in the third trimester using multiple fetal growth parameters. J Clin Ultrasound 1983;11(6):313–316. Hadlock FP, Deter RL, Harrist RB, Park SK. Estimating fetal age: Computer-assisted analysis of multiple fetal growth parameters. Radiology 1984;152(2):497–501. Hanna CW, Youssef ABM. Automated measurements in obstetric ultrasound images. Proceedings of the 1997 International Conference on Image Processing. Part 3, Oct 26 –29 1997, Santa Barbara, CA, USA. Los Alamitos, CA: IEEE Comp Soc, 1997:504 –507. Leavers VF. The dynamic generalized Hough transform: Its relationship to the probabilistic Hough transforms and an application to the concurrent detection of circles and ellipses. CVGIP Image Understanding 1992;56(3):381–398. Lu W. Hough transforms for shape identification and applications in medical image processing. PhD dissertation. Columbia, MO: University of Missouri - Columbia, 2003. Lu W, Tan J. Segmentation of ultrasound fetal images. Biological quality and precision agriculture II, Nov. 6 – 8, 2000, Boston, MA. Soc Photo-Optical Instrum Eng 2000;4203:81–90. Matsopoulos GK, Marshall S. Use of morphological image processing techniques for the measurement of a fetal head from ultrasound images. Pattern Recogn 1994;27(10):1317–1324. Neter J, Kutner M, Nachtsheim CJ, Wasserman W. Applied linear regression models. Chicago, IL: IRWIN, 1996. Pathak SD, Chalana V, Kim Y. Interactive automatic fetal head measurements from ultrasound images using multimedia computer technology. Ultrasound Med Biol 1997;23(5):665– 673. Salari V, Zador I, Chik L, Sokol R. Automated measurements of fetal head from ultrasound images. Medical imaging IV: Image processing, Feb 6 – 8, 1990, Newport Beach, CA, USA. vol. 1233. Bellingham, WA: Int Soc Optical Eng, 1990:213–216. Snijders RJ, Nicolaides KH. Fetal biometry at 14 – 40 weeks’ gestation. Ultrasound Obstet Gynecol 1994;4(1):34 – 48. Thomas JG, Peters RA II, Jeanty P. Automatic segmentation of ultrasound images using morphological operators. IEEE Trans Med Imaging 1991;10(2):180 –186. Tou JT, Gonzalez RC. Pattern recognition principles. Reading, MA: Addison-Wesley, 1974. William J, Roya S. Introduction to ultrasound. Philadelphia, PA: WB Saunders, 1998. Wu W-Y, Wang M-JJ. Elliptical object detection by using its geometric properties. Pattern Recogn 1993;26(10):1499 –1509. Xu L, Oja E. Randomized Hough transform (RHT): Basic mechanisms, algorithms, and computational complexities. CVGIP Image Understanding 1993;57(2):131–154. Xu L, Oja E, Kultanen P. A new curve detection method: Randomized Hough transform (RHT). Pattern Recogn Lett 1990;11:331–338. Zador IE, Salari V, Chik L, Sokol RJ. Ultrasound measurement of the fetal head: Computer versus operator. Ultrasound Obstet Gynecol 1991;1(3):208 –211.