Ultrasound in Med. & Biol., Vol. 38, No. 9, pp. 1508–1513, 2012 Copyright Ó 2012 World Federation for Ultrasound in Medicine & Biology Printed in the USA. All rights reserved 0301-5629/$ - see front matter
http://dx.doi.org/10.1016/j.ultrasmedbio.2012.05.017
d
Original Contribution ULTRASOUND ELASTOGRAPHY FOR THYROID NODULES: A RELIABLE STUDY? JAE KYUN KIM,* JUNG HWAN BAEK,y JEONG HYUN LEE,y JONG LIM KIM,y EUN JU HA,y TAE YONG KIM,z WON BAE KIM,z and YOUNG KEE SHONGz * Department of Radiology, Chung-Ang University College of Medicine, Seoul, Korea; y Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea; and z Department of Endocrinology and Metabolism, University of Ulsan College of Medicine, Asan Medical Center, Songpa-Gu, Seoul, Korea (Received 28 November 2011; revised 28 April 2012; in final form 21 May 2012)
Abstract—The aims were to determine the reliability of ultrasound elastography (USE) and the factors related to reliability. One hundred six solid thyroid nodules in 78 consecutive patients were enrolled. Conventional ultrasound examination and USE were performed for each nodule. We evaluated reliability, the factors affecting reliability of USE and the interobserver and intraobserver agreement. We suggest following three criteria as less reliable results: (1) ,50% green color in the region of interest box for the thyroid parenchyma; (2) discordance in elasticity scores in the three USE images; and (3) intranodular color signal loss. Consensual reliability of USE was 68% (72/106). Multivariable logistic regression analysis revealed that rim calcification (p 5 0.002), a compressive force of $3 (p , 0.001) and arterial pulsation (p , 0.001) were significantly associated with reliability of USE. Substantial interobserver (k 5 0.738) and intraobserver agreement were observed in reliable USE results (k 5 0.765). Clinical application of USE should be restricted to the thyroid nodules with reliable results. (E-mail:
[email protected]) Ó 2012 World Federation for Ultrasound in Medicine & Biology. Key Words: Ultrasound elastography, Ultrasound, Thyroid nodule, Reliability.
found a lack of reliable interobserver agreement for USE compared with grayscale US in the diagnosis of malignant thyroid nodules (Park et al. 2009; Kagoya et al. 2010). These results suggest that interpretation of the results of USE can be subjective. Various factors may influence the rating elasticity score, such as nodule size, exophytic location, rim calcification, thyroiditis and motion artifacts (Lyshchik et al. 2005; Rago et al. 2007; Asteria et al. 2008; Dighe et al. 2008; Hong et al. 2009). In a study by Hong et al., incorrect USE results were obtained with nodules protruding from the thyroid capsule (Hong et al. 2009). Many authors agree that coarse or peripheral rim calcifications cause an incorrect USE results (Lyshchik et al. 2005; Rago et al. 2007; Asteria et al. 2008; Hong et al. 2009). Calcifications within nodules increase nodular stiffness and, thus, probe compression does not result in tissue strain deformation (Asteria et al. 2008). Hong et al. suggested that two nodules in patients with subacute thyroiditis showed malignant USE finding (Hong et al. 2009). However, Dighe et al. reported benign USE results in four patients with lymphocytic thyroiditis (Dighe et al. 2008). Several factors such as those secondary to
INTRODUCTION Ultrasound elastography (USE) measures tissue deformation in response to compression and displays tissue stiffness (Lerner et al. 1990; Ophir et al. 1991; Gao et al. 1996; Ophir et al. 1999; Greenleaf et al. 2003). USE is useful in differentiating between benign and malignant tumors of breast, prostate and thyroid gland (Garra et al. 1997; Cochlin et al. 2002; Thomas et al. 2006; Pallwein et al. 2007). The sensitivity and specificity of USE for differentiating thyroid nodules have been reported to be 82%–97% and 77.5%–100%, respectively (Lyshchik et al. 2005; Rago et al. 2007; Asteria et al. 2008; Dighe et al. 2008; Hong et al. 2009) and are, therefore, superior to the respective values of 83.3% and 74% reported for grayscale ultrasound (US) (Moon et al. 2008). However, Kagoya et al. demonstrated high sensitivity (90%) but low specificity (50%) and Park et al.
Address correspondence to: Jung Hwan Baek, M.D., Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, 86 Asanbyeongwon-Gil, Songpa-Gu, Seoul 138-736, Korea. E-mail:
[email protected] 1508
Ultrasound elastography for thyroid nodules d J. K. KIM et al.
pulsation of the carotid artery, out-of-plane motion, swallowing and breathing have also been proposed as causes of incorrect USE results (Lyshchik et al. 2005; Asteria et al. 2008). Investigators may vary in their rating of elasticity, a measure that is influenced by several factors. Repeated examination may, therefore, fail to generate reliable elasticity scores. The purpose of the present study was threefold: (1) to evaluate the reliability of USE; (2) to determine the factors affecting reliability; and (3) to assess interobserver and intraobserver agreement concerning reliability. MATERIALS AND METHODS Patients The protocol of the present retrospective study was approved by Institutional Review Board of Asan Medical Center and neither patient approval nor informed consent was required for the review of images and clinical records. Informed consent had been obtained from all patients prior to the performance of US-guided fine needle aspiration cytology (FNAC). US-guided FNAC was performed for 132 thyroid nodules in 97 consecutive patients between September, 2009 and October, 2009. This study was restricted to patients with solid thyroid nodules (solid portion .90%). Therefore, data concerning a total of 106 thyroid nodules from 78 patients were analyzed (mean age, 51.8 years; range, 23–76; 14 males and 64 females). Imaging and image analysis Immediately prior to the commencement of the present study, a 4-week USE study was performed to collect preliminary data and to formulate the present study protocol. A review of the literature concerning USE for thyroid and other lesions was performed to evaluate the factors influencing the results of USE. On the basis of the preliminary study and literature review, use of the standard USE technique was selected for the present study and factors with a potential influence on the results of USE were identified. Conventional US, USE and US-guided FNAC were performed by one radiologist (B.J.H.) with 17 years of experience of thyroid US and 2 years of experience of thyroid USE. Conventional US and USE examinations were performed using a Hitachi Logos E, EUB-7500 (Hitachi Medical System, Tokyo, Japan) and a 6–14 MHz linear probe. Conventional transverse and longitudinal US images were obtained for each nodule and followed by the performance of USE. During the conventional US examination, the following features were evaluated: largest diameter of the nodule; anteroposterior lengths (nodule, thyroid gland
1509
and muscle overlying the thyroid gland); the presence of rim calcification; locations (upper, mid, lower pole and isthmus; medial or lateral portion of each thyroid lobe); degree of contact with the carotid artery; nodule motion secondary to arterial pulsation; bulging of the nodule beyond the thyroid gland; and the presence of thyroiditis. Anteroposterior lengths were measured at the midportion of the respective structure. The degree of contact with the carotid artery was classified according to the following two categories: no contact or a contact angle with the carotid artery of ,90 or a contact angle with the carotid artery of .90 ). Nodule motion secondary to arterial pulsation was evaluated using real-time US and was classified as positive or negative. Pulsation was classified as positive when arterial pulsation cyclically shifted the thyroid lobe medially. During USE examination, neck was slightly extended to prevent over stretching of the neck muscles. The probe was applied to the neck with light pressure by hand. A region-of-interest (ROI) box included the nodule, entire thyroid gland visible in the US image and the surrounding soft tissue and superficial muscles. USE was displayed over the grayscale image according to a color scale that ranges from red for components with greatest elastic strain, to blue for components with no strain. The level of compressive force is maintained between 2 and 4 throughout the examination and realtime monitoring is possible in US monitor. To minimize motion of thyroid gland, the patient was asked to avoid swallowing and hold their breathing during the examination. If motion of the thyroid gland secondary to arterial pulsation was detected, the compressive force was gradually increased up to level 4. Three video files (duration: 10–20 s) were recorded for the USE examination of each nodule. All video files were reviewed and multiple static USE images were captured. Among the captured USE images, two investigators (B.J.H. and K.J.L., with 6 years of experience in thyroid US and 6 months of experience in USE) selected the three best quality images with consensus, (i.e., those which showed the most homogeneously green color for the thyroid parenchyma). We defined homogeneous green color as green color fills more than 90% of ROI box. In cases of discrepancy in the evaluation of reliability between the two investigators, another investigator (L.J.H., with 13 years of thyroid US experience and 6 months of USE experience) reviewed the images to decide whether the images were reliable or not. The USE result was defined as being less reliable when one of the following criteria was observed: (1) ,50% green color in the ROI box for the thyroid parenchyma; (2) discordance in elasticity scores in three USE images; and (3) intranodular color signal loss. One month later, one investigator (B.J.H) repeated the measurement in
1510
Ultrasound in Medicine and Biology
the same manner in the randomly arranged subjects to evaluate intraobserver agreement. Each USE image was scored according to the classification proposed by Asteria et al. (Asteria et al. 2008). All numerical values in the present study were the average value for the three selected USE images. Statistical analysis Interobserver and intraobserver agreement were evaluated using Cohen’s k analysis. Bivariate analysis was performed to evaluate the relationship between the characteristics and the reliability of USE. The c2 test or Fischer’s exact test was used for categorical variables. For continuous variables, the largest diameter of the nodule was dichotomized using a cutoff value of 10 mm and compressive force was converted to a dichotomous variable according to the median values. Other continuous variables (anteroposterior lengths of the nodule, thyroid gland and muscle overlying the thyroid gland) were evaluated using a t-test or the MannWhitney U test, depending on the result of normality testing. Reliability of USE was regarded as a dependent variable. Multivariable logistic regression analysis was performed to evaluate interaction between characteristics. All p values were calculated using the two-tailed test. A p , 0.05 was considered significant. Calculations were performed using SPSS 14.0 for Windows (SPSS, Chicago, IL, USA).
Volume 38, Number 9, 2012
Consensual reliability of USE was 68% (72/106). A total of 32% (34/106) of the thyroid nodules showed less reliable results. The causes of less reliable results were as follows: (1) ,50% green color in the ROI box for the thyroid parenchyma (n 5 13) (Fig. 1); (2) discordance in elasticity scores in the three USE images (n 5 21) (Fig. 2); (3) intranodular color signal loss (n 5 22) (Figs. 2 and 3); and (4) the presence of two or more of the preceding causes (n 5 26). The causes of intranodular color signal loss were rim calcification (n 5 6) and arterial pulsation (n 5 16). In the bivariate analyses, upper- and isthmic location, rim calcification, compressive force and arterial pulsation were significantly associated with reliability (Table 1). However, largest diameter, location (medial vs. lateral), bulging of the nodule beyond the thyroid gland, thyroiditis and mean anteroposterior lengths (nodule, thyroid gland and muscle overlying the thyroid gland) were not significantly associated with reliability (Tables 1 and 2). Variables of p , 0.1 in the bivariate analyses were put into multivariable analysis. Multivariable logistic regression analysis revealed that rim calcification (p 5 0.002), compressive force $3 (p , 0.001), and nodule motion secondary to arterial pulsation (p , 0.001) were significant independent variables for reliability (Table 3). There were substantial (Landis and Koch 1977) interobserver and intraobserver agreement for reliable USE results (k values 0.738 and 0.765, respectively).
RESULTS Among the 78 patients, 53 patients had one nodule, 22 patients had two nodules and three patients had three nodules. The mean diameter of the thyroid nodules was 1.2 6 1.0 cm (range, 0.3–5.7 cm).
DISCUSSION The present study demonstrated that reliable USE results were obtained for 68% of the nodules and these were significantly associated with nodule motion, rim
Fig. 1. Longitudinal ultrasound elastography (USE) image in a 64-year-old man captured from a video file. Green color in the thyroid parenchyma is not over 50% in USE image. This USE image was considered less reliable result.
Ultrasound elastography for thyroid nodules d J. K. KIM et al.
1511
Fig. 2. Transverse and longitudinal ultrasound elastography (USE) images in a 62-year-old female with incidental thyroid nodule shows discordance of elasticity scores across serial images. The same nodule (white arrows) shows different elasticity scores across serial images. In the longitudinal scan (a–c), elasticity scores were 4, 3 and 1, respectively. In the transverse scan (d and e), arterial pulsation induced motion artifact in the thyroid parenchyma and the target nodule. The two USE images show differing elasticity scores. Color signal loss was detected in the thyroid nodule (arrowheads) and the carotid artery (black arrow).
calcification and compressive force. For nodules showing reliable USE results, interobserver and intraobserver agreement was substantial. This result, therefore, differs from that of Park et al. (Park et al. 2009), who assessed interobserver agreement for all thyroid nodules, irrespective of the reliability of the USE result. Many investigators reported high sensitivity and specificity of USE (Lyshchik et al. 2005; Rago et al. 2007, 2010; Asteria et al. 2008; Dighe et al. 2008; Hong et al. 2009; Bojunga et al. 2010; Sebag et al. 2010). Recently, Sebag et al. suggested that the cutoff level of elasticity index for malignancy was estimated as 65 kPa. The range of elasticity index was 150 6 95 kPa
(range, 30–356) in malignant nodules vs. 36 6 30 (range, 0–200) kPa in benign nodules. But several authors pointed out the problems of USE (Park et al. 2009; Hegedus 2010; Kagoya et al. 2010). Hegedus suggested several problems of USE studies including small sample sizes, inadequate account of selection criteria, unclear definition of what constitutes a consecutive group of patients, focus on patients with solitary nodules and lack of surgical confirmation (Hegedus 2010). In addition, there have been no definite criteria of proper USE image, so we tried to suggest proper USE criteria. The present authors have used USE to differentiate between malignant and benign nodules in clinical
1512
Ultrasound in Medicine and Biology
Volume 38, Number 9, 2012
Fig. 3. Longitudinal ultrasound elastography (USE) image in a 62-year-old female with intranodular color signal loss due to calcification. In the longitudinal USE image, a thyroid nodule with rim calcification shows intranodular color signal loss (arrowhead) due to reflection of the ultrasound beam at the surface of the rim calcification.
practice. In our experience, there are several factors that preclude the application of USE results in differentiating malignant and benign thyroid nodules. To exclude these factors, reliable USE results were defined according to three criteria. Application of these criteria showed that 32% of nodules had less reliable USE results. For this reason, USE examination should not be applied to all thyroid nodules. Nodule motion secondary to arterial pulsation was significantly associated with USE reliability. Sixteen of the 18 nodules affected by arterial pulsation showed intranodular color signal loss (Fig. 2). Inconsistent information concerning nodule position because of arterial pulsation may cause intranodular signal loss. Since elasticity scores are assigned on the basis of intranodular color signal, such signal loss renders scoring impossible. Lyshchik et al. also suggested that carotid artery pulsation causes noise of USE (Lyshchik et al. 2005). Regarding the degree of contact with the carotid artery, it was not Table 1. Bivariate analysis revealing relationships between characteristics on conventional US and reliability of ultrasound elastography (USE) Variables
Reliable
Less reliable
p valve
Location (upper and isthmic) Location (medial) Nodule motion* Bulging Rim calcification Compressive force ($3) Thyroiditis Nodule size ($10 mm) Contact with CA ($90 )
59 (82%) 36 (50%) 2 (2.8%) 34 (47%) 1 (1.4%) 60 (83%) 31 (43%) 16 (22%) 23 (32%)
20 (59%) 19 (56%) 16 (47%) 20 (59%) 6 (18%) 15 (44%) 19 (56%) 5 (15%) 8 (24%)
0.011 0.572 ,0.001 0.265 0.004 ,0.001 0.217 0.365 0.374
CA 5 carotid artery. * Nodule motion secondary to arterial pulsation. Bold numbers represent statistically significant findings.
significantly related to USE reliability. We suggest nodule motion was induced by the arterial pulsation rather than the degree of contact with the carotid artery. Rago et al. proposed that nodules with a calcified shell should be excluded from USE evaluation since the US beam does not cross the calcification and the compression does not result in tissue strain deformation (Rago et al. 2007). Hong et al. also demonstrated that two hyperplastic nodules with rim calcifications showed malignant USE feature (Hong et al. 2009). In the present study, six of the seven nodules with rim calcifications showed intranodular signal loss, resulting in less reliable USE results (Fig. 3). To obtain reliable results, a constant level of compressive force should be maintained throughout the USE examination (Rago et al. 2007). In the present study, a compressive force of between 2 and 4 was maintained, as in the study by Asteria et al. (Asteria et al. 2008). However, a compressive force of $3 significantly increased USE reliability (p , 0.001). This can be explained by the fact that the increased compressive force restricted nodule movement secondary to arterial pulsation.
Table 2. Relationship between anteroposterior (A-P) lengths of nodule, thyroid gland, muscle and reliability of ultrasound elastography (USE) Anteroposterior lengths
Reliable
Less reliable
p valve
Nodule Thyroid gland Muscle*
0.690 (0.397) 1.328 (0.376) 0.298 (0.204)
0.689 (0.518) 1.243 (0.456) 0.293 (0.252)
0.565 0.178 0.509
Note: Parenthesis is standard deviations. * Muscle overlying the thyroid gland.
Ultrasound elastography for thyroid nodules d J. K. KIM et al.
Table 3. Multivariable logistic regression analysis revealing relationships between characteristics on conventional US and reliability of ultrasound elastography (USE) Variables
Odds ratio
95% CI
p valve
Arterial pulsation Compressive force ($3) Rim calcification
86.0 14.4 56.1
13.1–562.8 3.5–59.5 4.7–674.6
,0.001 ,0.001 0.002
In conclusion, the definition of reliable USE image is important in clinical practice. According to the reliable USE criteria suggested in the present study, 68% of the thyroid nodules showed reliable results. For nodules showing reliable USE results, interobserver and intraobserver agreement was substantial. Therefore, clinical application of USE should be restricted to the thyroid nodules with reliable results. REFERENCES Asteria C, Giovanardi A, Pizzocaro A, Cozzaglio L, Morabito A, Somalvico F, Zoppo A. US-elastography in the differential diagnosis of benign and malignant thyroid nodules. Thyroid 2008;18: 523–531. Bojunga J, Herrmann E, Meyer G, Weber S, Zeuzem S, Friedrich-Rust M. Real-time elastography for the differentiation of benign and malignant thyroid nodules: A meta-analysis. Thyroid 2010;20:1145–1150. Cochlin DL, Ganatra RH, Griffiths DF. Elastography in the detection of prostatic cancer. Clin Radiol 2002;57:1014–1020. Dighe M, Bae U, Richardson ML, Dubinsky TJ, Minoshima S, Kim Y. Differential diagnosis of thyroid nodules with US elastography using carotid artery pulsation. Radiology 2008;248:662–669. Gao L, Parker KJ, Lerner RM, Levinson SF. Imaging of the elastic properties of tissue–a review. Ultrasound Med Biol 1996;22:959–977. Garra BS, Cespedes EI, Ophir J, Spratt SR, Zuurbier RA, Magnant CM, Pennanen MF. Elastography of breast lesions: Initial clinical results. Radiology 1997;202:79–86. Greenleaf JF, Fatemi M, Insana M. Selected methods for imaging elastic properties of biological tissues. Annu Rev Biomed Eng 2003;5: 57–78. Hegedus L. Can elastography stretch our understanding of thyroid histomorphology? J Clin Endocrinol Metab 2010;95:5213–5215.
1513
Hong Y, Liu X, Li Z, Zhang X, Chen M, Luo Z. Real-time ultrasound elastography in the differential diagnosis of benign and malignant thyroid nodules. J Ultrasound Med 2009;28:861–867. Kagoya R, Monobe H, Tojima H. Utility of elastography for differential diagnosis of benign and malignant thyroid nodules. Otolaryngol Head Neck Surg 2010;143:230–234. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159–174. Lerner RM, Huang SR, Parker KJ. ‘‘Sonoelasticity’’ images derived from ultrasound signals in mechanically vibrated tissues. Ultrasound Med Biol 1990;16:231–239. Lyshchik A, Higashi T, Asato R, Tanaka S, Ito J, Mai JJ, Pellot-Barakat C, Insana MF, Brill AB, Saga T, Hiraoka M, Togashi K. Thyroid gland tumor diagnosis at US elastography. Radiology 2005;237:202–211. Moon WJ, Jung SL, Lee JH, Na DG, Baek JH, Lee YH, Kim J, Kim HS, Byun JS, Lee DH. Benign and malignant thyroid nodules: US differentiation—multicenter retrospective study. Radiology 2008;247: 762–770. Ophir J, Alam SK, Garra B, Kallel F, Konofagou E, Krouskop T, Varghese T. Elastography: Ultrasonic estimation and imaging of the elastic properties of tissues. Proc Inst Mech Eng H 1999;213: 203–233. Ophir J, Cespedes I, Ponnekanti H, Yazdi Y, Li X. Elastography: A quantitative method for imaging the elasticity of biological tissues. Ultrason Imaging 1991;13:111–134. Pallwein L, Mitterberger M, Struve P, Pinggera G, Horninger W, Bartsch G, Aigner F, Lorenz A, Pedross F, Frauscher F. Real-time elastography for detecting prostate cancer: Preliminary experience. BJU Int 2007;100:42–46. Park SH, Kim SJ, Kim EK, Kim MJ, Son EJ, Kwak JY. Interobserver agreement in assessing the sonographic and elastographic features of malignant thyroid nodules. AJR Am J Roentgenol 2009;193: W416–W423. Rago T, Santini F, Scutari M, Pinchera A, Vitti P. Elastography: New developments in ultrasound for predicting malignancy in thyroid nodules. J Clin Endocrinol Metab 2007;92:2917–2922. Rago T, Scutari M, Santini F, Loiacono V, Piaggi P, Di Coscio G, Basolo F, Berti P, Pinchera A, Vitti P. Real-time elastosonography: Useful tool for refining the presurgical diagnosis in thyroid nodules with indeterminate or nondiagnostic cytology. J Clin Endocrinol Metab 2010;95:5274–5280. Sebag F, Vaillant-Lombard J, Berbis J, Griset V, Henry JF, Petit P, Oliver C. Shear wave elastography: A new ultrasound imaging mode for the differential diagnosis of benign and malignant thyroid nodules. J Clin Endocrinol Metab 2010;95:5281–5288. Thomas A, Fischer T, Frey H, Ohlinger R, Grunwald S, Blohmer JU, Winzer KJ, Weber S, Kristiansen G, Ebert B, Kummel S. Realtime elastography—An advanced method of ultrasound: First results in 108 patients with breast lesions. Ultrasound Obstet Gynecol 2006; 28:335–340.