Interobserver agreement in the evaluation of B-lines using bedside ultrasound

Interobserver agreement in the evaluation of B-lines using bedside ultrasound

Journal of Critical Care 30 (2015) 1395–1399 Contents lists available at ScienceDirect Journal of Critical Care journal homepage: www.jccjournal.org...

457KB Sizes 6 Downloads 36 Views

Journal of Critical Care 30 (2015) 1395–1399

Contents lists available at ScienceDirect

Journal of Critical Care journal homepage: www.jccjournal.org

Interobserver agreement in the evaluation of B-lines using ☆ bedside ultrasound John Gullett, MD a,⁎, John P. Donnelly, MSPH b, Richard Sinert, DO c, Bill Hosek, MD, MBA d, Drew Fuller, MD, MPH, FACEP e, Hugh Hill, MD a, Isadore Feldman, MD a, Giorgio Galetto, MD a, Martin Auster, MD f, Beatrice Hoffmann, MD, PhD, RDMS g a

Department of Emergency Medicine, Johns Hopkins University, Baltimore, MD 21224 University of Alabama at Birmingham, Department of Emergency Medicine, Birmingham, AL 35249 c Department of Emergency Medicine, SUNY Downstate Medical Center, Brooklyn, NY 11203 d Department of Emergency Medicine, Onslow Memorial Hospital, Jacksonville, NC 28546 e Department of Emergency Medicine, Calvert Memorial Hospital, Prince Frederick, MD 20678 f Department of Radiology, Johns Hopkins University, Baltimore, MD 21224 g Emergency Ultrasound Education and Fellowship, Department of Emergency Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215 b

a r t i c l e

i n f o

Keywords: Emergency ultrasound B-line lung ultrasound alveolar interstitial syndrome comet tail inter-rater reliability

a b s t r a c t Purpose: We evaluated agreement among trained emergency physicians assessing the degree of B-line presence on bedside ultrasound in patients presenting to the emergency department (ED) with acute undifferentiated dyspnea. We also determined which thoracic zones offered the highest level of interobserver reliability for sonographic B-line assessment. Materials and methods: We evaluated a prospective convenience sample of adult patients presenting with dyspnea to an academic ED. Two consecutive bedside lung ultrasounds were performed on 91 patients by a pair of physician-sonographers. The lung ultrasounds were structured 10-zone thoracic sonograms, documented as videos. Sonographer pairs were expert/expert (N100 lung ultrasounds performed) or expert/novice pairs (novices performed 5 supervised examinations after structured training) and blinded to clinical data. Sonographers reported B-line concentration with 3 assessment methods: (1) normal (b3 B-lines) or abnormal (≥ 3 B-lines); (2) ordinal (normal, mild, moderate, or severe), and (3) counting B-lines (0-10; N10) in each zone. All statistical analyses were performed using SPSS version 18.0 (Chicago, IL) and Stata 12.1 (College Station, TX). We evaluated interrater and intrarater agreement using Intraclass correlation coefficients (ICCs). Results: The right and left anterior/superior lung zones showed substantial agreement in all assessment methods and demonstrated best overall agreement (ICC for right: counting, ordinal, and normal/abnormal, 0.811 [0.7140.875], 0.875 [0.810-0.917], and 0.729 [0.590-0.821], respectively). Furthermore, both expert/expert pairs and expert/novice pairs showed substantial agreement in the right and left anterior/superior thoracic zones (expert/expert, 0.904 and 0.777, respectively; expert/novice, 0.862, and 0.834, respectively). Second best agreement was found for the lateral/superior lung zones (right: counting, ordinal, and normal/abnormal, 0.744 [0.612-0.831], 0.686 [0.524-0.792], and 0.639 [0.453-0.761], respectively; and ICC left: counting, ordinal, and normal/abnormal, 0.671 [0.501-0.782], 0.615 [0.417-0.746], and 0.720 [0.577-0.815], respectively). When comparing agreement to distinguish “normal vs abnormal” B-line findings, our results showed significant agreement in all zones with the exception of the right and left inferior/lateral lung fields and left posterior lung. Reinterpretation by 2 experts of all their own randomized video clips at a later date showed agreement of 0.697 (n = 733 zones) and 0.647 (n = 266) zones for ordinal assessment of B-line concentration. Conclusion: Interrater agreement was best in the anterior/superior thoracic zones followed by the lateral/superior zones for both expert/expert and expert/novice pairs. Agreement in the lateral/inferior lung zones was overall inferior. Intrarater agreement was highest at extreme high or low numbers of B-lines. © 2015 Elsevier Inc. All rights reserved.

1. Introduction ☆ Conflict of interest: None of the authors have a conflict of interest to declare. ⁎ Corresponding author at: Department of Emergency Medicine, University of Alabama at Birmingham, 619 19th St S, OHB 251, Birmingham, AL 35249. E-mail addresses: [email protected], [email protected] (J. Gullett). http://dx.doi.org/10.1016/j.jcrc.2015.08.021 0883-9441/© 2015 Elsevier Inc. All rights reserved.

There is a significant body of evidence that B-lines detected on lung ultrasound are a consistent artifact that occurs with interstitial lung syndrome (ILS) [1-6]. A literature review by an expert panel cited level A evidence that B-lines are the sonographic sign of lung interstitial

1396

J. Gullett et al. / Journal of Critical Care 30 (2015) 1395–1399

syndrome and that 3 or more B-lines define a positive regional scan for interstitial syndrome [6]. However, there was no ad hoc systematic review included of studies primarily evaluating interrater reliability in counting B-lines in the clinical environment, for specific different regions of the thorax, or in real time at the bedside by sonographers with different skill levels. Review of the evidence for interobserver agreement of assessing Blines reveals that the issue has been primarily addressed either via a brief subanalysis within other lung ultrasound studies set out to evaluate other research hypotheses, or not at all [1-5,7-12]. Many studies commenting on interobserver agreement of B-lines are based on video review that took place long after the initial patient contact and isolated from the clinical scenario [8-10,13]. During these reviews, a set number of video clips are reviewed, B-lines are counted by different methods, and agreement was evaluated. There has been little practical assessment of the real-time clinical component of making an appraisal at the bedside, including real-time B-line counting. In addition, many studies addressing B-lines used only a small number (≤ 5) of expert sonographers often reviewing video clips of lung ultrasounds post– clinical encounter [1-5,7-11]. As most studies reporting interobserver agreement are based on physicians counting B-lines on a video and then measuring agreement, it is not surprising that some investigators have challenged this idea [14]. Indeed, the B-line artifact is not stable or static. It moves with respiration, changes in shape and appearance as interstitial fluid changes, and with different transducers as well. There is current work in developing computerized B-line scoring to better quantify these dynamic artifacts with computer software, which may be promising [15,16]. This study is unique in that it is specifically designed to test interobserver agreement in quantifying B-lines, delineates this agreement in specific lung fields, and was performed at the bedside in the emergency department (ED) on acutely dyspneic patients by multiple sonographers with either expert-or beginner-level skills. As a secondary outcome, we evaluated the interobserver agreement of post hoc video review of assessing and quantifying B-lines by a group of experts.

2.3. Methods and measurements 2.3.1. Sonographers Each physician participating as a sonographer underwent a 1-hour didactic course on how to perform and interpret a structured pulmonary ultrasound examination. Examples of varying densities of B-lines were shown as none, less than 3, more than 3, and greater than 10. The physicians then performed 5 proctored training examinations on different patients with a clinical indication for lung ultrasound (ie. dyspnea) using the technique adopted for the study and described below. They were supervised by 1 of the experienced physiciansonographers. The scan was considered completed when both novice and experienced physicians or experienced and experienced physician pairs communicated agreement of the degree of B-line presentation in all lung fields in the training case and their opinion correlated with the qualitative examples of the didactic reference cases. Fig. 1 shows an example of a normal lung ultrasound image (A) and B-lines (B). 2.3.2. Scanning approach The technique adopted for this study has been described previously to evaluate the anterior and lateral thorax with 4 zones; however, as many pathologies can be found in the posterior lung fields, a fifth posterior zone was added for this study, bordered by the spine, posterior axillary line, diaphragm, and tip of the scapula (Fig. 2) [2]. Patients were scanned in the supine, sitting, or in lateral decubitus position; however, each study subject was scanned in the same position by the two assigned sonographers. A Sonosite M Turbo machine with the Sonosite

2. Materials and methods 2.1. Study design and setting This prospective convenience sample of patients presenting with undifferentiated dyspnea to an academic ED was used to test the agreement of emergency medicine physician sonographers on B-line assessment. 2.2. Selection of participants Patients were screened for eligibility over a 7-month study period. Enrollment times occurred when a study coordinator and 2 of the participating physicians were present in the ED. Inclusion criteria were patients older than 18 years of age with a chief complaint of dyspnea, who received a chest radiograph and/or computed tomography within 2 hours of undergoing the ultrasound examination, and within 3 hours of presentation to the ED. Exclusion criteria were pregnancy, trauma, non-English speaking, intubated, inability to give written consent, or unblinding of the sonographer. The local institutional review board approved the study. Emergency medicine faculty were enrolled as study sonographers and provided written consent. Faculty who had performed lung sonography on more than 100 patients before study begin were categorized as “experienced” for the purpose of the study. The “novice” sonographers consisted of attending emergency physicians with little experience in ultrasound -credentialed in only FAST (Focused Assessment with Sonography in Trauma), procedure guidance, and 1 other emergency ultrasound indication, but no experience in lung sonography.

Fig. 1. A, Negative, normal lung. B, Lung with B-lines.

J. Gullett et al. / Journal of Critical Care 30 (2015) 1395–1399

1397

the patient was examined by a pair of experts (E/E) or an expert and novice (E/N). We conducted these analyses using SPSS version 18.0 (Chicago, IL). 3. Results We enrolled 95 patients with acute undifferentiated shortness of breath. During enrollment times, a total of 113 patients were approached by the study coordinator; of those, 18 declined or were unable to tolerate a complete examination. Of those enrolled, 91 patients had complete data sets available for analysis (Table 1). Reasons for patients with incomplete data sets included incomplete examinations due to emergent circumstances, patient intolerance, or incomplete data forms. There was a nearly equal ratio of men to women (47 and 44, respectively). Six physicians participated in this study including 4 novice and 2 experienced sonographers. The time to complete 1 lung examination ranged between 2 and 26 minutes, with a median time of 9 minutes (IQR, 7-12). 3.1. Interrater agreement (ICC) The highest agreement for all methods: counting B-lines, ordinal assessment of B-lines, and assessment of normal vs abnormal was in the anterior/superior right and left lung fields (zones 1 right and 1 left). This was followed by zone 3, the lateral/superior lung zones. This finding was consistent for both expert/expert (E/E) and expert/novice (E/N) pairs. The area with least agreement was the right and left lateral/inferior (zone R4 and L4) and left posterior zone (L5) (Table 2). 3.2. Counting B-lines

Fig. 2. A 10-zone thoracic examination was used based on previously described practices and the addition of a posterior zone. Used with permission from Sonoguide.com, Beatrice Hoffmann.

C60 5-2 MHz curvilinear transducer was used with a scanning preset of a mechanical index of 0.7, thermal index of 0.1, and tissue harmonics off. Adjustable settings for the clinician were limited to depth and overall gain. All intercostal spaces of the individual lung fields were thoroughly scanned in sagittal and coronal orientation, moving the transducer from cranial to caudal. One representative 6 second video clip of each zone was recorded by the sonographer. Two sonographers successively examined each patient, each zone was explored thoroughly, and findings were reported for each zone to a coordinator, who recorded the data on a standardized form. 2.3.3. Data collection and analysis of interrater and intrarater agreement We asked sonographers to obtain a representative clip from each zone and to assess (1) whether a zone was normal (b 3 B-lines) or abnormal (≥3 B-lines); (2) the degree of B-lines on an ordinal scale of normal, mild, moderate, or severe; and (3) the observed B-lines on a numerical scale of 0 to 10 or greater than 10. A comment of “unclear” was also accepted. Sonographers were blinded to all aspects of the patient's clinical data as well as data obtained from the other sonographer. This included blinding from auscultation or any physical examination to minimize bias, blinding from chest radiograph results, or any medications received. To test intrarater consistency of interpretation, two expert sonographers independently reinterpreted all ultrasound videos in a blinded randomized video review, including their own scans, and recording the same data as the bedside sonographers. This was performed at a later date and without clinical bedside input that may cause interpretation bias. Data were reported as means with SDs or medians with interquartile ranges (IQRs) as appropriate. We evaluated Interrater and intrarater agreement using intraclass correlation coefficients (ICCs) [17,18]. We repeated the analysis stratifying by whether

Overall agreement was substantial in the anterior/superior right and left lung fields (zone 1 left and right) for counting B-lines (right ICC, 0.811; left ICC, 0.820). Agreement was also substantial among experts (E/E ICC right, 0.845; left ICC, 0.891) and among the expert/novice pairs (E/N right, 0.756; left, 0.714). Agreement in the lateral/superior lung fields (zone 3 left and right) was significant and was the zone with the second best agreement (Table 2). All other zones demonstrated varied or lower levels of agreement for this method. For all lung zones combined, agreement in counting Blines was less significant for both E/E pairs (0.603) and E/N pairs (0.593). 3.3. Ordinal assessment of B-lines For ordinal assessment of B-line density (negative/mild/moderate/ severe), the greatest overall agreement was observed in zone 1 (right ICC, 0.875; left ICC, 0.838). Consistently, substantial agreement was

Table 1 Characteristics of patients with clinical indication of lung ultrasound Variable Median age Total patients analyzed Male Female Admission diagnosis (N1 diagnosis possible) Total admitted Congestive heart failure Chronic obstructive pulmonary disease/reactive airway disease Pneumonia Pulmonary embolism Coronary artery disease/acute coronary syndrome Arrhythmia Other

n (%) 60 (IQR, 48-73 y) 91 47 (52%) 44 (48%) 71 (78%) 20 (22%) 25 (27%) 16 (18%) 5 (5.5%) 13 (14%) 4 (4.4%) 19 (21%)

1398

J. Gullett et al. / Journal of Critical Care 30 (2015) 1395–1399

Table 2 Overall intraclass correlation for B-line assessment Lung field

R1 R2 R3 R4 R5 L1 L2 L3 L4 L5

Counting B-Lines

Ordinal assessment

Normal vs abnormal

ICC (95% CI)

ICC (95% CI)

ICC (95% CI)

0.811 (0.714-0.875) 0.663 (0.489-0.778) 0.744 (0.612-0.831) 0.469 (0.195-0.649) 0.613 (0.413-0.744) 0.820 (0.728-0.881) 0.571 (0.349-0.718) 0.671 (0.501-0.782) 0.466 (0.188-0.648) 0.372 (0.046-0.587)

0.875 (0.810-0.917) 0.614 (0.415-0.745) 0.686 (0.524-0.792) 0.555 (0.326-0.706) 0.626 (0.433-0.753) 0.838 (0.754-0.893) 0.564 (0.338-0.713) 0.615 (0.417-0.746) 0.412 (0.107-0.613) 0.498 (0.238-0.670)

0.729 (0.590-0.821) 0.648 (0.466-0.768) 0.639 (0.453-0.761) 0.425 (0.131-0.619) 0.655 (0.477-0.772) 0.830 (0.743-0.888) 0.688 (0.526-0.794) 0.720 (0.577-0.815) 0.576 (0.356-0.721) 0.295 (−0.058-0.532)

Two-way mixed-effects model for absolute agreement. CI indicates confidence interval. Fig. 4. Bland-Altman plot of expert B. Randomized, delayed expert review of all video clips. X-axis: mean B-line count; Y-axis: difference in B-line count.

again found in zone 1 for both the expert/expert pairings (ICC right, 0.904; left, 0.777) and expert/novice pairings (ICC right, 0.862; left, 0.834). The other zones showed moderate agreement with the exception of the lateral/lower lung zones (zones R4 and L4) and the posterior left zone (L5) (Table 2). For all zones combined, ordinal assessment yielded less significant agreement for both pairings with ICC values of E/E of 0.599 and E/N of 0.631.

3.4. Assessing normal vs abnormal Highest agreement was again found in both anterior/superior zones (R1 and L1) and least agreement in the lateral/lower zones right and left (R4 and L4) and left posterior zone (L5) (Table 2). The overall agreement in assessing whether any given thoracic zone was normal or abnormal by counting of B-lines (b3, normal/negative) was 0.582. The highest agreement among the expert/expert group was again in zone 1 left (0.712). Agreement among experts was less in the other zones. In the E/N group, zone 1 right and left were marginal but relatively higher at 0.642 and 0.617, respectively.

3.5. Expert blinded video overreads/intrarater agreement (ICC) Overall self-agreement (ICC) between the 2 experts (experts A and B) rereading all randomized video clips including their own on a monitor at a later date was 0.697 (n = 733 zones) and 0.647 (n = 266 zones) for ordinal assessment of B-line concentration. When counting B-lines, the intrarater agreement was 0.676 for expert A and 0.586 for expert B. Bland-Altman plots of these reads showed greater agreement at extreme high or low numbers of B-lines (Figs. 3 and 4).

Fig. 3. Bland-Altman plot of expert A. Randomized, delayed expert review of all video clips. X-axis: mean B-line count; Y-axis: difference in B-line count.

4. Discussion In our study of a prospective convenience series of patients presenting to the ED for undifferentiated shortness of breath, we demonstrated that physician agreement (ICC) in assessing and quantifying B-lines was consistently highest in the anterior superior lung fields (zone 1 left and zone 1 right). This zone 1 was consistently superior to the other zones bilaterally using all interpretation methods with only 1 ICC value less than 0.81. The next best agreement bilaterally and across methods was in zone 3, the lateral/superior zones, with all values greater than 0.61. The lowest agreement was consistently in the lateral/inferior zones (R4 and L4) and in the left posterior zone (L5). These findings are subject to explanation based on several factors. The high performance of the superior lung zones is likely due to easy anatomical access and lack of interfering elements such as the heart, breast, and adipose tissues, making it a simpler zone in which to perform and interpret lung sonography. In addition, the lung bases frequently have more pathologic features present, such as consolidations or effusions, which confound clear interpretation of B-lines. Furthermore, respiratory physiology could explain the results: the superior lung fields are the area of the lung with the least mobility, even with deep respiratory excursion. This would make sonographic evaluation more consistent. Lower lateral and posterior lung fields are known to expand with significantly greater excursion and might show more variability with respiratory effort, increasing the difficulty of interpreting a moving, changing artifact [19]. Our data show that, with the exception of L4/R4, L5/R5, and L2 (not valid for the category “normal vs abnormal” [ICC, 0.688]), ICC is substantial (N0.61). Therefore, the agreement to distinguish “normal vs abnormal” is substantial in all zones with the exception of R4/L4 [17,18,20]. The zones R5/L5 do not currently belong to the recommended evaluation of pulmonary edema [6]. These findings allow 2 main interpretations: first, training of sonographers should be more carefully focused on R4 and L4 lung zones, and second, the diagnosis of ILS should only consist of zones 1, 2, and 3. Further research is necessary to substantiate our findings. Two expert sonographers performed blinded and randomized overread of all examinations including their own studies. Both expert sonographers had best agreement for counting B-lines and categorical assessment at the extremes of B-line numbers—very high and very low (Figs. 3 and 4). These findings suggest that this sonographic application is operator and pathology dependent, and merely counting B-lines or categorizing findings might be problematic. Our results show that observations may vary even when the same person later analyzes the same images. We emphasize that the assessment of intrarater agreement was performed in a controlled setting using video review at least 2 months after the examinations and was, therefore, devoid of the clinical distractions of the ED. One surmises that under such conditions, intrarater agreement would be stronger, but this was opposite of

J. Gullett et al. / Journal of Critical Care 30 (2015) 1395–1399

what we observed and could suggest that there was potential bias perceived by the sonographers, merely by approaching the patient without any clinical information available. 4.1. Limitations This study has several limitations. A preferable methodology for a study of interobserver agreement would be to have many sonographers examine the same patient; however, this is logistically difficult. All patients had to provide consent and, therefore, could not be sedated, mechanically ventilated, in significant distress, or altered mentally. This eliminated patients with more severe pathology from enrollment. Furthermore, equal numbers of experts and novices would have been preferable for comparison analysis. Only 2 experts provided examinations for use in the analysis. It is difficult, as well, to compare agreement when each pair examined different patients with different pathologies of differing degrees. Furthermore, sonographer pairs were not randomized on enrollment days, as this was logistically prohibitive. Interrater agreement in assessing B-lines was consistently substantial in the anterior/superior thoracic zones and appeared independent from sonographer experience. However, when assessing the agreement for entire lung scans consisting of the 10 lung fields, the interrater agreement in assessing B-lines overall is moderate and also not related to prior lung ultrasound experience. Expert agreement on blinded randomized video overreads of complete lung scans with elimination of potential bedside bias yielded similarly moderate results. The anterior/superior lung zones alone demonstrated excellent clinician agreement in interpretation despite severity, experience of the clinician, or bedside bias and thus may be the best location for consistently assessing B-lines in the clinical setting. Future studies are needed to substantiate these findings. 5. Conclusions The currently supported approach to determining whether a lung zone is defined as positive is based on the presence of 3 or more numerically counted B-lines [6]. The data presented here imply that such a specific number may not be the most generalizable and reproducible approach. The ephemeral nature of the B-line artifact as well as the logistical, technical, and pathologic factors in acquiring and interpreting them greatly implies that there is an inherent interrater variability. Focused studies are still needed to determine and validate the findings observed in our study. Certainly, acquisition and interpretation performance at the bedside in situ should be determined. Moving forward, the definition of a B-line positive lung zone may possibly be a range of B-lines, for example, and assessment could be as “negative vs positive.” More data are needed to validate this assertion. However, future studies evaluating B-lines and ILS with sonography should consider careful analysis of interrater agreement in methodology. In conclusion, this study reveals some potential implications for incorporating the evaluation of B-lines into clinical practice. First, although the connection between B-lines and interstitial syndrome is established and accepted, this study does not support counting or estimating severity of pathology based on the degree of B-line presence in

1399

all areas of the lung. Agreement was good in the mid to apical lung, which is significant, but that represents only part of this large organ system. The data further imply that ultrasound experience does not necessarily improve agreement among sonographers. Furthermore, although sonographers were blinded to all clinical data and data from the paired sonographer during our study, there seems to be an intrinsic bias just by approaching, interacting, and scanning the patient. Finally, consistently high-level agreement on B-line analysis is found in the anterior/superior lung zones. References [1] Lichtenstein D, Meziere G. A lung ultrasound sign allowing bedside distinction between pulmonary edema and COPD: the comet-tail artifact. Intensive Care Med 1998;24:1331–4. [2] Volpicelli G, Mussa A, Garofalo G, Cardinale L, Casoli G, Perotto F, et al. Bedside lung ultrasound in the assessment of alveolar-interstitial syndrome. Am J Emerg Med 2006;24(6):689–96. [3] Noble VE, Murray AF, Capp R, Sylvia-Reardon MH, Steele DJ, Liteplo A. Ultrasound assessment for extravascular lung water in patients undergoing hemodialysis. Time course for resolution. Chest 2009;135:1433–9. [4] Lichtenstein D, Mézière G, Biderman P, Gepner A, Barré O. The comet-tail artifact. An ultrasound sign of alveolar-interstitial syndrome. Am J Respir Crit Care Med 1997; 156:1640–6. [5] Reissig A, Gorg C, Mathis G. Transthoracic sonography in the diagnosis of pulmonary diseases: a systematic approach. Ultraschall Med 2009;30:438–54 [quiz 455–6]. [6] Volpicelli G, Elbarbary M, Blaivas M, Lichtenstein DA, Mathis G, Kirkpatrick AW, et al. International Liaison Committee on Lung Ultrasound (ILC-LUS) for International Consensus Conference on Lung Ultrasound (ICC-LUS). International evidencebased recommendations for point-of-care lung ultrasound. Intensive Care Med 2012;38(4):577–91 [Epub 2012 Mar 6]. [7] Agricola E, Bove T, Oppizzi M, Marino G, Zangrillo A, Margonato A, et al. “Ultrasound comet-tail images”: a marker of pulmonary edema: a comparative study with wedge pressure and extravascular lung water. Chest 2005;127(5):1690–5. [8] Boussuges A, Coulange M, Bessereau J, Gargne O, Ayme K, Gavarry O, et al. Ultrasound lung comets induced by repeated breath-hold diving, a study in underwater fishermen. Scand J Med Sci Sports 2011;21(6):e384–92. [9] Fagenholz PJ, Gutman JA, Murray AF, Noble VE, Thomas SH, Harris NS. Chest ultrasonography for the diagnosis and monitoring of high-altitude pulmonary edema. Chest 2007;131(4):1013–8. [10] Jambrik Z, Monti S, Coppola V, Agricola E, Mottola G, Miniati M, et al. Usefulness of ultrasound lung comets as a nonradiologic sign of extravascular lung water. Am J Cardiol 2004;93(10):1265–70. [11] Bataille B, Riu B, Ferre F, Moussot PE, Mari A, Brunel E, et al. Integrated use of bedside lung ultrasound and echocardiography in acute respiratory failure: a prospective observational study in ICU. Chest 2014;146(6):1586–93. [12] Lichtenstein DA, Mezière GA. Relevance of lung ultrasound in the diagnosis of acute respiratory failure: the BLUE protocol. Chest 2008;134(1):117–25. http://dx.doi.org/10. 1378/chest.07-2800 [Epub 2008 Apr 10. Erratum in: Chest. 2013 Aug;144(2):721]. [13] Anderson KL, Fields JM, Panebianco NL, Jenq KY, Marin J, Dean AJ. Inter-rater reliability of quantifying pleural B-lines using multiple counting methods. J Ultrasound Med 2013;32(1):115–20. [14] Soldati G, Copetti R, Sher S. Can lung comets be counted as “objects”? JACC Cardiovasc Imaging 2011;4(4):438–9. [15] Brattain LJ, Telfer BA, Liteplo AS, Noble VE. Automated B-line scoring on thoracic sonography. J Ultrasound Med 2013;32(12):2185–90. http://dx.doi.org/10.7863/ ultra.32.12.2185. [16] Weitzel WF, Hamilton J, Wang X, Bull JL, Vollmer A, Bowman A, et al. Quantitative lung ultrasound comet measurement: method and initial clinical results. Blood Purif 2015;39(1–3):37–44. [17] Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159–74. [18] Byrt T, Bishop J, Carlin JB. Bias, prevalence and kappa. J Clin Epidemiol 1993;46: 423–9. [19] West JB. Respiratory physiology: The essentials. 9th ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2011. [20] Hallgren KA. Computing inter-rater reliability for observational data: an overview and tutorial. Tutor Quant Methods Psychol 2012;8(1):23–34.