The reliability and validity of a clinical 3D freehand ultrasound system

The reliability and validity of a clinical 3D freehand ultrasound system

computer methods and programs in biomedicine 136 (2016) 179–187 j o u r n a l h o m e p a g e : w w w. i n t l . e l s e v i e r h e a l t h . c o m ...

2MB Sizes 6 Downloads 75 Views

computer methods and programs in biomedicine 136 (2016) 179–187

j o u r n a l h o m e p a g e : w w w. i n t l . e l s e v i e r h e a l t h . c o m / j o u r n a l s / c m p b

The reliability and validity of a clinical 3D freehand ultrasound system Francesco Cenni a,b,*, Davide Monari a,b, Kaat Desloovere b,c, Erwin Aertbeliën a, Simon-Henri Schless b,c, Herman Bruyninckx a a

Department of Mechanical Engineering, KU Leuven, Celestijnenlaan 300b, 3001 Leuven, Belgium Clinical Motion Analysis Laboratory, University Hospital, Pellenberg, Weligerveld 1, 3212 Pellenberg, Belgium c Department of Rehabilitation Sciences, KU Leuven, Tervuursevest 101, 3001 Leuven, Belgium b

A R T I C L E

I N F O

A B S T R A C T

Article history:

Background and Objective: Acquiring large anatomical volumes in a feasible manner is useful

Received 13 May 2016

for clinical decision-making. A relatively new technique called 3D freehand ultrasonography

Received in revised form

is capable of this by combining a conventional 2D ultrasonography system. Currently, a thorough

28 August 2016

analysis of this technique is lacking, as the analyses are dependent on the software imple-

Accepted 2 September 2016

mentation details and the choice of measurement systems.Therefore this study starts by making this implementation available under the form of an open-source software library to perform

Keywords:

3D freehand ultrasonography. Following that, reliability and validity analyses of extracting

3D freehand ultrasonography

volumes and lengths will be carried out using two independent motion-tracking systems.

Volume and length measurements

Methods: A PC-based ultrasonography device and two optical motion-tracking systems were

Reliability and validity analysis

used for data acquisition. An in-house software library called Py3DFreeHandUS was devel-

Open source software

oped to reconstruct (off-line) the corresponding data into one 3D data set. Reliability and validity analyses of the entire experimental set-up were performed by estimating the volumes and lengths of ground truth objects. Ten water-filled balloons and six cross-wires were used. Repeat measurements were also performed by two experienced operators. Results: The software library Py3DFreeHandUS is available online, along with the relevant documentation. The reliability analyses showed high intra- and inter-operator intra-class correlation coefficient results for both volumes and lengths. The accuracy analysis revealed a discrepancy in all cases of around 3%, which corresponded to 3 ml and 1 mm for volume and length measurements, respectively. Similar results were found for both of the motion-tracking systems. Conclusions: The undertaken analyses for estimating volume and lengths acquired with 3D freehand ultrasonography demonstrated reliable design measurements and suitable performance for applications that do not require sub-mm and -ml accuracy. © 2016 Elsevier Ireland Ltd. All rights reserved.

1.

Introduction

Over the past 30 years, medical imaging modalities acquiring large anatomical 3D data sets have been routinely used to

explore features useful for clinical decision-making [1]. These modalities, such as computed tomography (CT) and magnetic resonance imaging (MRI), provide a high image quality with a large field of view. However, these systems are expensive, time consuming and uncomfortable for patients. In more

* Corresponding author. Department of Mechanical Engineering, KU Leuven, Celestijnenlaan 300b, 3001 Leuven, Belgium. Fax: +32 16338012. E-mail address: [email protected] (F. Cenni). http://dx.doi.org/10.1016/j.cmpb.2016.09.001 0169-2607/© 2016 Elsevier Ireland Ltd. All rights reserved.

180

computer methods and programs in biomedicine 136 (2016) 179–187

recent years, ultrasonography (US) has become a valid alternative, capable of providing 3D data sets, albeit with an inferior image quality when compared to CT and MRI [1,2]. Despite this limitation, it is still considered adequate for a wide variety of clinical applications [3], such as musculoskeletal diagnosis and rehabilitation [4]. Nevertheless, in clinical practice it is uncommon to use ultrasound to acquire large anatomical volumes due to the software complexity involved in producing sufficient 3D data sets. Therefore, there is a need for validated and reliable software tools that are adaptable to the hardware at hand and simple enough to use in a day-to-day clinical setting. In musculoskeletal medicine, conventional 2D US is mainly used to detect morphological features, however, confined by the width of the transducer. Consequently, large anatomical volumes cannot be visualised in a single image, constricting the clinical applications [5]. Panoramic imaging can partially negate this drawback by combining multiple 2D US images in a linear sequence [6], yet this still does not provide the thorough analysis that is possible with a 3D data set. 3D US transducers, however, are becoming more commercially available. The problem is that these transducers are limited by the size of the 2D arrays, as well as being relatively expensive [7]. There are other US techniques available for acquiring 3D data sets, such as constrained sweeping, sensor-less, and 3D freehand US (3DfUS). 3DfUS is a technique that enhances a conventional 2D US device into a platform capable of reconstructing 3D anatomical data sets. This technique was initially proposed in the 1990s [8,9], and since then its utility has increased due to the improved US devices and processors. 3DfUS is achieved by combining conventional 2D US with a motion-tracking system, obtaining the position and orientation (pose) of the US transducer during an acquisition. An essential initial step of the 3DfUS technique is to perform the temporal and spatial calibration for the US transducer. This defines the time delay between the US and the pose systems, and the spatial relationship between the US image and the pose of the transducer. These calibrations were already investigated with satisfactory results [7], enabling the acquired 2D US images to be properly transformed into a 3D data set. The 3DfUS technique has many advantages over conventional 2D US, such as visualisation of larger anatomical volumes and the freedom to view the images offline in any plane or orientation. 3DfUS has previously been used in musculoskeletal medicine, most often to acquire and visualise various muscles of the lower limb [4,10]. Appropriate hardware is essential for obtaining good quality volumetric data. However, the pose sensor is the most sensitive signal for enabling such 3D data sets. US machines can offer varying levels of image quality; however, if the pose sensors are not up to standard, the US images cannot be properly allocated and the final outcome will contain artefacts, irrespective of the US image quality. For the 3DfUS techniques, different technologies have been utilised for tracking the US transducer: mechanical, acoustical, electromagnetic and optical. The choice of one technology depends on the type of clinical application and environment where the tracking system will be used [7]. Optical motion tracking systems use multiple cameras to track rigidly affixed markers upon a structure. While systems with a large number of cameras, e.g. 10, are characterised by high precision and are commonly used for human movement analysis [11], inexpensive and portable

devices based on the same principle can be also employed. The nominal accuracy of these portable devices is usually lower than those with more cameras, but is still comparable when a large field of view is not required, such as when the subject analysed is lying on a bed. Some software applications are currently available for providing the relevant tools to apply the 3DfUS technique [12,13]. The first software application was developed in 1999 [9] with real-time acquisition and satisfactory quality measurements [14]. However, there are some limitations with their software, such as the source code not being publicly available, no realtime feedback, limited hardware device and file format compatibility. Some of these drawbacks were solved by another platform recently released, called the PLUS project [13]. This is freely available and works in real-time, also as ultrasoundguided intervention tool (i.e. in operating theatres). However, this software only supports data streams from a limited number of hardware devices, and is written in low-level (machine) languages, such as C++. This makes efficient development and prototyping of new algorithms more difficult. A new software library for implementing the 3DfUS technique would be useful to address some of the aforementioned limitations. The reconstruction of a 3D data set also requires a platform capable of reading the relevant data formats, to optimise the ideal positioning of the 2D images, and to enable efficient data storage. Furthermore, current 3DfUS techniques lack comprehensive analyses of their validity and reliability for measurement performances, and eventual translation for use in clinical practice. Therefore, the aim of the current study is twofold. Firstly, describe and provide a new open-source software library for applying the 3DfUS technique. This library utilises a new approach for off-line 3D reconstruction, compatible with the most-common file formats, and implemented in the programming language Python. In addition, the availability of the implementation allows the reproducibility of the test analyses. Secondly, the study performs a thorough evaluation of the validity and the reliability of this library and its dependency on hardware by comparing two different motion-tracking systems. These measurements consist of calculating the lengths and volumes of several phantom objects.

2.

Materials and methods

2.1.

Hardware

A PC-based US machine with a 10.0 MHz linear transducer and a 60 mm field of view was used for recording a data set of 2D B-mode US images at 30 Hz (HL9.0/60/128Z, Telemed EchoBlaster 128 Ext-1Z system, Lithuania). Four reflective markers were rigidly affixed to the US transducer (Fig. 1) to enable the pose to be acquired by means of two independent optical motiontracking systems. The first system was a portable device with 3-integrated cameras, a spatial resolution less than 1 mm, and a sample rate of 120 Hz (OptiTrack NaturalPoint, USA). The second system was a fixed device with 10 optical cameras, a spatial resolution of 0.1 mm, and a sample rate of 100 Hz (Vicon Motion Systems Ltd., Oxford, UK). To simultaneously acquire the US and pose

computer methods and programs in biomedicine 136 (2016) 179–187

181

Fig. 1 – Cluster of four reflective markers rigidly affixed to the US transducer. Visualisation from the front (A), side (B) and top (C).

data streams, a square signal trigger was produced by the US machine to the motion analysis system. DICOM (standard format for medical imaging) and C3D (3D coordinate information) files were exported by using the proprietary software provided with the systems.

2.2.

Py3DFreeHandUS

An in-house software package called Py3DFreeHandUS was developed in Python 2.7 (www.python.org). This package is able to process data previously acquired (off-line) by any synchronised US and pose systems. It is also capable of performing the temporal and spatial calibration for the specific 3DfUS setup and to reconstruct the corresponding data streams into one 3D data set.

2.2.1.

was moved through each degree of freedom [14]. In addition, a three coupled motion sequence was also added, such as a vertical translation together with each of the three orientations. By choosing any two points on the line, two independent equations were defined per image. The equation that a pixel with image coordinates (u,v) must satisfy [14] is as follows:

⎛ x⎞ ⎛ sx u⎞ ⎜ y⎟ ⎜ sy v⎟ ⎜ ⎟ = C TT T TR R TP ⎜ ⎟ ⎜ 0⎟ ⎜ 0 ⎟ ⎜⎝ 1⎟⎠ ⎜⎝ 1 ⎟⎠

US transducer temporal calibration

To test this, the US transducer was moved vertically ten times in a tank filled with water, where the only visible object in the image was a white horizontal line representing the base of the water tank [15]. In addition to the US images, the vertical coordinates of the pose sensor were also collected. These two sinelike signals were then normalised and cross-correlated. The instant at which the maximum value of the cross-correlated signal occurred defined the estimated time delay. If positive, it implies that the US system is early with respect to the motion tracking system.

2.2.2.

Fig. 2 – The different reference systems involved: US image (P), cluster of markers attached to the US transducer (R), motion tracking system (T) and calibration phantom (C).

US transducer spatial calibration

To determine the homogeneous transformation matrix from the image reference frame to the transducer reference frame (RTP), a spatial calibration was carried out. The other two transformation matrices, from the transducer to the motion tracking system (TTR) and from motion tracking system to calibration phantom (CTT), were known for every time frame of the calibration acquisition. A representation of the spatial calibration set-up, along with the location of these reference frames, is reported in Fig. 2. To achieve the RTP, a method was put forward by Prager et al. [14]. This so-called “single wall” method was further improved by Hsu et al. [16]. It consisted of using the US transducer to scan a copper surface in a water tank, where a clear white line was consistently visible. Subsequently, the US transducer

This resulted in 11 identifiable unknowns in an overdeterminated set of equations (the number of images processed is higher than the unknowns): the pixel to millimetre ratios (sx, sy), six calibration parameters from RTP and three parameters from CTT. For the latter matrix, this consisted of only two rotations and one translation as we assumed the xy-plane to coincide with the plane phantom [16]. The six parameters from R TP and three parameters from CTT could be solved by using the Levenberg–Marquardt algorithm, with the default values set in the Scipy library (www.scipy.org). Two parameters for the pixel to millimetre ratio were calculated separately, as previously recommended [16]. Additionally, since the parameters were in different units, the Jacobian was regularised [14]. To better describe this calibration problem, the condition number (k), the root mean square (RMS) of the equation residuals and the corresponding covariance matrix were also implemented as outcomes for each calibration. A calibration quality assessment was used to calculate the precision and accuracy of the calibration parameters. The precision is an indication of the spread of measures around their mean, whereas the accuracy is the difference between the mean of the measures and the real value [16]. For the spatial calibration experimental analysis, the initial parameters for the nine unknowns were manually estimated (Fig. 3) according to a previous convention [14]. For six of the parameters (x1, y1, z1, α1, β1, γ1), which mapped the

182

computer methods and programs in biomedicine 136 (2016) 179–187

Fig. 3 – The initial parameters estimation: x1, y1 and z1 are the translations of the image (P) to the transducer (R) reference frame, whereas γ1, β1, and γ1 are the corresponding rotations. z2, β2 and γ2 are rotations and translations for describing the transformation between the motion-tracking system (T) and the phantom (C).

transformation between the reference system of the US transducer and the image, we assumed variability within a standard deviation of 3 mm and 3 degrees. The calibration parameters were solved in three steps [16]. Firstly, the Levenberg–Marquardt algorithm was run 30 times, with the starting point for the angles randomly changed within the defined standard deviation, while no variations were applied for the distances. Secondly, the angles with the corresponding lowest RMS in the previous step were used first, and only the distances were randomly changed within the standard deviation. Finally, the Levenberg–Marquardt algorithm was applied using the initial values obtained during the previous stage [14]. Reconstruction precision and distance accuracy as quality measurements for the probe calibration were assessed. The reconstruction precision was analysed by scanning the same point in a metal cross-wires phantom. The distance accuracy was analysed by scanning two points in the same aforementioned metal crosswires phantom. At least 20 well visible points were recorded for reconstruction precision, as well as 12 distances for computing the distance accuracy for each trial, over a large range of angles and depths for the US scans [14].

2.2.3.

Voxel array reconstruction

For the US, a time stamp reader for each image was added to the software package library, whereas for the motion tracking system, an optimal estimation of the TTR in a least square sense was performed by applying an established method [17]. This uses the singular value decomposition of a matrix derived from the positions of the reflective markers acquired. A voxel array (VA) is then created using the pose data of each corresponding US image. Two possible solutions to reduce the VA size are available in the software: 1) manually re-orienting the

Fig. 4 – V’ is the smallest 3D voxel array able to contain all the US images, identified by applying the PCA method. Other voxel arrays can be created, such as V, but are bigger, occupying more memory and containing more empty voxels.

global reference frame to be approximately normal to the scan direction during the acquisition; 2) by using Principal Component Analysis (PCA) on the corners of all the US images, it is possible to identify the main scan direction, thereby rotating the VA according to this direction (Fig. 4). This solution was chosen for the present validations tests. If necessary, a specific US frames interval (as opposed to the entire data set) could be specified for the 3D reconstruction. Finally, the dimension for the VA was determined by defining the scale factor for each axis direction (Fig. 5). This was done according to the real dimension of the pixels in the US images, as well as their total number during the known length of a scan. For allocating the grey scale values in the VA, an algorithm called Pixel Nearest Neighbour was applied [18].This ran through each pixel in every acquired 2D image, and filled the nearest voxel with the value of that pixel. In the present analysis, multiple contributions to the same voxel were averaged.Alternatively, the user could also decide to place the maximum value in the corresponding voxel.The latter option is particularly useful when multiple parallel US scans are collected. To speed up the voxel filling procedure, each 2D scan was positioned in the 3D volume in a vectorised manner. Only two outer loops existed: one for the DICOM file number, and one for the scan number. Once all the US images are properly allocated in the VA, it is likely that some voxels remain empty. This is due to the voxels being small, relative to the distance between the acquired images. Therefore, an algorithm called Voxel Nearest Neighbour was used for voxel interpolation [18] and was applied in the present analysis. This consists of filling each gap by using the assigned grey value of the closest voxel. In the software package, another solution for performing this interpolation is also included. An original method was implemented called “averaging cube” which consists of the following steps: 1) create a cube with 3 voxel sides, centred around the gap; 2) Identify

computer methods and programs in biomedicine 136 (2016) 179–187

Fig. 5 – Application of different scale factors (top and bottom parallelepiped) for the same voxel array; need to be defined for the directions on the transverse image (x and y axis) and for the longitudinal scan direction (z).

the minimum percentage of filled voxels inside the cube (100% = number of voxels in the cube); 3) If there are more voxels than a threshold percentage, take the weighted average of the filled voxels (weighted by the Euclidean distances between the centres of the cubes); 4) If that percentage is not found, the cube size is incremented by two voxels; 5) If the cube size is less than, or equal to, a maximum size, the procedure is repeated from point 2; otherwise, the procedure is stopped and the gap is not filled. Additionally, the voxel gap filling operation was limited by the convex hull between the two slices, using the corners of each slice. Operations on a convex hull are very efficient [19] (Fig. 6). The entire VA was then subdivided in a specific number of blocks, and the gap filling was performed at the current block. Finally, both the VA scans silhouettes (previously created with the wrapping convex hulls) and the grey scale VA data were converted to VTK files/data sets (www.vtk.org), and then exported to VTI files. The total computational time for creating a VA was also recorded.

2.3.

183

Fig. 6 – Two US images consecutive in time; a rectangular parallelepiped containing these two images can be created. Alternatively, the convex hull can be applied by using the corners of each image.

arbitrarily fixated fishing wires was used. The distances between the cross-wires were measured with a caliper (1 mm resolution). For the length acquisitions, two procedures were performed. The first was similar to the volume analysis, acquiring the grid of crossing wires and creating the corresponding VA. In addition, the instrumented US transducer was also used as a spatial pointer (PaP), to visualise all the relevant crosswire points, without any reconstruction. Volume estimations were performed using a custom workflow in MeVisLab (www.mevislab.de). The border of the balloon was manually drawn, defined by a semi-interactive segmentation of structures with edge features in the reconstructed images, thus creating the corresponding 3D model (Fig. 7). With this 3D model, the corresponding volume was

Validity and reliability evaluations

For the volume analysis (Fig. 7), 10 water-filled balloons containing various volumes were used. For each balloon, the volume was measured according to its weight (1 g resolution), and assuming a water density of 1 kg/l. To estimate these volumes by means of the 3DfUS technique, a stack of 2D images using the US transducer was acquired at 30 frames per second, for each balloon. These acquisitions were performed with both Vicon and OptiTrack motion-tracking systems. The Py3DFreeHandUS was then used to produce the corresponding VAs. The known volumes were then compared with the estimated volumes. For the length analysis (Fig. 8), a grid of

Fig. 7 – Example for volume analysis: a water filled balloon (A) was acquired and reconstructed in 3D (B). The borders were extracted in the transverse view (C) and the corresponding 3D model was created (D).

184

computer methods and programs in biomedicine 136 (2016) 179–187

95% confidence intervals (CI). The standard error of measurement (SEM) was used, and also reported as a percentage of its mean (SEM%). The ICC was investigated for absolute agreement to detect any relevant systematic error between operators. The SEM was calculated from the square root of the mean square error from one-way ANOVA.

Fig. 8 – Example for length analyses: a fishing grid consisting of crossed-wires (A) was acquired by following two methods (the VA, which is performed by one single sweep over the wires; and PaP, by moving the US transducer over each of the cross-wires). One extracted representative image for the crossed-wire with the PaP analysis (B) and a 3D reconstruction of the cross-wires (C).

computed. The entire procedure lasted approximately 7 minutes per each balloon. Length estimations (Fig. 8) were performed differently for the two procedures. For the VA method, the crossing points of the grid were identified in a tool in MeVisLab able to visualise the 3D reconstruction with the relevant coordinates, which were extracted and mapped into a common reference frame. These distances were then computed in the Python package. For the PaP method, these crossing points were extracted in the raw US images, with these coordinates again mapped into a common reference frame, allowing the computation of the distances. For each case, image digitisation and distance computation lasted less than 5 minutes. To analyse the consistency for calculating volume and lengths, we assessed the corresponding reliability. For the segmentation analysis, three repetitions of the same trial were performed by two independent experienced operators, whereas for the acquisition analysis, three different trials were analysed twice by the same two experienced operators. The trials for the reliability analysis were only performed with the OptiTrack motion-tracking system.

2.4.

Statistical analyses

The acquisition and feature extraction comparisons were investigated by means of the intra-class correlation coefficients (ICC) for intra-operator (3,1) and inter-operator (3,k) [20], with

3.

Results

3.1.

Library

The software package library Py3DFreeHandUS is available online (github.com/u0078867/Py3DFreeHandUS), along with the relevant documentation. In addition, supplementary files are included, demonstrating how to perform the probe calibration, quality assessments, VA reconstruction, transformation between the reference systems and tools for extracting features from the 3D data set. By use of this library, data can be processed from any combination of US and motion tracking systems, as long as they are able to provide C3D and DICOM files. This software package is free, and can be redistributed and/or modified under the terms of the GNU General Public License, as published by the Free Software Foundation (either version 3 of the License or (at your discretion) a later version).

3.2.

Calibration analysis

Probe calibration analysis was performed in over 700 US images, which enabled us to obtain an over-determined set of equations (Table 1). RMS error and k values for each calibration analysis, over different initial conditions, are summarised in Table 1. The solution, which corresponded to the minimum RMS, i.e. 3.4 mm, was chosen to set the calibration data. The temporal calibration assessment provided reproducible results for both OptiTrack and Vicon, resulting on average 48 ± 7 ms and 14 ± 12 ms, respectively. For the spatial calibration quality assessments, the reconstruction precision and distance accuracy values were comparable to the state-of-theart methods. The reconstruction precision was 1.9 mm for OptiTrack and 2.0 mm for Vicon. The distance accuracy was 1.4 mm for OptiTrack and 1.5 mm for Vicon.

3.3.

Processing time

When producing a VA reconstruction of a balloon (Fig. 7) with a length of 90 mm (by acquiring 400 US images) on a 16 GB RAM

Table 1 – Number of equations utilised for running the Levenberg–Marquardt algorithm. The root mean squared (RMS) residuals of the set of equations and the condition number (k) was averaged over the simulations. For each trial, the corresponding minimum value is also reported. 0

Number of equations

RMS

RMS min

k

k min

#1 #2 #3

1702 1438 1630

7.9 ± 3.4 45.4 ± 22.9 9.5 ± 3.8

3.6 4.0 3.4

1915 ± 6170 880 ± 1708 1379 ± 4450

18 8 12

185

computer methods and programs in biomedicine 136 (2016) 179–187

Table 2 – Intra-class correlation coefficients (ICC), corresponding confidence interval (CI), the standard error of measurements (SEM) and the value in percentage of its mean (SEM%) were computed for the volume and lengths analysed. This analysis was performed over the same trial and different trials, i.e. segmentation and acquisition analysis. Intra- and inter-rater analyses were also carried out. Segmentation Intra-rater

Volume ICC (CI) SEM (ml) SEM % PaP length ICC (CI) SEM (mm) SEM % VA length ICC (CI) SEM (mm) SEM %

Acquisition Inter-rater

Rater2

Raters 1–2

Rater1

Rater2

Raters 1–2

0.99 (.98–.99) 2.1 1.8

0.99 (.98–.99) 2.2 1.9

0.99 (.90–.99) 1.8 1.6

0.99 (.99–1) 0.3 0.3

0.99 (.98–1) 0.9 0.8

0.99 (.86–1) 0.7 0.6

0.99 (.99–1) 0.2 0.6

0.99 (.99–1) 0.7 2.0

0.99 (.99–1) 0.5 1.4

0.99 (.98–.99) 1.2 3.4

0.99 (.98–.99) 1.1 3.1

0.99 (.98–.99) 1.2 3.4

0.99 (.99–1) 0.6 1.7

0.99 (.97–.99) 1.6 4.5

0.99 (.98–.99) 1.2 3.4

0.99 (.96–.99) 1.7 4.8

0.98 (.90–.99) 2.1 5.9

0.99 (.96–.99) 1.3 3.7

Reliability analysis for volume and length

The ICC and SEM values for the volume and length analyses were reported in Table 2. The lowest ICC was 0.99 (CI, 0.90–0.99), and the highest SEM was 2.2 ml (SEM% = 1.9). In the PaP and VA length analyses, segmentation and acquisition variability showed high ICCs, with the lowest being 0.98 (CI, 0.90–0.99). For the SEM, in the PaP method the highest value was 1.1 mm (SEM% = 3.1), whereas for the VA, the highest SEM was 2.1 mm (SEM% = 5.9).

3.5.

Inter-rater

Rater1

Intel i7 2.7 GHz machine, the data processing time with a unitary scale factor was approximately 20 seconds. When the scale factors were modified to 0.17 × 0.17 × 0.33 mm, in accordance with the 2D US image resolution and the total number of images acquired, data processing time was approximately 450 seconds.

3.4.

Intra-rater

Validity analysis for volume and length

For the volume accuracy analyses (Table 3), balloons with various volumes were evaluated, ranging from 65 ml to 177 ml. By using both OptiTrack and Vicon, the 3DUS approach demonstrated a tendency to slightly overestimate volumes. This corresponded to an average absolute difference of 3.2 ml (2.6%),

in the worst case. For the length accuracies, the 3DfUS approach also showed a tendency to slightly overestimate distances. The PaP method had slightly smaller absolute difference than the VA method, with the smallest discrepancy being 0.7 mm with OptiTrack. On average, the worst case for the PaP method was 0.9 mm with Vicon, and for the VA method with OptiTrack, it was 1.2 mm, with one single case at 2.4 mm.

4.

Discussion

Large anatomical 3D data sets are useful for clinical decisionmaking. However, they are currently acquired by means of cumbersome measurements, such as MRI and CT. When longitudinal follow-up measurements are required, it is unhealthy to use CT, and MRI is expensive. Therefore, for some clinical applications, other more feasible medical imaging methods are preferred [1]. For this reason, a technique combining US with a motion-tracking system, such as 3DfUS, was proposed, with promising outcomes [4]. However, thorough analyses of measurement performances are still lacking. Therefore, this study carried out validity tests of various volume and length measurements. The results revealed that the software library and

Table 3 – Comparison of the validity analysis, for volume and lengths, by means of OptiTrack and Vicon system. The ground truth objects were measured, and then estimated with the 3DfUS technique. Mean ± standard deviation of the absolute average difference, [range] of the values, and their corresponding results in percentage are reported.

Volume (ml) OptiTrack Vicon PaP length (mm) OptiTrack Vicon VA length (mm) OptiTrack Vicon

Measured

Estimated

Absolute difference

Absolute difference (%)

112.0 ± 28.5 [64–154] 130.7 ± 25.1 [105–177]

113.9 ± 28.7 [64.8–154.0] 133.9 ± 24.4 [106.5–177.2]

2.7 ± 1.8 [0.0–5.5] 3.2 ± 2.1 [0.2–7.2]

2.5 ± 1.7 [0.0–5.3] 2.6 ± 2.0 [0.1–6.9]

34.8 ± 18.4 [21–59] 35.0 ± 18.3 [21–59]

35.2 ± 18.4 [21.3–59.8] 35.9 ± 18.2 [21.9–60.2]

0.7 ± 0.4 [0.1–1.3] 0.9 ± 0.4 [0.3–1.5]

2.7 ± 1.8 [0.2–5.4] 3.3 ± 2.3 [0.6–7.1]

34.8 ± 18.4 [21–59] 35.0 ± 18.3 [21–59]

35.4 ± 18.5 [21.5–60.2] 35.7 ± 19.3 [21.0–60.6]

1.2 ± 0.8 [0.0–2.4] 1.1 ± 0.8 [0.0–2.1]

4.3 ± 3.6 [0.0–9.8] 2.9 ± 2.1 [0.1–5.2]

186

computer methods and programs in biomedicine 136 (2016) 179–187

the relevant evaluations were sufficiently reliable and valid. Additionally, the utilised software is available to the community as open-source, since implementation details can influence significantly the reliability and validity. There are however a number of limitations to consider in this study. The software library presented does not support realtime feedback for a VA reconstruction. This is due to the chosen programming language, which does not permit processing speeds as seen in other languages, such as C++. As for the hardware, the measurements with different motion-tracking systems were not performed simultaneously, making it impossible to perform a comparison in the same experimental condition. Nevertheless, the similar distribution for the volume and length of the phantom objects, along with the identical measurement protocol, justify comparing the results. The spatial calibration, however, was only performed with one approach, and it is possible that other calibration methods may provide better results in terms of accuracy. As this was not one of the main aims of the present study, further investigation should be carried out to improve the corresponding measure of quality assessments. The proposed custom software package offers an open source platform that is able to process data from any US and motion-tracking device combination. This was possible as the proposed approach is off-line, unlike other 3DfUS projects [12,13]. For the programming language, Python was chosen due to its high-level and easily readable syntax. This permits a fast code prototyping that helps in making custom applications for specific measurements, along with a library that can be used without any specific device configuration. This high-level language is therefore suitable for testing new algorithms that can be applied to US software platforms. Another advantage to use Python is that it works identically on all major operating systems. To the authors’ knowledge, this is the first time that the 3DfUS was implemented without any restrictions in the use of acquisition devices. For obtaining the 3D reconstructions, the most common data formats for US images and pose systems were chosen (DICOM and C3D, respectively).The first step in the data processing structure is the definition of an efficient geometrical object capable of containing all the acquired US images.This is a crucial factor in reducing the file dimensions of the reconstructions, allowing easier data management in the following steps. It was realised that the PCA analysis was the most suitable approach for identifying the main scan direction during the acquisition, and thus for optimising the dimension of the VA (Fig. 4). However, this solution is only efficient when one scan direction is dominant with respect to the others. Once the dimensions of the entire VA are set, another variable is the size of the voxel within the array. Scale factors are used to multiply each of the three VA dimensions.They are set according to the pixel to millimetre ratio in the original US images (at the cross sectional area), whereas the scale factor for the longitudinal direction is defined by dividing the total number of images acquired by the length of the corresponding US scan.The present library allows manual determination of these factors, or allows automatic scaling that sets the maximum possible resolution for the current data. For the gap-filling phase, a new approach was introduced to spare some of the RAM. This phase was restricted between two consecutive slices, instead of using the entire reconstruction (Fig. 6), and was implemented by using an established geometrical

procedure, known as convex hull [19]. This approach allowed identification of the smallest set with all the relevant pixels in two consecutive images. Afterwards, the voxel filling was implemented by using two different methods: the Voxel Nearest Neighbour, which is commonly used in literature [18], and an original one called “averaging cube”, that fills the gaps by averaging voxels in a cube around the missing values. This cube has an adaptive size, to trade-off the sharpness of the reconstructed voxels with the ability to extrapolate. In the present analysis, the Voxel Nearest Neighbour was applied as it permits faster data processing time than the averaging cube method. Once the reconstruction was performed, the voxel array was saved in a VTI format. This was chosen because a compressed formatVTK is wrapped for Python.This format is easy to visualise with software such as Paraview and Mevislab. A careful investigation of the temporal and spatial calibration is essential prior to beginning the data processing for a 3D reconstruction. For this reason, the time delay was estimated and added in the processing pipeline for the 3D reconstruction, compensating the discrepancy between the US and the pose system. The reported results for the two motiontracking systems were not similar. The different tools that were used to synchronise the devices could account for this; however, in both cases these results were comparable with the temporal resolution of the system, which was defined by the frequency of the US machine. As for the spatial calibration analysis for the US transducer, a number of different methods were already successfully performed [7,21]. The “single wall” method improved with the Cambridge phantom [16] was applied. This utilises an easy experimental set-up with regard to other calibration techniques [7]. Contrary to other studies [14], this investigation found that it is important not only to move the US transducer in the recommended sequence of movements, but also to add coupled motion sequences. This was realised by analysing the RMS and corresponding covariance matrix, ascertaining which variables were not well constrained due to the lack of movements during the calibration. This modified sequence of movements decreased the RMS and k values, but were still slightly higher than in previous analyses [7]. The experimental measurements aimed to evaluate phantom objects, resembling the methodology and dimensions of a clinical scenario, especially for muscle morphology analysis. The reliability analyses showed high intra- and inter-operator ICC values, suggesting that irrespective of the operators, the experimental data set and the software package are reliable for computing both volume and length. The accuracy analysis revealed a discrepancy around 3% in all cases, which corresponded to 3 ml and 1 mm for volume and length, respectively. For the volume analysis, these results were slightly higher than previously reported in a similar analysis [4], probably due to the various sizes (Table 3) and shapes of the analysed water-filled balloons. For the length analysis, two methods were performed. In the VA method, the lengths were computed by extracting the crossed-point wires within the 3D data set, whereas for the PaP method, the lengths were computed from the raw 2D US images acquired with the corresponding pose data. Slightly better results were found for the PaP than for the VA analysis, probably due to kinematic artefacts, which may affect the visualisation of the cross-wire feature in 3D.This clearly occurred in one case when using the OptiTrack device,

computer methods and programs in biomedicine 136 (2016) 179–187

resulting in a maximum absolute discrepancy of 2.4 mm. Overall, however, similar results were found when using two different motion-tracking systems, despite one having a lower spatial resolution. This demonstrates that these measurements do not require such a level of spatial resolution for the motiontracking system, suggesting that the overall error of the 3DfUS technique is higher than both motion-tracking system errors. However, when the grey scale values of pixels are relevant, such as echogenicity intensity [22], the use of a high quality motiontracking system may provide a more accurate pixel location in the VA, and therefore an improved visualisation of anatomical structures. For clinical applications, it is useful to perform a comprehensive analysis in anatomical features such as muscle volume and length. This has been reported in several other investigations, with satisfactory findings with respect to accuracy [4,23]. However, a comprehensive reliability and validity analysis on healthy and pathological muscles should still be undertaken to confirm these findings. For in-vivo muscle analysis, MRI data are usually considered the golden standard for comparison with other imaging modalities.

[3]

[4]

[5]

[6] [7]

[8]

[9]

[10]

5.

Conclusion [11]

The present analysis has shown the effectiveness of a new clinical 3DfUS technique. The comprehensive reliability and validity analyses for volume and lengths have demonstrated reliable measurements and suitable performance for applications that do not require sub-mm and -ml accuracy. The description and publication of the Py3DFreeHandUS software library provides a deeper understanding for the results and reproducibility of these measurements. Furthermore, this library provides a new tool that is complementary to the existing 3DfUS software. Future investigations will aim to evaluate the clinical validity of this method.

[12]

[13]

[14]

[15]

[16]

Conflict of Interest The authors have declared that no competing interests exist.

[17]

[18]

Acknowledgment [19]

This work was made possible by a grant from the Doctoral Scholarships Committee for International Collaboration (DBOF) of the KU Leuven, Belgium, awarded to Prof. Kaat Desloovere, grant number DBOF/12/058. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

REFERENCES

[1] A. Fenster, D.B. Downey, H.N. Cardinal, Three-dimensional ultrasound imaging, Phys. Med. Biol. 46 (2001) R67–R99. [2] G.G.R. Schramek, D. Stoevesandt, A. Reising, J.T. Kielstein, M. Hiss, H. Kielstein, Imaging in anatomy: a comparison of

[20]

[21]

[22]

[23]

187

imaging techniques in embalmed human cadavers, BMC Med. Educ. 13 (2013) 143. T.R.N. Nelson, D.H.P. Pretorius, Three-dimensional ultrasound imaging [Review], Ultrasound Med. Biol. 24 (1998) 1243–1270. L. Barber, R. Barrett, G. Lichtwark, Validation of a freehand 3D ultrasound system for morphological measures of the medial gastrocnemius muscle, J. Biomech. 42 (2009) 1313– 1319. L. Barber, R. Barrett, G. Lichtwark, Validity and reliability of a simple ultrasound approach to measure medial gastrocnemius muscle length, J. Anat. 218 (2011) 637–642. J. Powers, F. Kremkau, Medical ultrasound systems, Interface Focus 1 (2011) 477–489. L. Mercier, T. Langø, F. Lindseth, L.D. Collins, A review of calibration techniques for freehand 3-D ultrasound systems, Ultrasound Med. Biol. 31 (2005) 143–165. N. Rankin, B. Downey, L. Munk, F. Levin, Review article sonographic reconstruction: techniques and diagnostic applications. 1993. R.W. Prager, A. Gee, L. Berman, Stradx: real-time acquisition and visualization of freehand three-dimensional ultrasound, Med. Image Anal. 3 (1999) 129–140. N.R. Fry, M. Gough, A.E. McNee, A.P. Shortland, Changes in the volume and length of the medial gastrocnemius after surgical recession in children with spastic diplegic cerebral palsy, J. Pediatr. Orthop. 27 (2007) 769–774. A. Cappozzo, U. Della Croce, A. Leardini, L. Chiari, Human movement analysis using stereophotogrammetry, Gait Posture 21 (2005) 186–196. A. Gee, R. Prager, G. Treece, C. Cash, L. Berman, Processing and visualizing three-dimensional ultrasound data, Br. J. Radiol. 77 (2004) S186–S193. A. Lasso, T. Heffter, A. Rankin, C. Pinter, T. Ungi, G. Fichtinger, PLUS: open-source toolkit for ultrasound-guided intervention systems, IEEE Trans. Biomed. Eng. 61 (2014) 2527–2537. R.W. Prager, R.N. Rohling, A.H. Gee, L. Berman, Rapid calibration for 3-D freehand ultrasound, Ultrasound Med. Biol. 24 (1998) 855–869. G.M. Treece, A.H. Gee, R.W. Prager, C.J. Cash, L.H. Berman, High-definition freehand 3-D ultrasound, Ultrasound Med. Biol. 29 (2003) 529–546. P.-W. Hsu, R.I.W.P. Prager, A.N.H.G. Gee, G.R.M.T. Treece, Rapid, easy and reliable calibration for freehand 3D ultrasound, Ultrasound Med. Biol. 32 (2006) 823–835. I. Soderkvist, P.A. Wedin, Determining the movements of the skeleton using well-configured markers, J. Biomech. 26 (1993) 1473–1477. R. Rohling, A. Gee, L. Berman, A comparison of freehand three-dimensional ultrasound reconstruction techniques, Med. Image Anal. 3 (1999) 339–359. B. Chazelle, An optimal convex hull algorithm in any fixed dimension, Discrete Comput. Geom. 10 (1993) 377–409. J.P.J. Weir, Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM, J. Strength Cond. Res. 19 (2005) 231–240. P.-W. Hsu, G.M. Treece, R.W. Prager, N.E. Houghton, A.H. Gee, Comparison of freehand 3-D ultrasound calibration techniques using a stylus, Ultrasound Med. Biol. 34 (2008) 1610–1621. C.A. Pitcher, C.M. Elliott, F.A. Panizzolo, J.P. Valentine, K. Stannage, S.L. Reid, Ultrasound characterization of medial gastrocnemius tissue composition in children with spastic cerebral palsy, Muscle Nerve 52 (2015) 397–403. N.R. Fry, C.R. Childs, L.C. Eve, M. Gough, R.O. Robinson, A.P. Shortland, Accurate measurement of muscle belly length in the motion analysis laboratory: potential for the assessment of contracture, Gait Posture 17 (2003) 119–124.