High-precision position estimation in PET using artificial neural networks

High-precision position estimation in PET using artificial neural networks

ARTICLE IN PRESS Nuclear Instruments and Methods in Physics Research A 604 (2009) 366–369 Contents lists available at ScienceDirect Nuclear Instrume...

314KB Sizes 3 Downloads 162 Views

ARTICLE IN PRESS Nuclear Instruments and Methods in Physics Research A 604 (2009) 366–369

Contents lists available at ScienceDirect

Nuclear Instruments and Methods in Physics Research A journal homepage: www.elsevier.com/locate/nima

High-precision position estimation in PET using artificial neural networks F. Mateo , R.J. Aliaga, N. Ferrando, J.D. Martı´nez, V. Herrero, Ch.W. Lerche, R.J. Colom, J.M. Monzo´, A. Sebastia´, R. Gadea ´n y de las Comunicaciones Avanzadas (ITACA), Universidad Polite ´cnica de Valencia, Camino de Vera Digital Systems Design Group (DSD), Instituto de las Tecnologı´as de la Informacio s/n, 46022 Valencia, Spain

a r t i c l e in f o

a b s t r a c t

Available online 29 January 2009

Traditionally, the most popular technique to predict the impact position of gamma photons on a PET detector has been Anger’s logic. However, it introduces nonlinearities that compress the light distribution, reducing the useful field of view and the spatial resolution, especially at the edges of the scintillator crystal. In this work, we make use of neural networks to address a bias-corrected position estimation from real stimulus obtained from a 2D PET system setup. The preprocessing and data acquisition were performed by separate custom boards, especially designed for this application. The results show that neural networks yield a more uniform field of view while improving the systematic error and the spatial resolution. Therefore, they stand as a better performing and readily available alternative to classic positioning methods. & 2009 Elsevier B.V. All rights reserved.

Keywords: Positron emission tomography Artificial neural networks Incidence position estimation Anger’s logic Multi-layer perceptron

1. Introduction The precise estimation of the incidence position of photons in a Positron Emission Tomography (PET) detector surface is a crucial issue to achieve a correct reconstruction of medical images. Usually, the reconstruction of these images is very sensitive to inevitable measurement errors. Therefore, it is essential to recover the information about the imaginary line along which the propagated photons travel, called line of response (LOR), as reliably as possible. The correct estimation of the LOR makes it possible to identify the point where the photons enter the scintillator crystal of the detector and, consequently, the centroid where the scintillation process takes place. A classical method for estimating the 2D position of the centroid of a light distribution crystal is Anger’s logic [1]. It estimates the position by applying one simple formula per coordinate. The solution proposed by Anger involves connecting the photomultiplier (PMT) outputs to a simple resistor division circuit to obtain only four signals ðX; Xþ; Y; YþÞ. However, Anger’s logic introduces some important drawbacks in the detection process: non-uniform spatial behavior, differences between each PMT gain or the deformation of the light distribution when it approaches the edge of the scintillator (border effects). These problems are alleviated by using correction maps, but cannot be entirely solved. The main consequence is an

 Corresponding author.

E-mail address: [email protected] (F. Mateo). 0168-9002/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.nima.2009.01.058

unavoidable reduction of the Useful Field Of View (UFOV) of the PET camera. We have researched on how artificial neural networks (ANNs) could be used for bias-corrected position estimation of perpendicular impinging photons in planar PET. The designed networks are going to be implemented into a specifically designed testbench to work in real time. Due to their highly parallelizable structure, ANNs represent a good option to consider if one wants to optimize the computational time. The previous positioning applications of ANNs reported in the literature cover PMTs [2], Avalanche Photodiode (APD) based detectors [3] and also Multi Anode PMTs (MA-PMTs), using simulated data [4,5]. Other recent works have dealt with the sensitivity of PET using ANNs for accurate LOR computation [6] and with position calibration of PET detectors using ANNs [7]. This paper is structured as follows. Section 2 introduces the basic features and functionality of ANNs. Section 3 outlines the structure and components of the testbench used in the experiments. In Section 4 we describe the experiments carried out and the most relevant results are presented in Section 5. Finally, Section 6 summarizes the conclusions and raises some issues for future research.

2. Artificial neural networks ANNs are highly interconnected network structures consisting of many simple processing elements that can perform many parallel computations for data processing [8]. To date, they have been widely used for many scientific applications which involve

ARTICLE IN PRESS F. Mateo et al. / Nuclear Instruments and Methods in Physics Research A 604 (2009) 366–369

367

x1

xm-1

36.75 mm

49 mm 42 mm

x2

xm N1 tansig neurons

tansig neurons 2

Outputtansig layer

Fig. 1. Scheme of a generic MLP of two-hidden layers, m inputs, N 1 and N 2 nodes in the first and second hidden layer, respectively, and one output (architecture: m=N 1 =N 2 =1). All activation functions have been considered hyperbolic tangents.

function approximation (regression), classification and data processing. A single processing element is called neuron or node, and it typically performs a nonlinear weighted sum of its inputs, as indicated below: ! X ðwi xi þ bÞ (1) y¼f i

where xi is the neuron input, y is the neuron output, wi is the neuron weight, b is the bias and f is the activation function. A neural network is organized in layers of neurons. It contains one input layer, one output layer and a number of hidden layers in between. Multi-layer perceptrons (MLPs) are feedforward neural networks that undergo supervised training with the backpropagation algorithm or any of its variants. In this work, the ANNs selected were two-hidden layer MLPs with hyperbolic tangent (tansig) activation functions in all layers (see Fig. 1). The backpropagation algorithm is based on the gradient descent, or the iterative reduction of the error function by modifying the weights of the network in each iteration or epoch. The weights are updated according to the Widrow–Hoff delta rule [9]. The weight update rule for the output nodes is indicated below: ¼ wtij  m wtþ1 ij

qEp qwij

(2)

where wij is the weight of the connection between the j-th neuron of the hidden layer and the i-th neuron of the output layer, t is the iteration, m is the learning rate and Ep is the output error function evaluated for each input pattern p, which can be expressed, for N output nodes, as Ep ¼

N 1X ðep Þ2 2 i¼1 i p di

(3) p di

being epi ¼  ppi , where are the desired output values (targets) and ppi are the predicted output values. The error signals are propagated backwards through the network, applying the chain rule, which enables the computation of the hidden layers weight updates. Standard backpropagation suffers from many problems related to slow convergence and local minima. Therefore, it has been improved in many ways giving rise to more modern algorithms.

MA-PMT Fig. 2. Scheme of the scintillation crystal projection on the MA-PMT used in the experiments. The useful area of the detector is limited by the crystal size to the 6  6 inner anodes.

the impacts consists of a 1 mm diameter 22Na radioactive source ð10 mCiÞ, attached to two stepper motors that control the position in each axis. The scintillator crystal used was a 42  42  10 mm3 continuous slab of LSO, which is coupled to a Hamamatsu H8500 Flat-Panel MA-PMT [10] with 8  8 output anodes. The useful area covered by the MA-PMT is 49  49 mm2 , but the size of the crystal limited the usable anodes, as seen in Fig. 2. This is not always a hindrance, as it also excludes samples affected by border effects when building the model. The detector electronics were specifically designed for this application [11]. The analog part consists in an Application Specific Integrated Circuit (ASIC). It includes a preamplifier stage and an implementation of a resistive readout circuit based on Siegel’s discretized positioning circuit (DPC) [12]. That is basically a symmetric network with specific resistive values that scales the values of the currents from the channels of the MA-PMT according to the node where they strike. Hence, this circuit collects the 64 currents delivered by the anodes of the MA-PMT as inputs and codes them into only 4 output currents that are obtained from the corners of the network. The four resulting signals are fed to current-sensitive shaping amplifiers and digitalized. This reduction in the number of signals will speed up the training process of the ANNs and will simplify their hardware implementation. The digital part of the design consists of a Field Programmable Gate Array (FPGA) board, where the ANNs are embedded, among other elements related to signal processing, coincidence detection and data acquisition [13]. The ANNs will only use the four signals obtained from the ASIC, instead of the original 64 MA-PMT channels. A basic scheme of the testbench is illustrated in Fig. 3.

4. Experiments The data acquisition was managed by a Java interface, which also controls the stepper motors and dumps the output data to MATLAB (2007b, The MathWorks, Inc., Natick, MA, USA). The supervised training and validation of the ANNs was carried out with the MATLAB Neural Network Toolbox. The Levenberg–Marquardt (trainlm) training algorithm [14,15] was chosen due to its good accuracy and robustness. Levenberg–Marquardt is a popular alternative to the Gauss–Newton method of finding the minimum of a function FðxÞ that is a sum of squares of nonlinear functions 1X ½f i ðxÞ2 . 2

3. Testbench

FðxÞ ¼

We have developed and mounted a high-precision PET testbench for our tests. The part of the setup used to generate

Let the Jacobian of f i ðxÞ be denoted Ji ðxÞ, then the Levenberg–Marquardt method searches in the direction given by

(4)

ARTICLE IN PRESS F. Mateo et al. / Nuclear Instruments and Methods in Physics Research A 604 (2009) 366–369

22Na

1 2 MA-PMT

Photons

LSO scintillator

368

64

Preamp. 1 2

ASIC Readout

Ia

ANNs X

Ib Ic Id

64

FPGA

Shaping

A/D converters Y

Z

Fig. 3. Block diagram of the testbench. The currents Ia, Ib, Ic and Id are used as inputs to the ANNs after being amplified and digitalized. The signal named Z is related to the depth of interaction, and is currently under study as it could imply adding a fifth input to the ANNs in the future.

the solution s of the set of equations ðJTk Jk þ lk IÞsk ¼ J Tk f k

(5)

where lk are nonnegative scalars, I is the identity matrix and k is the iteration. The ANN selected for the 2D positioning purpose was a double MLP estimator, i.e. one individual network for each coordinate (x and y), with a 4/9/6/1 architecture. The choice of this architecture stems from the optimization carried out previously that used a synthetic data set [4,5]. This dual architecture worked better and learned faster than a single MLP with two outputs to predict both coordinates. A sweep across the inner 6  6 anodes covered by the crystal was carried out. The amount of samples collected per position was 4096. These samples were filtered to remove outliers, and windowed around the energy photopeak of 511 keV to eliminate those samples affected by Compton scattering that could distort the energy histogram. The selected samples were divided in equally sized training and test sets, and several trainings were performed. The number of epochs was kept to a medium value (100 epochs) to achieve the best possible generalization without overfitting. All data were normalized between 1 and þ1 to speed up the learning process [8]. No early-stopping methods such as cross-validation were used during the training because they led to a poorer performance.

Fig. 4. 2D positioning histogram (event count map) produced by the centroid logic. The horizontal and vertical axis represent the coordinates x (mm) and y (mm), respectively.

5. Results

ANNs From the results, it can be appreciated that the centroid (Anger) approach introduces a high imprecision in the position detection. It is also noticeable that non-uniform compression artifacts appear, especially at the edges of the crystal. An example of a 2D histogram obtained from the centroid equations is shown in Fig. 4. On the other hand, the ANN estimators successfully correct these artifacts, increase the spatial resolution and produce a much wider UFOV in both dimensions. The 2D histogram for the dual ANN is shown in Fig. 5. The average MSE obtained on the test set was 0.35, while the mean systematic error was 0.20 mm in each coordinate, which improves drastically the 2.71 mm for the x axis and 2.80 mm for the y axis obtained by Anger’s equations. The training algorithm used (trainlm) improves the best results obtained previously in synthetic benchmarks [4,5] where the best performing algorithm was resilient backpropagation [16]. With regard to the full-width half-maximum spatial resolution, for the centroid logic it is better in the center of the crystal than at the edges, but the opposite occurs with the ANN estimators. A point spread function diagram representing a section of the 2D histograms along the x axis for both Anger’s logic and ANN estimators is shown in Figs. 6 and 7, respectively. For Anger’s logic, the resolution is seldom lower than 2 mm while the ANNs manage to keep it lower than 1 mm for any position. The

−15

700

−10

600 500

−5

400 0 300 5 200 10 100 15 −15

−10

−5

0

5

10

15

0

Fig. 5. 2D positioning histogram (event count map) produced by the optimal ANNs. The horizontal and vertical axis represent the coordinates x (mm) and y (mm), respectively.

ARTICLE IN PRESS F. Mateo et al. / Nuclear Instruments and Methods in Physics Research A 604 (2009) 366–369

Anger´s logic: Spatial resolution 300

Event count

250 200 150 100 50 0 −20

−15

−10

−5

0 x (mm)

5

10

15

20

Fig. 6. Point spread function of a section along the x axis of the 2D histogram produced by the centroid logic (fourth row: y ¼ 3:0625, 15:3125 mmoxo15:3125 mm).

369

that the ANNs can also be used in real setups, working in realtime. We have shown the benefits of using ANN-based position estimators for modeling a PET gamma-ray detector module response and for correcting its nonlinearities while enabling real-time LOR computation, in comparison with classical methods like centroid logic. Apart from the increase in the UFOV (almost 100% of the detector useful area), the systematic errors have been substantially reduced and the compression artifacts have been practically eliminated. The spatial resolution has also been improved with respect to previous studies. The ANNs proposed in this paper act as very good classifiers, giving close outputs to representative ‘‘centers’’, in a similar fashion as linear vector quantization works. However, further measurements will have to be carried out to make a much finer sampling to ensure that the results do not suffer from overfitting. A classification with a fine grid (using many classes) would be virtually equivalent to an accurate function approximation. Among other future lines of work we have considered the research on the hypothetical fifth variable (Z), related with the depth of interaction inside the crystal [17]. This signal could be added as a new input to the ANNs to achieve a full 3D position estimation.

ANNs: Spatial resolution 800 700

Acknowledgments

Event count

600 This work was supported by the Spanish Ministry of Science and Innovation under research project FPA2007-65013-C02-02 and grant BES-2005-9703, and by Generalitat Valenciana local government under grant (GV/2007/008).

500 400 300

References

200 100 0 −20

−15

−10

−5

0 x (mm)

5

10

15

20

[1] [2] [3] [4] [5] [6]

Fig. 7. Point spread function of a section along the x axis of the 2D histogram produced by the optimal ANNs (fourth row: y ¼ 3:0625, 15:3125 mmoxo15:3125 mm).

sections along the y axis have been omitted because of the high similarity with the x coordinate.

6. Conclusions and future work Although other studies have reported the adequacy of ANNs for positioning purposes with synthetic data [4,5], this work confirms

[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]

H. Anger, Rev. Sci. Instrum. 29 (1958) 27. A. Bronstein, et al., IEEE Trans. Nucl. Sci. NS-50 (3) (2003) 421. P. Bruyndonckx, et al., IEEE Trans. Nucl. Sci. NS-51 (5) (2004) 2520. R.J. Aliaga, et al., IEEE Trans. Nucl. Sci. NS-53 (3) (2006) 776. F. Mateo, et al., in: F. Sandoval, et al., (Eds.), Lecture Notes in Computer Science, vol. 4507, Springer, Berlin, 2007. J.B. Michaud, et al., in: IEEE Nuclear Science Symposium Conference Record, Honolulu, HI, USA, vol. 5, 2007, p. 3594. P. Bruyndonckx, et al., Nucl. Instr. and Meth. A 571 (2007) 304. S. Haykin, Neural Networks: A comprehensive foundation, second ed., Prentice-Hall, Englewood Cliffs, NJ, 1999. B. Widrow, M.E. Hoff, Jr., IRE WESCON Conv. 1960, vol. 4, p. 96. Hamamatsu H8500 Technical Data Sheet, hhttp://sales.hamamatsu.com/ assets/pdf/parts_H/H8500H8500B.pdfi. J.M. Monzo´, et al., IEEE Trans. Nucl. Sci. NS-55 (1) (2008) 421. S. Siegel, et al., IEEE Trans. Nucl. Sci. NS-43 (3) (1996) 1634. J.D. Martı´nez, et al., Nucl. Instr. and Meth. A 546 (2005) 28. K. Levenberg, Q. Appl. Math. 2 (1944) 164. D. Marquardt, SIAM J. Appl. Math. 11 (1963) 431. M. Riedmiller, Rprop—description and implementation details, Technical Report, University of Karlsruhe, 1994. Ch.W. Lerche, et al., Nucl. Instr. and Meth. A 537 (2005) 326.