Image and Vision Computing 28 (2010) 449–457
Contents lists available at ScienceDirect
Image and Vision Computing journal homepage: www.elsevier.com/locate/imavis
SVD lossy adaptive encoding of 3D digital images for ROI progressive transmission q Ismael Baeza a, José-Antonio Verdoy a, Rafael-Jacinto Villanueva a,*, Javier Villanueva-Oller b a b
Instituto de Matemática Multidisciplinar, Edificio 8G piso 2, Universidad Politécnica de Valencia, 46022 Valencia, Spain Ing. Técnica Informática de Sistemas, CES Felipe II, Aranjuez, Madrid, Spain
a r t i c l e
i n f o
Article history: Received 18 August 2006 Received in revised form 26 June 2009 Accepted 11 July 2009
Keywords: 3D digital images Singular value decomposition encoding Lossy progressive transmission Region of interest (ROI) transmission
a b s t r a c t In this paper, we propose an algorithm for lossy adaptive encoding of digital three-dimensional (3D) images based on singular value decomposition (SVD). This encoding allows us to design algorithms for progressive transmission and reconstruction of the 3D image, for one or several selected regions of interest (ROI) avoiding redundancy in data transmission. The main characteristic of the proposed algorithms is that the ROIs can be selected during the transmission process and it is not necessary to re-encode the image again to transmit the data corresponding to the selected ROI. An example with a data set of a CT scan consisting of 93 parallel slices where we added an implanted tumor (the ROI in this example) and a comparative with JPEG2000 are given. Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction The process of progressive transmission (Fig. 1) can be described in the following seven steps: (1) An image is acquired and stored in the image server. (2) In the server, the image is decomposed into ‘‘pieces”, called regions, in a chosen way: these regions can be parts of the original image, parts of some transformation of the image or parts of an appropriate encoding of some transformation of the image. (3) A region of the decomposed image is chosen, following given criteria, and then is transmitted to the client. (4) The client uses the received data (regions) to make an approximation of the original image, using an adequate reconstruction algorithm. (5) A new region among the untransmitted ones is chosen and the server transmits it to the client. (6) The client adds the just received region to the already transmitted ones, and uses the whole set of received regions to reconstruct a new approximation of the original image with improved quality. (7) Steps 5 and 6 are repeated until
q This paper has been supported by the grant Programa de Incentivo a la Investigación 2005 de la UPV No. 5706 and the Generalitat Valenciana Grant AE06/ 001. * Corresponding author. Tel.: +34 963877700. E-mail addresses:
[email protected] (I. Baeza),
[email protected] (J.-A. Verdoy),
[email protected] (R.-J. Villanueva),
[email protected] (J. Villanueva-Oller).
0262-8856/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.imavis.2009.07.004
(a) the transmission is stopped if the image appears not to be of interest, or (b) the transmission is stopped if a better approximation is not needed, or (c) the complete set of data is received if the full data set seems to be desirable, or (d) a feature within the image that seems to be of interest now and the process should be modified to begin transmitting subsets corresponding to this feature.
Except in 7(c), bandwidth and a large amount of daily network traffic are saved improving the network performance. 7(a), 7(b) and 7(c) have been widely considered from different points of view, for instance, using some interpolation techniques [1,2], or wavelets [3–5], or other types of transforms [6], or decorrelation techniques [7], or introducing adaptive strategies in the progressive transmission [8]. However, in this paper we are interested in 7(d), to be precise; the scenario where, during the progressive transmission of an image, the client observes an emerging detail of interest, stops the transmission, selects a sub-image containing the detail, also called region of interest (ROI), and the process continues with the progressive transmission of the ROI. The described scenario, when the client selects the ROI, may be tackled in two ways: (a) the server receives the ROI coordinates, encodes the ROI using data from the original image and transmits it progressively; (b) the server receives the ROI coordinates and it has the image encoded in such a way that allows the progressive transmission of its ROIs immediately without additional computations.
450
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
Fig. 1. Client–server scheme for progressive transmission.
The way (a) requires additional computation time for ROI encoding that involves delays in transmission and visualization. Moreover, it can be difficult to avoid redundancy in transmitted data. The way (b) requires a strong relation between the encoded data of the whole image and the data of any of its ROIs and that does not necessarily occur. To date the techniques developed are insatisfactory, or perform poorly. In the progressive transmission of ROIs, it is desirable not only to reconstruct the image as data are being received, but also to be able to select which part or parts of the emerging image are relevant and therefore, the client wants to receive first, and which part or parts are of no interest. For instance, an approach to this issue is commented in [3–5] using JPEG2000 where ROI-based coding is performed by the MAXSHIFT method described in [9]. But, as it is mentioned in [4], the codeblocks of the ROI and the background are interlaced in such a way that the ROI-based functionalities are not always achieved, because interlacing may induce redundancy during the transmission of ROI data. Furthermore, MAXSHIFT method requires a priori ROI selection, and then the encoding of the 3D image with the selected ROI shifted, each time a ROI is selected. Moreover, the only way to transmit two or more ROIs at the same time is to select a large ROI that contains all the others and then, to apply MAXSHIFT technique to this large ROI, including unwanted parts of the image. In this paper, we propose a method for encoding 3D images using singular value decomposition (SVD) where the ROIs can be selected at any step of the progressive transmission process and its corresponding data can be extracted from the already encoded image avoiding re-encoding and redundancy in data transmission and processing. Furthermore, the selected ROIs can be transmitted sequentially or all at the same time.
The main idea is based on the following property of SVD: Let r be one singular value, ðu1 ; . . . ; um Þ and ðv1 ; . . . ; vn Þ its associated singular vectors obtained from the SVD of a 2D image of size m n. Then, the product
rðu1 ; . . . ; um ÞT ðv1 ; . . . ; vn Þ
ð1Þ
is an approximate reconstruction to the 2D image and the product
rðui ; . . . ; uiþk ÞT ðvj ; . . . ; vjþl Þ
ð2Þ
is an approximate reconstruction to the ROI of size k l with upper left corner coordinates ði; jÞ (see Fig. 2). This idea has been tested recently for large-sized 2D images [10] with promising results. This paper is organized as follows. Section 2 describes the basis of SVD and its application to digital images. In Section 3 we present an algorithm for lossy adaptive encoding 3D images using SVD. In Section 4 we use the encoding developed to present an algorithm for transmission and reconstruction of encoded 3D images. In Section 5, an algorithm for progressive ROI transmission is given, where the complete 3D image is encoded with the algorithm presented in Section 3. An example for the progressive transmission of a 3D CT image and a comparative with JPEG2000 ROI MAXSHIFT method are presented in Sections 6 and 7, respectively. Conclusions are given in Section 8. Throughout, we use the nouns ‘‘image” and ‘‘matrix” interchangeably when the latter is a numerical description whose visualization would be the former. As well, Rst shall denote the space of s t real matrices. This paper focuses on an algorithmic point of view because one of our main goals is to present algorithms that allow to design
Fig. 2. ROI reconstruction using SVD.
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
451
automatic progressive transmission processes, transparent for the final user.
From Corollary 5, SVD provides an useful form to define the regions of an image for an easy further reconstruction, that is,
2. On SVD decomposition
regð1Þ ¼ fu1 ; r1 ; v1 g; .. .
In this section, we recall some results about matrix theory and SVD and its relation to digital images. Details can be found in [11, Section 2.5, pp. 69–73]. Definition 1. If A ¼ ðAij Þ is an s t matrix, then rankðAÞ ¼ p, with p 6 minfs; tg, is the number of rows (columns) of A that are linearly independent. Definition 2. U 2 Rss is an orthogonal matrix if U T U ¼ Is , where Is is the identity matrix of order s. As a consequence, the entries of an orthogonal matrix U lies in interval ½1; 1. Theorem 3 (Singular value decomposition (SVD) [11, Theorem 2.5.2, p. 70]). If A 2 Rst then there exist orthogonal matrices
U ¼ ½u1 ; . . . ; us 2 Rss
and V ¼ ½v1 ; . . . ; vt 2 Rtt ;
ð3Þ
such that
U T AV ¼ diagðr1 ; . . . ; rp Þ;
p ¼ minfs; tg;
ð4Þ
where
r1 P r2 P P rp P 0:
ð5Þ
The ri are the singular values of A and the column vectors ui and vi are, respectively, the ith left singular vector and the ith right singular vector of A. SVD is efficiently implemented in packages as Matlab [12], Mathematica [13], or Fortran and C in BLAS libraries [14]. Two aspects related to SVD should be mentioned here. The first is the fact that the use of orthogonal matrices in matrix decompositions provides the best choice to preserve numerical stability and accuracy of the results [11]. The second is that the cost of the SVD is 4s2 t þ 8st 2 þ 9t 3 flops [11, pp. 253–254] for a matrix of size s t. Now, let us define the 2-norm. 2-norm is a matrix norm that measures the differences between the original and the reconstructed data, i.e., the error during the progressive transmission and it is very related to singular values. It allows us to obtain an optimal order for data transmission. Definition 4 [11, Expression (2.5.8), p. 71]. If A ¼ ðAij Þ is an s t matrix and
U T AV ¼ diagðr1 ; . . . ; rp Þ;
p ¼ minfs; tg;
ð6Þ
is its SVD given by Theorem 3, we define the 2-norm of A by
kAk2 ¼ r1 :
regðpÞ ¼ fup ; rp ; vp g: Note that the regions above defined are not parts of the original images but parts of a transformation of the original image. During SVD it is usual to remove the smaller singular values because they are assumed to not provide substantial improvement in the image reconstruction (for examples applied to some areas of image processing, see [15–17]). Therefore, if we apply SVD to a matrix A 2 Rst , for a given threshold P 0, there exists an index q such that
r1 P P rq P > rqþ1 P P rp P 0; p ¼ minfs; tg; ð10Þ and only consider fu1 ; . . . ; uq g; fv1 ; . . . ; vq g the corresponding singular vector sets. Now, the question about an appropriate selection of the threshold arises. Our proposal is based on an extension to three-dimensional images of the relative difference introduced in [18]. Definition 6. Let O ¼ ðOijk Þ be a 3D image of size r s t and O ¼ ðOijk Þ be an approximation to O. The 3D relative difference is defined by
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uP P P 2 u r s t u i¼1 j¼1 k¼1 Oijk Oijk D3 ðO; O Þ ¼ t : Pr Ps Pt 2 i¼1 j¼1 k¼1 Oijk
ð11Þ
Following the reasoning of Theorem 2 of [18], if we compute the SVD of the parallel slices Oi ¼ ðOijk Þ; i ¼ 1; 2; . . . ; r, the approximations to each slice Oi ; i ¼ 1; 2; . . . ; r are constructed following the expression (8) and we define the sets
D ¼ fr such that r is a singular value of some Oi ; i ¼ 1; 2; .. .; rg; C ¼ fr such that r is not used in any reconstruction of Oi ; i ¼ 1;2;. .. ;rg; it is easy to show that
sP ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r2 D3 ðO; O Þ ¼ Pr2C 2 :
r2D r
ð12Þ
Expression (12) allows us to determine the regions that provide a substantial improvement in the image reconstruction process. As a practical rule for 2D images, if DðA; BÞ 6 0:05, then matrix B is an acceptable approximation of A (see [19]). Nevertheless, that is unacceptable for 3D images. For example, consider the 3D data set O of Fig. 3, an isosurface reconstruction of a human head using VTK [20] from a CT scan consisting of 93 parallel slices. Each slice is a
ð7Þ
Corollary 5 [11, Theorem 2.5.3, p. 72]. Let the SVD of A 2 Rst be given by Theorem 3. If k < r ¼ rankðAÞ and
Ak ¼ r1 u1 vT1 þ r2 u2 vT2 þ þ rk uk vTk ;
ð8Þ
then
min kA Bk2 ¼ kA Ak k2 ¼ rkþ1 :
rankðBÞ¼k
ð9Þ
This corollary gives us a method to reconstruct progressive approximations (expression (8)) optimal in the sense of 2-norm (expression (9)). Expression (9) also means that the higher singular values are more relevant in the image reconstruction than the smaller ones due to their decreasing order (see (5)).
Fig. 3. Bone- and skin-level renderings of the full, 93 CT slice, test data set.
452
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
2D CT image of size 256 256 where the pixels are 16-bit gray level. Consider O , the approximate reconstruction of O with D3 ðO; O Þ ¼ 0:05. In Fig. 4 appears a visualization of O . The wrinkled face that appears in Fig. 4 points out that 5% of error is not enough for 3D images. After some experiments, we suggest D3 ðO; O Þ 6 0:01 or even D3 ðO; O Þ 6 0:005 if more accuracy is needed. 3. Algorithm for encoding a 3D image The aim of this section is to present an algorithm for 3D digital image encoding using SVD that includes the facts mentioned in the comments developed in the above section. The main idea consists of: (1) (2) (3) (4)
Divide the 3D image into 2D slices. Apply SVD to each slice. Define the corresponding regions of each slice. Order the regions as follows: (a) First, the regions with the first (highest) singular value of each slice. (b) Second, order the remainder regions, decreasingly, by the singular value.
(5) Following the above ordering, compute the first r such that D3 ðO; O Þ 6 , with ¼ 0:01 or other prefixed threshold. (6) Reject the regions with singular values r < r . (7) Quantize the regions to be saved, ordered, in an encoded file. The suggested quantization is bval ¼ 32 bits for singular values and bvec ¼ 16 bits for each component of singular vectors (scalar quantization), but other possibilities may be considered. In any case, it is important to note that formula (12) remains true from expression (11) if vector singular quantization is exact enough, that is, a low quantization of singular vectors can lead to the invalidation of formula (12) and then, a non-accurate (noisy) approximation of the 3D image O would be obtained. Algorithm 7 (Encoding). Let O be a 3D image of size r s t; P 0 be a given threshold, bvec and bval be the number of bits in which the singular vectors and the singular values, respectively, are going to be quantized. (1) Divide O into r 2D parallel slices (matrices) Oi ; i ¼ 1; 2; . . . ; r, of size s t. (2) Apply SVD to each slice Oi , that is,
ri1 P P rip P 0; p ¼ minfs; tg; and being fui1 ; . . . ; uip g; fvi1 ; . . . ; vip g the corresponding singular vectors, for i ¼ 1; 2; . . . ; r.
(3) Define the regions as follows:
n o regði; jÞ ¼ uij ; rij ; vij ;
i ¼ 1; 2; . . . ; r; j ¼ 1; 2; . . . ; p:
(4) Order the regions as follows: (a) First, the first region of each slice, i.e., regð1; 1Þ; regð2; 1Þ; . . . ; regðr; 1Þ, (with the highest singular value of each slice). (b) Second, order the remainder regions, decreasingly according to the singular value. (5) Find the first singular value
r such that
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 2 r Prrp>r 2 6 : k¼1
rk
(6) Reject the regions where the singular value is lower than r . (7) Define qðO; Þ the number of non-rejected regions. (8) File encoding. (a) Write r; s and t, encoded using 16 bits. (b) Write bvec and bval encoded using 5 bits. (c) Write qðO; Þ encoded using 16 bits. (d) Quantize the first region of each slice, regð1; 1Þ; regð2; 1Þ; . . . ; regðr; 1Þ, in bvec bits each component of the singular vectors and bval the singular values. Write them as follows:
2 6 4
2
3
3
6 7 r1 þ u11 þ v11 þ7 5 þ þ 4 rr1 þ ur1 þ vr1 5
1 |{z}
|{z}
bval bits
sbvec bits
|{z}
|{z}
tbvec bits
bval bits
|{z}
sbvec bits
|{z}
tbvec bits
(e) Compute bs ¼ Intðlog2 rÞ þ 1, where IntðxÞ is the integer part of x, the number of bits to encode the slice number. (f) Quantize the remainder regions using bvec bits for each component of the singular vectors and using bval bits for the singular values, and write them in the file as follows:
2
3
4slice No: þ r þ u þ v 5 þ |fflfflfflfflffl{zfflfflfflfflffl} |{z} |{z} |{z} bs bits
2
bval bits
sbvec bits
tbvec bits
3
þ 4slice No: þ |{z} r þ |{z} u þ |{z} v 5: |fflfflfflfflffl{zfflfflfflfflffl} bs bits
bval bits
sbvec bits
tbvec bits
Algorithm 7, provides an encoding of a 3D image such that the relevant data (regions with singular values greater than the threshold) are stored almost completely, achieving the desirable goals of: Good quality reconstruction at first steps of progressive transmission. Final reconstruction (almost) identical to the original image. The image encoding algorithm does not generate much more data than the original image. Note that the ordering carried out in Step 4 of Algorithm 7 depends on the SVD and therefore the encoding is dependent on the image. In this sense, the above algorithm is an adaptive encoding of 3D images and consequently leads to an adaptive progressive transmission. Of course, this progressive transmission is lossy, because of data loss, mainly removing the regions with singular value smaller than the threshold and during quantization. 3.1. On the file size of the encoded 3D image
Fig. 4. Bone- and skin-level renderings of the approximate 3D image O with D3 ðO; O Þ ¼ 0:05.
The size of a 3D r s t image O is
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
b r s t bits;
ð13Þ
where b is the number of bits used to store and save the data. If we apply Algorithm 7 to O, the encoded O will have the size
16 þ 16 þ 16 þ 5 þ 5 þ 16 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl ffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl ffl} þrðsbvec þ bval þ tbvec Þ þ ðqðO; Þ rÞ
453
(c) Read the following s bvec bits and store them in the vector variable u. (d) Read the following t bvec bits and store them in the vector variable v. (e) Compute On ¼ On þ ruvT , (improvement of slice n).
74
ðbs þ sbvec þ bval þ tbvec Þ bits:
ð14Þ
Consider the 3D data set O of Fig. 3 consisting of 93 parallel slices, each of size 256 256 encoded in 16 bits. Its size is
16 bits 93 256 256 ¼ 97; 517; 568 bits ¼ 11:625 MBytes: Let us apply Algorithm 7 to this 3D data set O. After SVD of each slice we have
93 256 ¼ 23; 808 regions; and we order them following the Step 4 of the Algorithm 7. The such that
r
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P 2 r Prrp>r 2 6 0:01; k¼1
rk
is r ¼ 414:256, corresponding to region #7680, i.e., there are qðO; Þ ¼ 7679 non-rejected regions (33.43%). Therefore, we can encode the 3D image O, taking bvec ¼ 16 bits, bval ¼ 32 bits and
bs ¼ Int½log2 ð93Þ þ 1 ¼ 7; into a file with (see (14))
74 þ 93ð256 16 þ 32 þ 256 16Þ þ ð7679 93Þð7 þ 256 16 þ 32 þ 256 16Þ ¼ 63; 205; 272 bits ¼ 7:534 MBytes; corresponding to 64.81% of original data.
4. Algorithm for progressive transmission and reconstruction of images The client wants to visualize a 3D image. Once the desired image is identified, the server reads the data from the encoded (by Algorithm 7) file of the image and transmits them. The client receives the data and starts to build progressive reconstructions as more data arrive. The following algorithm is based on formula (8). Each time a new region is transmitted the client reads the slice number with the data of that slice and improves its reconstruction. Algorithm 8 (Image reconstruction). Suppose the server is ready to transmit, progressively, the desired image. The server starts reading from the encoded data and transmits them sequentially. Then, in the client side:
Each region contains a small amount of data; therefore, it would be suitable to update the image visualization only when, for instance, 1% or 2% of additional data are received. For that reason, the data transmitted until Step 2 of Algorithm 8, included, can be sent completely in a row, being a good starting point to build the reconstruction and visualization, and their complete transmission avoids interpolating or similar techniques to approximate slices without any transmitted region. For instance, in the example of 93 CT data set given in Fig. 3, the size of the data transmitted until Step 2, included, is
74 þ 93ð256 16 þ 32 þ 256 16Þ ¼ 764;906 bits ¼ 93:372 KBytes; that corresponds to 1.21% of the total size of the encoded file. 5. Transmitting and reconstructing ROIs In Step 4 of Algorithm 8, let us suppose that the client visualizes an emerging feature of interest. In such a situation, the user is able to stop the process and select the region of interest (ROI). Suppose that the client selects p ROIs whose coordinates of the upper left and the lower right corners are, respectively, ðxi1 ; yi1 ; zi1 Þ and ðxi2 ; yi2 ; zi2 Þ; i ¼ 1; 2; . . . ; p where xi1 ; yi1 ; zi1 ; xi2 ; yi2 ; zi2 are integers and xi1 6 xi2 ; yi1 6 yi2 , zi1 6 zi2 (see Fig. 5). Let us define
X ¼ x11 ; x12 [ [ xp1 ; xp2 ; 1 1 p p Y ¼ y1 ; y2 [ [ y1 ; y2 ; Z ¼ z11 ; z12 [ [ zp1 ; zp2 :
ð16Þ ð17Þ
The coordinates are transmitted to the server, and the server and the client resume the transmission/reception. Now, instead of complete regions, parts of regions are going to be transmitted, doing reconstructions following the scheme depicted in Fig. 2. Thus, the sequence of data transmission/reception has to be changed, certain strategies to avoid data repetition have to be developed and some extra data for region identification have to be included in transmission. To avoid data redundancy, two matrices and a vector should be defined in both server and client sides. Suppose that q complete regions have been transmitted/received so far. In the server side, the
(1) Read the first transmitted 16 þ 16 þ 16 þ 5 þ 5 þ 16 bits and store them in variables r; s; t; bvec ; bval , and q, respectively. (2) FOR i ¼ 1; . . . ; r; (a) Read the following bval bits and store them in variable r. (b) Read the following s bvec bits and store them in the vector variable u. (c) Read the following t bvec bits and store them in the vector variable v. (d) Compute Oi ¼ ruvT . (3) Compute bs ¼ Intðlog2 rÞ þ 1. (4) Until the client stops the transmission or the end of file is reached. (a) Read the first bs bits and store them in variable n. (b) Read the following bval bits and store them in variable r.
ð15Þ
Fig. 5. Upper left and lower right coordinates of a ROI into a 3D image.
454
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
matrices U s of size ðqðO; Þ qÞ s and V s of size ðqðO; Þ qÞ t, and the vector Ss of size qðO; Þ q are defined. The entry U s ði; jÞ is 1 if the component j of vector u of region q þ i has been already transmitted, 0 otherwise. Analogously for V s ði; jÞ with component j of vector v of region q þ i and Ss ðiÞ with the singular value of region q þ i. In the client side, let us define analogous matrices U c of size ðqðO; Þ qÞ s and V c of size ðqðO; Þ qÞ t, and the vector Sc of size qðO; Þ q. The entries U c ði; jÞ and V c ði; jÞ are the received component j of vectors u and v, respectively, of region q þ i, 0 otherwise. The entry Sc ðiÞ stores the singular value of region q þ i if it has been received, 0 otherwise. Three new aspects should be commented here: the first is that, now, the number of the transmitting region is an additional datum that the server has to transmit to the client. In Algorithm 8, this is not necessary because complete regions are transmitted. The second is that the auxiliary matrices and vectors described above only appear when ROI transmission is needed and they will be removed from the server and client memory when the progressive transmission/reception is finished. The third deals with avoiding data overlapping. When a ROI, called A, has been transmitted and, subsequently, the client selects a new ROI, called B, with data (regions) in common with A, only the corresponding untransmitted components of vectors u and v of the common regions have to be transmitted, because the remainding components were already stored in U c and V c and the singular value in Sc during the transmission of the previous ROI A. Moreover, note that in the reconstruction overlapping may occur. For instance, suppose that during ROI A reconstruction a part of a slice is reconstructed as follows
2
3 2 0 0 6 7 6 On ¼ r4 u2 5½0; v2 ; v3 ¼ 4 0 u3 0
0
0
3
ru2 v2 ru2 v3 7 5; ru3 v2 ru3 v3
ð18Þ
and, further, during ROI B reconstruction, the same slice is improved as follows:
2
2
3
0 u1 6 6 7 On ¼ On þ r4 u2 5½v1 ; v2 ; 0 ¼ 4 0 0
2
3
0
0
3
ru2 v2 ru2 v3 7 5 0 ru3 v2 ru3 v3
ru1 v1 ru1 v2 0 6 7 þ 4 ru2 v1 ru2 v2 0 5: 0
0
ð19Þ
0
Notice that overlapping appears in common component ð2; 2Þ. In order to avoid that possible overlapping overwrites the correct value in the reconstruction, a modified matrix addition is defined. Definition 9. If A ¼ ðaij Þ and B ¼ ðbij Þ are matrices of the same size, we define the non-overlapping matrix addition as
A B ¼ ðaij bij Þ;
The client selected one or several ROIs, the coordinates of the opposite corners of each ROI have been transmitted to the server and sets X; Y and Z as in (15)–(17) have been defined in both sides, server and client. In the server part, only for the first time a ROI transmission is required, Define the matrix U s of size ðqðO; Þ qÞ s and fill its entries with 00 s. Define the matrix V s of size ðqðO; Þ qÞ t and fill its entries with 00 s. Define the vector Ss of size qðO; Þ q and fill its entries with 00 s. Therefore, we assume that these matrices, U s and V s , as well as vector Ss are already defined and, maybe, some of their entries filled by non-zeros because of previous ROI transmissions. Algorithm 10 (Server ROI transmission). The server starts the transmission from region q þ 1. FOR i ¼ 1 TO qðO; Þ q or until the client stops the transmission, (1) read the first bs bits and store them in variable n, (2) IF n 2 X, (the region contains data of the current ROIs) (a) encode i to 16 bits and transmit it (region number encoding and transmission), (b) transmit n (slice number corresponding to the current transmitted region), (c) IF Ss ðiÞ ¼ 0, transmit the following bval bits, (singular value transmission), (d) FOR j ¼ 1 TO s (singular vector u transmission), (i) IF j 2 Y AND U s ði; jÞ ¼ 0, transmit the bvec bits of component j of the vector u of the current region and assign U s ði; jÞ ¼ 1, (e) FOR j ¼ 1 TO t (singular vector v transmission), (i) IF j 2 Z AND V s ði; jÞ ¼ 0, transmit the bvec bits of component j of the vector v of the current region and assign V s ði; jÞ ¼ 1. Now, in the client part and only for the first time a ROI transmission is required, Define the matrix U c of size ðqðO; Þ qÞ s and fill its entries with 00 s. Define the matrix V c of size ðqðO; Þ qÞ t and fill its entries with 00 s. Define the vector Sc of size qðO; Þ q and fill its entries with 00 s. Hence, we assume these matrices, U c and V c , as well as vector Sc are already defined and, maybe, some of their entries filled by non-zeros.
where
aij bij ¼
aij
if aij –0;
bij
if aij ¼ 0:
The above commented aspects lead us to consider both sides, server and client, separately in order to present the following algorithms for ROI progressive transmission and reconstruction. The starting conditions of both, Algorithms 10 and 11, for server ROI transmission and client ROI reconstruction are: The client stopped the transmission at certain region q.
Algorithm 11 (Client ROI reconstruction). Starts the server transmission from region q þ 1. Assign variable overlapping = False. Until the client stops the transmission or no more data are received. (1) Read the following 16 bits and store them in variable i (region number). (2) Read the following bs bits and store them in variable n (slice number). (3) Singular value construction.
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
IF Sc ðiÞ ¼ 0, read the following bval bits, store them in variable r and assign Sc ðiÞ ¼ r. ELSE, the singular value was transmitted before and r ¼ Sc ðiÞ (in this case overlapping may occur) and assign overlapping = True. (4) Construction of vector u. Fill the components j such that j R Y with 00 s. Read the following bvec bits and store them in variable aux. FOR ALL j 2 Y. IF U c ði; jÞ ¼ 0, assign uðjÞ ¼ aux, assign U c ði; jÞ ¼ aux, read the following bvec bits and store them in variable aux. ELSE, assign uðjÞ ¼ U c ði; jÞ. (5) Construction of vector v. Fill the components j such that j R Z with 00 s. Read the following bvec bits and store them in variable aux. FOR ALL j 2 Z. IF V c ði; jÞ ¼ 0, assign vðjÞ ¼ aux, assign V c ði; jÞ ¼ aux, read the following bvec bits and store them in variable aux. ELSE, assign vðjÞ ¼ V c ði; jÞ. (6) Improvement of ROI reconstruction in slice n IF overlapping = True, improve On ¼ On ruvT . ELSE, improve On ¼ On þ ruvT . Note that, in both algorithms, every time a new ROI transmission is required, the server starts transmitting data from the region q þ 1, that is, the first region not completely transmitted. Both algorithms are prepared to transmit and receive a sequential selection of ROIs or several ROIs at the same time, and when no more ROIs are required, they allow the transmission of the remainder untransmitted data which are used to improve the reconstruction until the complete encoded data set of the 3D image is transmitted.
455
that, a priori, in a real image the existence of that ‘‘tumor” is unknown. This image has been encoded by Algorithm 7 using ¼ 0:01; bval ¼ 32 bits, bvec ¼ 16 bits and bs ¼ 7 bits (see Section 3.1). The server starts the progressive transmission to the client and after the transmission of 476 regions of the encoded image, that is 6.20% or 478.196 KBytes, the obtained reconstruction can be seen in Fig. 7. Note that the ‘‘tumor”, at left of the jawbone, is emerging in the rendering. Then, the client stops the transmission and selects the ROI determined by the opposite corners ðx1 ; y1 ; z1 Þ ¼ ð58; 75; 130Þ and ðx2 ; y2 ; z2 Þ ¼ ð72; 170; 200Þ that contains the emerging ‘‘tumor”. The coordinates are transmitted to the server that resumes the transmission from region 477, transmitting only the data corresponding to the selected ROI. Fig. 8(a) and (b) shows reconstructions built with 6.20% of complete regions transmitted, and 25% and 100%, of the ROI data. Fig. 8(b) is reconstructed with 476 complete regions and 1837 regions corresponding to that ROI, that is, 478:196 KBytes þ 1837 ð16 þ bs Þ þ 1837 32 þ 1837 16 ððy2 y1 þ 1Þ þ ðz2 z1 þ 1ÞÞ bits ¼ 8;926;878 bits ¼ 1:06417 MBytes:
Recalling that bs ¼ 7; y1 ¼ 75; z1 ¼ 130; y2 ¼ 170; z2 ¼ 200, that corresponds to 14.12% of encoded 3D image, the ‘‘tumor” has been completely visualized saving the transmission of 85.88% of the encoded data. 7. A comparative with JPEG2000 In this section, let us compare the performance of the proposed method with MAXSHIFT method. In order to simplify the experiment, we consider the 2D image of size 256 256 whose pixels are 16-bit gray level of Fig. 9. In this image, we select two ROIs whose corner coordinates are ROI A: ð60; 55Þ–ð77; 70Þ and ROI B: ð175; 187Þ–ð186; 200Þ.
Fig. 6 is a lateral ray casting reconstruction of the human head of Fig. 3 using VTK [20] where we added an implanted ‘‘tumor” to it; this provides a recognizable internal detail for visual exploration. The ‘‘tumor” is at left side of the jawbone (see Fig. 6). Note
Let us use PSNR as a measure to compare results. Good reconstructed images have PSNR values of 30 dB or more [21, p. 6]. On one hand, only one ROI can be selected with MAXSHIFT method. Therefore, we should select a ROI, say C, that contains both ROIs A and B, then this ROI C is shifted and the complete 2D image encoded using JPEG2000. The encoding is carried out at rate r, where r ¼ 0:05; 0:1; 0:15; . . . ; 2:70; 2:75. The PSNR at r ¼ 2:75 is 238.03 that corresponds to the transmission of
Fig. 6. Ray casting lateral rendering of the 93 human head CT slices with an implanted ‘‘tumor” at jawbone.
Fig. 7. At 6.20% of encoded 3D image transmitted, i.e., 476 transmitted regions, something unusual can be perceived under the jawbone.
6. Example
456
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
Fig. 8. ROI reconstruction, (a) at 25% of the ROI data transmitted, (b) at 100% of the ROI data transmitted.
64.79 dB (higher b implies the transmission of less number of regions and lower b bad quality image reconstruction). In Fig. 10 we show the PSNR at different ratios with JPEG2000 and the proposed method using SVD with b ¼ 13 bits. SVD method does not achieve the same PSNR as MAXSHIFT method (see the right part of Fig. 10(a)) because SVD is a lossy progressive transmission. The reconstruction of the ROIs using both methods at 30 dB can be seen in Fig. 11. JPEG2000 needs to reconstruct the ROI C, containing ROIs A and B, to achieve the same PSNR at the ROIs A and B as the SVD method. As we can see in Fig. 10 both methods are comparable at low ratios. ROI reconstructions are shown in Fig. 11. After that, as more and more data are used for the reconstruction JPEG2000 turns out to provide a better PSNR at higher ratios than SVD. On the other hand, the proposed SVD method performs better when small and separated ROIs have been selected. Furthermore, when the gray-level resolution increases it can use more bits to quantize singular values and vectors and consequently better quality images can be reconstructed. 8. Conclusions
Fig. 9. 2-D image 256 256 and 16-bit gray level with two selected ROIs.
22580 bits. Kakadu software [22] is used to perform the above computations. The rate is an input in Kakadu. On the other hand, we compute SVD of 2D image in Fig. 9. Each time we send a new region of ROIs A and B, the client receives
18 þ b½ð77 60 þ 1Þ þ ð70 55 þ 1Þ þ ð186 175 þ 1Þ þ ð200 187 þ 1Þ ¼ 18 þ 60b bits; where the singular value is quantized with 18 bits and the singular vectors with b bits. Then, for b ¼ 11; 12; 13; 14; 15; 16 we simulate the transmission of 22,580 bits and the best result is obtained for b ¼ 13 bits (12 bits + 1 bit for the sign) where the PSNR is
In this paper we present an algorithm for lossy and adaptive encoding of 3D digital images based on SVD. This encoding is useful for progressive transmission of images and an algorithm for reconstruction is developed. In addition, the developed encoding allows the progressive transmission of ROIs, sequentially or concurrently, with simple changes in transmission and reconstruction algorithms, avoiding re-encoding and redundancy in data transmission. Moreover, the client can select the ROIs at any step during the transmission process. Furthermore, a comparative with JPEG2000 has been carried out, where there can be only one ROI, and this ROI must be defined a priori, as MAXSHIFT has no way to select ROIs during the process. Under these circumstances, JPEG2000 is the best method known for whole images. Nevertheless, if the client selects other ROI, the whole image must be encoded again and the transmission must be re-started from the beginning. Redundancy and interlaced data are the main drawbacks of this method. On the other hand, the SVD encoding increases the amount of data but an efficient use of data quantization can paliate this drawback in exchange to the loss of image quality. On the other hand, SVD allows definition and handling of several ROIs at the same time during the process, and there is no data redundancy as the encoding is done only once. This also saves computation time.
Fig. 10. (a) PSNR at different ratios with JPEG2000 and the proposed method. The singular vectors are quantized with 13 bits. (b) Zoom of graphic (a) around PSNR 30 dB.
I. Baeza et al. / Image and Vision Computing 28 (2010) 449–457
Fig. 11. Reconstruction of the ROIs at PSNR 30 dB using (a) the proposed method with singular vectors quantized with 13 bits, (b) JPEG2000.
Moreover, the proposed method improves performance when small and separated ROIs have been selected and when the gray-level resolution increases. Both methods are comparable at low ratios. Finally, this paper has been written under an algorithmic point of view in order to allow straightforward design and implementation of automatic processes for progressive transmission of 3D images and their ROIs, in a transparent form for the final user. References [1] E. Defez, A.G. Law, J. Villanueva-Oller, R.J. Villanueva, Matrix cubic splines for progressive transmission of images, J. Math. Imaging Vision 23-1 (2002) 41– 53. [2] T. Sigitani, Y. Iiguni, H. Maeda, Progressive cross-section display of 3D medical images, Phys. Med. Biol. 44-6 (1999) 1565–1577. [3] R.S. Dilmaghani, A. Ahmadian, M. Ghavami, A.H. Aghvami, Progressive medical image transmission and compression, IEEE Signal Process. Lett. 11-10 (2004) 806–809.
457
[4] G. Menegaz, J.P. Thiran, Lossy to lossless object-based coding of 3-D MRI data, IEEE Trans. Image Process. 11-9 (2002) 1053–1061. [5] X. Wu, T. Qiu, Wavelet coding of volumetric medical images for high throughput and operability, IEEE Trans. Medical Image Process. 24-6 (2005) 719–727. [6] H. Zhu, R.A. Brown, R.J. Villanueva, J. Villanueva-Oller, M.L. Lauzon, J.R. Mitchell, A.G. Law, Progressive imaging: S-transform order, ANZIAM J. 45 (E) (2004) C1002–C1016. [7] Y.-S. Kim, W.-Y. Kim, Reversible decorrelation method for progressive transmission of 3D medical image, IEEE Trans. Medical Imaging 17-3 (1998) 383–394. [8] I. Baeza, J. Villanueva-Oller, R.J. Villanueva, A.G. Law, Progressive transmission of images: adaptive best strategies, Math. Comput. Model. 41 (2005) 1325– 1339. [9] C. Christopoulos, A. Skodras, A. Ebrahimi, The JPEG 2000 still image coding system: an overview, IEEE Trans. Consumer Electron. 46 (2000) 1103–1127. [10] I. Baeza, J.-A. Verdoy, J. Villanueva-Oller, R.-J. Villanueva, ROI-based procedures for progressive transmission of digital images: a comparison, Math. Comput. Model. (2009), doi:10.1016/j.mcm.2009.05.014. [11] G.H. Golub, C.F. van Loan, Matrix Computations, third ed., Johns Hopkins Univ. Press, Baltimore, MA, 1996. [12] R. Pratap, Getting started with Matlab: Version 6. A Quick Introduction for Scientists and Engineers, Oxford University Press, Oxford, 2002. [13] Available from:
. [14] Available from: . [15] P.P. Kanjilal, S. Palit, On multiple pattern extraction using singular value decomposition, IEEE Trans. Signal Process. 43 (6) (1995) 1536–1540. [16] V.I. Gorodetski, L.J. Popyack, V. Samoilov, V.A. Skormin, SVD-based approach to transparent embedding data into digital images, in: Proceedings of the International Workshop on Information Assurance in Computer Networks: Methods, Models, and Architectures for Network Security, Springer, 2001, pp. 263–274. [17] V.V. Selivanov, R. Lecomte, Fast PET image reconstruction based on SVD decomposition of the system matrix, IEEE Trans. Nucl. Sci. 48 (3) (2001) 761– 767. [18] F. Pedroche, On some capabilities of the SVD expansion to handle images, WSEAS Trans. Math. 1–2 (2002) 67–70. [19] J.S. Walker, A Primer on Wavelets and their Scientific Applications, Chapman & Hall/CRC Press, London/Boca Raton, FL, 1999. [20] Available from: . [21] D.S. Taubman, M.W. Marcellin, JPEG2000: Image Compression Fundamentals, Standards and Practice, Kluwer Academic Publishers, Dordrecht, 2002. [22] Kakadu software, a comprehensive framework for JPEG2000 developers. Available from: .