Super-resolution of cardiac magnetic resonance images using Laplacian Pyramid based on Generative Adversarial Networks

Super-resolution of cardiac magnetic resonance images using Laplacian Pyramid based on Generative Adversarial Networks

Computerized Medical Imaging and Graphics 80 (2020) 101698 Contents lists available at ScienceDirect Computerized Medical Imaging and Graphics journ...

2MB Sizes 0 Downloads 52 Views

Computerized Medical Imaging and Graphics 80 (2020) 101698

Contents lists available at ScienceDirect

Computerized Medical Imaging and Graphics journal homepage: www.elsevier.com/locate/compmedimag

Super-resolution of cardiac magnetic resonance images using Laplacian Pyramid based on Generative Adversarial Networks Ming Zhao a , Xinhong Liu a , Hui Liu b , Kelvin K.L. Wong c,∗ a b c

School of Computer Science and Engineering, Central South University, Changsha, 410000, China Computer Science Department, Missouri State University, Springfield, 62701, United States Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China

a r t i c l e

i n f o

Article history: Received 23 September 2019 Received in revised form 28 December 2019 Accepted 2 January 2020 Keywords: Cardiac magnetic resonance imaging Single image super-resolution Generative Adversarial Networks Laplacian Pyramid Image enhancement

a b s t r a c t Background and objective: Cardiac magnetic resonance imaging (MRI) can assist in both functional and structural analysis of the heart, but due to hardware and physical limitations, high-resolution MRI scans is time consuming and peak signal-to-noise ratio (PSNR) is low. The existing super-resolution methods attempt to resolve this issue, but there are still shortcomings, such as hallucinate details after superresolution, low precision after reconstruction, etc. To dispose these problems, we propose the Laplacian Pyramid Generation Adversarial Network (LSRGAN) in order to generate visually better cardiovascular ultrasound images so as to aid physician diagnosis and treatment. Methods and results: In order to address the problem of low image resolution, we used the Laplacian Pyramid to analyze the high-frequency detail features of super-resolution (SR) reconstruction of images with different pixel sizes. To eliminate gradient disappearance, we implemented the least squares loss function as the discriminator, we introduce the residual-dense block (RDB) as the basic network building unit is used to generate higher quality images. The experimental results show that the LSRGAN can effectively avoid the illusion details after super-resolution and has the best reconstruction quality. Compared with the state-of-the-art methods, our proposed algorithm generates higher quality super-resolution images that comes with higher peak signal-to-noise ratio and structural similarity (SSIM) scores. Conclusion: We implemented a novel LSRGAN network model, which solves reduces insufficient resolution and hallucinate details of MRI after super-resolution. Our research presents a superior superresolution method for medical experts to diagnose and treat myocardial ischemia and myocardial infarction. © 2020 Published by Elsevier Ltd.

1. Introduction Cardiac magnetic resonance imaging (MRI) can perform functional and structural analysis of the heart. High tissue image ¨ assessing the heart. Cardiovascu¨ contrast is the gold standardfor lar imaging can accurately display anatomy, morphology, function, blood perfusion and myocardial activity. MRI with large field of view, high tissue resolution and non-radiation characteristics have been used in many clinical applications and computerized methods (Goceri and Goceri, 2015; Goceri, 2016; Dura et al., 2018; Goceri, 2013, 2011) to help medical experts. However, due to hardware and physical limitations, the cost of high-resolution MRI imaging is long scan time, small spatial coverage, and low signal-to-noise ratio. The

∗ Corresponding author at: Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China. E-mail address: [email protected] (K.K.L. Wong). https://doi.org/10.1016/j.compmedimag.2020.101698 0895-6111/© 2020 Published by Elsevier Ltd.

ability to recover high-resolution (HR) images from a single low resolution input may overcome these deficiencies. As a result, medical experts are unable to perform more accurate diagnosis and treatment (Atalay, 2014; Gao et al., 2018; Zhang et al., 2018; Zhao et al., 2018; Xu et al., 2017; Xu et al., 2018). Therefore, it is important to use super-resolution technology to improve the resolution of medical images on the software level. Single image super-resolution refers to the process of changing an image from a low-resolution image to a high-resolution one. Although there is a great breakthrough in the technique for superresolution of a single image by convolutional neural networks, there are still many shortcomings. Firstly, the existing methods are to reconstruct the high-resolution image after the upsampling step, making it more difficult to learn the mapping function under a large scale factor multiple times. Secondly, the current super resolution is limited by the gradient. Factors such as disappearance of high frequency detail and artifact generation, poor training sta-

2

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698

bility is unavoidable. Finally, due to the different high-frequency detail features contained in pictures of different pixel sizes, the current mainstream methods cannot effectively recover the highfrequency details of the target size (Zeyde et al., 2012; Hinton and Salakhutdinov, 2006a; Gupta et al., 2011; Timofte et al., 2013; Fischer et al., 2015; Schulter et al., 2015). We present the LSRGAN, which is a progressive step-by-step reconstruction of high-resolution images. We introduce a residualdense block as the basic network building block, using the least squares loss function as the discriminator. At the same time, we adopt the Laplacian pyramid model to capture high-frequency detail differences between low-resolution and high-resolution images. We allow the low-resolution images recover the loss of the image through the mapping function in order to recover better high-frequency details. According to the pixel size, the image is divided into three types: S, M, and L. For different pixel size categories, the pre-training GAN network is implemented step by step, and the high-frequency details of the target pixel size are effectively restored. The main work of this paper is as follows: (1) We first propose the application of GAN and Laplacian pyramid in image superresolution. By combining the two methods, more high-frequency detail parameters are captured to improve the super-resolution effect. (2) This paper establishes a new GAN architecture model, introduces a new network building unit RDB, and utilized the minimum binary loss function as a discriminator. Experimental results show that our suggested LSRGAN is more advanced than the current methods such as LapSRN, DRCN and SRGAN, etc., have a higher PSNR and SSIM score. (3) Dissimilar to other methods, we used a progressive reconstruction approach. By implementing the Laplacian pyramid model, multiple SR images are generated in the reconstructed image, and by training SR images of different pixel sizes, aiming to learn high-resolution details. This feature results in our model more flexible in application, running different pretrained GAN networks for different sizes of images, and recovering high-frequency details of the image. In scenarios where computing resources are limited, our 8× model can perform 2× or 4× magnifications, which have not appeared in the other methods to the best of our knowledge. 2. Methodology 2.1. Related work Super-resolution has been extensively studied in literature. Here, we focus our discussion on recent example-based and singleimage super-resolution approaches. 2.1.1. Image super-resolution The concept of super-resolution was first proposed by Harris Harris (1964) and Goodman Goodman (1968), and the latest overview of super-resolution presented by Nasrollahi (Nasrollahi and Moeslund, 2014) and Yeh (Yeh et al., 2016). In general, a single image super-resolution method can be divided into three types: 1) interpolation-based, 2) statistics-based, and 3) learning-based super-resolution methods. The earliest single image super-resolution problem was solved by interpolation based image super-resolution method, mainly based on nearest neighbor interpolation, bilinear interpolation and bicubic interpolation. Rajan (Rajan and Chaudhri, 2001) divided the image interpolation into a general interpolation model of decomposition, interpolation and fusion, and decomposed the image into different subspaces for interpolation, and then fed the interpolation results back into the image domain for fusion. These methods are super-resolution fast, but usually produce an excessively smooth

texture and are limited by the problem of excessively small magnification. The learning-based super-resolution method utilizes selfsimilarity features in natural images to create complex mappings between low-resolution (LR) images and HR images, and used a large number of LR/HR images to compare data to infer missing high-frequency information, promising to break the limitations of amplification. A method of constructing an LR-HR patch pair based on a proportional spatial pyramid of an LR input image is proposed by utilizing a self-similar feature in a natural image (Glasner et al., 2009). GANs were also implemented for unsupervised representation learning in Radford et al. (Radford et al., 2015). Hayat and Khizar (Hayat, 2018) used the method of deep learning (Hinton and Salakhutdinov, 2006b) to dispose the SR problem due to its efficient formulations (Goceri, 2018) particularly in medical image analysis (Goceri and Goceri, 2017). A three-layer deep convolutional network is trained between high-resolution images in low-resolution image domain by convolutional neural network, and the end-to-end mapping is learned to make the convolutional neural network sparse and have better SR performance. The output of each multi-scale residual network is utilized as a layered feature of global feature fusion. All of these features are used to repair high resolution images in the reconstructed model. Also, efficiency of residual blocks in deep networks has been shown in different works (Goceri, 2019a). However, most of these convolution methods rely on MSE loss to learn the LR/HR mapping function, which results in an output that is too smooth when the input resolution is low and the magnification is large. Also, there are still open and challenging issues, such as overfitting, in these deep learning based approaches (Goceri, 2019b), although Sobolev gradient based optimizers have been used recently to increase performances (Goceri, 2019c). In order to handle these problems, the perceptual loss of advanced features from the pre-training network is proposed (Bruna et al., 2016; Johnson et al., 2016). Gulrajani (Gulrajani et al., 2017) introduced the WGAN-GP to fix the problem of gradient disappearance and gradient explosion through a new loss function. Simonyan et al. (Simonyan and Zisserman, 2014) implemented a VGG deep convolutional network for image super-resolution. Ledig et al. Ledig et al. (2016) suggested a generation adversarial network for image super-resolution, optimizing the combined function of adversarial loss and content loss. This network architecture is somewhat similar to ours, our approach has higher magnifications, such as 8× or 16×, while SRGAN focuses on 4 × . And Ledig is focused on learning the mapping functions of LR/HR, and our architecture network is mainly to capture missing high frequency details and better reconstruction quality. For MRI, Oktay et al. Oktay et al. (2016, 2018) provide highfrequency information by training a single image and multiple image networks using a deep learning method to predict residuals. Super-resolution and segmentation will be performed by adding shape priors. RNN was first used by Poudel et al. Poudel et al. (2017) for cardiac MRI, reconstructing using low-frequency features in successive frames of the cardiac cycle. In order to enhance cardiac MRI, a super resolution network based on the U-net and long short-term memory layers was used (Basty and Grau, 2018). In this paper, we propose the Laplacian Pyramid Generation Adversarial Network. Our results show that, compared with other methods, LSRGAN can effectively improve the quality of reconstruction to avoid illusory information. 2.1.2. Laplacian Pyramid The Laplacian Pyramid was proposed by Burt et al. (Burt and Adelson, 1983) in 1983. The method can perform multi-scale feature extraction on images, and is widely used in various visual tasks, including image fusion, texture synthesis, and semantic segmentation. Lai et al. Lai et al. (2019) presented a LapSRN network, which

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698

3

Fig. 1. Super-resolution reconstruction framework. This technique zooms in on the LR image with 4× resolution and used the corresponding LSRGAN model twice.

is the most relevant to our work, but differs in the following two ways:

1) LapSRN pursues higher reconstruction speeds, so higher-level residual images are generated by lower-level feature sharing. This method is faster in inference, but has lower ability to capture high-frequency image features. We utilized low-resolution images to perform step-by-step training according to the classified image categories, capturing high-frequency image details contained in images with different resolution features, and having better texture features. 2) The loss function and the architecture design is different. LapSRN used the penalty function charbonnier to punish the deviation between the SR prediction and the real image HR. We implemented the least squares loss to effectively solve the gradient disappearance problem and improve the reconstructed image quality.

2.1.3. Generative adversarial network The Generated Adversarial Network (GAN) was proposed by Goodfellow et al. Goodfellow et al. (2014) in 2014. By using the discriminator network D and the generator network G for training, it is possible to capture the potential distribution of actual data and output new data. This is achieved by training through minmax two-player games to learn the previous image. However, the discriminator can easily distinguish between the generated image and the real HR image, which impairs the balance of the training, making it difficult for the GAN to generate realistic HR images. Huang et al. used a new novel introspective variational autoencoder (IntroVAE) model. Ledig et al. Ledig et al. (2016) solved this problem by optimizing the combined function of resistance loss and content loss.

2.2. Laplacian pyramid generative adversarial networks for super-resolution In this section, we describe the LSRGAN design methodology, including the network architecture as well as generating the antinetwork and perceptual loss functions.

Table 1 Image classification information. Image type

Size of Image (Pixel)

S M L

0×[0–64]×64 64×[64–128]×128 >128 × 128

2.2.1. Network architecture Our network architecture is based on the Laplacian pyramid framework. The resolution of the LR image is judged and input to the corresponding size training model and the residual image on the pyramid level is predicted step by step. Table 1 shows the details on image size classification. The super-resolution process is shown in Fig. 1. We used the Laplacian pyramid to predict high-frequency details, so the generator is different from other methods. First, the HR image and the LR image upsampled difference image (DR) are generated by the generator, and then the DR and LR images are added, in order to obtain final SR image. The training diagram of the LSRGAN network is shown in Fig. 2. 2.2.2. Generative adversarial network The generated confrontation network model implemented two networks to compete with each other: the generator network G and the discriminator network D both iteratively train each other. To address the min-max problem, we present Eq. 1 as follows. minmaxEI DR ∼p G

D

train (I

DR )

[log D(I DR )] + EI LR ∼p

G (I

LR )

[log(1 − D(G(I LR ))] (1)

This equation supports the generator model G fit to generate a DR image that is used to spoof the discriminator network. The discriminator D is to distinguish whether the image is generated by the generator G or a real image. However, when updating the generator, this loss function will cause the correct side of the decision boundary to be far from the gradient of the actual data sample. In this paper, we utilized the least squares generation confrontation network, using the a–b coding scheme as the discriminator, and defining the objective function as follows. minVLSRGAN (D) = D

+

 DR 2 1 E DR )−b ] DR [ D(I 2 I ∼Ptrain (I )

1 2 LR ) − a) ] E LR LR [D(G(I 2 I ∼PG (I )

(2)

4

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698

The content loss is defined as follows: 2 1  HR (Ix,y − GG (ILR )x,y ) 2 r WH rW

lCSR =

rH

(7)

x=1 y=1

In addition to the loss of content, we added the GAN generation component to the loss function, which encourages our network to support solutions that reside on multiple natural images by attempting to trick the discriminator network. The generative loss is defined as follows: SR lGen =

N 

D(G(I LR ) − 1)

2

(8)

n=1

2.3. Data set

Fig. 2. Schematic diagram of the training process. After the image is downsampled, the upsampled image is compared with the original image, and the discriminator determines whether the generated image is true or not.

Cardiac magnetic resonance imaging was performed on 64 patients with Siemens Sonata 1.5 T Syngo MR 2004A scanner with the Numaris-4 series number 21,609. A total of 2560 MRIs were obtained. We uniformly adjust the HR image to 1024 × 1024 pixels. The LR image is the image sampled by the bicubic interpolation method. The number of downsampling corresponds to 2, 4, and 8 magnifications are 1, 2, 3 respectively. For each HR image and its associated label, we performed the following: 1) fliping horizontally and vertically; and 2) cropping randomly; 2.4. Training details and parameters

1 2 minVLSRGAN (G) = EI LR ∼P (I LR ) [D(G(I LR ) − c) ] G 2 G

(3)

Where a and b are the labels of the fake and the real data, respectively, and c represents the value that G wants D to believe in the forged data. In order to make the generated samples as realistic as possible, we adopt the setting of c = b, and adopts the 0–1 coding scheme to obtain the following objective functions minVLSRGAN (D) = D

+

 2 1 EI DR ∼P (I DR ) [ D(IDR ) − 1 ] train 2

1 LR 2 )) ] E LR LR [D(G(I 2 I ∼PG (I )

1 2 minVLSRGAN (G) = EI LR ∼P (I LR ) [D(G(I LR ) − 1) ] G 2 G

3. Results (4)

3.1. Experimental analysis of LSRGAN model

(5)

Through experimental analysis, we compare the basic network elements, network structure, the number of remaining blocks and the influence of Laplacian pyramid on GAN, and finally determine the LSRGAN model.

For the generator network architecture, we used skip connection and no skip connection to compare and find that using skip connection can greatly improve our network performance. The RDB residual is presented as shown in Fig. 3. RDB integrates the residual block and the dense block, removing the batchnorm layer and the pooling layer. The output of each RDB module is used with concat. Then, we implemented two trained subpixel convolutional layers to increase the resolution of the input image, where the convolutional layer was a 33 kernel with 256 mapped features and a step size of 1, andfinally get the generated image. Our discriminator network architecture used the LeakyReLU activation function to avoid the largest pooling problem in the entire network. The discriminator network (shown in Fig. 4) contains eight convolutional layers. Finally, we utilized two dense layers to get the probability of sample classification through the least squares loss function. 2.2.3. Perceptual loss function The loss function is weighted by content loss lCSR and weight loss combination: SR lSR = lCSR + 10−3 lGen

Our system was implemented based on Ubuntu 16.04, the GPU used NVIDIA Geforce GTX 1080 Ti, we set the range of LR input image to [0,1], and set the HR image to [1,1]. For optimization, we used the parameters based on Adam framework by Kingma et al. (Kingma and Ba, 2014) with ˇ1 =0.90, the learning rate is 10−4 , and the number of iterations is 106 . Our generator network selects 16 residual blocks, which is explained in Section 3.1.2.

(6)

3.1.1. Experimental analysis of basic network unit We analyzed the impact of using RDB and using Residual Block as the basic network unit on the super-resolution results, and finally decided to use RDB as the basic network building unit (shown in Fig. 5). It can be found that using RDB as a basic network element can significantly improve the super-resolution results and the generation speed. 3.1.2. Experimental analysis of least squares GAN We carried out an experimental analysis of the least squares GAN (shown in Fig. 6). By comparing the difference between the least squares GAN and the ordinary GAN, and experimenting with the effect of comparing the number of different residual blocks on the result, it is decided to utilized 16 residual blocks in the generator network. 3.1.3. Experimental analysis of laplacian pyramid We derived the PSNR and SSIM at different magnifications by analyzing the GAN network using the Laplace pyramid (shown in Figs. 7). It can be found that at 2× super resolution, the network

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698

5

Fig. 3. Schematic diagram of the residual block network. Each residual block includes 2 convolutional layers, each of which uses a 3 × 3 kernel, contains 64 feature maps, and has a step size of 1.

Fig. 4. Discriminator network architecture. Each of the squares represents a convolutional layer, and the number below the square represents the number of mapped features. The number above the square represents the step size. Each convolutional layer kernel is 3 × 3, and the convolutional layer is followed by the BN layer and the Leaky Relu activation function.

Fig. 5. PSNR comparison diagram of the basic network unit. The RDB model is used as LSRGAN-RDB and the Residual Block LSRGAN model as LSRGAN-RB. Comparing the results of the two at a magnification of 4×, it was found that the use of RDB resulted in a significant increase in the results.

Fig. 7. Comparison of the Laplacian pyramid. LSRGAN-LP stands for Laplacian Pyramid and LSRGAN-NLP stands for Laplacian Pyramid. (a) represents the comparison of PSNR under different factors. (b) represents the comparison of SSIM under different factors. It can be found that under high factors, the super-resolution effect is better after using the Laplacian pyramid.

Fig. 6. PSNR comparison diagram of the loss function. LSRGAN-LS indicates the use of least squares GAN, LSRGAN-NLS indicates the use of normal GAN. The results are compared at 4× multiples. A better PSNR can be obtained when using least squares GAN. Experimental results show that Satisfactory results can be obtained with 16 number of residual blocks, and the network structure is not complicated.

structure is the same, and the results have no effect. But at 4× and 8× super resolution, the Laplacian pyramid can effectively improve the reconstruction quality and have better high frequency details.

6

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698

Fig. 8. Visual comparisons at different magnification. In (a) and (b), it can be found that at 2× magnification, our method reconstructs the details of the edge of the heart, which is not available in other methods, and the internal details are more abundant. As can be seen from (c) and (d), our method has finer high-frequency texture features when performing 4× magnifications, and the internal details are better highlighted. Then, in (e) and (f) at 8× magnification, our method uses Laplacian pyramids, and is advantageous at high magnification as it comes with better detail and edge contours.

3.2. Comparison with state-of-the-art techniques We compared the LSRGAN model with techniques the most advanced methods: bicubic, RCAN, MSRN, LapSRN and DBPN. The image enlargement details after 2×, 4× and 8× post superresolution reconstruction are listed in Fig. 8. It can be seen that the LSRGAN reconstruction is superior. We combine the Laplacian pyramid, with a richer high-frequency texture detail under a large scale factor, we can see the details that are invisible by other methods and the fine outline of the image. We trained and tested our model using the internal training set, compared the most advanced methods at present, and detailed the comparison of PSNR and SSIM (shown in Fig. 9). By comparing PSNR and SSIM, we find that the proposed method has higher reconstruction quality, especially in the case of large scale factor amplification (such as 8×), combined with Laplacian pyramid to recover more image details, to help medical experts diagnose and treat the work. We provide a solution for the low resolution of MRI,

which improves assessment of the cardiac morphology imaging, coronary surgery, etc. by generating clearer magnetic resonance images for medical experts. 4. Discussion The use of the GAN network for image super-resolution has proven to be successful. However, we have found that ordinary GAN may cause the gradient to disappear. One of the reasons for this analysis is that the conventional GAN implemented the sigmoid cross entropy loss function in the discriminator network. We used the least squares loss function for the discriminator. Firstly, the traditional loss function does not consider samples with correct decisions, the distance from the decision boundary is long, and the generated samples are too far from the decision boundary, resulting in worse results. After using the least squares loss function, even if the classification is correct, it will also be punished, which makes the sample distance we generate closer to the

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698

7

Fig. 9. Comparison with Bicubic, DBPN, LapSRN, MSRN and ECAN techniques. In (a), (b) based on magnification 2× and in (c), (d) at magnification 4×, showing a comparison with the PSNR and SSIM of the most advanced methods, we can see that the results of our method are better. Then, in (e) and (f) at magnification 8× represents PSNR and SSIM, and we can see that our results have obvious advantages.

decision boundary. Therefore, the super-resolution reconstruction works better. Secondly, because we punish the samples with far-reaching decision boundaries, more gradients are generated by punishment, which can effectively resolve the gradient disappearance problem, and the learning performance will be more stable. Thirdly, we utilized the RDB module for training, integrating the residual block and the dense block. In order to improve the computation speed and reduce the computational complexity, we removed the BN layer of the dense block. We found that when using the dense net, the pooling layer will be at pixel-level. The information is removed, so we do not consider the pooling layer. After

that, we used local residual learning to fuse the dense join layer and local features. Using such a technique we can achieve higher quality images. Fourthly, we introduced the combination of Laplacian pyramid and GAN to capture high-frequency details. The high-frequency details of ordinary GANs are very poor when dealing with high-magnification amplification. We implemented the Laplacian pyramid to specifically capture the corresponding magnifications. High frequency detail that can recover high frequency detail of images after large scale factor super resolution. Overall, our LSRGAN framework can improve the resolution of cardiac MRI. Our method can avoid reconstructing hallucinate

8

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698

details, ensure the stability of training, and generate SR images with richer high-frequency details. 5. Conclusion In this work, we present a generative adversarial network based on the Laplacian pyramid framework to achieve super-resolution of cardiac MRI. We implemented the least squares loss function as the discriminator and introduced residual dense block as the basic network building block to generate higher quality images. Super-resolution reconstruction is performed on low-resolution images of different pixel sizes. The experimental results show that our proposed algorithm has better effect on super-resolution of cardiovascular images and has higher PSNR and SSIM scores. We provide a solution for the lower resolution of MRI, effectively addresses the problem of hallucinate details, and can help medical experts diagnose heart disease and perform cardiovascular surgery more efficiently. Unfortunately, this technique has its limitations; the LSRGAN network has a complex network structure and a slow reconstruction speed. In the future work, we will attempt to optimize the loss function and improve the network structure in order to speed up the network reconstruction. Author’s contributions K.K.L.W.: collection, organizing, and review of the literature; MZ:collection, organizing and review of the literature; XL, HL: preparing the manuscript, and manuscript review and modification; MZ: manuscript review, modification, editing, revision. All authors read and approved the final manuscript. Declaration of Competing Interest The authors declare that there is no conflict of interests. Acknowledgement This work was funded by the National Natural Science Foundation of China (Grant Number 81771927). References Goceri, N., Goceri, E., 2015. A neural network based kidney segmentation from MR images: preliminary results. The 14th IEEE International Conference on Machine Learning and Applications (ICMLA’15), 1195–1198. Goceri, E., 2016. Automatic labeling of portal and hepatic veins from MR images Prior to liver transplantation. Int. J. Comput. Assist. Radiol. Surg. 11 (12), 2153–2161. Dura, E., Domingo, J., Goceri, E., 2018. A method for liver segmentation in perfusion MR images using probabilistic atlases and viscous reconstruction. Pattern Anal. Appl. 21 (4), 1083–1195. Goceri, E., 2013. A Comparative Evaluation for Liver Segmentation from Spir Images and a Novel Level Set Method Using Signed Pressure Force Function. Izmir Institute of Technology, Phd. Thesis. Goceri, E., 2011. Automatic kidney segmentation using gaussian mixture model on MRI sequences. Electrical Power Systems and Computers, Lecture Notes in Electrical Engineering 99, 23–29. Atalay, M., 2014. Cardiac magnetic resonance imaging: update. Topics in MagneticResonance Imaging 23 (1), pp. 1-1. Gao, Z., Li, Y., Sun, Y., Yang, J., et al., 2018. Motion tracking of the carotid artery Wall from ultrasound image sequences: a nonlinear State-space approach. IEEE Trans. Med. Imaging 37 (1), 273–283. Zhang, H., Gao, Z., Xu, L., et al., 2018. A meshfree representation for cardiac medical image computing. IEEE Journal of Translational Engineering in Health and Medicine 6, 1–12. Zhao, S., Gao, Z., Zhang, H., et al., 2018. Robust segmentation of intima-media borders with different morphologies and dynamics during the cardiac cycle. IEEE J. Biomed. Health. Inf. 22 (5), 1571–1582.

Xu, L., Huang, X., Ma, J., et al., 2017. Value of three-dimensional strain parameters for predicting left ventricular remodeling after ST-elevation myocardial infarction. The International Journal of Cardiovascular Imaging 33 (5), 663–673. Xu, C., Xu, L., Gao, Z., et al., 2018. Direct delineation of myocardial infarction without contrast agents using a joint motion feature learning architecture. Med. Image Anal. 50, 82–94. Zeyde, R., Elad, M., Protter, M., 2012. On single image scale-up using sparse-representations. In Curves and Surfaces, Springer 2 (6), 711–730. Hinton, G.E., Salakhutdinov, R.R., 2006a. Reducing the dimensionality of data with neural networks. Science 313 (5786), 504–507. Gupta, P., Srivastava, P., Bhardwaj, S., Bhateja, V., 2011. A modified PSNR metric based on hvs for quality assessment of color images. IEEE International Conference on Communication and Industrial Application (ICCIA), No. 6146669. Timofte, R., Smet, V., Gool, L., 2013. Anchored neighborhood regression for fast example-based super-resolution. IEEE International Conference on Computer Vision, 1920–1927. Fischer, P., Dosovitskiy, A., Ilg, E., et al., 2015. FlowNet: learning optical flow with convolutional networks. IEEE International Conference on Computer Vision No. 7410673, 2758–2766. Schulter, S., Leistner, C., Bischof, H., 2015. Fast and accurate image upscaling with super-resolution forests. IEEE Conference on Computer Vision and Pattern Recognition, No. 7299003, 3791–3799. Harris, J.L., 1964. Diffraction and resolving power. J. Opt. Soc. Am. 54 (7), 931–933. Goodman, J., 1968. Introduction to fourier optics, roberts and CompanyPublishers. New York. Nasrollahi, K., Moeslund, T.B., 2014. Super-resolution: a comprehensive survey. In Machine Vision and Applications 25, 1423–1468. Yeh, R., Chen, C., Lim, T., et al., 2016. Semantic Image Inpainting With Perceptual and Contextual Losses[J]. Rajan, D., Chaudhri, S., 2001. Generalized interpolation and its application in super resolution imaging. Image Vision Comput. 19 (13), 957–969. Glasner, D., Bagon, S., Irani, M., 2009. Super-resolution from a single image. IEEE International Conference on Computer Vision, v. 75., pp. 349–356, n. 18. Radford, A., Metz, L., Chintala, S., 2015. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv preprint arXiv 1511, 06434. Hayat, K., 2018. Multimedia Super-Resolution via Deep Learning: A Survey. Digital Signal Processing., pp. 81, 198-217. Hinton, G.E., Salakhutdinov, R.R., 2006b. Reducing the dimensionality of data with neural networks. Science 313 (5786), 504–507. Goceri, E., 2018. Formulas behind deep learning Success. International Conference on Applied Analysis and Mathematical Modeling (ICAAMM2018), 156. Goceri, E., Goceri, N., 2017. Deep learning in medical image analysis: recent advances and future trends. 11th Int. Conf. on Computer Graphics, Visualization, Computer Vision and Image Processing (CGVCVIP 2017), 20-23., pp. 305–311. Goceri, E., 2019. Analysis of deep networks with residual blocks and different activation functions: classification of skin diseases. The 9th Int. Conf. on Image Processing Theory, Tools and Applications (IPTA 2019), in press, November 6-9. Goceri, E., 2019. Challenges and recent solutions for image segmentation in the era of deep learning. The 9th Int. Conf. on Image Processing Theory, Tools and Applications (IPTA 2019), in press, November 6-9. Goceri, E., 2019c. Diagnosis of alzheimer’s disease with sobolev gradient based optimization and 3D convolutional neural network. International Journal for Numerical Methods in Biomedical Engineering 35 (7), e3225. Bruna, J., Sprechmann, P., LeCun, Y., 2016. Super-resolution with deep convolutional sufficient statistics. International Conference on Learning Representations (ICLR), 2., pp. 5, 3. Johnson, J., Alahi, A., Li, F., 2016. Perceptual losses for real-time style transfer and super- resolution. European Conference on Computer Vision (ECCV), 694–711, Springer. Gulrajani, I., Ahmed, F., Arjovsky, M., et al., 2017. Improved Training of Wasserstein GANs., pp. 5767–5777. Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv 1556, 1409. Ledig, C., Theis, L., Huszar, F., et al., 2016. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network[J]. Oktay, O., Bai, W., Lee, M., et al., 2016. Multi-input cardiac image super-resolution using convolutional neural networks. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (Eds.), MICCAI 2016. LNCS, vol. 9902. Springer, Cham, pp. 246–254. Oktay, O., Ferrante, E., Kamnitsas, K., et al., 2018. Anatomically constrained neural networks (ACNNs): application to cardiac image enhancement and segmentation. IEEE Trans. Med. Imaging 37 (2), 384–395. Poudel, R.P.K., Lamata, P., Montana, G., 2017. Recurrent fully convolutional neural networks for multi-slice MRI cardiac segmentation. In: Zuluaga, M.A., Bhatia, K., Kainz, B., Moghari, M.H., Pace, D.F. (Eds.), RAMBO/HVSMR - 2016. LNCS, vol. 10129. Springer, Cham, pp. 83–94. Basty, N., Grau, V., 2018. Super Resolution of cardiac cine MRI sequences using deep learning[M]. In: Lecture Notes in Computer Science, 11040. LNCS, pp. 23–31. Burt, P., Adelson, E., 1983. The laplacian pyramid as a compact image code. IEEE Trans. Commun. 31 (4), 532–540.

M. Zhao, X. Liu, H. Liu et al. / Computerized Medical Imaging and Graphics 80 (2020) 101698 Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H., 2019. Fast and accurate image super-Resolution with deep laplacian pyramid networks. IEEE Trans. Pattern Anal. Mach. Intelli. 41 (11), 2599–2613.

9

Goodfellow, J., Pouget-Abadie, J., Mirza, M., et al., 2014. Generative adversarial nets. International Conference on Neural Information Processing Systems, 4., pp. 12. Kingma, P., Ba, J., 2014. Adam: a method for stochastic optimization. Comput. Sci.