Image reconstruction for sub-sampled atomic force microscopy images using deep neural networks

Image reconstruction for sub-sampled atomic force microscopy images using deep neural networks

Journal Pre-proof Image reconstruction for sub-sampled atomic force microscopy images using deep neural networks Yufan Luo, Sean B. Andersson PII: S...

11MB Sizes 0 Downloads 25 Views

Journal Pre-proof Image reconstruction for sub-sampled atomic force microscopy images using deep neural networks Yufan Luo, Sean B. Andersson

PII:

S0968-4328(19)30084-8

DOI:

https://doi.org/10.1016/j.micron.2019.102814

Reference:

JMIC 102814

To appear in:

Micron

Received Date:

7 March 2019

Revised Date:

18 December 2019

Accepted Date:

18 December 2019

Please cite this article as: Yufan Luo, Sean B. Andersson, Image reconstruction for sub-sampled atomic force microscopy images using deep neural networks, (2019), doi: https://doi.org/10.1016/j.jag.2019.101985

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier.

Image reconstruction for sub-sampled atomic force microscopy images using deep neural networks ? Yufan Luoa , Sean B. Anderssona,b a Division b Department

of Systems Engineering, Boston University, Boston, MA 02215, USA of Mechanical Engineering, Boston University, Boston, MA 02215, USA

Abstract

ro of

Undersampling is a simple but efficient way to increase the imaging rate of atomic force microscopy (AFM). One major challenge in this approach is that of accurate image reconstruction from a limited number of measurements. In this work, we present a deep neural network (DNN) approach to reconstruct µ-path sub-sampled AFM images. Our network consists of two sub-networks, namely a RED-net and a U-net, in series, and is trained end-to-end from random images masked according to µ-path sub-sampling patterns. Using both simulation and experiments, the DNN is shown to yield better image quality than three existing optimization-based methods for reconstruction: basis pursuit, a variant of total variation minimization, and inpainting.

-p

Keywords: atomic force microscopy, undersampling, image reconstruction, deep neural networks

of older, slower instruments as well as many lower-cost but lower-performing AFMs on the market. A complementary HS-AFM approach based on undersampling was introduced in [12, 13]. Under this paradigm, the increase in imaging rate is achieved purely by reducing the number of pixels to be acquired. A complete image is formed from the undersampled data through post-processing image recovery algorithms. Due to their simplicity, undersampling methods can be readily implemented on many existing instruments so long as the the instrument provides the ability to program different sampling patterns, either through the manufacturer-provided software or by accessing signals through a commonly available break-out box. In addition to reducing the overall imaging time, the reduced interaction between the sharp probe and sample surface can mitigate possible damage to the specimen or the tip. Sub-sampling patterns, such as spirals [14] and µpath patterns [15], have been successfully implemented on commercial AFMs and shown to increase imaging rate. The µ-path pattern in particular is used in the present work. This pattern, originally proposed in [16], is composed of randomly placed short scans and was specifically designed to yield effective compressive sensing (CS)-based reconstruction from the undersampled images. Examples of two µ-path patterns (of two different sizes) are shown in Fig. 1. The size of the µ-

re

1. Introduction

Jo

ur

na

lP

The Atomic Force Microscope (AFM) is a powerful instrument for interrogating material properties such as topography, stiffness, and elasticity with nanometerscale resolution. Information is revealed through interactions between the sample and a sharp tip on the end of a cantilever and an image built pixel-by-pixel as the tip is rastered across the sample. The image acquisition times of many conventional AFMs are typically measured in the order of minutes, severely limiting the ability of the instrument to study dynamics in many systems of interest [1]. Because of the importance of developing high-speed AFM (HS-AFM), there has been significant work in this area. Approaches typically follow three main schemes: improving instrument dynamics through faster actuators and cantilevers [2, 3, 4], using advanced controller designs to move the mechanical system faster [5, 6, 7], and applying alternative scan paths that manage the harmonic content of the drive waveforms so as to respect the highly resonant character of the actuators [8, 9, 10]. While combinations of these approaches have yielded modern instruments with frame rates on the order of 1-10 Hz [11], there is both a large installed base ? This work was supported in part by NSF through grants CMMI1234845, DBI-1352729 and CMMI-1562031. Email addresses: [email protected] (Yufan Luo), [email protected] (Sean B. Andersson)

Preprint submitted to Micron

December 18, 2019

structions of the highest possible quality continues to drive research in new algorithms, leading to methods that leverage inherent structure in the image [14], use Bayesian-based schemes [20], and take advantage of samples across a sequence of images [21]. In this work we turn to Deep convolutional Neural Networks (DNNs). Driven by advances in graphics processing unit (GPU)-based training hardware, DNNs [22] have been widely applied in image processing and computer vision, achieving excellent results on many tasks including image classification [23], object detection [24], and image segmentation [25]. The remainder of this paper is organized as follows. In the next section, we briefly review the existing reconstruction algorithms we use for comparison before describing our DNN scheme. In Sec. 3, we provide a controlled comparison between the algorithms by using simulations (based on real, raster-scanned data) to remove any confounding factors arising from instrument dynamics during the scanning process. In Sec. 4, we describe the result of physical experiments to demonstrate the practicality and effectiveness of the scheme before providing a few concluding remarks in Sec. 5.

ro of

path size refers to the number of pixels in each short scan. Scanning according to a µ-path pattern involves moving the tip to the next starting pixel, engaging to the surface, scanning along a short path, lifting the tip, and then repeating.

Figure 1: Example of the horizontal µ-path sampling pattern of size 35 (left image) and 65 (right image) for a 256×256 pixel image.

-p

Although sub-sampling patterns such as spirals, Lissajous scans [9], or even row-subsampling (where only a fraction of the rows are actually scanned) are somewhat easier to implement than the µ−path approach, those patterns have limited randomness (or, more exactly, exhibit strong mutual coherence between the signal and the sampling pattern), leading to poorer reconstruction in general. The µ-path pattern involves random, short scans, allowing the user to trade off increased randomness (by using sizes as short as a single pixel) against total scan time (by reducing the number of tip engagements through longer scans) [17, 15]. In addition, the µ-path pattern always scans the sample in the same direction, ensuring consistency in the data. For patterns such as spirals or Lissajous figures, different portions of the image are acquired using different sides of the tip, possibly leading to different imaging artifacts in different regions. Regardless of the sampling pattern used, final images must be reconstructed from the sub-sampled data. Existing methods are primarily based on optimization, using either inpainting or algorithms for sparse reconstruction developed in CS. Inpainting methods seek to “fill in” missing pixels by, in essence, diffusing information from known areas into the unsampled regions [18] while CS based methods seek to find the sparsest solution consistent with the data. In general, inpainting tends to produce the best results when the underlying image contains primarily low spatial-frequency content. CS-based approaches do nearly as well as inpainting on that class of images but significantly outperforms when there is significant high-frequency information [19]. While both inpainting and CS-based schemes have shown some level of success, the need for recon-

re

2. Reconstruction algorithms

na

lP

As briefly discussed in Sec. 1, there are a large number of algorithms for reconstructing AFM images from sub-sampled data. Before describing our proposed DNN-based method, we first introduce three existing algorithms which are representative of current approaches, one based on inpainting and two based on CS. 2.1. Inpainting and Basis Pursuit

Jo

ur

Inpainting describes a broad class of algorithms for recovering an original, ideal image from a partially observed version of it. These approaches essentially “diffuse” information from the measurements into unsampled regions. They tend to work best for images whose information content is concentrated at low spatial frequencies [19]. In this work, we use low-curvature image simplifiers (LCIS) [26] for all inpainting-based results. This algorithm is able to connect broken edges over large distances by minimizing the curvature of the image [27] and is thus suitable for reconstructing images that were subsampled using a µ-path pattern. Unlike inpainting, CS methods take advantage of the approximate sparsity of real-world signals, that is, that many of the coefficients describing such signals are close to zero when represented in an appropriate basis. 2

ro of

-p

Figure 2: The proposed neural network structure consists of a U-net (in the grey box) and a RED-net (in the red box) in series.

[28]. CS methods seek the true image signal x ∈ Rn from the following observation equation,

re

the BP problem [30]. We note that a variety of more efficient, greedy algorithms have been developed, including orthogonal matching pursuit [31], CoSaMP [32], and a version previously developed by the authors and specifically for the sampling matrices defined by the µ-path pattern [17]. We use BP here primarily as a high quality baseline algorithm. While BP is effective in the general setting, it usually exhibits artifacts in the vertical direction when applied to µ−path samples. These artifacts become more severe when longer µ−paths are used, in direct opposition to the need to use longer paths to reduce the overall scanning time. To overcome this, we have previously introduced Basis Pursuit with Vertical Variation (BPVV). Similar to total variation algorithms, BPVV adds a vertical-only total variation penalty to the optimization objective, modifying (2) to

minimize

Ψ−1 x

1 + k∇v xk1 subj. to y = Φx, (3)

y = Φx = ΦΨη,

(1)

Jo

ur

na

lP

where y ∈ Rm is the observation vector, Φ is an m × n matrix defining the measurements, Ψ is an n × n sparsity basis and η is the sparse representation of x in the domain of Ψ. In general, m  n. In AFM, the probe can only measure a single pixel at a time. This implies that Φ is a sparse matrix with each row having a single non-zero entry and that y is a subset of x. We choose the Discrete Cosine Transform (DCT) as the sparsity as in practice this has a good balance between producing high sparsity in the image representation and having a low mutual coherence with the sampling matrices. In essence, a low mutual coherence implies the measurements defined by Φ are “spread out” in the domain of Ψ [29]. One common realization of reconstruction based on (1) is Basis Pursuit (BP), given by

minimize

Ψ−1 x

1 subj. to y = Φx = ΦΨη. (2)

where k∇v xk1 is the variation of the signal in the vertical direction. (3) can be viewed as the combination of CS and inpainting, in which the first term seeks a sparse solution and the second term seeks to diffuse pixel information.

where k · k1 denotes the l1 norm. This problem essentially searches for the sparsest signal from all candidates that match the measurements. While computationally demanding, particularly for large images, it in general provides very good reconstructions. In the reconstructions in this work, we use the package l1 -magic to solve

2.2. Proposed method The structure of the proposed neural network, shown in Fig. 2, consists of two subnets in series. The first 3

(b) µ-path samples

(c) U-Net (24.10dB)

ro of

(a) Ground truth

(d) RED-Net (21.84dB)

(e) Proposed (24.89dB)

Figure 3: The output of the proposed network for the sub-sampled square grating image comparing to U-Net and RED-net. The bottom row is the details of the top row in the red box. Peak Signal-to-Noise Ratio (PSNR) is indicated below each image.

RED-net (six convolutional and six deconvolutional layers). Each layer uses 3 × 3 kernels as indicated in Fig. 2. The network starts with 64 feature channels in the Unet, and doubles the size at each downsampling step and symetrically reduces by half at each upsampling step. At the final layer of the U-net, a 1 × 1 convolution is used to map the feature channels to the initial reconstruction. For the RED-net, we used 128 feature channels and a rectified linear unit (ReLU) as a nonlinear activation function.

lP

re

-p

one (grey dashed box) is the U-net, proposed in [33] for pixel-wise image segmentation. The network consists of a downsampling path and a symmetrical upsampling path, with the layers in the two paths linked by skiplayer connections (indicated in grey arrows). The feature channels are doubled at each downsampling step, allowing the network to propagate context information to higher resolution layers. The goal of this network in our application is to reconstruct the overall structure of µ-path sub-sampled images.

The second subnet (red dashed box in Fig. 2) is the very deep Residual Encoder-Decoder Networs (REDNet). It was originally proposed for image denoising and super-resolution [34]. In this network, convolutional and deconvolutional layers are symmetrically linked with skip-layer connections (indicated in blue arrows). The convolutional layers extract features from corrupted images while the deconvolutional layers are used to recover the details from the features. The REDnet is used in our proposed network to enhance the recovery of image details from the output of the U-net.

ur

na

Because of the unique structure of the measurement matrix Φ that describes a µ-path pattern, existing pretrained models for the U-net and RED-net would not yield reasonably accurate reconstructions and ideally the network should be trained with AFM images and learned AFM image features. However, AFM images are expensive and slow to obtain. Therefore, in this work, we simply used a training dataset of animal images from the Kaggle database where the grey-level value in each pixel can be interpreted as the corresponding height in an AFM image [35]. 200 images were used for validation, guiding the selection of the network structure described above, and 2000 were used for training. Images were masked with different random µ-path patterns and used as input data. The mean squared error between the original images and the network output from the masked images was used as the loss function. The loss was minimized using the Adam optimization algorithm with learning rate of 10−4 [36]. To overcome any possible issues with the gradient vanishing, in addition to the back-propagation inherent in the network

Jo

In both sub-networks, the skip connections pass µpath samples directly to the top layers. These connections are beneficial in recovering the original image as they help to avoid the loss of detailed information in the processing occurring in the previous layers. In addition, they allow the signal to be back-propagated directly to the bottom layers, thereby addressing the gradient vanishing problem [23]. We implemented our proposed network using a Unet of depth three (three pooling layers) and a 12-layer 4

Table 1: Average reconstruction quality (PSNR) over 33 µ-path subsampled AFM images using BP, BPVV, LCIS and the proposed DNN.

structure, we used a multi-level approach by first training the U-net to obtain initial coefficients and then training the combined network to get the final coefficients. The training process took around 8 hours in TensorFlow using an Nvidia GeForce GTX 750Ti. The results were tested using an additional 400 images from the Kaggle database. To demonstrate the contribution of each of the subnets to µ-path reconstruction, we performed a simulation by sampling from a standard raster-scanned AFM image of a grating using a µ-path pattern of length 40 and a total of 25% sampling. The original and a zoom of the region in the red box are shown in Fig. 3a while the samples acquired from the µ-path pattern are shown in Fig. 3b. The reconstruction from the U-Net portion of the network alone is shown in Fig. 3c, together with the error with respect to the original image (reported as Peak Signal to Noise Ratio (PSNR)). The reconstruction captures the main features of the image but the result is overly smooth. The results of the RED-Net alone are shown in Fig. 3d. This reconstruction is visually worse than the U-Net image but does not overly smooth the image; the edges of the grating, in fact, vary too much. The result of the combined network is shown in Fig. 3e and is both visually very close to the original and with a significantly higher PSNR than either the U-Net or RED-Net alone.

µ-path size 20 35 50 65

BP (dB) 20.25 19.91 19.60 19.34

BPVV (dB) 21.97 21.61 21.38 20.78

LCIS (dB) 21.90 21.28 20.07 19.27

DNN (dB) 23.85 22.95 22.45 21.61

re

-p

ro of

domain [37] and the third (Fig. 6) is the surface of a charge coupled device sensor, also taken from the public domain [38]. For the first two images, the proposed DNN outperformed the other three methods. The BP reconstruction, in particular, shows significant artifacts arising from the sampling pattern. These are largely removed by BPVV but BPVV does a poor job connecting long, continuous, thin lines in the image. LCIS is slightly worse than BPVV but with similar errors in the reconstruction. The proposed DNN network, however, is significantly better than all other reconstructions in both of the first two test images. For the CCD sensor surface image, however, BP produced the highest quality reconstruction. As discussed further in Sec. 3.2, this is related to the sparsity level of this image in the DCT basis. The average reconstruction quality across all 33 AFM images, measured in PSNR, for each method and µ-path size is presented in Table 1. The proposed DNN outperforms the other three methods with a margin of more than 1 dB. The margin decreases when µ-path size increases. It is also important to note that image generation using the DNN is very fast (on the order of one second on a typical laptop computer), making it suitable for real-time use. The other three methods all take approximately 30 minutes to generate a reconstruction.

lP

3. Simulation results

ur

na

To demonstrate and compare the performance of our proposed approach to other reconstruction algorithms, we used 33 AFM images, each of size 256 × 256 pixels. Images were selected from those acquired by our research group and from the public domain. Note that the images acquired from the public domain did not have x-y image sizes nor height information. However, the reconstruction algorithms are agnostic to the absolute size of the image, working directly in pixels. Similarly they do not depend on the scaling of pixel intensity to measured height. Each image was sub-sampled (20% of the pixels) using random µ-path patterns of size 20, 35, 50 and 65. Reconstructions were then made using the proposed network as well as using BP, BPVV, and LCIS.

Jo

Of course, as in any AFM imaging, the quality of a final image depends also on other parameters such as scan speed. (For comparisons of performance at different scan speed, see [15].) It is also important to note that unlike raster scanning, it can be difficult to quantify the final resolution in a reconstructed image. In raster scans, any feature at least as large as the distance between scan lines is sure to show up, at least in part, in the final image. With an image reconstructed from µpath data, a specific feature could still appear in the final image if it is recurring in some way in the image or if it is a “natural” continuation (as in the DNA images). Understanding this resolution is a topic of ongoing work.

3.1. Results Typical results from three of the test AFM images are shown in Figs. 4, 5 and 6. Fig. 4 is of a sample of DNA (acquired using a standard raster scan on an Agilent 5500 AFM). The second sample (Fig. 5) is a sample of BiFeO3 /SrRuO3 /DyScO3 , taken from the public 5

(b) BP (16.78 dB)

(c) BPVV (19.77 dB)

ro of

(a) Ground truth

(d) LCIS (19.74 dB)

(e) DNN (22.33 dB)

(a) Ground truth

na

lP

re

-p

Figure 4: Example reconstructions of DNA images (sub-sampled with µ-path pattern of size 35) using BP, BPVV, LCIS and the proposed DNN. The bottom row is the details for each of the reconstruction in the red box. PSNR is indicated below each image.

(b) BP (17.08 dB)

(c) BPVV (17.92 dB)

(d) LCIS (16.80 dB)

(e) DNN (20.08 dB)

ur

Figure 5: Example reconstructions of BiFeO3 /SrRuO3 /DyScO3 surface images (sub-sampled with µ-path pattern of size 50) using BP, BPVV, LCIS and the proposed DNN. The bottom row is the details for each of the reconstruction in the red box.

of sparsity and is defined as the ratio of the energy in the k-largest DCT coefficients in the image to the total energy (for a user-defined k). Numerical experiments in [19] showed that LCIS generally produces superior results when the image contains primarily low frequency content while BP is better when the images have mixed, but sparse, frequency content.

Jo

3.2. Selecting an appropriate reconstruction algorithm While the results in Table 1 indicate that the proposed DNN algorithm is significantly better on average, the third test example (shown in Fig. 6) also highlights that it is not always the best choice. In previous work, we addressed a similar question of when to choose between BP and LCIS [19]. Two quantities were introduced to characterize an image. The first (ρ) is defined as the proportion of energy at low frequencies relative to the total energy of the image. The second (γ) is a measure

A similar analysis was carried out here to compare the proposed DNN reconstruction scheme to CS based approaches (using BP as a canonical example) and to inpainting approaches (using LCIS). For each of the 6

(b) BP (25.40 dB)

ro of

(a) Ground truth

(c) BPVV (17.56 dB)

(d) LCIS (17.41 dB)

(e) DNN (17.48 dB)

Figure 6: Example reconstructions of CCD surface images (sub-sampled with µ-path pattern of size 65) using BP, BPVV, LCIS and the proposed DNN. The bottom row is the details for each of the reconstruction in the red box.

1

1

0.8 0.7 0.6 0.5 0.4

re

Low frequency level

0.9

-p

0.9

0.8

Low frequency level

images in the test set, reconstructions from 20% sampling using a size 35 µ−path pattern were compared in terms of PSNR. The results of DNN vs. BP are shown in Fig. 7a. In this figure, each individual test image is plotted according to its ρ and γ parameters. If the DNNbased reconstruction was more than 0.2 dB higher than that of the BP, the result is shown as a blue diamond; if BP was more than 0.2 dB higher than the DNN then a red triangle is shown. If the difference was less than 0.2 dB then the reconstructions were deemed to be of a similar quality and the result is not shown. The results show that for images with a high measure of sparsity and significant content outside of the low frequency-range, BP will outperform our DNN algorithm but that in most cases, DNN yields the better result. The comparison between DNN and LCIS is shown in Fig. 7b, indicating that the DNN results (again marked by a blue diamond) outperformed the LCIS approach in all cases.

0.3

lP

0.4 0.3

0

0

0.4

0.5

0.1

0.1

0.2

0.6

0.2

0.2

0

0.7

0.6

Sparsity level

(a) DNN vs. BP

0.8

1

0

0.2

0.4

0.6

0.8

1

Sparsity level

(b) DNN vs. LCIS

na

Figure 7: Reconstruction comparison results. Left image is DNN-BP comparison and right image is DNN-LCIS comparison. Images for which DNN yielded a higher PSNR are shown as open blue diamonds while images for which BP or LCIS yielded a better reconstruction are indicated as filled red triangles.

ur

pattern scan of size 35 with 20% of the pixels sampled. The only differences between the experiments was the region of the grating selected for imaging. The µ-path samples of the two experiments are shown in Fig. 8. The slow raster scan took 30 min and the resulting 256 × 256 images (Fig. 9a and 10a) were used as the ground truth.

4. Experimental results

Jo

To demonstrate the effectiveness of the proposed network on recovering images from real AFM sub-sampled data, we performed two imaging experiments on a commercial AFM (Agilent 5500, Keysight Technologies). Images were made of a grating with 1.9 µm high features in a chessboard pattern on a 10 µm pitch (TGX11, Ultrasharp). In each experiment, a 30 µm × 30 µm region of the sample was imaged three times: once using a slow raster scan, once with a fast raster scan, and once with a µ-path

The fast raster scan and the µ-path scan both took approximately 70 sec. to complete. (Implementation details of the µ-path scan approach can be found in [15].) The images using the fast raster scan are shown in Fig. 9b and 10b. For comparison purposes, reconstructions from the µ-path samples were made using both BPVV (which generally outperforms BP) and DNN. LCIS was omitted since, as demonstrated in Sec. 3.2, it 7

[3] A. Mohammadi, A. G. Fowler, Y. K. Yong, S. O. R. Moheimani, A Feedback Controlled MEMS Nanopositioner for On-Chip High-Speed AFM, Journal of Microelectromechanical Systems 23 (3) (2014) 610–619 (Jun. 2014). [4] G.-Y. Gu, L.-M. Zhu, C.-Y. Su, H. Ding, S. Fatikow, Modeling and Control of Piezo-Actuated Nanopositioning Stages: A Survey, IEEE Transactions on Automation Science and Engineering 13 (1) (2015) 313–332 (Dec. 2015). [5] Y. Yong, S. R. Moheimani, B. J. Kenton, K. Leang, Invited review article: High-speed flexure-guided nanopositioning: Mechanical design and control issues, Review of scientific instruments 83 (12) (2012) 121101 (2012). [6] P. Huang, S. B. Andersson, High speed atomic force microscopy enabled by a sample profile estimator, Applied physics letters 102 (21) (2013) 213118 (2013). [7] T. Uchihashi, N. Kodera, T. Ando, High-speed atomic force microscopy, in: Noncontact Atomic Force Microscopy, Springer, 2015, pp. 481–518 (2015). [8] I. Mahmood, S. R. Moheimani, Fast spiral-scan atomic force microscopy, Nanotechnology 20 (36) (2009) 365503 (2009). [9] T. Tuma, J. Lygeros, V. Kartik, A. Sebastian, A. Pantazi, Highspeed multiresolution scanning probe microscopy based on lissajous scan trajectories, Nanotechnology 23 (18) (2012) 185501 (2012). [10] M. S. Rana, H. R. Pota, I. R. Petersen, Spiral Scanning With Improved Control for Faster Imaging of AFM, IEEE Transactions on Nanotechnology 13 (3) (2014) 541–550 (May 2014). [11] T. Ando, T. Uchihashi, S. Scheuring, Filming Biomolecular Processes by High-Speed Atomic Force Microscopy, Chemical Reviews 114 (6) (2014) 3120–3188 (Jan. 2014). [12] B. Song, N. Xi, R. Yang, K. W. C. Lai, C. Qu, Video rate atomic force microscopy (afm) imaging using compressive sensing, in: Nanotechnology (IEEE-NANO), 2011 11th IEEE Conference on, IEEE, 2011, pp. 1056–1059 (2011). [13] S. B. Andersson, L. Y. Pao, Non-raster sampling in atomic force microscopy: A compressed sensing approach, in: American Control Conference (ACC), 2012, IEEE, 2012, pp. 2485–2490 (2012). [14] C. S. Oxvig, T. Arildsen, T. Larsen, Structure assisted compressed sensing reconstruction of undersampled AFM images, Ultramicroscopy 172 (2017) 1–9 (Jan. 2017). [15] R. A. Braker, Y. Luo, L. Y. Pao, S. B. Andersson, Hardware demonstration of atomic force microscopy imaging via compressive sensing and µ-path scans, in: 2018 Annual American Control Conference (ACC), IEEE, 2018, pp. 6037–6042 (2018). [16] B. D. Maxwell, S. B. Andersson, A compressed sensing measurement matrix for atomic force microscopy, in: American Control Conference (ACC), 2014, IEEE, 2014, pp. 1631–1636 (2014). [17] Y. Luo, S. B. Andersson, A fast image reconstruction algorithm for compressed sensing-based atomic force microscopy, in: American Control Conference (ACC), 2015, IEEE, 2015, pp. 3503–3508 (2015). [18] A. Chen, A. L. Bertozzi, P. D. Ashby, P. Getreuer, Y. Lou, Enhancement and Recovery in Atomic Force Microscopy Images, in: Excursions in Harmonic Analysis, Vol. 2, Birkh¨auser Boston, Boston, 2012, pp. 311–332 (Nov. 2012). [19] Y. Luo, S. B. Andersson, A comparison of reconstruction methods for undersampled atomic force microscopy images, Nanotechnology 26 (50) (2015) 505703 (2015). [20] Y. Zhang, Y. Li, Z. Wang, Z. Song, R. Lin, J. Qian, J. Yao, A fast image reconstruction method based on Bayesian compressed sensing for the undersampled AFM data with noise, Measurement Science and Technology 30 (2) (2019) 025402 (Feb. 2019). [21] Y. Luo, S. B. Andersson, A Compressive Sensing-Based Pixel

ro of

Figure 8: µ-path samples acquired using Agilent 5500 AFM. Left image is the samples used to generate Fig. 9c and 9d. Right image is the samples used to generate Fig. 10c and 10d.

lP

re

-p

does not produce better images than the DNN scheme. The final recovered surface images are shown in Fig. 9c and 10c using BPVV, and Fig. 9d and 10d using the proposed DNN. In both experiments, both BPVV and DNN yielded significantly better images, both visually and in terms of PSNR, with DNN showing an improvement of 4.5 dB over the equivalent-time raster scan. Because of the periodic nature of these gratings, their DCT-domain description is somewhat sparse, though real-life details in the faster scans decrease the overall sparsity. As a result, the improvement of the DNN over BPVV is somewhat small in terms of the PSNR gain, though still significant. Perhaps more important, the zoomed-in sections show that the DNN more faithfully captured the grating edges. 5. Conclusions

Jo

ur

na

In this work, we proposed a deep neural network approach to reconstruct µ-path pattern sub-sampled AFM images. The network was trained using only low-cost natural images. The performance of the proposed network was demonstrated through simulations and experiments. Once trained, the network produces images significantly faster than optimization-based methods such as basis pursuit or inpainting schemes (seconds versus tens of minutes on a standard laptop) while also producing higher quality reconstructions. References

[1] D. Y. Abramovitch, S. B. Andersson, L. Y. Pao, G. Schitter, A tutorial on the mechanisms, dynamics, and control of atomic force microscopes, in: American Control Conference, 2007. ACC’07, IEEE, 2007, pp. 3488–3502 (2007). [2] B. J. Kenton, K. K. Leang, Design and control of a three-axis serial-kinematic high-bandwidth nanopositioner, IEEE Transactions on Mechatronics 17 (2) (2012) 356–369 (Apr. 2012).

8

ro of

(a) Ground truth

(b) Raster scan (18.38 dB)

(c) BPVV (25.36 dB)

(d) DNN (26.27 dB)

ur

na

lP

re

-p

Figure 9: Reconstruction of a square grating sample (µ-path size 35). Bottom row is the details of top row in the red box.

(a) Ground truth

(b) Raster scan (19.29 dB)

(c) BPVV (23.36 dB)

(d) DNN (23.79 dB)

Jo

Figure 10: Reconstruction of a square grating sample (µ-path size 35). Bottom row is the details of top row in the red box.

Sharing Algorithm for High-Speed Atomic Force Microscopy, in: IEEE Conference on Decision and Control, 2016, pp. 2834– 2839 (Dec. 2016). [22] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in: Advances in neural information processing systems, 2012, pp. 1097–1105 (2012). [23] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on

computer vision and pattern recognition, 2016, pp. 770–778 (2016). [24] S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards realtime object detection with region proposal networks, in: Advances in neural information processing systems, 2015, pp. 91– 99 (2015). [25] V. Badrinarayanan, A. Kendall, R. Cipolla, Segnet: A deep convolutional encoder-decoder architecture for image segmentation, arXiv preprint arXiv:1511.00561 (2015).

9

Jo

ur

na

lP

re

-p

ro of

[26] J. Tumblin, G. Turk, Lcis: A boundary hierarchy for detailpreserving contrast reduction, in: Proceedings of the 26th annual conference on Computer graphics and interactive techniques, ACM Press/Addison-Wesley Publishing Co., 1999, pp. 83–90 (1999). [27] A. Bertozzi, C.-B. Sch¨onlieb, Unconditionally stable schemes for higher order inpainting, Communications in Mathematical Sciences 9 (2) (2011) 413–457 (2011). [28] A. Y. Carmi, L. Mihaylova, S. J. Godsill (Eds.), Compressed Sensing & Sparse Filtering, Signals and Communication Technology, Springer Berlin Heidelberg, Berlin, Heidelberg, 2014 (2014). [29] E. J. Cand`es, J. Romberg, Sparsity and incoherence in compressive sampling, Inverse Problems 23 (3) (2007) 969–985 (Apr. 2007). [30] l1 -MAGIC, https://statweb.stanford.edu/~candes/ l1magic/, accessed: 2010-01-28. [31] J. A. Tropp, A. C. Gilbert, Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit, IEEE Transactions on Information Theory 53 (12) (2007) 4655–4666 (Dec. 2007). [32] D. Needell, J. A. Tropp, CoSaMP: Iterative signal recovery from incomplete and inaccurate samples, Applied and Computational Harmonic Analysis 26 (3) (2009) 301–321 (2009). [33] O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical image computing and computer-assisted intervention, Springer, 2015, pp. 234–241 (2015). [34] X. Mao, C. Shen, Y.-B. Yang, Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections, in: Advances in neural information processing systems, 2016, pp. 2802–2810 (2016). [35] Kaggle database, www.kaggle.com, accessed: 2019-02-20. [36] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014). [37] In-plane image of BiFeO3/SrRuO3/DyScO3 surface, http://www.asylumresearch.com/Gallery/ Materials/Piezo/Piezo28.shtml, accessed: 2017-07-18. [38] CCD sensor removed from canon A75 digital camera, https://commons.wikimedia.org/wiki/File: CCD_CANON_A75_AFM_HR_JANUSZ_REBIS.jpg, accessed: 2017-07-18.

10