Quantification of sheet nacre morphogenesis using X-ray nanotomography and deep learning

Quantification of sheet nacre morphogenesis using X-ray nanotomography and deep learning

Journal Pre-proofs Quantification of sheet nacre morphogenesis using X-ray nanotomography and deep learning Maksim Beliaev, Dana Zöllner, Alexandra Pa...

8MB Sizes 0 Downloads 28 Views

Journal Pre-proofs Quantification of sheet nacre morphogenesis using X-ray nanotomography and deep learning Maksim Beliaev, Dana Zöllner, Alexandra Pacureanu, Paul Zaslansky, Luca Bertinetti, Igor Zlotnikov PII: DOI: Reference:

S1047-8477(19)30258-8 https://doi.org/10.1016/j.jsb.2019.107432 YJSBI 107432

To appear in:

Journal of Structural Biology

Received Date: Revised Date: Accepted Date:

31 August 2019 12 November 2019 3 December 2019

Please cite this article as: Beliaev, M., Zöllner, D., Pacureanu, A., Zaslansky, P., Bertinetti, L., Zlotnikov, I., Quantification of sheet nacre morphogenesis using X-ray nanotomography and deep learning, Journal of Structural Biology (2019), doi: https://doi.org/10.1016/j.jsb.2019.107432

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier Inc.

Quantification of sheet nacre morphogenesis using X-ray nanotomography and deep learning Maksim Beliaev1, Dana Zöllner1, Alexandra Pacureanu2, Paul Zaslansky3, Luca Bertinetti1 and Igor Zlotnikov1* 1B

CUBE - Center for Molecular Bioengineering, Technische Universität Dresden, Germany

2 The

European Synchrotron Facility, Grenoble, France

3Julius

Wolff Institute for Biomechanics and Musculoskeletal Regeneration, Berlin, Germany

Abstract High-resolution three-dimensional imaging is key to our understanding of biological tissue formation and function. Recent developments in synchrotron-based X-Ray tomography techniques provide unprecedented morphological information on relatively large sample volumes with a spatial resolution better than 50 nm. However, the analysis of the generated data, in particular image segmentation – separation into structure and background – still present a significant challenge, especially when considering complex biomineralized structures that exhibit hierarchical arrangement of their constituents across many length scales – from millimeters down to nanometers. In the present work, synchrotron-based holographic nano-tomography data are combined with state-of-the-art machine learning methods to image and analyze the nacreous architecture in the bivalve Unio pictorum in 3D. Using kinetic and thermodynamic considerations known from physics of materials, the obtained spatial information is then used to provide a quantitative description of the structural and topological evolution of nacre during shell formation. Ultimately, this study establishes a workflow for high-resolution three-dimensional analysis of fine highly-mineralized biological tissues while providing a detailed analytical view on nacre morphogenesis. Highlights - The morphology of nacre in 3D is imaged using X-Ray nanotomography - Convolutional Neural Networks (Machine Learning) enable automatic data segmentation - Classical grain growth theories from physics of materials describe nacre formation well - We provide a quantitative description of the structural and topological evolution of nacre 1

1. Introduction Nacre is one of the most studied molluscan shell ultrastructures and biomineralized tissues in general (Bøggild, 1930; Cartwright and Checa, 2007; Nudelman, 2015; Sun and Bhushan, 2012a; Taylor et al., 1969). It is abundant in nature as it constitutes the inner layer of many bivalve, gastropod and cephalopod shells and the outer layer of pearls. Moreover, nacreous architecture is one of the most studied biocomposites in terms of its structure-to-function relationship and mechanical performance (Lemanis and Zlotnikov, 2018; Meyers et al., 2013; Taylor and Layman, 1972) and is one of the most synthetically mimicked naturally occurring structures (Wang et al., 2012; Wegst et al., 2015). However, despite the fact that since almost two centuries nacre is an exemplary model system to study the process of biomineralization and mineral morphogenesis (Bevelander and Nakahara, 1969; Cartwright et al., 2009; Cartwright and Checa, 2007; Devol et al., 2015; Marin et al., 2012; Mutvei, 1977; Nakahara et al., 1982; Nudelman, 2015; Olson et al., 2013; Sun and Bhushan, 2012b; Wada, 1966), its formation is still not fully understood. The shells of molluscs typically consist of a number of mineral-organic biocomposite ultrastructures that are arranged in layers parallel to the surface of the shell, which is covered by a purely organic periostracum (Bøggild, 1930). The growth of the shell in thickness proceeds on the internal part of the periostracum in one direction – towards the soft tissue of the organism. The different ultrastructures grow sequentially. Some bivalves, for example, initially form a prismatic assembly, which consists of relatively large columnar mineral building blocks (up to 60 microns in diameter) joined together by an approximately 1 µm thick interprismatic matrix. Then, sheet nacre, made of approximately 1 µm thick platelets joined together by an extremely thin (less than 50 nm) organic matrix, is deposited. Thus, the so-called “brick and mortar” layered architecture is formed on top of the prisms (Checa and Rodriguez-Navarro, 2005). Whereas nacre is always made of aragonite, the prismatic architectures can be either aragonitic or calcitic (Bøggild, 1930; Marin et al., 2012; Taylor et al., 1969). In both cases, the mineral building blocks are not purely inorganic, but contain up to 5% of intracrystalline organic matter (Dauphin, 2002). In the last few decades, significant progress has been made in understanding the biochemical and physical processes that are responsible for biomineral formation and its assembly into the various molluscan shell ultrastructures. More recently, by considering the different mineral building blocks to be individual grains in a polycrystalline material, a number of studies demonstrated that structural and textural evolution of some ultrastructures can be quantitatively described employing analytical correlations that are usually used to compute grain growth and 2

coarsening phenomena in generic materials systems (Bayerlein et al., 2014; Reich et al., 2019; Zöllner and Zlotnikov, 2019a). Specifically, by analyzing the 3D morphology of the relatively coarse prismatic ultrastructure in a number of bivalves, it was suggested that the formation of the prismatic assembly is a spontaneous process that is thermodynamically driven by the cellular tissue of the organism—the mantle. These studies also hypothesized that similar concepts can be applied to quantitatively describe the formation of the nacreous architecture (Schoeppler et al., 2018). However, whereas the prismatic ultrastructure and the constituent columnar building blocks can be visualized using increasingly available X-Ray-based computed microtomography techniques, 3D imaging of the much finer nacre poses an exceptional challenge. Recent state-of-the-art nanotomography (nano-CT) synchrotron instruments are able to image such dense structures with voxel sizes below 50 nm (Pacureanu et al., 2019). Yet, the analysis of the collected data is a non-trivial task. Nano-CT pushes the limit not only on resolution, but also on the signal to noise ratio such that analysis strongly relies on tedious image processing and segmentation, with serious consequences for noise, artifacts and poor contrast. In many cases, it is easier to analyze the visual data manually rather than automatically. However, in the case of contemporary 3D high-resolution nanotomography, each data-set contains thousands of images with numerous features having different contrast and their manual segmentation is almost impossible to perform with reasonable consistent accuracy. In fact, manual analysis can introduce inconsistencies as a result of bias and errors that are difficult to predict, correct or even account for (Iassonov et al., 2009). A number of standard methods can help to automate the image segmentation process. The most common approaches are based on algorithms, such as filtering, thresholding, water shedding and morphological operations. Still, these procedures, generally, show poor results when data are noisy and have poor contrast. Such datasets are better suited for machine learning (ML) techniques. The main advantage of ML over task-specific algorithms is its ability to learn how to properly handle enormous amounts of data from the data itself (Qiu et al., 2016). Indeed, Deep Learning (DL), a subclass of ML, demonstrating a ground breaking performance in 3D data analysis, is becoming a very popular tool in image segmentation (Garcia-Garcia et al., 2017). In this work, synchrotron-based holographic X-Ray nano-tomography was employed to visualize the structural evolution of sheet nacre in the nacre-prismatic bivalve Unio pictorum. The morphological and textural evolution of the transition from the prismatic ultrastructure to nacre in 2D in this species was recently systematically described (Schoeppler et al., 2018) making it an 3

ideal model system for analysis in 3D. The obtained nanotomography data were successfully segmented using Convolutional Neural Networks (CNNs) – one of the most common DL methods successfully applied to image processing since the 1990s (Hasegawa et al., 1994; Litjens et al., 2017; Shen et al., 2017). Subsequently, the obtained structural data were used to quantitatively describe sheet nacre morphogenesis using analytical theories borrowed from classical materials science. Ultimately, with the assistance of X-Ray nanotomography in combination with machine learning methods, this work not only presents the first detailed three-dimensional model of a biomineralized nacreous tissue, but also provides the first quantitative description of its formation. 2. Methods Sample Preparation Specimens in the size of approximately 1x1x1 mm3 containing the prismatic to nacre transition zone from the shell of U. pictorum were cut out from the entire shell using a diamond saw. The samples were not subjected to any additional mechanical or chemical treatment. X-Ray nano-tomography X-ray holographic nano-tomography experiments were conducted at the ID16A nanoimaging beamline of the ESRF. X-ray holography is a full-field, free space propagation, phase contrast technique. By using a highly focused beam, the spatial resolution can be pushed to sub-50 nm (Hubert et al., 2018). First, the areas of interest were identified employing a series of single Xray projections used to obtain a general overview of the samples. Then, full data sets were recorded with voxel sizes of 40 nm using a beam energy of 17 keV with a monochromaticity of 1%. In order to obtain each 3D image, four sets of 2000 angular projections of the sample rotated over 180º were collected at four distances of the sample with respect to the X-ray beam focus and the detector. The exposure time for each projection was 0.3 s and the detector was equipped with a GGG:Eu 23µm scintillator and a Frelon CCD camera. Sets of four projections corresponding to the same rotation angle were aligned together, brought to the same magnification and combined to obtain a phase map through a phase retrieval algorithm. The resulting phase maps were used for tomographic reconstruction through filtered back-projection (PyHST). In order to preserve the sample from radiation damage (with the total dose estimated to be 4.6e+07 Gy), the data was collected in cryogenic conditions. 4

Image Analysis In this work, a CNN architecture inspired by U-Net CNN was used (Weigert et al., 2018). Cross-sectional and top-view pairs of training and “ground truth” images were used to train two different networks. Their dimensions were 256x256 and 512x512 pixels for cross-section and topview, respectively. The ratios of training, validation and test samples are 19:2:0 and 7:1:0 during hyperparameter optimization and 15:4:2 and 5:2:1 during k fold cross-validation, with k=4 (Devuver and Kittler, 1982). For consistency, the networks were trained for 40 epochs. After each, the validation set was used to evaluate the model. On average, the loss function has reached a plateau after approx. 20 epochs. At the end, the best model was selected. All data processing was performed on a workstation with GPU NVIDIA Quadra M6000 with Python 3.6.6 and Tensorflowgpu 1.4.1 (Abadi et al., 2015), keras-gpu 2.0.8 (Chollet, 2015) and csbdeep 0.1.1 (Weigert et al., 2018) packages. Data Analysis Analysis and associated least-squares fitting of the tomography data handled by deep learning were performed using in-house code on MatLab.

5

3. Results and Discussion The shell of Unio pictorum U. pictorum (Fig. 1A), also known as the painter’s mussel, is a freshwater bivalve that can be found in most European lakes, rivers and streams (Van Damme, 2011). Like many other bivalves, the shell of U. pictorum, as depicted by scanning electron microscopy, consists of two mineralized ultrastructures: the outer prismatic and the inner nacreous architectures (Fig. 1B and Fig. 1C) (Dauphin et al., 2018; Marie et al., 2007a, 2007b). Unlike the shells of bivalves from other commonly studied species, as the ones from the order Pteriida, in which the prismatic layer is made of calcite (Bøggild, 1930; Marin et al., 2012; Reich et al., 2019; Taylor et al., 1969), the mineral building blocks in both layers in U. pictorum are made of aragonite. Moreover, the structural transition from the prismatic assembly to nacre in this species is not abrupt, but gradual (Fig. 1B) (Dauphin et al., 2018; Schoeppler et al., 2018). This is another reason why this shell is a perfect candidate to study morphological transitions of molluscan shell ultrastructures in 3D.

X-Ray Nanotomography Data Segments of the shell of U. pictorum extracted from the transition zone between the prismatic to the nacreous ultrastructure were imaged using synchrotron-based computed nanotomography. Similarly to the nacreous architecture in other molluscs (Bertoldi et al., 2008; LeviKalisman et al., 2001; Nakahara et al., 1982), in U. pictorum the thickness of the organic layers separating the mineral layers is smaller than 50 nm (Fig. 1C). As a compromise between the need to analyze the shape of individual platelets and to reach a field of view that is sufficient for a statistical analysis of the nacreous ultrastructure, a pixel resolution of 40 nm was chosen. Representative two-dimensional tomographic sections, obtained perpendicular and parallel to the layered nacre are presented in Fig. 2A and Fig. 2B, cross-section and top-view, respectively. In both directions, our ability to resolve the interlamellar organic interfaces separating the individual nacre layers (Fig. 2A and Fig. 1D) and the intertabular organic interfaces separating the individual platelets in a single layer (Fig. 2B and Fig. 1D) is clearly evident. However, a number of factors affect our ability to perform automated segmentation of the tomography data. The achieved spatial resolution is close to the limit of resolving the organic interfaces which means that the separating structures are made of very few voxels in thickness with faint contrast. Moreover, the presence of nanoasperities on the surface of the platelets and mineral bridges between the platelets result in significant intensity variations that make automatic segmentation of individual platelets using task6

specific algorithms nearly impossible (Younis et al., 2012). As an example, Fig. 2A-I and Fig. 2B-I demonstrate the effect of Otsu thresholding of representative segments taken from the original data in Fig. 2A and Fig. 2B, respectively. Clearly, this approach does not allow automated segmentation of the mineral building blocks. To further illustrate the unfeasibility of automated segmentation using thresholding, the segments in Fig. 2A and Fig. 2B were manually segmented by assigning each pixel with a corresponding materials class. In fact, in all our analysis, the intensity in the obtained tomographic data was divided into four different categories. The first corresponds to the interlamellar organic matrix, which can be viewed as horizontal lines in Fig. 2A and as dark circular “clouds” on top of platelets in Fig. 2B. The second is the intertabular organic matrix, which appears to have a honeycomb morphology in Fig. 2B and vertically separates the platelets in Fig. 2A. In these images and throughout the rest of the paper, the interlamellar organic phase is color-coded in green (Fig. 2A-I) and the intertabular organic phase is color-coded in blue (Fig. 2B-I). The last two categories are assigned to the biogenic aragonite – the platelets – in which the different intensities correspond to different densities of the biomineral phase and are color-coded using a brown scale. Here, dense mineral zones are white whereas less dense zones that are most probably rich with intracrystalline organic matter are dark brown (Fig. 2A-I and Fig. 2B-I). For each category, the probability distribution function (PDF) of their intensity was estimated using Gaussian Kernel Density Estimation (GKDE). Fig. 2C shows the distribution for all the pixels in the segments that are marked in Fig. 2A and Fig. 2B, whereas in Fig. 2D and Fig. 2E, intensity distributions for all four manually assigned categories are presented. It is clear that the PDFs overlap and that automated segmentation and labeling using thresholding is impossible. Data Analysis Using Deep Learning Among the various ML methods, Convolutional Neural Networks (CNNs) show the best results when applied to supervised image segmentation (Litjens et al., 2017; Shen et al., 2017). In its core, CNN is a network of computational modules, called neurons. Each individual neuron and the network as a whole compute non-linear input-output mappings, or in other words, assign a unique new value to every pixel in the original data. Neurons, figuratively speaking, are comprised of “knobs” (better known as weights or parameters), which are tuned by the feedback loop of the network in an attempt to minimize the fitting discrepancy, also called the loss function (e.g. Minimum Absolute Error – MAE (Willmott and Matsuura, 2005), Dice – DSC (Dice, 1945) and 7

Jaccard – J (Jaccard, 1901)), between the output calculated by the network and the provided ground truth (the expected result). This way, the network is trained on “training data” that contain both the original data and the optimal mapping. The final state of neurons are compiled into a computational model, which is then used to process real data. When applied to image data, the input of a standard neuron is a matrix of all pixel intensities in that image. In a convolutional neuron, the input is N x N kernels, which consist of a pixel that has to be mapped and its N2-1 neighbors. To distinguish from “knobs”, which are learned by the network, N is a parameter that defines the dimension of a kernel and is a part of a set of parameters that have to be defined prior to CNN training, called “hyperparameters”. Other examples of hyperparameters include batch size (number of simultaneously processed inputs, which is limited by the available computational power), the learning rate (the rate in which the weights are changed) and the optimizer (the algorithm by which the fitting error is minimized). Some hyperparameters are task-specific and can be carried over invariably in application to different datasets (Falk et al., 2019). However, other hyperparameters are data-specific and must be adjusted for an optimal performance of the network. Normally, all input pixels have corresponding neurons that are arranged in layers. The advantage of a convolutional layer in comparison to standard one is that an entire convolutional layer learns one mapping having N x N +1 “knobs” (N x N weights + 1 bias), which are shared between all neurons in a layer. Small number of “knobs” per layer allows multiple layers to be stacked in parallel (in “width”) to learn multiple mappings from the same input and thus, produce multiple output channels. In other words, larger amount of channels allows the extraction of a larger amount of features. To learn a complex non-linear mapping, the layers are stacked in “depth” as well, so that an output of each layer is transformed by a standard predefined non-linear mathematical function (activation function hyperparameter) and fed as an input to the following. In general, the overall architecture of a network ultimately depends on the task at hand and, in most cases, is empirically constructed. In this work, we employed U-Net-like architecture that was shown to be successful in analyzing features obtained from biological tissue imaging (Chatfield et al., 2014; Ronneberger et al., 2015; Weigert et al., 2018). This architecture, in addition to convolutional layers, which extract features form the data, also uses “max pooling” layers, which reduce the dimensions of these features. This greatly improves the generalization capacities of a network, as the model is forced to come to a general representation (abstraction) of the features it has extracted. The additional 8

“upsampling” layers are then used to bring the data back into its original size. The name of the architecture correspond to its U-like schematic representation (Fig. 3). Here the downward flow (encoding) subjects the data to dimensionality reduction (max pooling), which allows increasing the number of channels. The upward flow (decoding) scales back the data by upsampling and at the same time reducing the number of channels. In addition, block-like connectivity between the encoding and the corresponding decoding channels using algebraic operations, such as concatenation and addition, is used to smooth errors that arise from the reduction of dimensionality. For example, the U-Net-like architecture in Fig. 3 is build form 2 blocks and 16 channels. In the Base Block, the data is first passed through a convolutional layer, which uses 3 x 3 kernels to produce 16 output channels. In Block 1, these channels are passed through another convolutional layer, followed by max pooling to reduce the size of the data four-fold and another convolution, which doubles the amount of channels. Similar data processing steps are performed in Block 2. Similar principles are applied during decoding (Fig. 3). During data analysis, CNNs were initially used to remove blurring and noise that comes with tomographic image acquisition. A pre-trained model, provided by Weigert et al. (Weigert et al., 2018), was used to perform blind pre-processing of our data. This network was initially trained to segment filament-like features in a biological microstructure and is part of an open-source distribution (Weigert et al., 2018). Application of this model on our dataset results in a more clearly defined organic-rich areas (Fig. 2A-II and Fig. 2B-II). Nevertheless, it still did not allow us to process the data by intensity thresholding (Fig. 2A-III and Fig. 2B-III). Therefore, the next step was to train an own CNN. With the goal of segmenting the shape of individual nacre platelets, two CNN models were created. One was trained to segment the interlamellar organic matrix from the cross-sectional 2D images (Fig. 2A) and the second was trained to segment the intertabular matrix from the top-view images (Fig. 2B). In Fig. 2A-IV and Fig. 2B-IV, an example of manually labeled ground truths that were used to train the networks are presented. Table 1. Pre-selected hyperparameters used to train CNNs. CNN kernel Loss function 3x3

MAE

Optimizer

Activation

Adam

ReLU

(Kingma and

(Hahnioser et

Ba, 2014)

al., 2000) 9

Batch size

Learning rate

16

0.0001

Whereas most of the major hyperparameters used to train our CNNs are summarized in Table 1, we also performed a fine-tuning of the two models by analyzing their performance and varying the number of blocks and channels. Both hyperparameters characterize the complexity of the U-Net architecture and are commonly used for network optimization (Weigert et al., 2018). Typically, the performance of a network is characterized by one of the above mentioned scoring metrics: MAE, DSC or J coefficients. While MAE shows the degree of an overall mismatch between predicted and labelled ground truth images, DSC and J coefficients represent the degree of an overlap of the foreground features between two images. Perfect match correspond to zero value of MAE and to a value of one in case of DSC and J. In this work, we defined a grid of possible combinations for the number of blocks and channels, trained a separate model for each and assessed it using the DSC score (Fig. 4A and Fig. 4B). It is clear that the performance of both models is increasing with the increasing complexity of the network (Fig. 4A-I and Fig. 4B-I). However, in the case of the cross-sectional images (Fig. 4A-I), it drops after the point of 4 blocks and 128 channels. This is expected, as the pattern, which the network should recognize, is simpler in the case of the interlamellar matrix, while it is more complex and has a much higher spatial variability in case of the intertabular one. In the next fine-tuning step, models with the best score in the previous analysis (Fig. 4A-I and Fig. 4B-I, 4-128 for cross-section and 5-160 for top-view) were used to probe the effect of other data-specific hyperparameters, such as the influence of pre-processing, decreased size of input images, decreased size of training data, introduction of additional layers and data augmentation. Data augmentation, such as rotations, shearing and scaling slightly improved the score (Fig. 4A-II-a and Fig. 4B-II-a). Halving of the training data decreased the performance of the network (Fig. 4A-II-b and Fig. 4B-II-b). Introduction of supplementary functional layers, which are sometimes used to prevent network overfitting (Ioffe and Szegedy, 2015; Srivastava et al., 2016), also resulted in the reduction of the DSC score (Fig. 4A-II-c and Fig. 4B-II-c). The use of original images instead of the pre-processed ones again produced a slight negative effect (Fig. 4AII-d and Fig. 4B-II-d). Lastly, a smaller size of input images, although it greatly reduces the amount of computational resources required to train the network, decreases the performance of the network as well (Fig. 4A-II-e and Fig. 4B-II-e). Based on these results, in the final evaluation we have used CNNs having no supplementary layers added, and trained it on pre-processed and augmented data. In addition, given limited memory on our GPU, we prioritized the complexity of the network over the input size, so that for the selected number of blocks and channels the dimensions of input 10

images would be as large as possible. Finally, to make a concluding assessment of the performance of the constructed models, the commonly used cross-validation (Devuver and Kittler, 1982) was performed on “shallow” (332) and “deep” (4-128 for cross-section and 5-160 for top-view) models (Table 2). As one can see, according to all three scoring approaches, the more complex models perform slightly better than their simpler counterpart. To visually demonstrate the superiority of the complex networks, they were tested on challenging regions of the data, in the area where the prismatic ultrastructure transitions into nacre and the organic matrices are either extremely thin or disappear completely. The segmented organic phase in cross-sectional images (Fig. 4C and Fig. 4D) and in top-view (Fig. 4E and Fig. 4F) employing “shallow” and “deep” models, respectively, are colored coded to highlight true positives in green, false positives in orange and false negatives in blue. It is clear that whereas the “shallow” network performs reasonably well when segmenting the interlamellar matrix (Fig. 4C), it introduces a significant amount of disconnected patches in top-view images (Fig. 4E). In comparison, the “deeper” network performs extremely well in both directions (Fig. 4D and Fig. 4F). See Supplementary Material (Deep Learning Models) for the code used to implement the CNN models. Table 2. Evaluation of the performance of the trained CNNs as a function of the complexity of their architecture. Blocks-Channels

Orientation

MAE

DSC

J

3-32

cross-section

0.0273

0.812

0.688

4-128

cross-section

0.0252

0.827

0.708

3-32

top-view

0.0134

0.754

0.609

5-160

top-view

0.0124

0.777

0.638

In Fig. 4H and Fig. 4J, the capacity of our trained CNNs, after the introduction of taskspecific and the tuning of data-specific hyperparameters, to process the data in Fig. 4G and Fig. 4I, respectively, is demonstrated. In both directions, the networks are shown to be fully capable to identify the mineral phase and the organic matrices.

11

Quantitative Description of Nacre Formation After hyperparameters optimization and selection of the best model for both networks, processing and segmentation of an entire nanotomography dataset were performed. See Supplementary Material (Raw and Processed Data) for the raw, deconvoluted and segmented data. The outputs of two networks were fused by selecting maximum pixel values out of both models to outline the mineral in the prismatic to nacre ultrastructural transition in 3D. Fig. 5A and the Supplementary Video S1 show a representative mesh that corresponds to 12% of the dataset covering the volume of 50x50x30 µm3. Here, the organic intercrystalline material is color coded in grayscale and the mineral phase is represented by the empty space between the organic membranes. While Fig. 5A provides a side view on the transition between the two ultrastructures, a close view on a segment taken from fully developed nacre is presented in Fig. 5B. In agreement with previous reports, a gradual continuous transition between the two ultrastructures in U. pictorum is observed (Dauphin et al., 2018; Schoeppler et al., 2018). The prismatic mineral units gradually morph into nacre platelets. The initial nacre layers are irregular, discontinuous and curved (compare to with Fig. 1C). However, with each additional layer, the nacreous ultrastructure becomes increasingly ordered until a highly regular layered morphology is observed. In addition, in accordance with multiple previous studies, the data clearly demonstrate that nacre is divided into sub-domains in which the platelets are stacked on top of each other to form jugged column-like sub-structures (Fig. 1D) (Maier et al., 2014; Olson et al., 2013; Schoeppler et al., 2018). In 2D and 3D data representation, in Fig. 4H and Fig. 5B, respectively, these domains are outlined by intertabular interfaces color coded in blue. Each domain, having a unique crystallographic orientation, was previously postulated to be the result of a near-epitaxial crystal growth (Olson et al., 2013). Here, the platelets in consecutive layers grow sequentially on top of each other inheriting their crystallographic properties through mineral bridges that connect them (Checa et al., 2011). These bridges are evident as gaps in the green colored interlamellar matrix in Fig. 5B. The three-dimensional data not only provides us with a general overview of the transition between the two ultrastructure, but also allows us to analyze the geometry of individual platelets at different stages of nacre formation. Specifically, the evolution of the morphology, topology and size of the platelets were quantitatively described as a function of the direction of growth, which is normal to the layered assembly (upwards in Fig. 1B and Fig. 5A). The data was collected from 20 consecutive nacre layers representing a total thickness of approximately 25 µm. In each layer, at least 60 platelets were characterized. Three two-dimensional sections perpendicular to the direction 12

of growth that represent three different stages of nacre biomineralization are shown in Figs. 5C5E. Fig. 5C was taken from the center of a single nacre layer close to the prismatic ultrastructure and, therefore, characterizes the early stages of nacre formation. Fig. 5D shows the morphology of the platelets 9 layers away from Fig. 5C and, similarly, Fig. 5E is located 9 layers away from Fig. 5D. In these images, platelets that belong to the same columnar sub-domain are colored coded with a unique color. This allowed us to quantitatively follow the formation of nacre as a function of the direction of growth by treating it as a column-like ultrastructure that grows in discrete steps – layer by layer. Following Figs. 5C-5E, a coarsening of the microstructure and the reduction of the average number of platelets is observed. The grain growth and coarsening process of polycrystalline microstructures is a thoroughly-studied phenomenon in materials science that is well understood from a theoretical viewpoint (Atkinson, 1988). A large number of analytical approaches were already successfully applied to quantitatively describe these processes in a variety of materials systems, including the prismatic architecture in a number of molluscs (Bayerlein et al., 2014; Reich et al., 2019; Zöllner et al., 2017; Zöllner and Zlotnikov, 2019b). In particular, Burke and Turnbull (Burke and Turnbull, 1952) proposed one of the first physically motivated grain growth models, which describes the temporal evolution of the average grain size in a polycrystalline network such that 𝛿

〈𝑅〉(𝑡) = (𝑏𝑡 + 〈𝑅〉1/𝛿 0 ) .

(1)

Here 〈𝑅〉 describes the average (linear) grain size, i.e. the average grain radius, which is often calculated as the radius of a grain area equivalent circle in 2D or grain volume equivalent sphere in 3D. Its initial value is 〈𝑅〉0. b is the growth constant that depends on grain boundary mobility, 𝑚, and boundary energy per boundary length, γ, and 𝛿 is the growth exponent. For the special case of ideal coarsening, where all boundaries in the system are assumed to have identical physical properties and the reduction of the boundary energy is the sole driving force, the growth exponent equals to 𝛿 = 0.5. While this is hardly the case for most metals or alloys, it has been recently shown that the ideal case of Eq. (1) can be applied to describe the structural evolution of the prismatic ultrastructure in the shell of Atrina vexillum (Zöllner and Zlotnikov, 2019b). It is important to note, that, so far, the studies that focused on the analytical description of grain growth in molluscan shells 13

assumed a linear correlation between the growth direction, 𝑧, and the time, 𝑡, and therefore, the data was presented as a function of z. In the present case, the quantification of nacre morphogenesis is presented as a function of the number of layer i, which is assumed to linearly correlate with the direction of growth in mature nacre. Average nacre platelet area as a function of the number of a layer, i, is given in Fig. 6A. The data points follow a linear function, which is consistent with Eq. (1), when recalculated from radius to area – from 〈𝑅〉(𝑖) ∝ 𝑖1/2 to 〈𝐴〉(𝑖) ∝ 𝑖. Hence, the growth exponent 𝛿 = 0.5 describes the data very well. In addition, the error bars in Fig. 6A, which represent the width of the associated size distribution, demonstrate that a platelet may have an area up to nearly three times the average value. Naturally, an increase in average size is coupled to a decrease in the number of grains or in the current case, platelets. Hence, individual platelets have to shrink and disappear, whereas others grow. This process is not random, but was shown to depend on grain topology (the number of edges) according to the classical von Neumann-Mullins-law (Mullins, 1956; von Neumann, 1952): 𝑑𝐴 𝑑𝑡

𝜋

= 3𝑚𝛾(𝑛 ― 𝑛𝑐).

(2)

Here 𝐴 is the area of a grain, 𝑛 is the number of edges of the grain, and 𝑛𝑐 is the critical number of edges, which takes on a value of six in ideal coarsening. The analysis of a total of 1450 platelets from all 19 segmented nacre layers in accordance with Eq. (2) is presented in Fig. 6B. The linear least-squares fit to the data describes the coarsening as a function of the topological number very well, whereas the critical number of edges deviates from the value predicted for the case of ideal coarsening. The value of 𝑛𝑐~5 indicates that in nacre, on average, the six-edged platelets will grow and the five-edge platelets will preserve their size. Apart from following the structural evolution of nacre during the process of growth, individual nacre layers can be analyzed with respect to their platelet size and topological characteristics. In particular, in polycrystalline and cellular materials, the relationship between grain size and grain topology is anticipated to follow the Lewis-law (Lewis, 1931): 𝐴

𝜌 = 〈𝐴〉 = 𝛼𝑛 + (1 ― 6𝛼).

(3)

Here, the scaled platelet area in layer i, 𝜌 = 𝐴/〈𝐴〉𝑖, relates to the number of edges, n, of that platelet 14

where 𝛼 is a geometrical factor. Fig. 6C shows that the size and the topology of the platelets follow such a linear correlation in five independently analyzed equidistant nacre layers. The unconstrained linear least squares fit to the data from all five layers yields 𝛼 = 0.21 from the first and the second term of Eq. (3), independently. This value is smaller than the values previously calculated for the columnar prismatic ultrastructures in the shells Atrina vexillum with 𝛼 ≈ 0.36, Atrina rigida with 𝛼 ≈ 0.41, and Pinna nobilis with 𝛼 ≈ 0.42 (Reich et al., 2019). In addition, the purely topological relationship showing how grains in a network are arranged in space according to the Aboav-Weaire-law (Aboav, 1970; Mombach et al., 1990; Weaire, 1974) correlates between the number of edges, 𝑛, of a platelet to the average number of edges of all neighboring platelets, 𝑛𝑚, by 𝑛𝑚𝑛 = (〈𝑛〉 ― 𝛽)𝑛 + (〈𝑛〉𝛽 + 𝜇2).

(4)

Here, 𝛽 is a geometrical constant, which is usually close to unity, and 〈𝑛〉 is the average number of edges of the entire microstructure, which is six in the case of ideal coarsening. It is important to note that in contrast to the Lewis-law, this relationship has been shown to describe a broad variety of cellular networks, such as nanocrystalline metals, colloidal soap froth structures and even vegetable tissues (Mombach et al., 1990). In the present case, the predicted linear correlation, Eq. (4), between the product of the average number of edges of all neighboring platelets and the number of edges of the corresponding central platelet versus 𝑛 itself is indeed fulfilled (Fig. 6D). The unconstrained linear least-squares fit to the data from five nacre layers yields geometrical constant values of 𝛽 = 1.05 and 𝛽 = 1.10 from the slope and the constant term of Eq. (4), respectively. Here, the average number of edges in all the platelets in the considered layers is 〈𝑛〉 = 5.92 – indeed close to 6, and the associated second moment is given by 𝜇2 = 1.36. These values are again smaller compared to the geometric constant obtained in the prismatic layer of, e.g., Atrina vexillum with 𝛽 ≈ 1.20 (Zöllner and Zlotnikov, 2019b). This analysis clearly demonstrates that concepts from classical materials science are not only capable to describe the formation process of the coarse prismatic architecture, but also provide an analytical framework to quantify the deposition of nacre. Morphological and topological behavior of the individual mineral platelets and the entire nacreous ultrastructure is well predicted by mathematical models that are commonly used to analyze spontaneous growth and coarsening 15

process in generic materials systems. Together with previously obtained data on the prismatic layers in bivalves (Bayerlein et al., 2014; Reich et al., 2019; Zöllner and Zlotnikov, 2019a), this outcome supports the hypothesis that molluscan shell biomineralization is a biologically regulated thermodynamically driven self-assembly process (Schoeppler et al., 2018). 4. Conclusions In summary, in this study, the combination of state-of-the-art X-Ray based imaging technique with advanced machine learning-based image processing methods provide the first detailed quantitative view on nacre morphogenesis in 3D. It is important to note that the used approach for automatic segmentation can possibly be improved in a variety of ways. For example, 3D information can be included in network training and subsequent segmentation. Also, other network architectures may prove to be more optimal for the studied structure and, of course, additional optimization might decrease the number of parameters, increase the training speed and the quality of data segmentation. Finally, the current approach may suffer from error introduced by a strong dependence on the ground truth provided by the human factor. Nevertheless, this work clearly demonstrates the potential of deep learning approaches in advancing our understanding of biomineralized tissue formation based on 3D imaging.

References Abadi, M., AshishAgarwal, Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mane, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viegas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X., 2015. TensorFlow:Large-Scale Machine Learning on Heterogeneous Distributed Systems. Aboav, D.A., 1970. The arrangement of grains in a polycrystal. Metallography 3, 383–390. https://doi.org/https://doi.org/10.1016/0026-0800(70)90038-8 Atkinson, H. V., 1988. Theories of normal grain growth in pure single phase systems. Acta Metall. 36, 469–491. Bayerlein, B., Zaslansky, P., Dauphin, Y., Rack, A., Fratzl, P., Zlotnikov, I., 2014. Self-similar mesostructure evolution of the growing mollusc shell reminiscent of thermodynamically driven grain growth. Nat. Mater. 13, 1102–1107. Bertoldi, K., Bigoni, D., Drugan, W.J., 2008. Nacre: An orthotropic and bimodular elastic material. Compos. Sci. Technol. 68, 1363–1375. https://doi.org/10.1016/j.compscitech.2007.11.016 Bevelander, G., Nakahara, H., 1969. An electron microscope study of the formation of the 16

nacreous layer in the shell of certain bivalve molluscs. Calcif. Tissue Res. 3, 84–92. https://doi.org/10.1007/BF02058648 Bøggild, O.B., 1930. The shell structure of the mollusks. K. Danks. Selsk. Skr. naturh. Math. Afd. 9, 231. https://doi.org/10.1088/0022-3700/9/5/001 Burke, J., Turnbull, D., 1952. Recrystallization and grain growth. Prog. Met. Phys. 3, 220. Cartwright, J.H.E., Checa, A.G., 2007. The dynamics of nacre self-assembly. J. R. Soc. Interface 4, 491–504. https://doi.org/10.1098/rsif.2006.0188 Cartwright, J.H.E., Checa, A.G., Escribano, B., Sainz-Díaz, C.I., 2009. Spiral and target patterns in bivalve nacre manifest a natural excitable medium from layer growth of a biological liquid crystal. Proc. Natl. Acad. Sci. U. S. A. 106, 10499–504. https://doi.org/10.1073/pnas.0900867106 Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A., 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. Proc. Br. Mach. Vis. Conf. 2014 6.1-6.12. https://doi.org/10.5244/C.28.6 Checa, A.G., Cartwright, J.H.E., Willinger, M.G., 2011. Mineral bridges in nacre. J. Struct. Biol. 176, 330–339. https://doi.org/10.1016/j.jsb.2011.09.011 Checa, A.G., Rodriguez-Navarro, A.B., 2005. Self-organisation of nacre in the shells of Pterioida ( Bivalvia : Mollusca ). Biomaterials 26, 1071–1079. https://doi.org/10.1016/j.biomaterials.2004.04.007 Chollet, F., 2015. Keras Documentation. Keras.Io. Dauphin, Y., 2002. Comparison of the soluble matrices of the calcitic prismatic layer of Pinna nobilis (Mollusca, Bivalvia, Pteriomorpha). Comp. Biochem. Physiol. A. Mol. Integr. Physiol. 132, 577–590. Dauphin, Y., Luquet, G., Salome, M., Bellot-Gurlet, L., Cuif, J.P., 2018. Structure and composition of Unio pictorum shell: arguments for the diversity of the nacroprismatic arrangement in molluscs. J. Microsc. 270, 156–169. https://doi.org/10.1111/jmi.12669 Devol, R.T., Sun, C.Y., Marcus, M.A., Coppersmith, S.N., Myneni, S.C.B., Gilbert, P.U.P.A., 2015. Nanoscale transforming mineral phases in fresh nacre. J. Am. Chem. Soc. 137, 13325–13333. https://doi.org/10.1021/jacs.5b07931 Devuver, P.A., Kittler, J., 1982. Pattern Recognition: A Statistical Approach. Prentice Hall International, London. Dice, L.R., 1945. Measures of the Amount of Ecologic Association Between Species. Ecology. https://doi.org/10.2307/1932409 Falk, T., Mai, D., Bensch, R., Çiçek, Ö., Abdulkadir, A., Marrakchi, Y., Böhm, A., Deubner, J., Jäckel, Z., Seiwald, K., Dovzhenko, A., Tietz, O., Dal Bosco, C., Walsh, S., Saltukoglu, D., Tay, T.L., Prinz, M., Palme, K., Simons, M., Diester, I., Brox, T., Ronneberger, O., 2019. UNet: deep learning for cell counting, detection, and morphometry. Nat. Methods 16, 67–70. https://doi.org/10.1038/s41592-018-0261-2 Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Garcia-Rodriguez, J., 2017. A Review on Deep Learning Techniques Applied to Semantic Segmentation 1–23. Hahnioser, R.H.R., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, H.S., 2000. Digital selection and analogue amplification coexist in a cortex- inspired silicon circuit. Nature 405, 947–951. https://doi.org/10.1038/35016072 Hasegawa, A., Lo, S.-C.B., Freedman, M.T., Mun, S.K., 1994. Convolution neural-networkbased detection of lung structures. Proc. SPIE Image Process. 2167, 654–662. https://doi.org/10.1117/12.175101 Hubert, M., Pacureanu, A., Guilloud, C., Yang, Y., Da Silva, J.C., Laurencin, J., Lefebvre-Joud, 17

F., Cloetens, P., 2018. Efficient correction of wavefront inhomogeneities in X-ray holographic nanotomography by random sample displacement. Appl. Phys. Lett. 112. https://doi.org/10.1063/1.5026462 Iassonov, P., Gebrenegus, T., Tuller, M., 2009. Segmentation of X-ray computed tomography images of porous materials: A crucial step for characterization and quantitative analysis of pore structures. Water Resour. Res. 45, 1–12. https://doi.org/10.1029/2009WR008087 Ioffe, S., Szegedy, C., 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. 32nd Int. Conf. Mach. Learn. ICML 2015 1, 448–456. Jaccard, P., 1901. Etude de la distribution florale dans une portion des Alpes et du Jura. Bull. la Soc. Vaudoise des Sci. Nat. 37, 547–579. Kingma, D.P., Ba, J., 2014. Adam: A Method for Stochastic Optimization 1–15. Lemanis, R., Zlotnikov, I., 2018. Finite Element Analysis as a Method to Study Molluscan Shell Mechanics. Adv. Eng. Mater. 20, 1–24. https://doi.org/10.1002/adem.201700939 Levi-Kalisman, Y., Falini, G., Addadi, L., Weiner, S., 2001. Structure of the nacreous organic matrix of a bivalve mollusk shell examined in the hydrated state using cryo-TEM. J. Struct. Biol. 135, 8–17. https://doi.org/10.1006/jsbi.2001.4372 Lewis, F.T., 1931. A comparison between the mosaic of polygons in a film of artificial emulsion and the pattern of simple epithelium in surface view (cucumber epidermis and human amnion). Anat. Rec. 50, 235–265. https://doi.org/10.1002/ar.1090500303 Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A.W.M., van Ginneken, B., Sánchez, C.I., 2017. A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88. https://doi.org/10.1016/j.media.2017.07.005 Maier, B.J., Griesshaber, E., Alexa, P., Ziegler, A., Ubhi, H.S., Schmahl, W.W., 2014. Biological control of crystallographic architecture: Hierarchy and co-alignment parameters. Acta Biomater. 10, 3866–3874. https://doi.org/10.1016/j.actbio.2014.02.039 Marie, B., Guichard, N., Barros, J.P. De, Luquet, G., Marin, F., 2007a. Calcification in the Shell of the Freshwater Bivalve Unio pictorum. Biominer. from Paleontol. to Mater. Sci. 273–280. Marie, B., Luquet, G., Pais De Barros, J.-P., Guichard, N., Morel, S., Alcaraz, G., Bollache, L., Marin, F., 2007b. The shell matrix of the freshwater mussel Unio pictorum (Paleoheterodonta, Unionoida). Involvement of acidic polysaccharides from glycoproteins in nacre mineralization. FEBS J. 274, 2933–45. https://doi.org/10.1111/j.17424658.2007.05825.x Marin, F., Le Roy, N., Marie, B., 2012. The formation and mineralization of mollusk shell. Front. Biosci. S4, 1099–1125. https://doi.org/10.2741/S321 Meyers, M.A., McKittrick, J., Chen, P.-Y., 2013. Structural Biological Materials: Critical Mechanics-Materials Connections. Science 339, 773 LP – 779. https://doi.org/10.1126/science.1220854 Mombach, J.C.M., Vasconcellos, M.A.Z., De Almeida, R.M.C., 1990. Arrangement of cells in vegetable tissues. J. Phys. D. Appl. Phys. 23, 600–606. https://doi.org/10.1088/00223727/23/5/021 Mullins, W.W., 1956. Two-dimensional motion of idealized grain boundaries. J. Appl. Phys. 27, 900–904. https://doi.org/10.1063/1.1722511 Mutvei, H., 1977. The nacreous layer in Mytilus, Nucula, and Unio (Bivalvia) - Crystalline composition and nucleation of nacreous tablets. Calcif. Tissue Res. 24, 11–18. https://doi.org/10.1007/BF02223291 Nakahara, H., G. Bevelander, Kakei, M., 1982. Electron microscopic and amino acid studies on the outer and inner shell layers of Haliotis rufescens. Venus (Japanese J. Malacol. 41, 32–46. 18

https://doi.org/10.18941/venusjjm.41.1_32 Nudelman, F., 2015. Nacre biomineralisation: A review on the mechanisms of crystal nucleation. Semin. Cell Dev. Biol. 46, 2–10. https://doi.org/10.1016/j.semcdb.2015.07.004 Olson, I.C., Blonsky, A.Z., Tamura, N., Kunz, M., Pokroy, B., Romao, C.P., White, M.A., Gilbert, P.U.P. a, 2013. Crystal nucleation and near-epitaxial growth in nacre. J. Struct. Biol. 184, 454–463. https://doi.org/10.1016/j.jsb.2013.10.002 Pacureanu, A., Maniates-Selvin, J., Kuan, A.T., Thomas, L.A., Chen, C.-L., Cloetens, P., Lee, W.-C.A., 2019. Dense neuronal reconstruction through X-ray holographic nanotomography. bioRxiv 653188. https://doi.org/10.1101/653188 Qiu, J., Wu, Q., Ding, G., Xu, Y., Feng, S., 2016. A survey of machine learning for big data processing. EURASIP J. Adv. Signal Process. 2016. https://doi.org/10.1186/s13634-0160355-x Reich, E., Schoeppler, V., Lemanis, R., Lakin, E., Zolotoyabko, E., Zöllner, D., Zlotnikov, I., 2019. Morphological and textural evolution of the prismatic ultrastructure in mollusc shells: A comparative study of Pinnidae species. Acta Biomater. 85, 272–281. https://doi.org/10.1016/J.ACTBIO.2018.12.023 Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for biomedical image segmentation. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) 9351, 234–241. https://doi.org/10.1007/978-3-319-24574-4_28 Schoeppler, V., Gránásy, L., Reich, E., Poulsen, N., de Kloe, R., Cook, P., Rack, A., Pusztai, T., Zlotnikov, I., 2018. Biomineralization as a Paradigm of Directional Solidification: A Physical Model for Molluscan Shell Ultrastructural Morphogenesis. Adv. Mater. 30, 1803855. https://doi.org/10.1002/adma.201803855 Shen, D., Wu, G., Suk, H.-I., 2017. Deep Learning in Medical Image Analysis. Annu. Rev. Biomed. Eng. 19, 221–248. https://doi.org/10.1146/annurev-bioeng-071516-044442 Srivastava, R., Yadav, N., Chattopadhyay, J., 2016. Growth and Form of Self-organized Branched Crystal Pattern in Nonlinear Chemical System 65. https://doi.org/10.1007/978981-10-0864-1 Sun, J., Bhushan, B., 2012a. Hierarchical structure and mechanical properties of nacre: a review. RSC Adv. 2, 7617. https://doi.org/10.1039/c2ra20218b Sun, J., Bhushan, B., 2012b. Hierarchical structure and mechanical properties of nacre: A review. RSC Adv. 2, 7617–7632. https://doi.org/10.1039/c2ra20218b Taylor, J.D., Kennedy, W.J., Hall, A., 1969. The shell structure and mineralogy of the Bivalvia. Nuculacea---Trigonacea. Bull. Brit. Mus. 3, 1–125. Taylor, J.D., Layman, M., 1972. The mechanical properties of bivalve (Mollusca) shell structures. Palaeontology 15, 73–87. Van Damme, D., 2011. Unio pictorum. IUCN Red List Threat. Species. https://doi.org/10.2305/IUCN.UK.2011-2.RLTS.T155543A4795613.en von Neumann, J., 1952. Metal interfaces. American Society of Metals, Cleveland. Wada, K., 1966. Spiral Growth of Nacre. Nature. https://doi.org/10.1038/2111427a0 Wang, J., Cheng, Q., Tang, Z., 2012. Layered nanocomposites inspired by the structure and mechanical properties of nacre. Chem. Soc. Rev. 41, 1111–1129. https://doi.org/10.1039/C1CS15106A Weaire, D., 1974. Some remarks on the arrangement of grains in a polycrystal. Metallography 7, 157–160. https://doi.org/https://doi.org/10.1016/0026-0800(74)90004-4 Wegst, U.G.K., Bai, H., Saiz, E., Tomsia, A.P., Ritchie, R.O., 2015. Bioinspired structural materials. Nat. Mater. 14, 23–36. 19

Weigert, M., Schmidt, U., Boothe, T., Müller, A., Dibrov, A., Jain, A., Wilhelm, B., Schmidt, D., Broaddus, C., Culley, S., Rocha-Martins, M., Segovia-Miranda, F., Norden, C., Henriques, R., Zerial, M., Solimena, M., Rink, J., Tomancak, P., Royer, L., Jug, F., Myers, E.W., 2018. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nat. Methods 15, 1090–1097. https://doi.org/10.1038/s41592-018-0216-7 Willmott, C., Matsuura, K., 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30, 79–82. Younis, S., Kauffmann, Y., Bloch, L., Zolotoyabko, E., 2012. Inhomogeneity of nacre lamellae on the nanometer length scale. Cryst. Growth Des. 12, 4574–4579. Zöllner, D., Reich, E., Zlotnikov, I., 2017. Morphogenesis of biomineralized calcitic prismatic tissue in mollusca fully described by classical hierarchical grain boundary motion. Cryst. Growth Des. 17, 5023–5027. https://doi.org/10.1021/acs.cgd.7b00965 Zöllner, D., Zlotnikov, I., 2019a. Biomineralized tissue formation as an archetype of ideal grain growth. Mater. Horizons 6, 751–757. https://doi.org/10.1039/c8mh01153b Zöllner, D., Zlotnikov, I., 2019b. Biomineralized tissue formation as an archetype of ideal grain growth. Mater. Horizons. https://doi.org/10.1039/C8MH01153B

20

Figure 1. The bivalve shell of U. pictorum. (A) Two valves of the shell showing the outer (up) and the inner (bottom) surfaces of the shell. (B) Scanning electron microscopy image of a fractured cross-section of the shell showing the nacreous (top) and prismatic (bottom) ultrastructures. Scale bar is 20 µm. (C) Detailed view on the nacreous structure. Scale bar is 3 µm. (D) A schematic representation of the nacreous architecture.

21

Figure 2. Tomographic data analysis. (A-B) Synchrotron-based 2D tomographic sections of nacre in U. pictorum obtained in perpendicular and parallel to the layered arrangement, in cross-section and top-view, respectively. Scale bars are 1 µm. I - Otsu thresholding of the data marked in (A) and (B). II - processed segments marked in (A) and (B) using CNNs provided by Weigert et al. (Weigert et al., 2018). III - Otsu thresholding of the data in II. IV – Manual labeling of the data in (A) and (B) used to train our CNNs. Here, the interlamellar organic phase is color-code in green, the intertabular organic phase is color-coded in blue and the different densities in the biomineral phase are color-coded using a brown scale in which the dense mineral zones are white and the intracrystalline organics-rich zones are brown. (C) Intensity distribution of all pixels in the data marked in (A) and (B). (D) Intensity distributions of all four material classes in the data in (C) using the same color-coding scheme. (E) A closer look on intensity distributions as marked in (D). 22

Figure 3. Example of a U-Net architecture used in this work comprising 2 blocks and 16 channels. “Channels” here refer to the number of channels in the first convolutional layer and are highlighted in green. “Blocks” refer to the number of iterations of downscaling of features and consequent upscaling and are highlighted in blue and red.

23

Figure 4. Evaluation of CNN performance. (A-B) Dice coefficient evaluation obtained by training networks to process the cross-section and the top-view on the nacreous ultrastructure, respectively, with (I) - varying the number of blocks and channels and (II) – the effect on the Dice coefficient as a result of: a - data augmentation, b – halving the size of the training data, c - introducing supplementary functional layers to the network, d - using original images instead of the preprocessed ones, e – halving the size of the input images. (C-D) Segmented organic phase in a crosssection. (E-F) Segmented organic phase in top-view images. In (C) and (E) the segmentation was performed employing the “shallow” model (3-32). In (D) and (F) the segmentation was performed employing “deep” models (4-128 for cross-section and 5-160 for top-view). The images are colored-coded to highlight true positives in green, false positives in orange and false negatives in blue. (G-H) Original and processes tomographic sections in cross-section, respectively. (I-J) Original and processes tomographic sections in top-view, respectively.

24

Figure 5. Tomography data visualization and segmentation. (A) A representative 3D segment of the prismatic to nacre transition in U. pictorum. (B) A color-coded representative 3D segment of the nacreous assembly. Here, the interlamellar organic phase is color-coded in green and the intertabular organic phase is color-coded in blue. (C-E) Segmented 2D tomographic section taken perpendicular to the growth direction of nacre at different distances from the prismatic ultrastructure: (C) in close proximity to nacre-prismatic transition zone; (D) 9 nacre layers away from (C); (E) 9 layers away from (D). Scale bars are 5 µm. Here, platelets that belong to the same structural sub-domain are colored coded with a unique color.

25

Figure 6. Quantification of nacre formation. (A) Average growth law relating platelet areas to the layer number, i, in the direction of growth from 19 successive segmented nacre layers together with a least-squares fit to Eq. (1). (B) von Neumann-Mullins law describing platelet area change rate as a function of the number of edges, n, in all 19 segmented nacre layers together with a least-squares fit to Eq. (2). (C) Lewis law analysis showing the scaled area of individual platelets as a function of the number of edges within five different nacre layers together with a least-squares fit to Eq. (3). (D) Aboav-Weaire law analysis within five different nacre layers together with a least-squares fit to Eq. (4).

26

Graphical Abstract

Declaration of interests

☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

Author contributions: Maksim Beliaev: Investigation, Software, Methodology, Visualization, Writing - Original Draft Dana Zöllner: Software, Methodology, Writing - Review & Editing 27

Alexandra Pacureanu: Investigation, Methodology, Writing - Review & Editing Paul Zaslansky: Investigation, Writing - Review & Editing Luca Bertinetti: Software, Writing - Review & Editing Igor Zlotnikov: Conceptualization, Supervision, Methodology, Investigation, Writing - Original Draft

28