A machine-learning approach to predicting and understanding the properties of amorphous metallic alloys

A machine-learning approach to predicting and understanding the properties of amorphous metallic alloys

Journal Pre-proof A machine-learning approach to predicting and understanding the properties of amorphous metallic alloys Jie Xiong, San-Qiang Shi, T...

2MB Sizes 0 Downloads 64 Views

Journal Pre-proof A machine-learning approach to predicting and understanding the properties of amorphous metallic alloys

Jie Xiong, San-Qiang Shi, Tong-Yi Zhang PII:

S0264-1275(19)30816-0

DOI:

https://doi.org/10.1016/j.matdes.2019.108378

Reference:

JMADE 108378

To appear in:

Materials & Design

Received date:

17 October 2019

Revised date:

11 November 2019

Accepted date:

19 November 2019

Please cite this article as: J. Xiong, S.-Q. Shi and T.-Y. Zhang, A machine-learning approach to predicting and understanding the properties of amorphous metallic alloys, Materials & Design(2019), https://doi.org/10.1016/j.matdes.2019.108378

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier.

Journaland Pre-proof A machine-learning approach to predicting understanding the properties of amorphous

metallic alloys Jie Xiongb, San-Qiang Shib*, Tong-Yi Zhanga,b* a

Material Genome Institute, Shanghai University, Shanghai, China; bDepartment of Mechanical Engineering, The Hong Kong

Polytechnic University, Hong Kong, China

Abstract There is a pressing need to shorten the development period for new materials possessing desired properties. For example, bulk metallic glasses (BMGs) are a unique class of alloy materials utilized in a wide variety of

f

applications due to their attractive physical properties. However, the lack of predictive tools for uncovering

oo

the relationships between BMGs’ alloy composition and desired properties limits the further application of these materials. In this study, a machine-learning (ML) approach was developed, based on a dataset of 6,471

pr

alloys, to enable the construction of a predictive ML model to describe the glass-forming ability and elastic

e-

moduli of BMGs. The model’s predictions of unseen data were found to be in good agreement with most experimental values. Consequently, we determined that an alloy with a large critical-casting diameter would

Pr

likely have a high mixing entropy, a high thermal conductivity, and a mixing enthalpy of approximately -28 kJ/mol, and that a BMG with a small average atomic volume would likely have a high elastic modulus. The

Jo u

1. Introduction

rn

metallic-glass properties.

al

efficacy of ML was demonstrated in furnishing a mechanistic understanding and enabling the prediction of

The amorphous structure of metallic glasses results in distinct mechanical and physical properties that are not exhibited by conventional crystalline alloys. For example, the disordered structure leads to high yield strength and wear resistance [1, 2], high magnetic permeability [3], and high corrosion resistance [4] in some metallic glasses, meaning that bulk metallic glasses (BMGs) are promising materials for engineering applications. Paradoxically, however, the absence of structural order in metallic glasses is not conducive to their bulk manufacture, which hinders their use and application. Fewer than 1,000 BMGs have been developed in the last 50 years, and many of these comprised multiple components; indeed, BMGs usually contain three or more components [5]. In addition, the nonequilibrium nature of the amorphous phase indicates that various processing parameters, such as cooling rates and enhanced surface diffusion, affect the formation ability of amorphous metallic glass [6]. This generates an enormous combinatorial space of composition-processing parameters for BMGs, rendering a trial-and-error based design method extremely challenging. The conventional development of metallic glasses mainly relies on experience; for instance, metallic 1

Pre-proof glasses are known to form in systems nearJournal deep eutectics [7]. In recent decades, various empirical models based on factors such as thermodynamics [8-10], valence electron distribution [11], atomic size mismatches [12], and transformation temperatures [13] have provided fundamental insights into metallic glass-forming conditions. However, these models are usually limited to certain metallic glass compositions, and there is no universal model capable of predicting the glass-forming ability of unknown alloys. Recently, machine-learning (ML) approaches have been utilized to predict the properties of metallic materials. For example, Xue et al. developed ML models to search for shape memory alloys with targeted transformation temperatures [14, 15]. Cheng et al. formulated a materials design strategy combining ML models with experiments to find high entropy alloys with large hardness [16]. Feng et al. utilized a deep neural network to predict the defects in stainless steel [17]. Sun et al. developed ML models to predict the

f

glass-forming ability of binary metallic alloys [18]. Ward et al. constructed a ML framework for

oo

accelerating the design of engineered metallic glasses and validated it via commercially viable fabrication methods [19]. Ren et al. introduced ML-based iteration and high-throughput experiments to rapidly discover

pr

new glass-forming systems [20]. Several studies have used ML approaches to identify the correlations between glass-forming ability and the experimentally measured properties of an alloy [21, 22]. These studies

e-

thus confirmed that ML methods were reliable and efficient in discovering new metallic glasses and

Pr

predicting their properties.

In this work, we developed and validated a general ML framework for the prediction of the

al

properties and design of metallic glasses, based on their compositions. A dataset of metallic glasses was created by collecting data from several related studies [23-34] and the Landolt-Bornstein handbook on

rn

amorphous ternary alloys [35]. These data covered variables such as BMG glass-forming ability, critical

Jo u

casting diameter, and elastic properties. ML tools such as random forest (RF) and symbolic regression were used to create models for predicting the desired properties of metallic glass compositions. Key descriptors or features were screened by a three-step selection method during model constructions, and symbolic regression was used to develop mathematical expressions based on these features. This ML framework was shown to accelerate the development of new metallic glass-forming systems and provide a greater understanding of the physics underpinning the properties of metallic glasses.

2. Method 2.1 Data collection The data on BMGs were collected from several publicly available resources and partitioned into four training sets: glass-forming ability (GFA); critical casting diameter (Dmax); characteristic transformation temperatures (CTTs); and elastic moduli (EM). The training sets represented a wide range of elements, including metals and metalloids, with the GFA and Dmax datasets containing 54 elements, the CTT datasets containing 42 elements, and the EM datasets containing 48 elements. 2

Journal GFA describes whether an alloy can form an Pre-proof amorphous bulk (i.e., a BMG) via rapid cooling (e.g., by copper-mold casting, injection molding, or suction casting), an amorphous ribbon (i.e. a ribbon metallic glass, or RMG), or a crystalline alloy (CRA). If there have been conflicting reports about the GFA of individual alloys, the highest GFA was selected for our dataset. For example, Al25Gd55Ni20 was reported in the Landolt-Bornstein handbook [35] to form an RMG, but Fang et al. found that this alloy can form a bulk sample with a diameter of 3 mm [32], so it was denoted a BMG in our dataset. This GFA dataset consisted of 6,471 unique alloy compositions in total, comprising 1,211 BMGs, 1,552 CRAs, and 3,708 RMGs. In terms of physical metallurgy, the critical cooling rate of an alloy is the most reliable and quantifiable measurement of its GFA [34]. This is the slowest cooling rate above which no crystallization occurs during the solidification of an alloy, and is determined using data from multiple solidifications at

f

different cooling rates. Dmax is a slightly less rigorous parameter and much easier to experimentally obtain

oo

than the critical cooling rate, as it is the largest diameter or the largest section-thickness of an alloy when it is cast into a fully amorphous rod or plate. In general, the slower the cooling rate, the larger the Dmax, and the

pr

higher the GFA.

e-

The Dmax dataset contained values for 5,934 entries. For BMG alloys, only the critical copper-mold casting diameter values reported in the literature were included, i.e., the Dmax of samples from injection

Pr

molding and suction casting were not considered. For RMG alloys, a single and small value of 0.1 mm for Dmax was assumed as appropriate for this dataset, as this was half the Dmax of the smallest reported BMG,

Jo u

rn

of Dmax, the highest was used.

al

Ti50Cu42.5Ni7.5 [34]. The Dmax of CRAs was set at 0. As with the GFA dataset, if there were multiple values

Fig. 1. The elements included in the four constructed datasets, which described a BMG alloy in terms of its GFA (red fill), Dmax (green fill), CTT (yellow fill), and EM (blue fill).

3

Journal Pre-proof Various criteria have been proposed to compare the GFA of alloys, with most based on CTTs [36]. Varying combinations of transformation temperatures have been found to determine the relationship of Dmax to GFA criteria, such as the supercooled liquid range temperature

[36] and the reduced glass transition

[37]. Here, Tg is the glass transition temperature, Tx is the onset of crystallization

temperature, and Tl is the liquidus temperature. The CTT dataset was assembled from 674 measurements of differential thermal analysis or differential scanning calorimetry at a constant heating rate, with most of the entries obtained at rate of 20 K/min. When multiple values of a CTT were reported, their averaged value was used here. The EM data included the shear moduli G and bulk moduli B of BMGs. The ratio of G to B correlated with the fracture toughness, intrinsic plasticity, and GFA of metallic glasses [34]. The Young’s

f

moduli can be calculated with G and B based on the isotropic mechanical property of BMGs. In addition,

oo

strong linear relationships between fracture tensile strength and Young’s moduli, hardness and Young’s moduli, yield shear stress and shear moduli were found in the previous works [34, 38, 39]. The EM dataset

pr

contained 278 unique BMGs, and their moduli were measured with resonant ultrasonic spectroscopy (RUC).

e-

The dataset shows all reported values of a property for each given alloy, indicating that there are multiple values for some properties of a given alloy. However, the difference between the multiple values is small.

Pr

To simplify the following ML process, the average of multiple values was used for such property of a particular BMG. For example, the values of shear modulus of Mg 65Cu25Tb10 were 19.4 GPa and 19.6 GPa reported by Wang et. al [34] and Chen et al. [39], respectively. Thus, the mean of 19.5 GPa was used in the

al

following analysis for the shear modulus of Mg65Cu25Tb10.

rn

2.2 Feature Candidates

Jo u

Candidate features (X) are the input for an ML model (f) to predict the desired properties (Y), i.e., Y = f(X). Thus, for a given target property Y, an adequate set of X had to be identified to ensure that a well-performing ML model was generated. As the physics underlying the glass transformation of an alloy are not well understood, as many potential descriptions as possible should be included to predict the GFA and Dmax. The feature candidates of an alloy were thus compiled with the basic elemental parameters, thermodynamic parameters, valence electron distribution, and atomic volume of the alloy elements. Table 1. The used 20 basic elemental properties and their values can be seen in Table S1.

Atomic Number (AN)

Density (D)

Mulliken Electronegativity (XM)

Atomic Weight (AW)

Melting Point (Tm)

Pauling Electronegativity (XP)

Group (Gp)

Boiling Point (Tb)

First Ionization Potential (I1)

Period (P)

Heat Capacity (Cp)

Second Ionization Potential (I2)

Covalent Radius (Rc)

Thermal Conductivity (K)

Electron Affinity (Eea)

Metallic Radius (Rm)

Heat of Fusion (Hf)

Work Function (W) 4

Valence Electrons (VEC)

Journal Lattice Volume Pre-proof (LP)

The elemental parameters were based on the elemental properties from experiments or density functional theory (DFT) simulations, and comprised, inter alia, atomic fundamental properties (e.g., atomic number, period, group in the periodic table), chemical properties (e.g., Pauling electronegativity, Mulliken electronegativity), and physical properties (density, melting temperature, specific heat capacity). The alloy feature candidates were calculated from the corresponding elemental properties (shown in Table 1) based on the linear mixture rule (x1) [40] and the reciprocal mixture rule (x2) [41], and their deviation (xD) [42] and discrepancy (xd) were also calculated [42], with Equations (1)–(4). ∑ (∑

f

(1)

(

pr

√∑

)

e-

(

⁄ ) ,

(2)

(3)

(4)

Pr

√∑

oo

)

where ai and xi are the atomic fraction and elemental properties of the i-th constituent, respectively.

al

Thermodynamic parameters reflect the driving force for glass transformation, and these parameters (which are defined in the equations below) include the mixing enthalpy (Hmix) based on Miedema’s

rn

empirical method [43] (Equation (5)), the normalized mixing entropy (Smix/R), based on fundamental

Jo u

thermodynamics [44] (Equation (6)), the normalized mismatch entropy (Smis/kB), estimated via a relationship given by Takeuchi et al. [12] (Equation (7)), as well as two parameters denoted PHSmis (Equation (8)) and PHSS (Equation (10)) proposed by Rao et al. [45], and a similar self-defined parameter PHSmix (Equation (9)): ∑ ⁄ ⁄



(5)

∑ (

(6) ) ⁄

(7) (8) ⁄

(9) ⁄

,

(10)

where ΔHij is the molar mixing enthalpy for binary liquid alloys [12], R is the gas constant, kB is the Boltzmann constant, Rmd is the discrepancy of Rm calculated with Equation (4). 5

Journal Pre-proof The valence electron distributions comprised the number (sVEC, pVEC, dVEC) and the fraction (fs, fp, fd) of electrons in the s, p, d valence orbitals of an element, based on the linear mixture rule [46], as given by Equation (11). The equation for the average atomic volume (Vmm and Vmc) was constructed by Wang et al. [47] to predict the bulk modulus, which may affect the GFA, and is given by Equation (12): (

(

)

(

)

(11)



)

(

(12)

)

where VEC1 is the average number of valence electrons, as calculated by Equation (1). In summary, 94 feature candidates, termed as zero-generation features, were generated for further study.

oo

f

2.3 Feature scaling As the range of magnitudes of the generated feature candidate alloys varied broadly, the use of Euclidian distances in computations such as decision trees, neural networks, and K-nearest neighbors (KNN), typical

pr

of some ML models, may not be appropriate. Therefore, the feature-scaling method was adopted to

e-

normalize the scale of features, thus serving as a preprocessing step prior to feature selection and model construction [48]. The min-max normalization (or min-max rescaling) method was applied to rescale

Pr

features in this work, and the feature candidate alloys were scaled to a predefined domain [a, b], according to Equation (13): )

)(

(13)

rn

al

(

where a = 0.2, b = 0.8, x is the original feature, x’ is the normalized feature, and min x and max x are the

Jo u

minimum and maximum value of the original feature. Tang et. al [49] found that the general domain [0, 1] might affect the performance of some ML algorithms, e.g. neural networks, and they recommended the use of smaller domains, such as [0.2, 0.8]. 2.4 Feature selection

Feature selection is a key process that has a critical effect on the performance of the ML model. It involves selecting a subset of relevant features for model construction, and enables some irrelevant features to be discarded with minimal loss of information. Thus, a three-step feature-selection method (TFS) was adopted to screen the normalized feature candidates in the GFA and Dmax dataset (as shown in Fig. 2). Initially, a feature that is highly correlated with other features can be considered to contain similar information. Thus, a correlation-based feature selection (CFS) method was used to remove these redundant features. The Pearson correlation coefficient (PCC) is a measurement of the linear correlation between two features X1 and X2, and is given by Equation (14): (

)

[(

[

])(

[

])]

,

(14) 6

where

and

Journal Pre-proof denote the standard deviation of features X1 and X2, and E is the expectation. PCC has a

value between -1 and 1, where -1 represents a wholly negative linear correlation, 1 represents a wholly positive linear correlation, 0 represents no linear correlation, and a value >0.75 or <-0.75 indicates a strong correlation. The highly cross-correlated features are thus revealed and able to be removed, and the remaining n features are denoted the first-generation features. The selection of the first-generation features is also based on domain knowledge. For example, metallic glass structures are generally built based on a hard spheres model, wherein the atoms are assumed to be densely and randomly packed. Thus, Rc1 is removed

rn

al

Pr

e-

pr

oo

f

when it is highly correlated with Rm1.

Fig. 2. The three-step feature selection method, involving the selection of first-generation features by correlation-based feature

Jo u

selection from the feature candidates, followed by the selection of second-generation features by comparison of the performance of the variance threshold and the ReliefF algorithm, and finally the determination of the final feature subset by comparing the results of sequential forward selection (SFS) and sequential backward selection (SBS).

After this, a widely used filter method known as variance threshold (VT) was adopted to further screen the first-generation features. The VT method assumes that a low-variance feature generally has very little predictive power, and thus m features whose variance (Var) is below a certain threshold (as determined by Equation (15) below) are removed: ( )

[(

[ ]) ] .

(15)

The ReliefF feature selection algorithm [50] searches for k nearest neighbor samples from the same class H(x), and for k nearest neighbors from each different class M(x). During this process, the weights of features are adjusted by comparing in-class distances and inter-class distances after N iterations. This procedure is given by 7

(∑

Journal ∑ | ( | ( )|Pre-proof



( ) ( )

(

( ))

( )

)

)|

( )

,

(16)

where P(x) is the probability of x, and range(x) is the difference between the maximum value and minimum value of each feature, which is 0.6 for the normalized features in this study. Next, a subset containing (n-m) features is selected using of the ReliefF algorithm to accurately compare the performance of these features, and the results are denoted the second-generation features. Finally, wrapper methods were used to determine the final feature subsets. Wrapper methods are based on greedy search algorithms, including sequential forward selection (SFS) and sequential backward selection (SBS). These evaluate possible feature subsets and select the subset that produces the best performance for a specific ML algorithm (which was an RF in this work). Thus, for a given dimensional

f

feature set X*, SFS starts from the null set, and sequentially adds the best feature x+ that maximizes the

oo

prediction accuracy (J) when combined with the already selected feature subset Xk, until the accuracy is satisfied [51]. This is expressed by )

pr

(

.

(17)

e-

In contrast, SBS starts from the full set X* and sequentially removes the worst feature x- that least reduces

2.5 Machine learning models

)

.

(18)

al

(

Pr

the prediction accuracy from Xk, until the accuracy is satisfied [51], as given by

rn

Around 20 ML algorithms available in the WEKA library, including linear ML algorithms (linear regression, logistic regression, elastic net, ridge regression), nonlinear ML algorithms (naive bayes, bayes net, decision

Jo u

tree, k-nearest neighbors, locally weighted learning, support vector machines, gaussian regression, neural network), ensemble ML algorithms (random forest, bagging, stacking), were used evaluated in terms of second-generation features via the cross-validation test [52]. The random forest (RF) algorithm superperformed over the other ML algorithms and thus was used for further in the study. Since mathematical expressions for regression problems were not available in the RF model, symbolic regression based on genetic programming was used to distil mathematical formulas from the desired properties and chosen features [53]. The initial expressions were generated by randomly combining mathematical operators, functions, constants, and non-normalized chosen features. Next, new formulas were continuously iterated by genetic programming and finally evolve to the optimal formula. The performance of the classification models was measured by their classification accuracy, which describes the proportion of the samples that were correctly classified. The performance of the regression models was measured by the correlation coefficient r 𝑟

∑𝑛 (𝑦̂𝑖 𝑦̅)

√ ∑𝑛

(𝑦

𝑦̅)

8

Journal Pre-proof

(19) and the root mean square error (RMSE) √∑𝑛

𝑀

𝑛

(𝑦̂

𝑦)

(20) where 𝑦̂𝑖 is the prediction and 𝑦̅ is the average of 𝑦 , and the r value lies between 0 and 1, with 1 indicating a perfect fit. 2.6 Cross-validation test The k-fold (with k = 100) cross-validation (CV) test was conducted here to evaluate the performance of

oo

f

different ML models. The initial dataset was randomly and equally split into k sub-datasets, a training set with (k-1) sub-datasets, and a testing set with the remaining sub-datasets. The ML model was built on the

pr

training set, and the performance was measured with the testing set. This process was repeated k times, and the cross-validation accuracy was obtained by the average value of the k results; this could be regarded as an

3

Pr

e-

estimation of the testing accuracy.

Results and discussion

al

Here we describe the building of a classification model to determine the GFA of alloys, a regression model based on alloy composition, another regression model based on CTT to predict Dmax, and two regression

rn

models to predict the EM of an alloy. In addition, mathematical expressions were developed by symbolic

3.1 GFA model

Jo u

regression and the linear least-squares method for regression problems.

The Pearson correlation coefficient (PCC) between two features X1 and X2 were calculated, where Xi (i = 1, 2, 3, … 94) denotes the 94 zero-generation features, and the target variable of the GFA data was not included. Based on the PCC values between -0.75 and 0.75, 35 first-generation features were selected from the 94 zero-generation features and thus from the GFA dataset. Fig. 3c shows that the CV classification accuracy of the ML model built on the CFS subset (CFS-GFA) can reach 89.52%, which is greater than that of the ML models constructed on the feature candidates and other subsets. However, the dimensionality of feature space must be further reduced to achieve a balance between performance and complexity, with a tolerance of 2% for CV classification accuracy. Thus, as shown in Fig. 3a and 3b, 14 features with a variance greater than 0.01 were selected by VT, and another subset containing 14 features was given by use of the ReliefF algorithm. The VT subset leads to a cross-validation classification accuracy (88.89%), more or less the same as that of 88.21% based on the ReliefF subset. Feature selection was further conducted with the two subsets by using the SBS and SFS methods. The 9

Journal Pre-proof results show that the SBS and SFS give the same subset. This feature selection yields the VT-SS-6 with the six features of VEC1, sVEC, Hfd, Tb2, Gp1, and Wd from the VT subset and the ReF-SS-6 with the six features of KD, Kd, VECd, Wd, Smix/R, and Tbd from the ReliefF subset, clearly indicating that VT-SS-6 and ReF-SS-6 have completely different features. Fig. 3c shows that the ML model with the six VT-SS-6 features (VTS6-GFA) had a CV classification accuracy of 88.13% and the ML model with the ReF-SS-6 features had a CV classification accuracy of 87.15%. The difference between the two CV classification accuracies is about 1%, although the two six-features subsets are completely different. This result might indicate that there are some correlations between the two subsets. Nevertheless, to simplify the following ML analysis, we took the VT-SS-6 subset as the final feature subset. A good ML model should be able to provide accurate prediction with as less as possible number of

f

features. The CFS-GFA model had the best classification accuracy of 89.52% with 36 features. The VTS6-

oo

GFA model uses only six features and has a classification accuracy of 88.13%, which indicates that the six-

Jo u

rn

al

Pr

e-

pr

features play the major role in the classification, although the classification accuracy is reduced by 1.39%.

Fig.3. (a) The variance and (b) the weight of the first-generation features. (c) The k-fold CV classification accuracy of a random

10

Journal Pre-proof

e-

pr

oo

f

forest model of various feature subsets.

Pr

Fig.4. The performance of the VTS6-GFA model shown as (a) the confusion matrix; (b) the receiver operating characteristic curve for distinguishing the CRA (red curve), RMG (green curve) and BMG (blue curve) alloys; the GFA distributions with (c) Hfd-Wd pair, (d) VEC1-sVEC pair and (e) Gp1-Tb2 pair, where the incorrect classifications of the VTS6-GFA model are marked with white

rn

al

crosses.

Fig. 4a shows the confusion matrix of the VTS6-GFA model, which has an accuracy of 88.13%. In

Jo u

this context, precision is the ratio of relevant samples to retrieved samples. For example, 1,151 alloys were classified as BMG alloys by the ML model, and 1,073 were labelled as BMGs in our dataset, meaning that the precision of BMG classification was 93.2%. In addition, recall or true positive rate is the fraction of retrieved relevant samples among the total relevant samples: for instance, 3,463 out of 3,708 RMG-labeled alloys were classified as RMG alloys by the ML model, and thus the recall of RMG was 93.4%. Thus, if one wants to use this ML model to distinguish GFAs (BMG or RMG) from non-GFAs (CRA), the precision of GFA predictions is 92.5%, and the recall can reach as high as 96.4%. Another widely used measurement, the receiver operating characteristic (ROC) curve, is shown in Fig. 4b. The ROC curve is created by plotting the true positive rate against the false positive rate. The false positive rate is the probability of falsely rejecting the null hypothesis. For example, 70 of 1,552 RMGlabeled and 8 of 3,708 CRA-labeled alloys were categorized as BMG alloys by our VTS6-GFA model, and thus the false positive rate of predicting BMG using this model was 1.5%. The area under the ROC curve (AUC) can be applied to describe the performance of a classification model. AUC usually varies between 0.5 and 1, where 0.5 represents the uninformative classifier or so-called “random guess”, a value > 0.95 11

Journal Pre-proof represents an excellent classifier and a value = 1 is the perfect classifier. The AUC of classifying crystalline alloys and amorphous alloys classifications is 0.95, and the AUC of distinguishing BMG-labeled alloys from other alloys can reach 0.98. This excellent prediction accuracy indicates that our model can be used to discover new GFAs. The GFA distributions of 6,471 alloys with six SBS-chosen features are shown in Fig. 4c–4e, and provide some information on the key features required for the formation of BMGs. Alloys should be composed of elements with significant differences in work function and heat of fusion; valence electrons might weaken the glass-forming ability; the boiling temperature of alloys cannot be high; and the presence of subfamily elements, such as La and Zr, might enhance the glass forming ability. The 764 incorrect predictions of our ML model are marked with white crosses, and it was possible that the ML model

oo

f

incorrectly classified an alloy whose neighbors in the feature space had a different GFA. 3.2 The Dmax model based on compositions

pr

The Dmax dataset was constructed from the GFA dataset by removing these BMGs which did not have the value of measured Dmax. In this way, the Dmax dataset has the same 35 first-generation features as

e-

those in the GFA dataset. Subsequently, 13 high-influence features were selected by VT and ReliefF to

reduction of the correlation coefficient (r).

Pr

compare their performance. Finally, two six-feature subsets were selected by SBF and SFS with a 0.01

Fig. 5a shows that the ML model (CFS-Dmax) on the CFS subset outperforms other feature subsets in

al

the k-fold CV test. The ML model on the ReliefF subset (r = 0.8542) performs much better than the VT

rn

subset (r = 0.8339) in the k-fold CV test, i.e., features in the ReliefF subset are the second-generation features. It can be seen that the ML model based on the SBS-6 subset (r = 0.8503) has a similar r value to

Jo u

that of the SFS-6 subset (r = 0.8511). Fig. 5b indicates that the ML model applied to the SFS-6 subset (Smix/R, Hfd, Hmix, Wd, Tb2, K1) had a smaller RMSE than does SBS-6 (AN1, KD, Tmd, Tb2, ANd, Tbd) in the kfold CV test. The ML model applied to the SFS-5 subset (Smix/R, Hfd, Hmix, Wd, Tb2) had an r value of 0.8419 and an RMSE of 1.2389 mm, and the performance increased significantly (r = 0.8511, RMSE = 1.2063 mm) when adding feature K1. Compared with the best-performing CFS-Dmax model, 28 features were removed with only a 0.0089 reduction of r and 0.0272 mm increase of RMSE in the SFS-6 subset.

12

Journal Pre-proof

Fig. 5. The (a) r and (b) RMSE of the random forest model applied to various feature subsets in k-fold CV. (c) ML-predicted

oo

f

values versus the measured values, with a 10 mm predicting error.

Alloy

Zr41.2Ti13.8Cu12.5Ni10Be22.5

Predicted Dmax

5 mm

21.8 mm

50 mm

5.3 mm

25 mm

7.4 mm

25 mm

5.1 mm

27 mm

6.2 mm

25 mm

0.1 mm

Pr

Y36Sc20Al24Co20 Y36Sc20Al24Co10Ni10

rn

al

Mg59.5Cu22.9Ag6.6Gd11 Pd40Ni40P20

Measured Dmax

e-

Zr41Ti14Cu12.5Ni8C2Be22.5

pr

Table 2. Measured and predicted Dmax of six outliers whose predicting error is bigger than 15 mm

Jo u

Fig. 5c shows a plot of the ML-predicted values against the measured values for Dmax with the SFS-6 subset. Setting a 10 mm predicting error yields nine outliers, six of them shown in Table 2 have an error bigger than 15 mm. The ML model applied to the SFS-6 subset (SFS6-Dmax) significantly overestimated the Dmax of Zr41Ti14Cu12.5Ni8C2Be22.5, which can be explained by the fact that it has a much lower Dmax than a similar composition of Zr41.2Ti13.8Cu12.5Ni10Be22.5. This also might be the reason for the underestimated Dmax of Zr41.2Ti13.8Cu12.5Ni10Be22.5. There were four Al-Co-Sc-Y alloys and three X36Y20Al24Co20 (X = Sc, Gd, Y) alloys in our dataset, and most of these had measured Dmax values < 5mm, except Y36Sc20Al24Co20. Thus, the Dmax of Y36Sc20Al24Co20 and Y36Sc20Al24Co10Ni10 were underestimated. The alloys Mg65Cu15Ag10Gd10 (measured Dmax = 7.5 mm) and Ni-P-Pd RMG (measured Dmax = 0.1 mm) perturbed the prediction of Mg59.5Cu22.9Ag6.6Gd11 and Pd40Ni40 P20, respectively. In summary, the discrepancies between predictions and observations might be caused by two reasons. The first is that the ML model is not able to predict the property that changes abruptly from its neighbors in the feature space. In this case, a more rigorous ML model should be developed to enhance the model capacity. The other reason is that some observations may not be reliable. More experiments should be conducted to ensure the reported experimental data. Reliable data are essential to ML, especially when the dataset is small. Overall, the SFS613

Journal Pre-proof Dmax model can predict the Dmax of alloys well. Symbolic regression (SR) was used to search for a parameter that can be generated using nonnormalized features in the SFS-6 subset without the six outliers mentioned above, and with the integer constant, and the operators {

}. A series of parameters of Dmax were generated with 1, 2,



and 3 features, respectively. Next, three linear least-square (LLS) models were built with these parameters, revealing that the LLS model using γ3 had the highest r value of 0.7125. Its performance in the k-fold CV test is shown in Fig. 6a. Equations (21) and (22) indicate that a high Smix (mixing entropy,

) can promote

GFA, which meets the confusion principle [55]. Equations (22) and (23) show that K1 (the thermal conductivity,

) of a solid alloy will affect its GFA, and that high thermal conductivity might

oo

f

lead to a slow critical cooling rate, i.e. big casting diameter [56]. The influence of Hmix (mixing enthalpy, ) can be seen in Equation (23), where a weak mixing enthalpy fosters the formation of a solid-

pr

solution phase, and an ultra-negative mixing enthalpy leads to the formation of a intermetallic phase [56]; both of these hinder glass formation. Thus, three prerequisites for forming BMGs are (1) a high mixing

((

⁄ )

)

)

(21)

(

(22)

)

(23)

Jo u

(

)

al



rn

(

).

Pr

approximately equal to -28 kJ/mol (

e-

entropy, (2) a high average thermal conductivity, and (3) an appropriate negative mixing enthalpy,

3.3 Dmax model based on CTTs

One RF model, one LLS model based on symbolic regression, and 20 LLS models based on previously proposed GFA criteria were built on the CTT dataset without six outliers described in Section 3.2. The performance of these models was compared by a k-fold CV text. It can be seen in Table 3 that the ML model (Fig. 6b) outperformed other formulas in the k-fold CV test, as it had a much higher r (0.7723) and smaller RMSE (2.8915 mm). The next best was the LLS model based on SR (r = 0.6721), and then the χ-criteria. The PCC values of 94 feature candidates and three CTTs were calculated. The CTTs and six features, namely Tm1, Hf1, Gpd, Gp1, XP1, and XPd, were highly correlated with each other. Thus, the symbolic regression was conducted using these six non-normalized features, the integer constant, and the operators {



}. SR parameters were generated for Tg, Tx, and Tl, respectively, as given by Equations

(24), (25), and (26): (

)

(24) 14

Journal Pre-proof

(25)

.

(26)

The SR results showed that Tg, Tx, and Tl (critical transition temperature, K) are largely and directly influenced by Tm1 (average melting temperature, K), which is shown by the fact that r > 0.93 in Fig. 7a. Equation (24) indicates that the glass transition temperature will increase then decrease as Hf1 (average heat of fusion,

) increases. The liquidus temperature will increase as Rm1 (average atomic radius, nm)

decreases, and the small atomic radius might enhance the resistance of the alloy to phase transition. Table 3. r values and RMSE values of the RF model and LLS model applied to the symbolic regression parameter and 20 GFA criteria. Formula

ref

R

[36]

0.4261

4.1121

[37]

0.2047

4.4493

[23]

0.4713

4.0089

[57]

0.5054

3.9221

[58]

0.4088

4.1482

[58]

0.4847

3.9757

[59]

0.2951

4.3430

[60]

0.4934

3.9535

[61]

0.4685

4.0157

[62]

0.4649

4.0243

[63]

0.3957

4.1745

[64]

0.5362

3.8369

[65]

0.4307

4.1024

[66]

0.4438

4.0731

[67]

0.4972

3.9438

[68]

0.5136

3.9001

)

[69]

0.4890

3.9649

)

[70]

0.5036

3.9270

[21]

0.5520

3.7901

[71]

0.5527

3.7883

This work

0.7723

2.8915

This work

0.6721

3.3655

oo

f

GFA criteria

⁄ ⁄( (

) )⁄(

)



α ( ⁄ )

β

( ⁄ )

⁄(

)

e-

δ

)⁄

( φ

⁄ )

[ (

θ

[(

[

⁄(

rn

[ ⁄(

𝐺 χ RF SR

( ⁄ )

Jo u

𝜔

(

[ (

[(

6 6

)]

)⁄ ] [(

)]

)⁄ (

00

( ⁄ )

( ⁄ )⁄ ( )]⁄[(

) ]

)] [ ⁄( (

6 (

)⁄ ]

)⁄

(

β

)]

)]⁄[ (

al

𝜔𝐽

⁄ ) )

( ⁄ )

𝜔𝐿

(

)⁄(

(

β𝑌 ⁄ 𝜔𝐿

0

Pr

( ( ⁄ )

ξ

𝜔𝐴

pr

γ

)] )

)

(

(

) )

RMSE

15

Fig.6. (a) The performance of LLSs based on the symbolic regression parameter γ3 in the k-fold CV test. (b) The performance of Journal Pre-proof the RF model on transition temperatures in the k-fold CV test.

Fig.7. (a) The linear relationship between the critical transformation temperature and average melting temperatures. The

f

performances of linear least square models based on symbolic regression parameter (b) φg and (c) φl in a k-fold cross-validation

Jo u

rn

al

Pr

e-

pr

oo

test

Fig. 8. The performance of RF models on all possible subsets of four influential features in predicting (a) shear moduli and (b) bulk moduli. The performance of the ML model on the best subset of three features in predicting (c) shear moduli with a 15 GPa error and (d) bulk moduli with a 30 GPa error.

16

Journal Pre-proof 3.4 EM model In our previous work on the prediction of the mechanical properties of metal-metal amorphous alloys [72], four features (Xp1, Vmm, Rmd, and Smix/R) were found to be crucial. Extending the results to the dataset built in this work, ML models have been constructed to predict the EM of metallic-metallic and metalloidmetallic bulk metallic glasses. The global optimization method and the exhaustive feature selection method was utilized in this low-dimensional feature space. Fifteen ML models were built on all possible subsets of four influential features, and the performance of these ML models was compared (as shown in Fig. 8a and 8b) to find the best predictive model for the shear (G) and bulk (B) moduli, respectively. The r value of the ML model for predicting shear moduli reached a maximum of 0.9836 when the subset contained Rmd and

f

Xp1, and the RMSE reached a minimum of 3.609 GPa. The r value for bulk moduli reached a maximum of

oo

0.9843 when the subset contained all features, and this decreased slightly to 0.9840 when Rmd was removed. Thus, the subset containing Xp1, Vmm, and Smix/R, which had an RMSE of 9.531 GPa, was considered as the

pr

best subset for bulk moduli.

e-

Fig. 8c and 8d show the excellent predictive power of this subset, in the plot of the ML-predicted values against the measured values for G and B, respectively, with the best selected subset having three

Pr

features. There are four exceptions, shown as Table 4 and dots outside the blue region. The high measured shear modulus (>75 GPa) of Fe49Cr15Mo14C15B6Er1 and Fe50Mn10Mo14Cr4C16B6, whose Xp1 are close to

al

Ni80P20, led to the overestimation of Ni80P20. The Vmm value of Fe71Nb6B23 (measured B = 182.6 GPa) is similar to that of Fe80P11C9; therefore, the ML model overestimated the modulus of Fe80P11C9. The similarity

rn

of Vmm and Xp1 values between Pt 74.7Cu1.5Ag0.3P18.0B4.0Si1.5 and Au49.0Ag5.5Pd2.3Cu26.9Si16.3 led to the

Jo u

incorrect ML model-based prediction of their bulk moduli. Table 4. Measured and predicted elastic modulus of four exceptions Alloy

Measured EM

Predicted EM

Ni80P20

G = 36.7 GPa

G = 56.1 GPa

Fe80P11C9

B = 148 GPa

B = 187 GPa

Pt74.7Cu1.5Ag0.3P18.0B4.0Si1.5

B = 216.7 GPa

B = 161.9 GPa

Au49.0Ag5.5Pd2.3Cu26.9Si16.3

B = 132.3 GPa

B = 203.4 GPa

The SR parameters α and β were generated using four non-normalized features, the integer constant, and the operators {



}. The performances of LLS models built on two-features parameters in

the k-fold CV test are shown in Fig. 9. The LLS models based on these SR parameters all performed well, with r > 0.93. The SR results are given by Equations (27)–(29), and show that a smaller Vmm (average atomic volume, Å3) results in a higher G (shear modulus, GPa) and B (bulk modulus, GPa). Equation (28) supports 17

Pre-proof Park’s work, as it shows that high mixingJournal entropy coupled with disordered amorphous structure manifests small shear transformation zone sizes, which could make the shear bands more difficult to form and propagate, leading to higher shear moduli [73]. When considering Pauling electronegativity, higher electronegativity corresponds to stronger atomic bonding, and therefore higher bulk moduli. 𝐺

(27)

𝐺

(28)



(29)

Jo u

rn

al

Pr

e-

pr

oo

f

(30)

Fig. 9. The performance of the LLS based on SR parameters in predicting (a) shear moduli and (b) bulk moduli.

3.5 Experimental validation and iterations We address a design problem as finding new kinds of BMG alloys. In this section, we will show a ML solution for this issue. The ability to form a bulk metallic glass was not measured during melt spinning, i.e., the bulkforming ability of an RMG alloy is unknown. This limits the ability of the ML model to classify RMG alloys, especially distinguish RMG from BMG alloys. Thus, we can find new BMG alloys from RMG alloys (shown as Table S2) which are categorized as BMG alloys by the VTS6-GFA model. Thirteen RMGs with a high probability (>60%) of belonging to BMG were chosen and their glass-forming ability were rechecked, five of them, La55Al20Cu25 [74], Zr60Al15Cu25 [75], Pd56Ni24P20 [76], Mg65Cu25Ce10 [77], and La50Al25Cu25 [78] are found to form bulk samples. The classification performance of the VTS6-GFA model improved as shown in Fig. 10. Those validations and iterations confirm the correctness, generalization and predictive 18

Journal Pre-proof

ability of our ML model.

Fig.10. (a) The performance of the updated VTS6-GFA model shown as the confusion matrix; (b) comparison of the classification

oo

f

accuracy and precision of BMG on initial model and updated model with feedback.

pr

4. Conclusion

e-

In this study, we have presented a general ML framework for the prediction, design, and understanding of metallic glasses. These models were constructed based on datasets built from metallic glass experiments

Pr

comprising over 6,000 samples. We used these datasets to train and cross-validate ML models with an RF algorithm to predict the GFA, Dmax, shear moduli, and bulk moduli. A three-step feature selection method

al

was proposed for ML, and it performed well with the GFA and Dmax models.

rn

Based on the ML-based GFA model, BMGs should be composed of elements with significant differences in work function and heat of fusion; valence electrons might weaken the glass-forming ability;

glass forming ability.

Jo u

the boiling temperature of alloys cannot be high; and the presence of subfamily elements might enhance the

In addition, mathematical expressions were generated from SR and LLSs for regression problems. Three properties that favor the formation of BMGs with large casting diameters were proposed, as follows: (1) high mixing entropy, (2) high average thermal conductivity, and (3) appropriate negative mixing enthalpy, of approximately -28 kJ/mol. The CTTs of metallic glasses were thus determined to be largely and directly influenced by the average melting temperatures of the alloys of which they are composed. The shear and bulk moduli of BMGs are negatively correlated with the average atomic volume of the constituent alloys, the mixing entropy enhance the shear moduli and the average Pauling electronegativity influences the bulk moduli of BMGs. This ML framework can guide the discovery and understanding of new BMGs with desired properties. Acknowledgments 19

Journal I would like to thank Doctor Ouyang Runhai for Pre-proof his expert advices. This work was supported by the National Key Research and Development Program of China (No. 2018YFB0704400), the Hong Kong Polytechnic University (internal grant Nos. 1-ZE8R and G-YBDH), and the 111 Project (grant No. D16002) from the State Administration of Foreign Experts Affairs, PRC.

References [1]

L. Zhang, H. Huang, Micro machining of bulk metallic glasses: a review, Int. J. Adv. Manuf. Technol. (2019). doi:10.1007/s00170-018-2726-y.

[2]

M. Jafary-Zadeh, G.P. Kumar, P.S. Branicio, M. Seifi, J.J. Lewandowski, F. Cui, A critical review on metallic glasses as

M.M. Khan, A. Nemati, Z.U. Rahman, U.H. Shah, H. Asgar, W. Haider, Recent advancements in bulk metallic glasses

oo

[3]

f

structural materials for cardiovascular stent applications, J. Funct. Biomater. (2018). doi:10.3390/jfb9010019.

and their applications: a review, Crit. Rev. Solid State Mater. Sci. (2018). doi:10.1080/10408436.2017.1358149. B. Nair, B.G. Priyadarshini, Process, structure, property and applications of metallic glasses, AIMS Mater. Sci. (2016).

pr

[4]

doi:10.3934/matersci.2016.3.1022.

J.D. Schuler, T.J. Rupert, Materials selection rules for amorphous complexion formation in binary metallic alloys, Acta

e-

[5]

Pr

Mater. 140 (2017) 196–205. doi:10.1016/j.actamat.2017.08.042. [6]

M.W. Chen, A brief overview of bulk metallic glasses, NPG Asia Mater. 3 (2011) 82–90. doi:10.1038/asiamat.2011.30.

[7]

Y. Li, Y. Liu, Z. Liu, S. Sohn, F.J. Walker, J. Schroers, S. Ding, Combinatorial development of bulk metallic glasses, Nat.

D. Kim, B.J. Lee, N.J. Kim, Prediction of composition dependency of glass forming ability of Mg-Cu-Y alloys by

rn

[8]

al

Mater. 13 (2014) 494–500. doi:10.1038/nmat3939.

thermodynamic approach, Scr. Mater. 52 (2005) 969–972. doi:10.1016/j.scriptamat.2005.01.038. S.W. Kao, C.C. Hwang, T.S. Chin, Simulation of reduced glass transition temperature of Cu-Zr alloys by molecular

Jo u

[9]

dynamics, J. Appl. Phys. 105 (2009). doi:10.1063/1.3086623. [10]

D. V Louzguine-Luzgin, R. Belosludov, M. Saito, Y. Kawazoe, A. Inoue, Glass-transition behavior of Ni: calculation, prediction, and experiment, J. Appl. Phys. 104 (2008). doi:10.1063/1.3042240.

[11]

J.J. Pang, M.J. Tan, K.M. Liew, On valence electron density, energy dissipation and plasticity of bulk metallic glasses, J. Alloys Compd. 577 (2013) S56–S65. doi:10.1016/j.jallcom.2012.03.036.

[12]

A. Takeuchi, A. Inoue, Classification of bulk metallic glasses by atomic size difference, heat of mixing and period of constituent elements and its application to characterization of the main alloying element, Mater. Trans. 46 (2005) 2817– 2829. doi:10.2320/matertrans.46.2817.

[13]

Y. Sato, C. Nakai, M. Wakeda, S. Ogata, Predictive modeling of time-temperature-transformation diagram of metallic glasses based on atomistically-informed classical nucleation theory, Sci. Rep. 7 (2017). doi:10.1038/s41598-017-06482-8.

[14]

D.Z. Xue, P. V Balachandran, J. Hogden, J. Theiler, D.Q. Xue, T. Lookman, Accelerated search for materials with targeted properties by adaptive design, Nat. Commun. 7 (2016). doi:10.1038/ncomms11241.

[15]

D.Z. Xue, D.Q. Xue, R.H. Yuan, Y.M. Zhou, P. V Balachandran, X.D. Ding, J. Sun, T. Lookman, An informatics

20

approach to transformation temperaturesJournal of NiTi-based shape memory alloys, Acta Mater. 125 (2017) 532–541. Pre-proof doi:10.1016/j.actamat.2016.12.009. [16]

C. Wen, Y. Zhang, C.X. Wang, D.Z. Xue, Y. Bai, S. Antonov, L.H. Dai, T. Lookman, Y.J. Su, Machine learning assisted design of high entropy alloys with desired property, Acta Mater. 170 (2019) 109–117. doi:10.1016/j.actamat.2019.03.010.

[17]

S. Feng, H.Y. Zhou, H.B. Dong, Using deep neural network with small dataset to predict material defects, Mater. Des. 162 (2019) 300–310. doi:10.1016/j.matdes.2018.11.060.

[18]

Y.T. Sun, H.Y. Bai, M.Z. Li, W.H. Wang, Machine learning approach for prediction and understanding of glass-forming ability, J. Phys. Chem. Lett. 8 (2017) 3434–3439. doi:10.1021/acs.jpclett.7b01046.

[19]

C. Wolverton, G.R. Jelbert, S.C. O’Keeffe, J. Stevick, M. Aykol, L. Ward, A machine learning approach for engineering bulk metallic glass alloys, Acta Mater. 159 (2018) 102–111. doi:10.1016/j.actamat.2018.08.002. F. Ren, L. Ward, T. Williams, K.J. Laws, C. Wolverton, J. Hattrick-Simpers, A. Mehta, Accelerated discovery of metallic through

iteration

of

machine learning

and high-throughput

doi:10.1126/sciadv.aaq1566.

Sci.

Adv.

4

(2018).

M.K. Tripathi, S. Ganguly, P. Dey, P.P. Chattopadhyay, Evolution of glass forming ability indicator by genetic

pr

[21]

experiments,

f

glasses

oo

[20]

programming, Comput. Mater. Sci. 118 (2016) 56–65. doi:10.1016/j.commatsci.2016.02.037. M.K. Tripathi, P.P. Chattopadhyay, S. Ganguly, Multivariate analysis and classification of bulk metallic glasses using

e-

[22]

[23]

Pr

principal component analysis, Comput. Mater. Sci. 107 (2015) 79–87. doi:10.1016/j.commatsci.2015.05.010. Z.P. Lu, C.T. Liu, A new glass-forming ability criterion for bulk metallic glasses, Acta Mater. 50 (2002) 3501–3512. doi:10.1016/S1359-6454(02)00166-0.

Z.L. Long, H.Q. Wei, Y.H. Ding, P. Zhang, G.Q. Xie, A. Inoue, A new criterion for predicting the glass-forming ability of

al

[24]

[25]

rn

bulk metallic glasses, J. Alloys Compd. 475 (2009) 207–219. doi:10.1016/j.jallcom.2008.07.087. J.W. Li, A.N. He, B.L. Shen, Effect of Tb addition on the thermal stability, glass-forming ability and magnetic properties

[26]

Jo u

of Fe-B-Si-Nb bulk metallic glass, J. Alloys Compd. 586 (2014) S46–S49. doi:10.1016/j.jallcom.2012.09.087. S. Tao, T.Y. Ma, H. Jian, Z. Ahmad, H. Tong, M. Yan, Glass forming ability, magnetic and mechanical properties of (Fe72Mo4B24)(100-x)Dyx

(x=4-7)

bulk

metallic

glasses,

Mater.

Sci.

Eng.

A.

528

(2010)

161–164.

doi:10.1016/j.msea.2010.08.092. [27]

K.F. Xie, K.F. Yao, T.Y. Huang, Preparation of (Ti 0.45Cu0.378Zr0.10Ni0.072)(100-x)Snx bulk metallic glasses, J. Alloys Compd. 504 (2010) S22–S26. doi:10.1016/j.jallcom.2010.02.199.

[28]

N.B. Hua, R. Li, H. Wang, J.F. Wang, Y. Li, T. Zhang, Formation and mechanical properties of Ni-free Zr-based bulk metallic glasses, J. Alloys Compd. 509 (2011) S175–S178. doi:10.1016/j.jallcom.2011.01.078.

[29]

C.

Suryanarayana,

A. Inoue,

Iron-based bulk metallic glasses,

Int. Mater. Rev.

58 (2012) 131–166.

doi:10.1179/1743280412y.0000000007. [30]

Q.S. Zhang, W. Zhang, D. V. Louzguine-Luzgin, A. Inoue, Effect of substituting elements on glass-forming ability of the new

Zr48Cu36Al8Ag8

bulk

metallic

glass-forming

alloy,

J.

Alloys

Compd.

504

(2010)

S18–S21.

doi:10.1016/j.jallcom.2010.02.052. [31]

W. Jiao, D.Q. Zhao, D.W. Ding, H. Bai, W.H. Wang, Effect of free electron concentration on glass-forming ability of CaMg-Cu system, J. Non-Cryst. Solids. 358 (2012) 711–714. doi:10.1016/j.jnoncrysol.2011.10.033.

21

[32]

F. Yuan, J. Du, B.L. Shen, Controllable spin-glass behavior and large magnetocaloric effect in Gd-Ni-Al bulk metallic Journal Pre-proof glasses, Appl. Phys. Lett. 101 (2012). doi: 10.1063/1.4738778.

[33]

M.X. Pan, A.L. Greer, W.H. Wang, B. Zhang, D.Q. Zhao, Amorphous metallic plastic, Phys. Rev. Lett. 94 (2005) 1–4. doi:10.1103/physrevlett.94.205502.

[34]

W.H. Wang, The elastic properties, elastic models and elastic perspectives of metallic glasses, Prog. Mater. Sci. 57 (2012) 487–656. doi:10.1016/j.pmatsci.2011.07.001.

[35]

Y. Kawazoe, T. Masumoto, K. Suzuki, A. Inoue, A.-P. Tsai, J.-Z. Yu, T. Aihara Jr., T. Nakanomyo, Nonequilibrium phase diagrams of ternary amorphous alloys, Springer-Verlag, Berlin/Heidelberg, 1997. doi:10.1007/b58222.

[36]

A. Inoue, Stabilization of metallic supercooled liquid, Acta Mater. 48 (2000) 279-306. https://doi:10.1016/S13596454(99)00300-6..

[37]

Z.P. Lu, Y. Li, S.C. Ng, Reduced glass transition temperature and glass forming ability of bulk glass forming alloys, J.

oo

[38]

W.H. Wang, Correlations between elastic moduli and properties in bulk metallic glasses, J. Appl. Phys. 99 (2006). doi:

X.Q. Chen, H. Niu, D. Li, Y. Li, Modeling hardness of polycrystalline materials and bulk metallic glasses, Intermetallics.

e-

19 (2011) 1275–1281. doi:10.1016/j.intermet.2011.03.026.

C. Carruthers, H. Teitelbaum, The linear mixture rule in chemical-kinetics. Ⅱ. thermal-dissociation of diatomic-molecules,

Pr

[40]

pr

10.1063/1.2193060. [39]

f

Non-Cryst. Solids. (2000). doi:10.1016/S0022-3093(00)00064-8.

Chem. Phys. 127 (1988) 351–362. doi:10.1016/0301-0104(88)87133-7. [41]

R.H. McKee, A.M. Medeiros, W.C. Daughtrey, A proposed methodology for setting occupational exposure limits for

S.S. Fang, X. Xiao, X. Lei, W.H. Li, Y.D. Dong, Relationship between the widths of supercooled liquid regions and bond

rn

[42]

al

hydrocarbon solvents, J. Occup. Environ. Hyg. 2 (2005) 524–542. doi:10.1080/15459620500299754.

parameters of Mg-based bulk metallic glasses, J. Non-Cryst. Solids. 321 (2003) 120–125. doi:10.1016/S0022-

[43]

Jo u

3093(03)00155-8.

P.K. Ray, M. Akinc, M.J. Kramer, Applications of an extended Miedema’s model for ternary alloys, J. Alloys Compd. 489 (2010) 357–361. doi:10.1016/j.jallcom.2009.07.062.

[44]

W.M. White, Energy, entropy and fundamental thermodynamic concepts, Geochemistry, 2011.

[45]

B.R. Rao, M. Srinivas, A.K. Shah, A.S. Gandhi, B.S. Murty, A new thermodynamic parameter to predict glass forming ability in

iron

based multi-component

systems

containing zirconium,

Intermetallics.

35 (2013) 73–81.

doi:10.1016/j.intermet.2012.11.020. [46]

L. Ward, A. Agrawal, A. Choudhary, C. Wolverton, A general-purpose machine learning framework for predicting properties of inorganic materials, NPJ Comput. Mater. 2 (2016) 1–7. doi:10.1038/npjcompumats.2016.28.

[47]

J.Q. Wang, W.H. Wang, H.B. Yu, H.Y. Bai, Correlations between elastic moduli and molar volume in metallic glasses, Appl. Phys. Lett. 94 (2009). doi:10.1063/1.3106110.

[48]

L. Friedman, O. V Komogortsev, Assessment of the effectiveness of seven biometric feature normalization techniques, IEEE T. Inf. Foren. Sec. 14 (2019) 2528–2536. doi:10.1109/Tifs.2019.2904844.

[49]

Z. Tang, P.A. Fishwick, Feedforward neural nets as models for time series forecasting, ORSA J. Comput. (1993).

22

Journal Pre-proof

doi:10.1287/ijoc.5.4.374. [50]

R.J. Palma-Mendoza, D. Rodriguez, L. de-Marcos, Distributed ReliefF-based feature selection in Spark, Knowl. Inf. Syst. 57 (2018) 1–20. doi:10.1007/s10115-017-1145-y.

[51]

J.D. Li, K.W. Cheng, S.H. Wang, F. Morstatter, R.P. Trevino, J.L. Tang, H. Liu, Feature selection: a data perspective, ACM Comput. Surv. 50 (2018). doi: 10.1145/3136625.

[52]

R.R. Bouckaert, E. Frank, M. Hall, R. Kirkby, P. Reutemann, A. Seewald, D. Scuse, WEKA manual for version 3-8-3, University of Waikato, 2018.

[53]

S. Sun, R.H. Ouyang, B.C. Zhang, T.Y. Zhang, Data-driven discovery of formulas by symbolic regression, MRS Bull. 44 (2019) 559–564. doi:10.1557/mrs.2019.156. M.L. McHugh, Interrater reliability: the kappa statistic, Biochem. Medica. (2012). doi:10.11613/bm.2012.031.

[55]

C. Chattopadhyay, B.S. Murty, Kinetic modification of the “confusion principle” for metallic glass formation, Scr. Mater.

f

[54]

[56]

D.B. Miracle, O.N. Senkov, A critical review of high entropy alloys and related concepts, Acta Mater. 122 (2017) 448–

pr

511. doi:10.1016/j.actamat.2016.08.081.

X.S. Xiao, F. Shoushi, G.M. Wang, H. Qin, Y. Dong, Influence of beryllium on thermal stability and glass-forming ability

e-

[57]

oo

116 (2016) 7–10. doi:10.1016/j.scriptamat.2016.01.022.

of Zr-Al-Ni-Cu bulk amorphous alloys, J. Alloys Compd. 376 (2004) 145–148. doi:10.1016/j.jallcom.2004.01.014. K. Mondal, B.S. Murty, On the parameters to assess the glass forming ability of liquids, J. Non-Cryst. Solids. 351 (2005)

Pr

[58]

1366–1371. doi:10.1016/j.jnoncrysol.2005.03.006.

Q. Chen, J. Shen, D. Zhang, H. Fan, J. Sun, D.G. McCartney, A new criterion for evaluating the glass-forming ability of

al

[59]

bulk metallic glasses, Mater. Sci. Eng. A. 433 (2006) 155–160. doi:10.1016/j.msea.2006.06.053. X.H. Du, J.C. Huang, C.T. Liu, Z.P. Lu, New criterion of glass forming ability for bulk metallic glasses, J. Appl. Phys.

rn

[60]

[61]

Jo u

101 (2007) 2005–2008. doi:10.1063/1.2718286. G.J. Fan, H. Choo, P.K. Liaw, A new criterion for the glass-forming ability of liquids, J. Non-Cryst. Solids. (2007). doi:10.1016/j.jnoncrysol.2006.08.049. [62]

X.H. Du, J.C. Huang, New criterion in predicting glass forming ability of various glass-forming systems, Chinese Phys. B. 17 (2008) 249–254. doi:10.1088/1674-1056/17/1/043.

[63]

Z.Z. Yuan, S.L. Bao, Y. Lu, D.P. Zhang, L. Yao, A new criterion for evaluating the glass-forming ability of bulk glass forming alloys, J. Alloys Compd. 459 (2008) 251–260. doi:10.1016/j.jallcom.2007.05.037.

[64]

Z.L. Long, G.Q. Xie, H.Q. Wei, X.P. Su, J. Peng, P. Zhang, A. Inoue, On the new criterion to assess the glass-forming ability of metallic alloys, Mater. Sci. Eng. A. 509 (2009) 23–30. doi:10.1016/j.msea.2009.01.063.

[65]

X.L. Ji, Y. Pan, A thermodynamic approach to assess glass-forming ability of bulk metallic glasses, T. Nonferr. Metal SOC. 19 (2009) 1271–1279. doi:10.1016/S1003-6326(08)60438-0.

[66]

G.H. Zhang, K.C. Chou, A criterion for evaluating glass-forming ability of alloys, J. Appl. Phys. 106 (2009). doi:10.1063/1.3255952.

[67]

H.Q. Wei, Z.L. Long, Z.C. Zhang, X.A. Li, J. Peng, P. Zhang, Correlations between viscosity and glass-forming ability in bulk amorphous alloys, Acta Phys. Sin. 58 (2009) 2556–2564. doi:10.7498/aps.58.2556.

23

[68]

S. Guo, C.T. Liu, New glass forming ability criterion derived from cooling consideration, Intermetallics. 18 (2010) 2065– Journal Pre-proof 2068. doi:10.1016/j.intermet.2010.06.012.

[69]

B.S. Dong, S.X. Zhou, D.R. Li, C.W. Lu, F. Guo, X.J. Ni, Z.C. Lu, A new criterion for predicting glass forming ability of bulk metallic glasses and some critical discussions, Prog. Nat. Sci. Mater. Int. 21 (2011) 164–172. doi:10.1016/S10020071(12)60051-3.

[70]

P. Blyskun, P. Maj, M. Kowalczyk, J. Latuch, T. Kulik, Relation of various GFA indicators to the critical diameter of Zrbased BMGs, J. Alloys Compd. 625 (2015) 13–17. doi:10.1016/j.jallcom.2014.11.112.

[71]

Y. Zhang, M. Zhao, Z. Chen, W. Liu, Z. Long, M. Zhong, G. Liao, A new correlation between the characteristics temperature and glass-forming ability for bulk metallic glasses, J. Therm. Anal. Calorim. 132 (2018) 1645–1660. doi:10.1007/s10973-018-7050-0.

[72]

J. Xiong, T.Y. Zhang, S.Q. Shi, Machine learning prediction of elastic properties and glass-forming ability of bulk

oo

f

metallic glasses, MRS Commun. 9 (2019) 576–585. doi:10.1557/mrc.2019.44. E.S. Park, Understanding of the shear bands in amorphous metals, Appl. Microsc. (2015). doi:10.9729/am.2015.45.2.63.

[74]

A.S. Argon, The physics of deformation and fracture of polymers, Cambridge University Press, 2010.

pr

[73]

doi:10.1017/CBO9781139033046.

P.J. Tao, Q. Tu, W.W. Zhang, D.Y. Li, Investigation for biocompatibility of zirconium-based bulk metallic glasses with

e-

[75]

Pr

and without Ni element addition, Basic Clin. Pharmacol. 119 (2016) 10. [76]

R. Xu, W. Pang, Q. Huo, Modern Inorganic Synthetic Chemistry, Elsevier, 2011. doi:10.1016/C2009-0-30549-8.

[77]

X.K. Xi, D.Q. Zhao, M.X. Pan, W.H. Wang, On the criteria of bulk metallic glass formation in MgCu-based alloys,

rn

V. Kokotin, Polyhedra-based analysis of computer simulated, Technical University Dresden, 2010.

Jo u

[78]

al

Intermetallics. 13 (2005) 638–641. doi:10.1016/j.intermet.2004.10.003.

24

Journal Pre-proof

CRediT author statement

Jie XIONG: Conceptualization, Investigation, Methodology, Data Curation, Writing-Original Draft San-Qiang SHI: Writing-Review and Editing, Funding acquisition

Jo u

rn

al

Pr

e-

pr

oo

f

Tong-Yi ZHANG: Writing-Review and Editing, Funding acquisition

25

Journal Pre-proof

Declaration of interests

The authors declare that they have no known competing financial interests or personal relationships that

Jo u

rn

al

Pr

e-

pr

oo

f

could have appeared to influence the work reported in this paper.

26

Journal Pre-proof

Jo u

rn

al

Pr

e-

pr

oo

f

Graphical abstract

27

Journal Pre-proof

Highlights

The machine learning based models for predicting glass-forming ability, critical transition temperatures and elastic moduli of metallic glasses are developed.

Key features which might affect the glass-forming ability, critical transition temperatures and elastic moduli are selected, and expressions based on those features are generated by symbolic regression.

Jo u

rn

al

Pr

e-

pr

oo

f

The machine learning based models and expressions show good predictive and generalization ability, and outperformance the traditional criteria.

28