Determination of minor metal elements in steel using laser-induced breakdown spectroscopy combined with machine learning algorithms

Determination of minor metal elements in steel using laser-induced breakdown spectroscopy combined with machine learning algorithms

Journal Pre-proof Determination of minor metal elements in steel using laserinduced breakdown spectroscopy combined with machine learning algorithms ...

1MB Sizes 0 Downloads 35 Views

Journal Pre-proof Determination of minor metal elements in steel using laserinduced breakdown spectroscopy combined with machine learning algorithms

Yuqing Zhang, Chen Sun, Liang Gao, Zengqi Yue, Sahar Shabbir, Weijie Xu, Mengting Wu, Jin Yu PII:

S0584-8547(19)30606-8

DOI:

https://doi.org/10.1016/j.sab.2020.105802

Reference:

SAB 105802

To appear in:

Spectrochimica Acta Part B: Atomic Spectroscopy

Received date:

22 November 2019

Revised date:

21 February 2020

Accepted date:

21 February 2020

Please cite this article as: Y. Zhang, C. Sun, L. Gao, et al., Determination of minor metal elements in steel using laser-induced breakdown spectroscopy combined with machine learning algorithms, Spectrochimica Acta Part B: Atomic Spectroscopy(2019), https://doi.org/10.1016/j.sab.2020.105802

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Β© 2019 Published by Elsevier.

Journal Pre-proof

Determination of minor metal elements in steel using laser-induced breakdown spectroscopy combined with machine learning algorithms Yuqing Zhang, Chen Sun, Liang Gao, Zengqi Yue, Sahar Shabbir, Weijie Xu, Mengting Wu, and Jin Yu# School of Physics and Astronomy, Shanghai Jiao Tong University, Shanghai 200240,

of

China

ro

#Corresponding author: [email protected]

-p

ABSTRACT

The properties of a steel are crucially influenced by the contained minor elements,

re

including metals, such as Mn, Cr and Ni. The determination of their concentrations using laser- induced breakdown spectroscopy (LIBS) represents a great help in many

lP

application scenarios, especially with in situ and online measurement requirements. Such determination can be significantly perturbed by spectral interferences with Fe I and Fe II

na

lines which is particularly dense in the VIS and near UV ranges. Univariate regression can sometimes, lead to calibration models with modest analytical performances. In this

ur

work, multivariate calibration models are developed using a machine learning approach. We first show the regression results with univariate models. The development of

Jo

multivariate models is then briefly presented, in successive steps of data pretreatment, feature selection with SelectKBest algorithm and regression model training with backpropagation neural network (BPNN). The analytical performances obtained with the developed multivariate models are compared with those obtained with the univariate models. We demonstrate in such way, the efficiency of the machine learning approach in the development of multivariate models for calibration and prediction with LIBS spectra acquired from steel samples. In particular, the prediction trueness (relative error of prediction) and precision (relative standard deviation) for the determination of the above mentioned metal elements in steel reach the respective values of 1.13%, 2.85%, 7.20%

1

Journal Pre-proof (for Mn, Cr, Ni) and 6.68%, 3.96%, 6.52% (for Mn, Cr, Ni) with the used experimental condition and measurement protocol.

Keywords: Steels, Minor metal elements, LIBS, Spectral interference, Multivariate regression, Machine learning

of

1. Introduction

ro

Steel is the most important base material for heavy industries. Although iron is the main element in steel, special properties of a steel, such as toughness, strength, hardness,

-p

corrosion resistance, are mainly determined by minor and trace elements [1-3]. These

re

elements can be divided into two categories: non- metal elements, such as carbon, nitrogen, phosphorus, sulfur, and metal elements, such as manganese, chromium, nickel,

lP

molybdenum, vanadium. Among the last ones, manganese is an efficient deoxidizer and desulfurizer in steel, which can improve quench ability and hot workability of steel [4].

na

Chromium can significantly improve strength, hardness, oxidation resistance, corrosion resistance and wear resistance of steel [5]. It is therefore of crucial importance to

ur

precisely determine the concentrations of minor and trace metal elements in steel. Currently, metal element analysis methods for steel include X-ray fluorescence (XRF),

Jo

particularly energy-dispersed X-ray fluorescence (EDXRF) [6-10], spark optical emission spectroscopy [11], inductively coupled plasma atomic emission spectrometry (ICP-AES) [12,13], inductively coupled plasma mass spectrometry (ICP-MS) [14-17], isotope dilution spark source mass spectrometry (ID-SSMS) [18] and flame atomic absorption spectrometry (FAAS) [19-22]. These techniques in general need complex and timeconsuming sample preparations, therefore typically corresponding to laboratory-based analytical techniques. Although XRF and spark emission spectroscopy can provide direct analysis of steel samples without complex sample pretreatment, the measurement accuracy is in general quite modest in the range from 2% for EDXRF [6] to above 10% in general. With spark emission spectroscopy, a contact should be ensured between the sample and the spark in order to excite a discharge plasma on sample surface. 2

Journal Pre-proof Laser-induced breakdown spectroscopy (LIBS) presents a number of specific features, direct analysis of materials, possibly at a remote distance, without the materials being pretreated through a complex preparation procedure. Such features may become determinant for many applications especially in industrial process control where other analytical techniques meet important difficulties. For steel industry more specifically, these applications include, for example, in process analysis of molten steels [23], in situ analysis of steel materials, analysis and sorting of belt-transported steel pieces or scraps [24]. To fully satisfy the requirements of the above mentioned applications, LIBS in its

of

current develop stage and for some specific application cases, is still expected to be

ro

improved for better quantitative analytical performance, especially concerning the precision and the trueness of the measurements. A major difficulty of LIBS analysis of

-p

steels and its alloys comes from spectral interference due to the emission lines from iron. These lines, present in a very large number in the spectral range of interest, not only

re

overlap with many emission lines from the analyzed elements, but also induce

lP

fluctuations of the base line of the spectra. Univariate regression models can sometimes lead to modest analytical performances [25]. Development of more elaborated spectroscopic data treatment methods is therefore of primary importance for LIBS

na

applications in steel industries. Chemometrics data treatment methods are often used for multivariate classifications and regressions with LIBS spectra of steels and alloys.

ur

Among the algorithms tested for LIBS spectra of steels, one can often find random forest

Jo

(RF) [26], K-nearest neighbor (KNN) [27], soft independent modeling of class analogy (SIMCA) [27-29], support vector machine (SVM) [30], partial least squares (PLS) models [29,31-35], neural networks (NN) [31], and principal components regression (PCR) [29,35,36] for qualitative and quantitative analysis of steels. In this work, we developed a new multivariate regression method based on machine learning algorithms for determination of minor elements in steels. Beyond the classical chemometric methods, the machine learning approach [37] with its initial development for artificial intelligence, has undergone this last years, fantastic developments with a large number of advanced and constantly renewing algorithms available for developing efficient data treatment methods. Nowadays, machining learning has provided powerful algorithms for data treatment in a wide variety of applications, especially in computer 3

Journal Pre-proof vision and image processing [38,39] as well as in medicine [40]. We have developed in our previous works, regression models based on machine learning for quantitative analysis of trace elements in soils with LIBS [41]. The purpose was to efficiently reduced the matrix effect affecting strongly the LIBS analysis of soils. The developed method was implemented in the present work for the treatment of LIBS spectra of steel samples. The purpose is here to build regression models able to perform precise and accurate predictions of concentrations for important metal elements in steel, Mn, Cr and Ni for instance. In the following parts of the paper, after a brief presentation of the experiment,

of

univariate regression models are first developed using raw spectral intensities of the

ro

analyzed elements and their normalized intensities with an emission line from the matrix iron element. The developed multivariate model is then presented in more details,

-p

although its general principle can also be found elsewhere in our published work for LIBS analysis of soils [41]. The analytical performances of the developed multivariate

re

regression models are presented, before the conclusion of the paper will be delivered.

lP

2. Samples, experimental setup and measurement protocol 2.1 Samples

na

Twenty- five certified reference steel samples were purchased from NCS Testing Technology Co., Ltd. These reference materials contain typical metal elements and non-

ur

metal elements in steel. In this experiment, we focused on the calibration models for 3

Jo

metal elements, manganese, chromium and nickel. The concentration ranges of these elements are respectively from 320 to 242000 ppm, from 53 to 249300 ppm, and from 30 to 242300 ppm, for Mn, Cr, and Ni. The detailed concentration information is presented in Tab. 1. The samples had a cylindrical form of about 3.5 cm for the both diameter and height. The surface of a sample was polished (with 600 mesh then 800 mesh sander papers) and cleaned (with 99.999% alcohol then distilled water) before the LIBS measurements were performed on it.

Table 1 Certified concentrations of the 3 analyzed elements and iron in the used certified reference steel samples provided by NCS Testing Technology Co., Ltd. 4

Journal Pre-proof Sample name

Elemental concentration in ppm Mn

Cr

Ni

Fe

Others

S1

940

0

0

954249

44811

K144

S2

23500

15800

35700

834064

90936

YSBS11080-2003

S3

4700

250

5900

932659

56491

CSBS11088

S4

2960

15900

18900

885611

76629

YSBS4510188-15

S5

8800

249300

5000

692690

44210

K084

S6

14800

1200

13100

916630

54270

CZ950

S7

1220

117200

1800

853440

26340

HLBS11035-2012

S8

137100

33800

0

JZK14-355

S9

242000

18600

CZ951

S10

9570

9730

YSBS16209-2007

S11

8800

730

YJZ0301

S12

5720

YSBS11230-2013

S13

7430

YSBS11222-2014

S14

12600

YSBS23302-200

S15

11500

YSBS35102-2015

S16

YSBS11389-2008

S17

YSBS45375-2013

S18

YSBS281039-13

S19

16500

694739

28161

620

950735

29345

280

976734

13456

8720

600

973710

11250

10600

360

972675

8935

240

270

981295

5595

172400

92400

708237

15463

16800

2560

880

975210

4550.5

11800

175700

116500

677041

18959

4720

0

0

979984

15296

11700

196700

242300

483719

65581

S20

4240

0

0

982163

13597

S21

3430

0

0

984067

12503

YSBS11208-2015

S22

11900

170

50

985950

1930

YSBS451073-13

S23

320

53

160

997417

2050

YSBS11208a-2015

S24

11600

200

50

985970

2180

YSBS11078c-2012

S25

1240

160

30

997297

1273

Jo

YSBS45374-2013

ur

YSBS45371-2013

re

-p

36443

na

792657

lP

of

K146

ro

NCS reference

2.2 Experimental setup This used experimental setup is shown in Fig. 1. The ablation source was a Q switched Nd:YAG laser (Beamtech Optronics Co., Ltd.) operating at 1064 nm with a repetition rate of 10 Hz, and delivering laser pulses of duration of 7 ns and pulse energy of 40 mJ maximum. The laser beam passed through an optical attenuator consisting in an 5

Journal Pre-proof association of a half-wave plate and a Glan prism and was directed in to a doublet lens of 50 mm focal length (f0 ). Laser pulses with energy adjusted at 32 mJ after the optical attenuator were then focused slightly below the surface of the sample with a shift down of 0.5 mm, which allowed the generation of stable plasmas above the sample surface. In the experiment, the focused laser spot on the target surface wa s estimated to be 400 ΞΌm in diameter, resulting in a laser fluence of 25.5 J/cm2 and an irradiance of 3.64 GW/cm2 delivered to the sample, without considering the absorption by the plasma. The sample was mounted on a x-y- z 3-D displacement stage in order to allow the sample surface to

of

be scanned during a measurement. A πœƒ-πœ‘ tilting plate supporting the 3-D stage allowed

ro

in the experiment setting the sample surface in the horizontal plane. And the sample surface was kept in a constant height in the experime nt by using the combination of a

-p

laser point in an inclined incidence to the sample surface and a CCD camera focused on

Jo

ur

na

lP

re

the sample surface.

Fig.1 Schematic presentation of the experimental setup.

The generated plasmas were imaged along a horizontal direction by a combination of two quartz lenses (f1 and f2 ) with a same focal length of 50 mm. In the image plane, an optical fiber with a core diameter of 50 Β΅m captured a part of plasma emission inside of the plasma image of about 2 mm height and 2 mm wide. The emission capture point by the fiber was set at on the middle of the plasma and at two- fifths of its height from the 6

Journal Pre-proof sample surface. The output side of the fiber was connected to the entrance of a n echelle spectrometer (Mechelle 5000, Andor Technology), which was in turn coupled to an intensified charge-coupled device (ICCD) camera (iStar, Andor Technology). The spectral range of the spectrometer was 220 nm – 900 nm. The ICCD was triggered by laser pulses via a fast photodiode and set with a detection delay of 1 Β΅s and a detection gate width of 3 Β΅s. A same gain was applied to the intensifier of the ICCD for all the measurements in our experiment.

2.3 Experimental protocol

of

For each steel sample, 400 replicate spectra were taken. Each of these spectra was an

ro

accumulation of 15 subsequent laser shots on an ablation site. A distance of 500 Β΅m was left between 2 neighbor ablation sites to avoid overlapping. As shown in Fig. 1, the 400

-p

replicate measurements were distributed on the sample surface inside of 4 matrix of 10 Γ— 10 ablation sites. In the experiment, ablations were first performed along a line of 10

re

ablation sites for 10 spectra. A second line was then ablated below the first one for more

lP

replicates, up to the last line of the 10 Γ— 10 matrix. The second matrix was then ablated after rotating the sample to a fresh area. The sample surface was kept in the same

na

horizontal plane during the rotation of the sample thanks to the sample surface monitoring system. A homemade software automatically managed the synchronization

ur

between the opening of the beam shutter, the acquisition by the ICCD camera and the translation of the sample.

Jo

3. Results and discussions 3.1 Raw spectra and analytical line selection Figure 2 shows the replicate-averaged spectrum recorded for the sample S5 with the insets showing emission lines from the 3 analyzed elements (Mn, Cr and Ni) respectively, with a selected and enlarged wavelength scale. The spectral lines selected for representing the emission intensities of the 3 analyzed elements are respectively Mn II 293.93 nm, Cr II 313.21 nm, and Ni II 229.7 nm lines as shown in red letters in the insets in Fig. 2. The reasons of their selection were first their relatively strong intensities in the detected spectra. They were chosen especially because that compared to the other detected lines of the same element, they were relatively free of interference with spectral 7

Journal Pre-proof lines emitted by other elements in the same sample, especially Fe. Another important criterion for line selection was that the line would not be significantly affected by selfabsorption. The line shapes were inspected to avoid obvious self- absorption and selfreversing. In addition, the obtained relatively high 𝑅 2 value of the calibration curve compared to the values obtained with other lines of the same element further justify the

lP

re

-p

ro

of

choice.

na

Fig.2 Replicate-averaged spectrum of the sample S5. The insets show spectral lines selected to measure the emission intensities of the 3 analyzed elements, Mn, Cr and Ni.

ur

3.2 Univariate regression

Jo

As shown in Table 2, an ensemble of certified reference steel samples was respectively selected for each analyzed element with a reduced concentration range, more suitable for the calibration purpose. The selected samples were further separated into a set of calibration samples and a set of validation samples. LIBS measurements were performed for each sample of the both sample sets to obtain spectral intensities of the selected lines of the analyzed elements. For each sample, the 400 replicate spectra were averaged, resulting in a mean spectrum and the associated standard deviation (𝑆𝐷), where βˆ‘π‘› ( Μ…) 2 𝑖=1 𝐼𝑖 βˆ’πΌ

𝑆𝐷 = √

π‘›βˆ’1

, with 𝐼𝑖 being the line intensity of the 𝑖 π‘‘β„Ž replicate measurement, 𝐼 Μ… the

average intensity and 𝑛 the number of the replicates. The background of the mean spectrum was fitted and removed. The averaged intensities of the selected lines for the 3 analyzed elements were then extracted together with the corresponding 𝑆𝐷 values. 8

Journal Pre-proof Univariate regression calibration models for the analyzed elements were obtained by fitting the averaged intensities of the selected lines as a function of the respective elemental concentrations. Tab. 2 Calibration and validation sample sets respectively for the 3 analyzed elements. Element

Mn

Validation

Calibration sample set

sample set

S1, S3, S4, S5, S6, S7, S10, S11, S12, S13, S14, S15, S16, S17, S18,

S20, S24

S19, S21, S22, S23, S25 S2, S4, S6, S7, S8, S9, S10, S11, S12, S14, S16, S17, S19

S13, S15

Ni

S2, S3, S5, S6, S7, S9, S10, S11, S12, S13, S14, S16, S17, S19, S23

S4, S15

of

Cr

ro

Figure 3 shows replicate-averaged line intensities of the calibration samples as a function of the corresponding elemental concentrations for the 3 analyzed elements of

-p

Mn (Fig. 3a), Cr (Fig. 3b) and Ni (Fig. 3c), together with the calibration curves resulted from a linear fitting of the experimental points. The calibration curves are then used to

re

predicted the concentrations of the validation samples according to the corresponding line

lP

intensities for the 3 analyzed elements. The validation data points are also plotted in Fig. 3 with crosses. The error bars in the figure are standard deviations ( ±𝑆𝐷 ) of the intensities calculated over the 400 replicate measurements performed for each sample.

na

We can remark that the determination coefficients 𝑅 2 show quite reduced values with

ur

respect to the unity due to the dispersion of the average line intensities with respect to the linear regression, which would indicate the influence of matrix effect and a limited

Jo

experimental repeatability of the measurements from a sample to another. We can remark also relatively large error bars indicating the dispersion of individual replicate measurements for a given sample. Such dispersion can be contributed by experimental fluctuations as well as inhomogeneity of the sample. This means that the dispersion can increase with the number of replicates 𝑛 (𝑛 = 400 in our case). At the same time, the precision of the mean intensity determination is improved, since such precision corresponds to standard errors (𝑆𝐸). 𝑆𝐸 is defined as the standard deviation of the means of the measurements, and is related to 𝑆𝐷 through the relation 𝑆𝐸 = 𝑆𝐷 β„βˆš 𝑛.

9

Jo

ur

na

lP

re

-p

ro

of

Journal Pre-proof

Fig. 3 Univariate calibration curves resulted from linear fitting of the average intensities of the selected lines of the analyzed elements (a: Mn, b: Cr and c: Ni) for the samples of the calibration set (open symbols); and validation data (crosses) for each elements from the samples of the validation set. The error bars in the figures correspond to the standard deviations (±𝑆𝐷) over the 400 replicate measurements of a given sample.

10

Journal Pre-proof From the data presented in Fig. 3, we can extract parameters indicating the analytical performances of the calibration models: 𝑅 2 determination coefficient, 𝑅𝐸𝐢 (%) relative error of calibration, 𝐿𝑂𝐷(ppm) limit of detection, 𝑅𝐸𝑃 (%) relative error of prediction and 𝑅𝑆𝐷 (%) relative standard deviation of the predicted concentrations. Such set of parameters are usually used for the assessment of calibration models, their definitions can be found elsewhere [31,41]. Table 3 shows the extracted parameters. Such analytical performances clearly leave rooms for improvements in order to satisfy the requirement of quantitative analysis. We can notice that the same remark above

of

concerning the 𝑆𝐷 of the intensities (error bars in Fig. 3) can be applied here to the 𝑅𝑆𝐷

ro

of the predicted concentrations in Table 3, which describe the dispersion of the predicted concentrations with individual replicate spectra. This implies smaller relative standard

-p

errors ( 𝑅𝑆𝐸) according to the 𝑅𝑆𝐸 = 𝑅𝑆𝐷 ⁄ √ π‘š, for mean concentrations obtained by averaging over the concentrations predicted by the model using π‘š replicate spectra

re

(1 < π‘š ≀ 𝑛). Similarly, snice the 𝐿𝑂𝐷 values in the Table 3 are calculated according to

lP

the 𝑆𝐷 of the individual replicate measurements, they can be therefore reduced when mean concentrations are considered.

Calibration type

Element

Regression of average line intensities Regression of normalized average line intensities

Mn

na

Tab. 3 Parameters indicating the analytical performances of the univariate calibration models. Calibration model

Validation

𝑅𝐸𝐢(%)

𝐿𝑂𝐷(ppm)

0.8796

43.0

887.6

21.6

11.8

0.9095

33.4

1909.0

40.8

8.23

Ni

0.9852

705.9

7169.9

19.1

14.5

Mn

0.9405

32.4

-

18.0

21.5

Cr

0.9996

21.8

-

26.0

14.5

Ni

0.9916

759.3

-

12.3

18.2

ur

Jo

Cr

π‘Ÿ2

𝑅𝐸𝑃(%)

𝑅𝑆𝐷(%)

Internal reference method was thus used to improve the performances of the univariate calibration models. The Fe II 245.9 nm line, was used to normalize the intensities of the selected lines of the analyzed elements. Comparing to other iron ion lines, the selected one allowed better results. The calibration curves resulted from linear regression of the normalized line intensities as a function of the concentration ratio 11

Journal Pre-proof between the concerned element and iron are shown in Fig. 4 in a similar way as in Fig. 3. The parameters indicating the performances of the calibration models are shown in Table 3 in comparison with those for the calibration models with raw line intensities. As we can see in the table, the normalization improves 𝑅 2 and 𝑅𝐸𝑃, but degrades 𝑅𝑆𝐷. This means that the normalization presents a good efficiency of correction for the matrix effect from a sample to another, but it introduces supplementary noises for the replicate intensities of

Jo

ur

na

lP

re

-p

ro

of

a given sample. Additional efforts are therefore required for further improvements.

12

Journal Pre-proof

of

Fig. 4 Univariate calibration curves resulted from linear fitting of the normalized average

ro

intensities of the selected lines of the analyzed elements (a: Mn, b: Cr and c: Ni) for the samples of the calibration set (open symbols); and validation data (crosses) for each elements from the

-p

samples of the validation set. The error bars in the figures correspond to the standard deviations

3.3 Multivariate regression

lP

3.3.1. Principle and model training

re

(±𝑆𝐷) over the replicate measurements of a given sample.

In this work we used the method developed in our previous work devoted to the

na

treatment of LIBS spectra of soil samples [41], detailed information about the principle and the implementation of the method can be therefore found in the corresponding

ur

publication. In the following, we will focus on the application of the method to the case of treatment of LIBS spectra of steels, with necessary adaptations.

Jo

a. Flowchart

Figure 5 shows the simplified flowchart of the model training procedure. Several steps can be distinguished in a successive way. The experimental spectra are divided into a calibration set of spectra and a validation set of spectra in the way indicated in Table 2. Pretreatments are performed on raw spectra, which consists in i) normalization and ii) feature selection. The normalization, applied to all the raw spectra of the calibration as well as the validation sets, is a simple operation which transformed the intensity range of a raw spectrum into the interval between 0 and 1. The particularity of the normalization procedure used in this work is that it is performed for a given pixel in the spectrum over the spectra of all the replicate measurements of all the samples. Such normalization reduces the contrast among the pixel intensities of a raw spectrum, which can initially 13

Journal Pre-proof exceed one order of magnitude for a large part of the pixels as shown on Fig. 2. The feature selection algorithm is then applied to the normalized spectra of the calibration set. The selected features of a spectrum are used as input variables for a back propagation neural network algorithm for model training. Finally, the trueness and the precision of the model are tested using the selected features from the normalized spectra of the validation

ur

na

lP

re

-p

ro

of

set.

Jo

Fig. 5 Flowchart of the multivariate calibration model training procedure.

b. Feature selection

The feature selection is performed using SelectKBest (SKB) algorithm [41] with the normalized spectra of the calibration set. The principle consists in selecting and keeping in an individual spectrum for the further processing, pixel intensities with high enough correlation with the series of analyte concentrations of the calibration sample set. In this work, the same procedure was applied respectively to the 3 analyzed elements to select 150 spectral features for each of the elements. The selection of such number of pixels corresponds to the criterion fixed for the Pearson’s correlation coefficient [42] being greater than 0.75 when calculating the covariance between the pixel intensity and the analyte concentration. A reduced number of pixel intensities selected from a spectrum 14

Journal Pre-proof and used as input variables to train the calibration model, decrease the risk of overfitting for the model. In this work, such risk is further reduced by a necessary trade-off between the calibration performance (𝑅𝐸𝐢) of the model and its capacity of prediction (𝑅𝐸𝑃 and 𝑅𝑆𝐷). Figure 6 shows the results of the feature selection procedure. We can see that for all the 3 elements, the emission lines selected manually for the above univariate regression, Mn II 293.93 nm, Cr II 313.21 nm and Ni II 229.7 nm lines, obtain high scores in the feature selection by the SKB algorithm (indicated by doted ovals in Fig. 6). These lines

of

are not the most intense line, according to the relative line intensities in NIST atomic

ro

spectra database, for the respective elements, but they necessarily suffer less from interference with other emission lines, especially those from Fe II and Fe I. This means

-p

that the used algorithm offers a tool for automatic selection of the most suitable lines for analytical purposes by taking into account the usual criteria in LIBS analysis in terms of

Jo

ur

na

lP

re

relatively high intensity and less spectral interference.

15

Jo

ur

na

lP

re

-p

ro

of

Journal Pre-proof

Fig. 6 Results of spectral feature selection using SelectKBest algorithm for the 3 analyzed elements, Mn (a), Cr (b) and Ni (c). For each element, the up part of the figure shows raw spectrum (blue line) and selected pixels (red dots) in the raw spectrum. The insets in the up part 16

Journal Pre-proof of the figures, show detailed spectra in the spectral ranges with high score features together with the selected pixels with red dots. In the bottom part of each figure, the scores of the 150 selected features are shown.

c. Model training The 150 selected features were used as input variables to train a back-propagation neural network (BPNN) [43,44]. The resulted calibration models were validated using spectra obtained from the validation samples which is not involved in the model training. The performance of the calibration models are assessed with determination coefficient 𝑅 2 ,

of

relative error of calibration ( 𝑅𝐸𝐢) and limit of detection ( 𝐿𝑂𝐷 ). The trueness and precision of the predictions by the calibration models are assessed by relative error of

ro

prediction (𝑅𝐸𝑃) and relative standard deviation (𝑅𝑆𝐷) of the predicted values.

-p

3.3.2. Calibration curves with the multivariable regression

The multivariate calibration models together with the validation data of the analyzed

re

elements are shown in Fig. 7. And the parameters assessing the analytical performances

lP

of the models are presented in Table 4. The multivariate calibration curves exhibit 𝑅 2 values very close to the unity within the range of 0.99969 - 0.99997. This means that the

na

multivariate models efficiently compensate experimental fluctuations and the matrix effect from a reference sample to another. We can also see that the fluctuations from a

ur

replicate to another for a given sample is significantly reduced, which leads to reduced error bars (±𝑆𝐷) on the predicted concentrations. A direct consequence of such reduction

Jo

is a significantly improved 𝐿𝑂𝐷𝑠 . However, the larger concentration ranges of the certified reference samples used in the experiment certainly did not allow us to evaluate the 𝐿𝑂𝐷𝑠 of the developed multivariate models in an optimized condition for concentration ranges with smaller extension and mean value. Concerning the prediction performances of the models, the 𝑅𝐸𝑃s of the 3 analyzed elements, which represent the trueness of the predictions, are measured to be 1.13%, 2.85% and 7.20% for Mn, Cr and Ni respectively, significantly improved with respect to the univariate models. The precision of the prediction, assessed by 𝑅𝑆𝐷s, are measured to be 6.68%, 3.96% and 6.52% for Mn, Cr and Ni respectively, also significantly improved compared to the univariate models. Finally, the same remarks about 𝑅𝑆𝐷 and 𝐿𝑂𝐷 of the univariate model can also be made here. The consideration of mean concentrations resulted from averaging over 17

Journal Pre-proof those predicted by the multivariate model with individual replicate spectra can thus

Jo

ur

na

lP

re

-p

ro

of

further improve the precision and 𝐿𝑂𝐷 of the concentration determination.

Fig. 7 Multivariate calibration models and the predicted concentrations as a function of the certified concentrations of the calibration samples (open symbols) for the 3 analyzed elements (a: Mn, b: Cr and c: Ni). Predicted concentrations of the validation samples are presented in the 18

Journal Pre-proof figures with crosses. The error bars in the figures correspond to the standard deviations (±𝑆𝐷) of the replicate measurements for a given sample. Tab. 4 Parameters showing the analytical performances of the multivariable regression models.

Multivariate regression

Calibration model Element

π‘Ÿ2

𝑅𝐸𝐢(%)

Mn

0.99969

1.61

Cr

0.99997

Ni

0.99977

Validation

𝐿𝑂𝐷(ppm)

𝑅𝐸𝑃(%)

𝑅𝑆𝐷(%)

614.7

1.13

6.68

5.60

432.5

2.85

3.96

28.9

893.9

7.20

6.52

of

Calibration type

4. Conclusion

ro

In this work, we have developed a LIBS spectrum data treatment method for

-p

determining the concentrations of minor metal elements, Mn, Cr and Ni for instance, in steel based on a machine learning approach. Such approach mainly consists in a data

re

pretreatment procedure with normalization and spectral feature selection, followed by a multivariate prediction model training and validation. The feature selection results

lP

obtained with SelectKBest showed the efficiency of such automatic selection method, which can have practical interests for applications. The selected spectral features were

na

used as the input variables to train the multivariate models based on back-propagation neuronal network. The trained algorithms provided prediction models for the analyzed

ur

elements. The obtained multivariate calibration curves exhibit 𝑅 2 values very close to the

Jo

unity within the range of 0.99969-0.99997. Comparing to the univariate models, the analytical performances allowed by the multivariate models are clearly improved, with 𝑅𝐸𝑃𝑠 and 𝑅𝑆𝐷𝑠 in the range of 1.13% – 7.20% and 3.96% – 6.68% respectively. Our work has therefore demonstrated the efficiency of the machine learning approach in the data treatment of LIBS spectra of steels. Such approach will be in a next step in our laboratory, applied to precise determination of nonmetal elements in steel, which presents great interests for applications in steel industry and many related domains. Declaration of interests The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

19

Journal Pre-proof

Acknowledgments This study was supported by the National Natural Science Foundation of China (Grant Nos 11574209, 11805126, and 61975190). Author Statement

-p

ro

of

Yuqing Zhang performed the experiment, made the calculations for the univariate and multivariate models and contributed to the paper writing; Chen Sun developed the machine learning program; Jin Yu designed the research program and supervised the operations; Liang Gao, Yuqing Zhang and Zengqi Yue contributed to the development of the experimental setup; other co-authors participated in the experiments and the date treatments.

T. Wada, W. C. Hagel, Effect of trace elements, molybdenum, and intercritical heat

lP

[1]

re

References

treatment on temper embrittlement of 2-1/4Cr-1 Mo steel, Metall. Trans. A 1976,7(9): 1419-26.

R. W. Swindeman, V. K. Sikka, R. L. Klueh, Residual and trace element effects on

na

[2]

the high-temperature creep strength of austenitic stainless steels, Metall. Trans. A

D. A. Melford, The influence of residual and trace elements on hot shortness and

Jo

[3]

ur

1983,14(3): 581-593.

high temperature embrittlement, Philosophical Transactions of the Royal Society of London, Series A, Mathematical and Physical Sciences, 1980, 295(1413): 89-103. [4]

R. Brook, P. S. P. Silva, The influence of manganese on the fracture toughness of nickel steels, Int. J. Fracture, 1976, 12(1): 27-32.

[5]

V. A. Maslyuk, G. G. Lvova, V. Ya. Kurovskii, A. A. Mamonova, Effect of chromium and manganese nitrides on the structure and properties of Kh18N15 powder stainless steel, Powder Metall. Met. Ceram. 2011, 50(5-6): 289.

[6]

M. K. Tiwari, A. K. Singh, K. J. S. Sawhney, Analysis of stainless steel samples by energy dispersive X-ray fluorescence (EDXRF) spectrometry, Bull. Mater. Sci. 2001, 24(6): 633-638. 20

Journal Pre-proof [7]

M. Nagoshi, T. Aoyama, Y. Tanaka, et al., Quantitative Analysis of Nb in Steel Utilizing XRF-yield XAFS Edge Jump, ISIJ Int. 2013, 53(12): 2197-2200.

[8]

G. L. Bosco, Development and application of portable, hand-held X-ray fluorescence spectrometers, TrAC, Trends Anal. Chem. 2013, 45: 121-134.

[9]

A. I. Volkov, N. V. Alov, Method for improving the accuracy of continuous X-ray fluorescence analysis of iron ore mixtures, J. Anal. Chem. 2010, 65(7): 732-738.

[10] Z. Wang, Y. Deguchi, F. Shiou, et al., Application of laser- induced breakdown spectroscopy to real-time elemental monitoring of iron and steel making processes,

of

ISIJ Int. 2016: ISIJINT-2015-542.

[11] M. Hemmerlin, R. Meilland, H. Falk, P. Wintjens, L. Paulard, Application of

ro

vacuum ultraviolet laser- induced breakdown spectrometry for steel analysis β€”

-p

comparison with spark-optical emission spectrometry figures of merit, Spectrochim. Acta B At. Spectrosc. 2001, 56(6): 661-669.

re

[12] H. Kataoka, Y. Okamoto, T. Matsushita, et al., Magnetic drop-in tungsten boat

lP

furnace vaporisation inductively coupled plasma atomic emission spectrometry (MDI- TBF-ICP-AES) for the direct solid sampling of iron and steel, J. Anal. At. Spectrom. 2008, 23(8): 1108-1111.

na

[13] H. Wiltsche, I. B. Brenner, K. Prattes, et al., Characterization of a multimode sample introduction system (MSIS) for multielement analysis of trace elements in

ur

high alloy steels and nickel alloys using axially viewed hydride generation ICP-

Jo

AES, J. Anal. At. Spectrom. 2008, 23(9): 1253-1262. [14] H. Yasuhara, T. Okano, Y. Matsumura, Determination of trace elements in steel by laser ablation inductively coupled plasma mass spectrometry, Analyst, 1992, 117(3): 395-399. [15] G. Okano, S. Igarashi, O. Ohno, et al., Determination of trace amounts of bismuth in steel by ICP-MS through a cascade-preconcentration and separation method, ISIJ Int. 2015, 55(1): 332-334. [16] S. Finkeldei, G.Staats, ICP-MS–A powerful analytical technique for the analysis of traces of Sb, Bi, Pb, Sn and P in steel, Fresenius J. Anal. Chem. 1997, 359(4-5): 357-360.

21

Journal Pre-proof [17] M. Weyrauch, M. Oeser, A. BrΓΌske, et al., In situ high-precision Ni isotope analysis of metals by femtosecond-LA-MC-ICP-MS, J. Anal. At. Spectrom. 2017, 32(7): 1312-1319. [18] P. J. Paulsen, R. Alvarez, C. W. Mueller, Trace Element Determinations in a LowAlloy Steel Standard Reference Material by Isotope Dilution, Spark Source Mass Spectrometry, Appl. Spectrosc. 1976, 30(1): 42-46. [19] T. Seki, H. Takigawa, Y. Hirano, et al., On- line preconcentration and determination of lead in iron and steel by flow injection- flame atomic absorption spectrometry,

of

Anal. Sci. 2000, 16(5): 513-516.

ro

[20] Y. Zhou, L. Li, Determination of metals in waste bag filter of steel works by microwave digestion- flame atomic absorption spectrometry, Spectrosc. Spect. Anal.

-p

2011, 31(9): 2565-2568.

[21] T. Muraya, K. Oguma, Determination of Bismuth in Iron and Steel by Flame

re

Atomic Absorption Spectrometry Coupled with Ion-Pair Solid Phase Extraction,

lP

Bunseki Kagaku, 2009, 58(11): 937-940.

[22] T. Itagaki, T. Ashino, K. Takada, et al., A simultaneous internal standard method for improving the analytical precision of flame atomic absorption spectrometry

64(2): 117-124.

na

using high-resolution continuum- light-source apparatus, Bunseki Kagaku, 2015,

ur

[23] L. Sun, H. Yu, Z. Cong, et al., In situ analysis of steel melt by double-pulse laser-

Jo

induced breakdown spectroscopy with a Cassegrain telescope, Spectrochim. Acta B At. Spectrosc. 2015, 112: 40-48. [24] E. Grifoni, S. Legnaioli, G. Lorenzetti, S. Pagnotta, V. Palleschi, Applying LIBS to Metals Processing, Spectroscopy, 2015, 30(11): 20-31. [25] V. Karki, A. Sarkar, M. Singh, G. S. Maurya, R. Kumar, A. K. Rai, S. K. Aggarwal, Comparison of spectrum normalization techniques for univariate analysis of stainless steel by laser- induced breakdown spectroscopy, Pramana, 2016, 86(6): 1313-1327. [26] T. Zhang, L. Liang, K. Wang, et al., A novel approach for the quantitative analysis of multiple elements in steel based on laser- induced breakdown spectroscopy

22

Journal Pre-proof (LIBS) and random forest regression (RFR), J. Anal. At. Spectrom. 2014, 29(12): 2323-2329. [27] J. P. Castro, E. R. Pereira-Filho, Twelve different types of data normalization for the proposition of classification, univariate and multivariate regression models for the direct analyses of alloys by laser- induced breakdown spectroscopy (LIBS), J. Anal. At. Spectrom. 2016, 31(10): 2005-2014. [28] E. Vors, K. Tchepidjian, J.-B. Sirven, Evaluation and optimization of the robustness of a multivariate analysis methodology for identification of alloys by

of

laser induced breakdown spectroscopy, Spectrochim. Acta B At. Spectrosc. 2016,

ro

117: 16-22.

[29] J.-B. Sirven, B. Salle, P. Mauchien, J.-L. Lacour, S. Maurice, G. Manhes,

-p

Feasibility study of rock identification at the surface of Mars by remote laser-

Spectrom. 2007, 22: 1471–1480.

re

induced breakdown spectroscopy and three chemometric methods, J. Anal. At.

lP

[30] L. Liang, T. Zhang, K. Wang, et al., Classification of steel materials by laserinduced breakdown spectroscopy coupled with support vector machines, Appl. Opt. 2014, 53(4): 544-552.

na

[31] J.-B. Sirven, B. Bousquet, L. Canioni, L. Sarger, Laser- induced breakdown spectroscopy of composite samples: comparison of advanced chemometrics

ur

methods, Anal. Chem. 2006, 78: 1462–1469.

Jo

[32] C. B. Stipe, B. D. Hensley, J. L. Boersema, et al., Laser- induced breakdown spectroscopy of steel: a comparison of univariate and multivariate calibration methods, Appl. Spectrosc. 2010, 64(2): 154-160. [33] C. B. Stipe, B. D. Hensley, J. L. Boersema, S. G. Buckley, Laser- induced breakdown spectroscopy of steel: a comparison of univariate and multivariate calibration methods, Appl. Spectrosc. 2010, 64(2): 154–160. [34] R. C. Wiens, S. Maurice, J. Lasue, O.Forni et al., Pre- flight calibration and initial data processing for the ChemCam laser- induced breakdown spectroscopy instrument on the Mars Science Laboratory rover, Spectrochim. Acta B At. Spectrosc. 2013, 82: 1-27.

23

Journal Pre-proof [35] T. Takahashi, B. Thornton, Quantitative methods for compensation of matrix effects and self-absorption in laser induced breakdown spectroscopy, Spectrochim. Acta B At. Spectrosc. 2017, 138: 31–42. [36] D. L. Death, A. P. Cunningham, L. J. Pollard, Multi-element analysis of iron ore pellets by Laser-induced Breakdown Spectroscopy and Principal Components Regression, Spectrochim. Acta B At. Spectrosc. 2008, 63(7): 763-769. [37] https://en.wikipedia.org/wiki/Machine_learning [38] https://en.wikipedia.org/wiki/Computer_vision

of

[39] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proc.

ro

IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2016, pp. 770–778. [40] A. Esteva1, A. Robicquet, B. Ramsundar, V. Kuleshov, M. DePristo, K. Chou, C.

-p

Cui, G. Corrado, S. Thrun, J. Dean, A guide to deep learning in healthcare, Nat. Med., 2019, 25(1): 24-29.

re

[41] C. Sun, Y. Tian, L. Gao, Y. S. Niu, T. L. Zhang, H. Li, Y. Q. Zhang, Z. Q. Yue, N.

lP

Delepine-Gilon , J. Yu, Machine Learning Allows Calibration Models to Predict Trace Element Concentration in Soils with Generalized LIBS Spectra, Sci. Rep. 2019,9(1):11363, https://doi.org/10.1038/s41598-019-47751-y.

Media, Inc. (2017).

na

[42] Peter Bruce and Andrew Bruce, Practical Statistics for Data Scientists, O’Reilly

ur

[43] T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein, Introduction to Algorithms,

Jo

Second Edition. MIT Press and McGraw-Hill (2001). [44] G. Hinton, L. Deng, D. Yu, et al., Deep neural networks for acoustic modeling in speech recognition, IEEE Signal Proc. Mag. 2012, 29(11): 82-97.

24

of

Journal Pre-proof

Jo

ur

na

lP

re

-p

ro

Graphical abstract

25

Journal Pre-proof

Highlights

of ro -p re lP na ur

ο‚· ο‚· ο‚·

Quantitative determination of Mn, Cr, Ni in steels with trueness and precision of 1.13%, 2.85%, 7.20% and 6.68%, 3.96%, 6.52% respectively. Machine learning approach for LIBS spectrum treatment. SelectKBest algorithm for spectral feature selection. Back-propagation neural network for building multivariate calibration models.

Jo

ο‚·

26