Study on distributed lithium-ion power battery grouping scheme for efficiency and consistency improvement

Study on distributed lithium-ion power battery grouping scheme for efficiency and consistency improvement

Journal of Cleaner Production 233 (2019) 429e445 Contents lists available at ScienceDirect Journal of Cleaner Production journal homepage: www.elsev...

6MB Sizes 0 Downloads 15 Views

Journal of Cleaner Production 233 (2019) 429e445

Contents lists available at ScienceDirect

Journal of Cleaner Production journal homepage: www.elsevier.com/locate/jclepro

Study on distributed lithium-ion power battery grouping scheme for efficiency and consistency improvement Xiwei Bai a, b, Jie Tan a, *, Xuelei Wang a, Lianjing Wang a, Chengbao Liu a, b, Liyong Shi c, Wei Sun c a b c

Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China University of Chinese Academy of Sciences, Beijing, 100049, China Zhejiang Tianneng Energy Technology Co., Ltd., Changxing, 313100, China

a r t i c l e i n f o

a b s t r a c t :

Article history: Received 12 March 2019 Received in revised form 30 May 2019 Accepted 31 May 2019 Available online 3 June 2019

The service life, safety, and capacity of lithium-ion power battery packs relies heavily on the consistency among battery cells. Grouping is an effective procedure to improve consistency by screening cells with similar performance and assembling them into an identical group. Battery grouping can be achieved via clustering techniques based on characteristics like static capacity, internal resistance etc. The dynamic characteristics-based method considers the battery performance during the entire charging-discharging process and has become one of the most promising grouping method. However, it suffers from high computational complexity. Nowadays, facing stricter quality standards and the increasing demand for battery products, existing dynamic characteristics-based grouping scheme cannot meet the performance and time requirement of the modern high-quality, large-scale battery manufacturing. In this paper, a novel grouping scheme based on distributed time-series clustering is proposed to match the need of both efficiency and consistency improvement. The proposed scheme designs an effective “cloud-edge” mode and utilizes an innovative two-stage trick to achieve parallel processing, which split the original centralized clustering approach into local clustering and global merging. The host computers for data acquisition and battery control are regarded as distributed edge computing resources to implement local clustering on the acquired battery discharging voltage sequence set. Cluster contours are extracted via a denoising contour extraction algorithm considering the irregularity of the discharging voltage sequences. The results of the above preliminary processing are uploaded to a cloud data center. A pragmatic merging scheme based on an integrated merging indicator is established to solve the decision-making problem of global merging on the cloud data center. The final global cluster set is transmitted back to host computers to instruct cell unloading operation. Experimental results based on real battery discharging voltage sequence data suggest that the proposed scheme can reduce the inconsistency rate by 43.56% and the time cost by 92.87%. The computing efficiency and resource utilization rate of the distributed scheme is much higher than the centralized scheme. Meanwhile, compared with three existing advanced grouping approaches, our scheme perform the best in reducing inconsistency rate. © 2019 Published by Elsevier Ltd.

Keywords: Lithium-ion power battery grouping Consistency improvement Efficiency improvement Edge computing Distributed time-series clustering

1. Introduction As the core component of electric vehicles (EVs), lithium-ion power batteries boast the characteristics of high energy density, low self-discharge rate, pollution-free, none memory effect and good cycle performance (Lipu et al., 2018; Liu et al., 2018; Lu et al.,

* Corresponding author. Institute of Automation, Chinese Academy of Sciences, 95 East Zhongguancun Road Haidian District, Beijing 100190, China. E-mail address: [email protected] (J. Tan). https://doi.org/10.1016/j.jclepro.2019.05.401 0959-6526/© 2019 Published by Elsevier Ltd.

2013; Wang et al., 2016; Wei et al., 2017) and have gradually become one of the most prevailing and promising power sources (Oliveira et al., 2015; Zhang et al., 2017a). However, due to the limitations in voltage and capacity, battery cells are generally grouped together through series-parallel connections and further form a power battery pack with additional battery management system (BMS) to meet the requirements of EVs (Barreras et al., 2016; Zhang et al., 2017a, b). In a power battery pack, it is inevitable that there exist inconsistencies in characteristics like internal resistance, nominal capacity, open-circuit voltage (OCV), thermal distribution, charging-discharging performance and etc. among

430

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

cells (Chiu et al., 2014; Feng et al., 2018; Gogoana et al., 2014; Rumpf et al., 2017). These inconsistencies can cause severe problems because of the barrel effect (Love et al., 2014). In practical applications, a BMS controls the charging and discharging process of this battery pack. The cell with the lowest capacity will become a limitation of this process, resulting in less available capacity of the entire battery pack (Niu et al., 2019). Gogoana (Gogoana et al., 2014) proved that the inconsistency in ohmic resistance might cause much larger lifetime reduction. More seriously, the inconsistency € fer et al., 2014; among cells gets worse with aging (Baumho Jaguemont et al., 2016; Zheng et al., 2015). Therefore, to ensure safety and prolong the service life, the inconsistency of the battery pack needs to be minimized. The inconsistency problem mainly stems from the quality variation of raw material, the precision and stability of equipment, and the production environment during the battery manufacturing € fer et al., 2014; V€ process (Baumho ayrynen and Salminen, 2012). However, reducing the effects of these factors might be difficult and costly. Fortunately, the grouping process allows us to assemble cells with similar characteristics into a battery pack to improve consistency (Huang et al., 2018; Lai et al., 2019). How to increase the performance of the battery grouping process has become a research focus recently. Essentially, battery grouping aims to categorize battery cells according to their diversities in various characteristics. These characteristics mainly comprise static capacity, voltage, internal resistance (Li, 2014) and thermal behavior (Fang et al., 2013). Battery grouping can be achieved via a similarity analysis of any characteristic above. The static capacity-based grouping method conducts a chargingdischarging experiment under specific conditions, calculates static capacity according to the charging and discharging time. Cells are then grouped by their respective static capacity (Kim et al., 2012). This method has the advantage of convenient operation, whereas it is vulnerable to the variation of environment factors such as temperature. Voltage-based grouping method divides cells into different groups according to their OCVs or load voltages. Although measuring the voltage is simple and direct, the grouping results are usually unsatisfactory because the voltage variation during the charging-discharging process is neglected. By contrast, internal resistance-based grouping method can obtain cell groups with better consistency whereas the accurate measurement of this characteristic is difficult (Gogoana et al., 2014). The aforementioned grouping methods have different advantages and disadvantages. In practical applications, manufacturers often combine two or more methods to obtain better grouping performance (Zeng et al., 2016). In (He et al., 2014; Yun et al., 2019), the self-organizing map (SOM) is adopted for battery grouping based on cell temperature and capacity. The demonstrated grouping experiments prove its validity in reducing the variation of the above two parameters. However, this kind of combinatorial method leads to more complex operations as well as longer execution time and meanwhile cannot get rid of the disadvantages of basic methods. The desired method should take as many characteristics of cells as possible into account and be convenient to implement. So far the most promising grouping method is the dynamic characteristics-based method (He et al., 2016). This method makes use of the dynamic characteristic curve, namely, the voltage or current curves during the charging-discharging process. Cells with similar curves are grouped into the same pack. These curves contain the majority characteristics of batteries and are feasible to be acquired and recorded with sensors and computers. The core issue of the dynamic characteristics-based grouping method comes down to a whole time-series clustering problem. The similarity between every two time-series behind the charging or discharging curve of specific cells determines whether they can

be grouped into one pack. In (Aghabozorgi et al., 2015), five major components that a typical whole time-series clustering solution contains: representation, distance measurement, prototype, algorithm, and evaluation are summarized. Firstly, if necessary, raw time-series data is represented in a lower dimensional space to extract significant features and reduce computational complexity. Next, a suitable distance is selected or designed to measure the similarity between two time-series. Then, a prototype and its respective algorithm are utilized to implement clustering. Finally, the clusters are evaluated to determine whether the result is acceptable. The time-series clustering for battery grouping follows the above procedures. In (He et al., 2016), an affinity propagation clustering algorithm is adopted for battery grouping after data cleaning and alignment of the acquired charging and discharging voltage sequences. In (Wang et al., 2017), clustering is achieved via a proposed squeeze algorithm, which compares the acquired curves with templates stored in the database and assigns them to the most similar one. In (Li et al., 2015), a modified equal-number Kmeans clustering is proposed to meet the quantity requirements of the battery pack. These existing methods reduce the inconsistency of battery groups by maximizing the silhouette index of clustering results or minimizing the average difference among all battery voltage sequences in each cluster. Nowadays, the rapid development of EVs requires larger volumes of battery production. In a typical battery factory in China, about 300,000 cells are produced per day. This situation puts forward a new demand for reducing the complexity of the grouping methods. So far, few researchers focus on this large-scale grouping problem or propose any solutions. With the development and promotion of cloud computing technique, traditional server room is gradually replaced by the cloud data center, where data are centrally stored and processed with higher efficiency. In a typical lithium-ion battery grouping process, the charging and discharging data are collected by formation cabinets and sent to host computers for temporary storage. Each host computer manages a formation cabinet group and controls the behaviors of all cabinets in the group. After the charging and discharging operations, these host computers will stand idle. The collected data will not be processed until they are uploaded to the cloud data center. Meanwhile, according to the above analysis, the large-scale grouping problem boils down to a clustering problem for a large volume of sequential data. This can be handled easily by improving the computing power of the cloud data center. However, for a private cloud in small or medium-sized enterprises, it means that more servers should be deployed and the cost of production will be raised. Apparently, this plan will not satisfy the enterprise. Therefore, the major research objectives are to reduce the inconsistency among cells and improve computing efficiency under the premise of not increasing any cost on extra computing resources. In recent years, a new paradigm named edge computing has become a research hotspot. It is defined as a technique that enables computation to be performed at the edge of the networks (Shi et al., 2016). With this paradigm, the computing burden of the cloud data center could be shared by a group of edge computing devices. On this basis, we put forward our hypothesis that an “cloud-edge” collaborative grouping mode could solve the largescale grouping problem. The idled host computers are regarded as the edge computing devices for preliminary grouping and the cloud data center is utilized to integrate all local groups. Considering that the edge computing devices can work in parallel to improve efficiency, the objectives can be achieved with the help of the distributed time-series clustering technique. In this paper, we propose a specific two-stage distributed timeseries clustering scheme for battery grouping inspired by (Chen et al., 2016; Tong et al., 2017). This scheme contains mainly two steps: local clustering and global merging. The first step, local

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

clustering, is implemented on host computers. After data preprocessing, sequence alignment and similarity measurement, the rest collected sequences in each host computer form the local cluster set via a density peaks clustering (DPC) algorithm and the contour of every cluster will be extracted. Next, all sequences and clustering results are uploaded to the cloud data center for global cluster merging, which forms a global cluster set. At last, the global cluster set will be checked and reorganized if similar clusters exist. The implementation of battery grouping is based on the final global cluster set. The feasibility of the proposed scheme is demonstrated in the following aspects: First, the host computers in the workshop are available for operation, thus can be used as edge computing devices with requisite software. Second, the clusters can be represented by their contours because the discharging sequences encircled by the contours have similar characteristics and are densely distributed. Third, the global merging procedure use only cluster contours, which can reduce computational complexity. Fourth, the consistency can be largely maintained with proper similarity measurement indices and merging scheme. This work is driven by the demands of the real lithium-ion power battery manufacturers. The procedures of the proposed scheme follow the general dynamic characteristics-based grouping method. Meanwhile, we are inspired by existing advanced distributed scheme and edge computing concepts and try to integrate them into the grouping process. Although our scheme is closely related to the mainstream approaches, we make several innovative improvements. The contributions of this work are listed below: C A “cloud-edge” collaborative grouping mode is established to utilize the idle computing resources of host computers for the lithium-ion battery industry. C A two-stage distributed battery grouping scheme that splits the original centralized clustering approach into local clustering and global merging is proposed for consistency and efficiency improvement. These two stages are implemented on edge computing devices and cloud data center respectively. C A denoising contour extraction algorithm is designed to detect cluster boundaries of the irregular discharging voltage sequence. The contours are used as effective simplified representations of their respective clusters. C A multiple groups merging scheme based on three fitness evaluation indices concerning the range of terminal time, relative position and distance is designed for global merging. The merging scheme utilizes only cluster contours in order to improve efficiency. The contribution towards computational load reduction of the proposed scheme is very important for practical engineering application because workers need to know the grouping results of cells and start sorting right after the end of the charging and discharging process. Therefore, long time delay is not unacceptable for the enterprise. Therefore, we make use of the host computers as extra computing resources to achieve fast distributed time-series clustering. Effective algorithms are designed to align inhomogeneous data, extract contour curves and achieve cluster merging for the proposed scheme. The rest of this paper is organized as follow. Firstly, in section 2, an overview of the lithium-ion battery grouping process is presented. Next, in section 3, the proposed scheme is illustrated in details. Then, in section 4, experiment results are analyzed and discussed. Finally, section 5 concludes this paper. 2. Description of lithium-ion power battery grouping process The grouping process aims to find cells with similar performance

431

and places them into homogeneous groups to improve consistency. For a particular lithium-ion battery manufacturer in China, a capacity-resistance-based combinatorial method is applied for battery grouping. Firstly, in battery formation workshop (Fig. 1), cells are mounted on a charging-discharging device called formation cabinet. This cabinet holds a maximum number of 512 cells. A host computer controls the behavior of 10 cabinets and executes operations according to established procedures. The charging-discharging process comprises 9 steps: standing, constant current charging, constant current charging, constant voltage charging, standing, constant current discharging, standing, constant current charging and standing. Data include the dynamic voltage, current and capacity of all 512 cells are measured and transmitted to the host computers at an interval of 1 min. After the last standing step, the terminal voltage of each cell is compared with the criterion of acceptability. Qualified cells are divided into different groups according to their capacity values, which is recorded right after the constant current discharging step. Afterwards, the grouped cells are sent to the battery module assembly workshop (Fig. 2) for internal resistance-based grouping. Cells in the same group will be arranged in the same module. To achieve battery grouping using dynamic characteristicsbased method, the charging and discharging data are needed. In this paper, we only utilize the discharging voltage sequences during the whole constant current discharging step because the discharging voltage curve contains enough information that reflects the major characteristics of a lithium-ion battery. In practical applications, the proposed two-stage distributed time-series clustering scheme is implemented in battery formation workshop. Battery groups will be transported to module assembly workshop for further grouping to improve consistency. 3. Two-stage distributed grouping scheme The overall framework of the proposed grouping scheme is presented in Fig. 3. At the battery formation workshop, formation cabinets are managed and controlled by their respective host computers. These host computers carry out operation instructions, collect and store data temporally. During the time that workers are mounting or demounting cells, they keep laying idle, which causes a huge waste of computing power. With the edge computing technique, they can all be utilized as additional computing resources and play an important role in the battery grouping process. The cloud data center needs only analyse and process the preliminary results rather than the raw data. The flowchart of the proposed grouping scheme is presented in Fig. 4. Firstly, the whole voltage sequence of each cell in the constant current discharging step (step 6 in Fig. 3) is acquired from the formation cabinets and stored in the host computers. A preliminary local clustering including data preprocessing, sequence alignment, similarity measurement, local density clustering, and contour extraction is implemented on each host computer. Afterwards, the local cluster set and cluster contours are uploaded to the cloud data center for global merging, including fitness evaluation and cluster merging. It is worth noting that only the cluster contours that participate in the global merging process, which can reduce the computation load significantly. A global cluster set is formed according to the merging results. A check for similarity is executed to merge possible similar clusters in the above set. After the global cluster set being transmitted back to host computers, all cells are assigned to different groups. For each group in each host computer, cells are divided into several subgroups according to their capacity values. The unloading instruction will be programmed and sent to formation cabinets groups to guide workers demounting cells using the indicator light. In the following sections, all procedures of the two-stage

432

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

Fig. 1. Battery formation workshop.

Fig. 2. Battery module assembly workshop.

distributed grouping scheme will be illustrated in details. Necessary notations are listed in Table 1. 3.1. Local clustering At the local clustering stage, the aim is to achieve preliminary clustering and extract cluster contours for global merging. The flowchart is shown in the left part of Fig. 4. Note that the local clustering is implemented in parallel by the edge computing devices, thus the execution time of this stage will depend on the device that takes the longest time. The following section will elaborate this stage through 4 parts: data preprocessing, sequence alignment and similarity measurement, local density clustering and contour extraction. 3.1.1. Data preprocessing The discharging voltage sequence contains abnormities that

need to be removed before further analysis. According to the technical standard, the capacity value of a qualified cell has a lower boundary and the terminal voltage value is supposed to be in a specific range. The data preprocessing step identifies unqualified cells that violate the standard and removes them from the data set. Fig. 5 shows a comparison of discharging voltage sequences before and after data preprocessing. Fig. 5 (a) shows the original terminal voltage and capacity distribution. The blue lines mark the range of qualification. Only when the capacity value is larger than the value marked by the vertical line and the terminal voltage value lays between two values marked by horizontal lines can a cell pass the standard inspection. Three types of unqualified cell are plotted in different shapes (triangle, circle, and square) and colors (red, green, and yellow). Their corresponding discharging sequences are displayed in Fig. 5 (c). The data preprocessing step realizes data cleaning by removing those abnormal data. In Fig. 5 (b, d), the cyan cross and sequences represent the qualified cells. Data preprocessing cannot filter all abnormal data. In Fig. 5 (d), the sequence pointed out by arrow fulfills the standard but has apparent deviations. It is classified as an outlier and will be removed during the local density clustering. 3.1.2. Sequence alignment and similarity measurement Due to the measurement delay, the recorded discharging voltage values of different cells in an identical batch have slight variations in sampling time. Meanwhile, these sequences have different length because of irregular sampling and stipulated stopping rules. Data acquisition unit stops instantly when the voltage of a cell reaches 2.75 V, which is the lower boundary. In this situation, traditional distance is unable to handle sequences with unequal intervals and time distortion (Folgado et al., 2018). Therefore, the sequence alignment is necessary before similarity measurement and clustering. Dynamic time warping (DTW) (Berndt and Clifford, 1994) can find an optimal match between two sequences through a nonlinear warping of their respective time axis. Given two time-series X ¼ ðx1 ; x2 ; :::; xnX Þ and Y ¼ ðy1 ; y2 ; :::; ynY Þ, DTW calculates a cost matrix CXY 2
X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

433

Fig. 3. Overall framework of the two-stage distributed grouping scheme.

warping path P ¼ ðp1 ; p2 ; :::; pl Þ with length l, where pi ¼ T  ; i ¼ 1; 2; :::; l. pXi and pYi are the number of two correpXi pYi sponding values in X and Y. Dynamic programming (DP) is employed to find the optimal P with minimal total costs. Unfortunately, DTW is not appropriate for battery discharging voltage sequences alignment because it does not take into account of the time values, which is an important feature to measure the discharging performance of a battery. It is important to maintain the time information and align two sequences according to their time value at each sampling time. Therefore, DTW should act on the time axis (time alignment) instead of voltage axis (value alignment), after which traditional distances (Euclidean distance in this paper) can be used for similarity measurement. Let the sampling time of X and Y be TX ¼ ðt X1 ; t X2 ; :::; t XnX Þ and TY ¼ ðt Y1 ;t Y2 ;:::;t YnY Þ, the sequence alignment and similarity measurement contains three steps. Firstly, the optimal warping path P between TX and TY are obtained using DTW; then, all corresponding voltage values between X and Y along P are recorded; finally, the Euclidean distance dXY between X and Y is calculated. The comparison between time and value alignment is shown in Fig. 6. Voltage sequences are obtained from a real discharging process. In this case, the DTW distance between red and blue sequences is 127.8 by value alignment and 1800.7 by time alignment. To clarify, the distance is not the length of the green line but the vertical distance (represent the discharging voltage value difference) between two corresponding samples. Obviously, these two curves differ in discharging rate and terminal time, therefore time alignment performs better because it stresses the differences in discharging time between two time-series. When two sequences are aligned using DTW, their lengths will change accordingly. For the purpose of similarity measurement, the calculated distance should be scaled in accordance with a ratio of the sequence length after alignment to a basic length. For a group of time-series data fX1 ; X2 ; :::; Xr g, a DTW distance matrix D is formed by Dij ¼ dXi Xj , where i2½1 : r and j2 ½1 : r. The calculation of D is a heavy load especially when r is very large,

therefore it is usually calculated ahead of time before clustering. Remark 1. The proposed distributed clustering scheme can reduce the computational complexity of D significantly with the help of edge computing devices. Suppose there are a cells and b host computers. Centralized computing structure needs aða 1Þ=2 times of similarity measurement in total to obtain D while distributed computing structure needs only aða bÞ=2b times in total and aða bÞ=2b2 times in each host computers. Apparently, the distributed structure is more effective.

3.1.3. Local density clustering A density-peak clustering (DPC) algorithm (Rodriguez and Laio, 2014) is employed in this paper for time-series clustering. DPC can find cluster center automatically and discover clusters with arbitrary shape. Besides, the distance matrix D can be used as input for DPC directly. The ideal cluster center should fulfill two characteristics: (1) higher local density than its neighbors; (2) relatively long distance from other center (Du et al., 2016). The local density rXi is defined as:

rXi ¼

  . 2  X exp  dXi Xj dc

(1)

jsi

where the cut-off distance dc is specified manually. The distance dXi is defined as the minimum distance from a higher density center:

dXi ¼ min dXi Xj j:rX > rX j

(2)

i

For Xi with the largest local density:

dXi ¼ maxdXi Xj j

(3)

A decision graph is constructed to select centers with larger r and d as well as determine the proper amount of cluster centers nc .

434

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

Fig. 4. Flowchart of the two-stage distributed grouping scheme.

Data are allocated directly to centers with the minimum distance in D. The membership of each cluster is stored in Sid . Usually, for the purpose of cluster merging, the number of centers should not be too less.

Owing to the detrimental effect of outliers on clustering and contour extraction, they should be removed beforehand. Inspired by the idea in density based spatial clustering applications with noise (DBSCAN) algorithm (Ester et al., 1996), outliers are identified

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

435

Table 1 The notation list. Symbol

Description

X Y x y nX nY CXY P p l t TX TY dXY r D a b

TD

A time-series A time-series Sampling value in X Sampling value in Y Number of values in X Number of values in Y Cost matrix of X and Y Optimal warping alignment path Value pair in P Length of P Sampling time Sampling time sequence of X Sampling time sequence of Y Euclidean distance between X and Y Number of time-series in a group Dynamic time warping (DTW) distance matrix Number of cells Number of host computers Local density Cut-off distance Minimum distance from higher density centers Amount of cluster centers Membership of clusters Radius parameter of DBSCAN algorithm Density threshold of DBSCAN algorithm Discharging voltage set Discharging time set Identifier of cluster Noise level Threshold parameter for boundary curve smoothing Set of indices that belong to cluster ic Voltage sequence with the largest terminal time in ST ðIÞ Length of Bt Length of I Set of time values that have the minimal difference with Bt Corresponding set of voltage value of STC Upper boundary of a cluster Lower boundary of a cluster Corresponding time sequence of Bu and Bl Number of clusters in a cluster group Number of clusters in a cluster group Number of clusters in the final global cluster group Terminal time index Relative position index Distance index Terminal time of the upper boundary curve Terminal time of the lower boundary curve Cluster contour Binary distance metric Average l1 -norm distance metric Synthesized fitness index Terminal time index table Relative position index table Distance index table Synthesized fitness index table Fitness degree threshold Consistency ratio Time ratio Total number of sequence in the discharging sequence set Number of groups after clustering Number of the discharging sequence set in group i Time cost for the D matrix establishment (centralized scheme)

TC

Time cost for clustering (centralized scheme)

TD i

Time cost for the D matrix establishment of group i (distributed scheme)

T Ci

Time cost for local clustering of group i (distributed scheme)

TM

Time cost for global merging (distributed scheme)

r dc

d nc Sid ε mp SV ST ic

a l I Bv lt ln STC SVC Bu Bl Bt u v nf iT ð$Þ iP ð$Þ iD ð$Þ tu tl C Dbin ð$Þ Dl1 ð$Þ iS ð$Þ IT IP ID IS Ts Cr Tr n m ni

by scanning the number of neighbors in a specified radius ε. Those with fewer neighbors, compared with a given threshold mp , are regarded as outliers. Although the parameter setting of DBSCAN is subjective, we can set reference value according to experiences and manual search. In general, outliers are with small numbers and

have a relative obvious difference compared with normal samples, thus they can be easily separated with proper ε and mp . 3.1.4. Contour extraction Contour is defined as a set of boundary points in a cluster. As

436

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

Terminal voltage and capacity distribution

Terminal voltage and capacity distribution

4200

4200

4100

4100 Terminal Voltage (mV)

Terminal Voltage (mV)

4000 3900 3800 3700

4000 3900 3800 3700 3600 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Capacity (mAH) (b)

3600 600 800 1000 1200 1400 1600 1800 2000 2200 2400 Capacity (mAH) (a)

Discharging voltage curve

Discharging voltage curve 4200

4000

4000 Discharging Voltage (mV)

(mV)(mV) Voltage Terminal Voltage Discharging

4200

3800 3600 3400 3200 3000 2800 2600

3800 3600 3400

Outliers

3200 3000 2800

0

10

20

30 40 Time (min) (c)

50

60

70

2600

0

10

20

30 40 Time (min) (d)

50

60

70

Fig. 5. Comparison of discharging sequences before and after data preprocessing.

Value alignment

4000 3800 3600 3400 3200 3000 2800 2600

0

10

20

30

40

50

Time alignment

4200

Dicharging voltage (mV)

Dicharging voltage (mV)

4200

60

70

80

Time (min) (a)

4000 3800 3600 3400 3200 3000 2800 2600

0

10

20

30

40

50

60

70

80

Time (min) (b)

Fig. 6. Comparison between time and value alignment.

shown in Fig. 7, the contour of a time-series cluster comprises an upper boundary and a lower boundary, which correspond to two sequences that encase most of the samples in this cluster. These two boundaries contain vital information like the variation trend and range of terminal time. In order to reduce computation time, only the contour of each local cluster will participate in global merging. Therefore, cluster contours are crucial towards the quality of merging. The contour extraction algorithm is depicted as Algorithm 1. The input contains the discharging voltage set SV and time set ST , the identifier of the cluster ic, the assumed noise level a, the membership of each cluster Sid and a threshold l. The output contains the upper boundary Bu, lower boundary Bl and the corresponding time sequence of the two boundaries Bt . Taking the irregular sampling problem into consideration, Algorithm 1 finds the discharging time sequence with the largest terminal time and regards is as the time reference Bt of the

boundaries. Next, for each time sequence in ST , it finds a time value that has the minimal sampling time difference with a specific value in Bt .The corresponding voltage value is stored as a boundary sample candidate. This step will repeat until all values in Bt are checked. In this algorithm, a threshold l is set to smooth the boundaries. If the minimal sampling time difference is larger than l, voltage values might change dramatically and the obtained boundaries will be rugged. Therefore, the original corresponding voltage value of the specific time value Bt will become boundary sample candidate. Then, the voltage values in SVC are sorted in ascending order. The samples in two boundaries are composed of the first and last a percent of values in order to remove noise and disturbances. The above procedure will be repeated until all samples of the upper and lower boundaries are found. Algorithm 1. Contour extraction

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

3.2. Global merging The global merging process tries to find similar clusters in different local sites and amalgamates them by analyzing their contours. A two-group merging scheme is illustrated in Fig. 8. Suppose group 1 has u clusters and group 2 has vðv < uÞ clusters, global merging finds a matching cluster in group 1 for each cluster in group 2. If two clusters are matching, they are supposed to have high fitness degree. If no match is found for one cluster, it will be

Fig. 7. Illustration of the cluster contour.

437

considered independent and added to the cluster group as a new cluster. Therefore, the number of clusters in the final cluster group nf fulfils nf  u. It is worth noting that cluster matching and merging are executed alternately, namely merging operation will start right after two matching clusters are found. In this way, we can reduce the occurrences of erroneous matches. For example, if the fitness degrees between cluster 2 in group 2 with all clusters in group 1 change after the first merging operation, it can still match the most appropriate one, whereas the one-stop matching solution will not consider those changes. To achieve multiple groups merging, a basic cluster group is determined ahead of time, then this two-group merging scheme is repeated until all groups are merged. The flowchart of global merging is shown in the right part of Fig. 4. Details about fitness evaluation index computation and cluster merging approach are elaborated in the following section.

3.2.1. Fitness evaluation Fitness evaluation is realized by a collaborative decision-making process between several fitness indices. A fitness index is defined as a metric that determines whether two clusters can be merged into one. In this paper, three types of fitness indices are designed: the terminal time index iT , the relative position index iP , and the distance index iD . Fig. 9 illustrates the contrast between two contours under different high/low fitness situations. Terminal time index iT measures the overlap ratio of the range of terminal time between upper and lower boundary curves. It is defined as:

438

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

Fig. 8. A two-group merging scheme.

Fig. 9. Illustration of three fitness indices.



iT C 1 ; C 2



    min t 1u ; t 2u  max t 1l ; t 2l  ¼  min t 1u  t 1l ; t 2u  t 2l

(4)

where t iu ði ¼ 1; 2Þ and t il ði ¼ 1; 2Þ are the terminal time of the upper and lower boundary curves from two different contours C 1 and C 2 . The span of iT is bounded by 0 and 1 for compatible situations

(Figs. 9 (a-1)), where the overlap ratio is higher than 0 and larger iT means higher fitness. iT becomes negative when leftward and rightward deviations occur (Figs. 9 (a-2, a-3)) and such cases are considered as low fitness. Relative position index iP measures the consistency of relative position between two boundary curves. It is defined as:

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

  iP C 1 ; C 2

¼

    Dbin B1u ; B2u Dbin B1l ; B2l

  Dbin B1u ; B2u

¼

lu      1 X sgn DTW B1u ðiÞ  DTW B2u ðiÞ lu i¼1

  Dbin B1l ; B2l

¼

ll      1X sgn DTW B1l ðjÞ  DTW B2l ðjÞ ll

439

two clusters can be merged. If the indices between a target cluster and all clusters in the basic group are low, this cluster will become new cluster of the basic group. To measure the level of fitness degree, a threshold Ts is set. If any iS between a target cluster and all clusters in the basic group is larger than Ts , they can be merged. Otherwise, it will form new basic cluster. 4. Experiment and discussion

j¼1

(5) where Biu ði ¼ 1; 2Þ and Bil ði ¼ 1; 2Þ are upper and lower boundary curves of two contours C 1 and C 2 . Dbin is a binary distance metric. Two boundary curves are firstly aligned via DTW technique, their difference value between each two corresponding points are mapped into 0 and 1 using the sgnð$Þ function. On this basis, Dbin is defined as the average binary distance which fulfills Dbin 2 ½0; 1 and iP is equivalent to the product of Dbin ðB1u ; B2u Þ and Dbin ðB1l ; B2l Þ. For a high fitness situation in Figs. 9 (b-1), one of the contours surround or is surrounded by the other, which is shown as the green and blue dotted curves. In this case, iP approaches 0 and 1 for green and blue contours respectively. On the contrary, iP fluctuates around the median value for situations like Figs. 9 (b-2, b-3), which indicates low fitness. Distance index iD measures the similarity between two contours intuitively. It is defined as:

  iD C 1 ; C 2

¼

  Dl1 B1u ; B2u

¼

  Dl1 B1l ; B2l

¼

    Dl1 B1u ; B2u Dl1 B1l ; B2l     1   DTW B1u  DTW B2u  lu 1      1  DTW B1l  DTW B2l  ll 1

(6)

where Dl1 is the average l1 -norm between two curves after being aligned by DTW. iD pursues the proximity of each corresponding points, a high fitness situation is shown in Figs. 9 (c-1). Fig. 9 (c-2, c3) shows several regular and irregular deviation low fitness situations. Although iD conflicts with iP in regular deviation situation, they can reach a compromise for the best compatibility. 3.2.2. Cluster merging To find two clusters with the highest fitness, three indices are necessary to be integrated. In this paper, a synthesized fitness evaluation index iS is formed by the product of iT , iP and iD after standardization. Given two groups with u and vðv < uÞ clusters, three fitness index table IT 2
        iS C 1 ; C 2 ¼ iT C 1 ; C 2 iP C 1 ; C 2 iD C 1 ; C 2

(7)

All iS form synthesized fitness index table IS , the column and row of the maximal iS in each column are defined as the merging cluster pair. To form new clusters, the contours of original two clusters should be merged. The contour extraction algorithm (Algorithm 1) will be employed again to find new contour among 4 boundary curves. According to our group merging scheme, the above procedures are repeated to achieve multiple group merging process. The synthesized fitness evaluation index iS determines whether

4.1. Experiment configuration, process and result In this section, the proposed improved distributed grouping scheme is tested with real discharging voltage sequences acquired from a complete discharging process in the grouping procedure from a lithium-ion power battery manufacturer in China. The manufactured battery cells are standard 18650-size cylindrical cells with 2000 mAH nominal capacity, 3.7 V nominal voltage, and internal resistance lesser than 60 mU. During the discharging process, the discharging current is fixed at 2000 mA and the terminal voltage is set to 2.75 V. This process will stop immediately when the measured voltage value reach the terminal value. The interval of data collection is set to 1 min. However, the real sampling sequences are usually inhomogeneous. In our experiments, data were collected by 4 independent formation cabinets from 4 different cabinet groups. All batteries are of the same type in an identical batch. Total 1974 discharging sequences are comprised of 4 groups (A,B,C and D) with 496, 491, 500 and 487 sequences respectively. The local clustering procedure should be implemented on all 4 groups simultaneously by their corresponding host computer. In our experiments, the whole process were completed on a desktop PC with Intel Core i7-6700 CPU, 16 GB RAM and 3.40 GHz clock speed. After the data preprocessing step, abnormities that violate the technical standard were removed from the dataset. The number of sequences were reduced to 460, 466, 399 and 471 correspondingly. The DTW distance matrix for each group was computed ahead of time. Results of local density clustering are shown in Fig. 10. Firstly, proper radius and threshold are selected to achieve the secondary data screening. Number of samples are further reduced to 443, 459, 388 and 455. Decision graphs are employed to determine the number of clusters and the cluster centers. From Figs. 10 (a-1) to (a4), it is clear that the samples marked by red triangles own relative larger local density r and distance d than others, thus they are selected as cluster centers and other samples are assigned to their nearest center. The number of clusters for group A, B, C and D are 4, 3, 3, and 3. Clustering results are shown in Figs. 10 (b-1) to (b-4). After the local density clustering, the contour of each cluster in each group was extracted using Algorithm 1. Those contours are displayed as black curves in Fig. 11 (a-1) to (a-4). For convenient demonstration, we separate them by translating along their voltage axes. The percentage a is set to 95% to reduce the influence of noise and avoid quick expansion of contours. The threshold l is set to 1.5 min for boundary smoothing. The threshold Ts is set to 0.6. These works should be finished on host computers and all contours will be transmitted to cloud data cente r for global merging in real application. According to the proposed two-group merging scheme, group A is selected as the basic unit because it owns the maximum number of clusters. The results of global merging is shown in Figs. 11 (b-1) to (b-4) and the decision-making process in shown in Fig. 12. The first merging process between contour group A and contour group B is illustrated in Figs. 11 (b-2) and Fig. 12 (a). Synthesized fitness index tables are calculated to determine the most suitable matching pair. In this case, those pairs are B2-A1, B3-A3 and B1-A1,

Decision Graph A

1400

Cluster 1

1200

4000

Cluster 4

600 400

3600 3400 3200

2800

Cluster 3 0.2

0.4

0.6

0.8

1.0

1.2

Cluster 1 Cluster 2 Cluster 3 Cluster 4

3000

Cluster 2

200 0

3800

Voltage (mV)

1000 800

Clustering-A

4200

2600

1.4

0

10

20

(a-1) Decision Graph B

350 300

200

Cluster 2

150 100

0.4

0.6

0.8

1.0

1.2

70

50

60

70

50

60

70

Cluster 1 Cluster 2 Cluster 3

2600

0

10

20

30

40

Time(min) (b-2) Clustering-C

4200 4000

Voltage (mV)

Cluster 1

400 300 Cluster 3

200

Cluster 2

3800 3600 3400 3200 3000

100

Cluster 1 Cluster 2 Cluster 3

2800 0.2

0.4

0.6

0.8

1.0

1.2

1.4

2600

0

10

20

(a-3) Decision Graph D

600

Cluster 1

300 200 Cluster 3

100

3800 3600 3400 3200 3000

Cluster 1 Cluster 2 Cluster 3

2800 0.2

0.4

0.6

0.8

1.0

1.2

40

4000

Voltage (mV)

Cluster 2

400

30

Time(min) (b-3) Clustering-D

4200

500

0

60

3200

Decision Graph C

500

0

50

3400

(a-2) 600

70

3600

2800 0.2

60

3800

3000

50 0

50

4000

Voltage (mV)

Cluster 3

40

Clustering-B

4200 Cluster 1

250

30

Time (min) (b-1)

1.4

2600

0

10

(a-4) Fig. 10. Results of local density clustering.

20

30

40

Time(min) (b-4)

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

Fig. 11. Coutour extraction and global merging.

441

442

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

Fig. 12. The decision-making process of global merging.

as shown in Figs. 11 (a-2). Apparently, the synthesized fitness indices between all merged pairs are larger than Ts . In Fig. 12 (a), from the synthesized fitness index tables, cluster B2 is selected and merged with cluster A1 because of high iS . To clarify, iS ¼ 1 does not mean that these two clusters are identical. It is the result of data standardization. After the first merging step, B2 is deleted and the table is recalculated. Next, B3 is merged with A3 and deleted from the table. At last, B1 is merged with A1 and the first two-group merging finishes. In Figs. 11 (b-2), the integration of B into A expands the original A1 and A3 contours. The group C will merge with the new basic group A. In Figs. 11 (b-3) and Fig. 12 (b), the merging pairs are C1-A1, C2-A1 and C3-A3, contours of A1 and A3 expand due to the integration. In Figs. 11 (b-4) and Fig. 12 (c), the merging pairs are D1-A1, D2-A3. In this case, contours of A1 and A3 remain unchanged basically and assimilate D1 and D2 directly. In the last fitness index table between D3 and A in Fig. 12 (c), none of the indices are larger than Ts , thus D3 becomes a new basic cluster, namely A5. A comparison between all contours before and after merging is shown in Figs. 11 (b-1) and (a-1), it is clear that the proposed scheme can achieve multiple cluster merging effectively. These works can be completed through big data processing on the cloud data center in real application.

To avoid inappropriate local clustering or cluster merging, a check of global cluster set will be executed at last. The fitness index between any two clusters in the global cluster set are calculated. The fitness index table is shown in Fig. 13. The asymmetry of the fitness index table is due to the operation of data standardization. Therefore the correct fitness index between A1 and A2, for example, should be the element in row A2 and column A1. Apparently none of these indices exceeds Ts , which means the clustering and merging results are acceptable. 4.2. Performance evaluation and comparative analysis To test the performance and prove the validity of our improved distributed grouping scheme, two assessment indices, the consistency ratio Cr and time ratio Tr , are proposed in this paper. The consistency ratio Cr is defined as the ratio between the weighted average of average DTW distance within all clusters and the total average DTW distance within the original set without clustering:

Cluster level weighted average distance  100% Total average distance

Cr ¼

nX ni  i 1 X    2 DTW Xj  DTWðX Þ k 2 n ðn  1Þ i¼1 i i j¼1 m P

n ¼ Pi ni

¼

k¼iþ1

n1 X

n  X   DTWðXi Þ  DTW Xj  2

2 nðn  1Þ i¼1 j¼iþ1

ni  i 1 X m nðn  1Þ nX    P DTW Xj  DTWðXk Þ 2 ðn  1Þ i i¼1 j¼1 k¼iþ1

P

ni

n1 P

n    P DTWðXi Þ  DTW Xj  2

 100%

 100%

i¼1 j¼iþ1

(8) Fig. 13. Check of global cluster set.

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

where n is the total number of the discharging sequence set, which is presented as fX1 ; X2 ; :::; Xn g. The whole set is divided into m groups ffX1 ; :::; Xn1 g; :::; fX1 ; :::; Xnm gg after the improved distributed grouping process, where ni ði ¼ 1; 2; ::; mÞ are the number of the sequences within a group. In the basis of Cr , the reduction inconsistency rate can be represented as 1  Cr . To evaluate the performance of the proposed distributed grouping scheme, a centralized grouping scheme based on DPC, three relevant works and a baseline are adopted for comparison: C Centralized grouping scheme based on DPC: This scheme is a centralized one-stage version of the proposed scheme. The difference is that the local density clustering based on DPC will act on the whole discharging voltage set to obtain the grouping results directly. C Grouping scheme based on affinity propagation (AP) (He et al., 2016): This scheme applies the wavelet denoising and AP-based clustering with the DTW distance for battery grouping. We omit the first step because the noise level of the experimental data is low. Meanwhile, we use time alignment for comparison. C Grouping scheme based on curve fitting and improved kmeans (Huang et al., 2018): In this scheme, the parameters of the polynomial fitting model is used to represent the original sequence. The DPC is employed to determine the initial cluster center and k-means algorithm is used for battery grouping. C Grouping scheme based on fuzzy c-means (Guo and Liu, 2012): The original scheme does not consider the irregularity of voltage sequence, therefore we use the curve fitting technique to achieve uniform sampling. The fuzzy c-means algorithm with Manhattan distance is then used for battery grouping. Due to the difference in discharging time, we use the terminal voltage value to extend the sequence with shorter length. C Random grouping scheme: This scheme regards the whole dataset as a group. It serves as a baseline for all other schemes.

443

The maximal, minimal and average DTW distance in each groups , the calculated total weighted average distance, Cr and 1  Cr obtained by the above six clustering scheme as well as their computation time are listed in Table 2. From the table, we observe that the obtained groups via distributed and centralized scheme share some similar measurement indices. The maximal, minimal and average distance are alike. All indices for the 4th distributed group (G14) and the 4th centralized group (G24) are identical. A majority of samples are assigned to G11 via distributed scheme, whereas they are further assigned into G22, G23 and G25 via centralized scheme. G15 achieves the best performance in all measurement indices. The obtained weighted average distances are 239.56 via distributed scheme and 213.29 via centralized scheme and the corresponding Cr are 56.44% and 50.25%, which means the inconsistency among batteries are reduced by 43.56% and 49.75% compared with the baseline. In short, the proposed scheme achieve relative high performance that almost overtake the centralized scheme. This result is acceptable considering the savings of time cost. The proposed distributed grouping scheme perform the best in reducing inconsistency compared with all three schemes in the literature. The AP-based scheme cannot achieve effective grouping division for the experimental data, thus the index Cr is very high. Meanwhile, the computation of DTW distance matrix increase its computation time. The k-means-based scheme extracts useful features through curve fitting for dimensionality reduction, which makes it the fastest algorithm. However, the parameters of fitting model do not contain time information, which lead to inaccurate similarity measurement. The fuzzy c-means-based scheme maintains the time information, whereas the Cr is still not satisfactory because of the lack of outlier detection and the insufficient precision of curve fitting. Further, for both centralized and distributed scheme, we analyse the time consumption of each steps. The time ratio Tr is defined as the ratio of computational time cost between the centralized and distributed clustering scheme for battery grouping, which is represented as:

Table 2 The performance measurements of grouping scheme. 1  Cr Computation time

Battery Grouping Scheme

Groups Number of samples

Maximum distance

Minimum distance

Average distance

Total weighted average Cr distance

Distributed grouping scheme based on DPC

G11 G12 G13 G14 G15 G21 G22 G23 G24 G25 G31 G32 G41 G42 G43 G44 G45 G51 G52 G53 G54 G55 \

952.94 733.40 592.08 892.87 527.53 970.81 711.49 599.74 892.87 1065.23 2569.82 2320.35 1349.13 1033.34 1465.66 2208.39 1466.17 1297.22 1994.78 1249.52 1607.61 1178.45 2703.79

6.13 9.34 6.87 18.37 8.00 6.87 9.86 7.77 18.37 6.13 6.13 12.10 7.16 9.86 6.13 9.34 58.62 7.77 18.37 9.86 6.13 9.34 6.13

261.70 200.33 184.38 330.74 160.16 250.15 204.47 171.00 330.74 212.15 356.62 622.88 314.70 284.18 279.81 511.94 477.87 273.99 477.74 291.65 265.44 342.16 424.45

239.56

56.44% 43.56% 126.56s

213.29

50.25% 49.75% 1775.51s

375.45

88.46% 11.54% 1779.74s

335.45

79.03% 20.97% 1.64s

298.54

70.34% 29.66% 10.58s

424.45

100%

Centralized grouping scheme based on DPC

Grouping scheme based on AP Grouping scheme based on curve fitting and improved K-means

Grouping scheme based on Fuzzy C-Means

Baseline: Random grouping scheme

1174 113 267 49 142 399 643 390 49 286 1669 127 665 401 402 294 34 542 58 354 409 433 1796

0%

\

444

Tr ¼

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445

  C þ TM max T D i þ Ti TD þ TC

 100%; i ¼ 1; 2; :::; m

(9)

where T D and T C is the time cost for DTW distance matrix estabC lishment and clustering of the centralized clustering scheme, T D i , Ti are the time cost for DTW distance matrix establishment and local clustering of each group of the distributed clustering scheme, T M is the time cost for global merging. The comparison of computation time between the centralized and distributed scheme is shown in Fig. 14. The main computational burden is the DTW distance matrix establishment step. It costs 1774.59s for centralized scheme and 126.17s for the longest grouping duration for distributed scheme. Due to the efficiency of DPC algorithm, the clustering process takes only 0.92s for centralized scheme. Likewise, the clustering and global merging process takes 0.96s in total for distributed scheme. Therefore, the computation time of centralized and distributed scheme are 1775.51s and 126.56s. The time ratio Tr is equal to 7.13%, which means the proposed scheme reduces about 92.87% time cost compared with existing centralized scheme.

4.3. Discussion In this paper, we introduce the “cloud-edge” collaborative mode into battery grouping process and propose a “local-global” twostage distributed grouping scheme. Our work is oriented towards the demand of massive battery production. The aim is to optimize the original grouping scheme, reduce inconsistency rates and increase production efficiency. Compared with existing approaches, our scheme puts forward the idea that the idle host computers at the workshop can be utilized as edge computing resources. Meanwhile, we focus on the characteristics of discharging voltage sequence and propose an effective algorithm to extract cluster contours after local clustering. Only these contours are involved in global merging, which can reduce the computational complexity significantly. Novel fitness evaluation indices for global cluster merging is designed to maintain consistency. In our experiments, the respective 43.56% and 92.87% reduction of the inconsistency rate and time cost verified the validity and feasibility of the proposed hypothesis and methodology. Besides, the resource utilization rate is increase due to the proper usage of the host computers.

Although this work achieves its preliminary objectives, it still has some limitations. First, the computational complexity of the local clustering algorithm is relative high. Second, the combination of the fitness evaluation indices is straightforward. Third, the amount of experimental data is not enough. In future research, we will collect more samples, replace DPC with some state-of-art clustering algorithms which are designed for large amount of data and consider more factors to improve the synthesized fitness index. One of the main advantage of this work lies in its popularization ability. The proposed scheme can be applied to different factories due to the similarity in battery production. The proposed mode provides a new idea that utilizes the idle resources to accelerate computation and production, which is meaningful for various industries. 5. Conclusion In summary, based on the edge computing technique, an effective two-stage distributed lithium-ion power battery grouping scheme is proposed in the paper for consistency improvement of battery packs and efficiency improvement of battery production. The idle periods of host computers are utilized to implement local clustering on battery discharging voltage sequences collected by all formation cabinets they manage. Cluster contours are extracted and uploaded to cloud data center for fast global cluster merging. Batteries are assigned to their respective clusters in the final global cluster set and waiting to be unloaded by groups. Experiments on total 1974 discharging voltage sequences indicate that the proposed scheme can achieve 43.56% of inconsistency reduction of battery groups that beats the existing grouping schemes and rivals the proposed centralized grouping scheme, which is about 49.75%. What is remarkable about this distributed scheme is that the time cost is reduced by 92.87%, from 1775.51s to 126.56s, compared with centralized scheme. We have reasons to believe the proposed scheme can cope with the situation of massive production, achieving fast battery grouping with high performance and increasing resource utilization. Future research will further focus on shortening the gap of consistency improvement between the advanced centralized scheme and testing the scheme on larger dataset. Acknowledgements This work was supported by the National Natural Science Foundation of China of China under Grant U1701262 and U1801263. References

Fig. 14. Computation time comparison between centralized and distributed scheme.

Aghabozorgi, S., Shirkhorshidi, A.S., Wah, T.Y., 2015. Time-series clusteringeA decade review. Inf. Syst. 53, 16e38. Barreras, J.V., Fleischer, C., Christensen, A.E., Swiercynski, M., Schaltz, E., Andreasen, S.J., Sauer, D.U., 2016. An advanced HIL simulation battery model for battery management system testing. IEEE Trans. Ind. Appl. 52 (6), 5086e5099. € fer, T., Brühl, M., Rothgang, S., Sauer, D.U., 2014. Production caused variation Baumho in capacity aging trend and correlation to initial cell performance. J. Power Sources 247, 332e338. Berndt, D.J., Clifford, J., 1994. Using Dynamic Time Warping to Find Patterns in Time Series. KDD workshop, Seattle, WA, pp. 359e370. Chen, Y., Tang, S., Zhou, L., Cheng, W., Du, J., Tian, W., Pei, S., 2016. Decentralized Clustering by Finding Loose and Distributed Density Cores. Information Sciences. S0020025516305795. Chiu, K.C., Lin, C.H., Yeh, S.F., Lin, Y.H., Huang, C.S., Chen, K.C., 2014. Cycle life analysis of series connected lithium-ion batteries with temperature difference. J. Power Sources 263 (4), 75e84. Du, M., Ding, S., Jia, H., 2016. Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl. Based Syst. 99, 135e145. Ester, M., Kriegel, H.-P., Sander, J., Xu, X., 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 226e231. Fang, K., Shi, C., Mu, D., Wu, B., Feng, W., 2013. Investigation of nickelemetal

X. Bai et al. / Journal of Cleaner Production 233 (2019) 429e445 hydride battery sorting based on charging thermal behavior. J. Power Sources 224 (224), 120e124. Feng, X., Xu, C., He, X., Wang, L., Zhang, G., Ouyang, M., 2018. Mechanisms for the evolution of cell variations within a LiNixCoyMnzO2/graphite lithium-ion battery pack caused by temperature non-uniformity. J. Clean. Prod. 205, 447e462. Folgado, D., Barandas, M., Matias, R., Martins, R., Carvalho, M., Gamboa, H., 2018. Time alignment measurement for time series. Pattern Recogn. 81, 268e279. Gogoana, R., Pinson, M.B., Bazant, M.Z., Sarma, S.E., 2014. Internal resistance matching for parallel-connected lithium-ion cells and impacts on battery pack cycle life. J. Power Sources 252, 8e13. Guo, L., Liu, G.W., 2012. Research of lithium-ion battery sorting method based on fuzzy C-means algorithm. In: Advanced Materials Research. Trans Tech Publ, pp. 983e988. He, F., Shen, W.X., Song, Q., Kapoor, A., Honnery, D., Dayawansa, D., 2014. Clustering LiFePO4 Cells for Battery Pack Based on Neural Network in EVs, Transportation Electrification Asia-Pacific. He, Z., Gao, M., Ma, G., Liu, Y., Tang, L., 2016. Battery grouping with time series clustering based on affinity propagation. Energies 9 (7), 561. Huang, J., Chen, D., Yang, Y., Gao, M., 2018. Battery grouping based on improved Kmeans with curve fitting, 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA). IEEE 1966e1971. Jaguemont, J., Boulon, L., Venet, P., Dube, Y., Sari, A., 2016. Lithium ion battery aging experiments at sub-zero temperatures and model development for capacity fade estimation. IEEE Trans. Veh. Technol. 65 (6), 4328e4343. Kim, J., Shin, J., Chun, C., Cho, B., 2012. Stable configuration of a Li-ion series battery pack based on a screening process for improved voltage/SOC balancing. IEEE Trans. Power Electron. 27 (1), 411e424. Lai, X., Qiao, D., Zheng, Y., Ouyang, M., Han, X., Zhou, L., 2019. A rapid screening and regrouping approach based on neural networks for large-scale retired lithiumion cells in second-use applications. J. Clean. Prod. 213, 776e791. Li, X., 2014. A comparative study of sorting methods for lithium-ion batteries. In: Transportation Electrification Asia-Pacific. Li, X., Song, K., Guo, W., Lu, R., Zhu, C., Stewart, P., 2015. A novel grouping method for lithium iron phosphate batteries based on a fractional joint Kalman filter and a new modified K-means clustering algorithm. Energies 8 (8), 7703e7728. Lipu, M.S.H., Hannan, M.A., Hussain, A., Hoque, M.M., Ker, P.J., Saad, M.H.M., Ayob, A., 2018. A review of state of health and remaining useful life estimation methods for lithium-ion battery in electric vehicles: challenges and recommendations. J. Clean. Prod. 205, 115e133. Liu, C., Tan, J., Shi, H., Wang, X., 2018. Lithium-ion cell screening with convolutional neural networks based on two-step time-series clustering and hybrid resampling for imbalanced data. IEEE Access 6, 59001e59014. Love, C.T., Swider-Lyons, K.E., Virji, M.B.V., Rocheleau, R.E., 2014. State-of-health monitoring of 18650 4S packs with a single-point impedance diagnostic. J. Power Sources 266 (1), 512e519.

445

Lu, L., Han, X., Li, J., Hua, J., Ouyang, M., 2013. A review on the key issues for lithiumion battery management in electric vehicles. J. Power Sources 226 (3), 272e288. Niu, X., Garg, A., Goyal, A., Simeone, A., Bao, N., Zhang, J., Peng, X., 2019. A coupled electrochemical-mechanical performance evaluation for safety design of lithium-ion batteries in electric vehicles: an integrated cell and system level approach. J. Clean. Prod. 222, 633e645. Oliveira, L., Messagie, M., Rangaraju, S., Sanfelix, J., Hernandez Rivas, M., Van Mierlo, J., 2015. Key issues of lithium-ion batteries e from resource depletion to environmental performance indicators. J. Clean. Prod. 108, 354e362. Rodriguez, A., Laio, A., 2014. Clustering by fast search and find of density peaks. Science 344 (6191), 1492e1496. Rumpf, K., Naumann, M., Jossen, A., 2017. Experimental investigation of parametric cell-to-cell variation and correlation based on 1100 commercial lithium-ion cells. J. Energy Storage 14, 224e243. Shi, W., Jie, C., Quan, Z., Li, Y., Xu, L., 2016. Edge computing: vision and challenges. IEEE Internet of Things Journal 3 (5), 637e646. Tong, Q., Xiu, L., Bo, Y., 2017. Efficient distributed clustering using boundary information. Neurocomputing 275, 2355e2366. €yrynen, A., Salminen, J., 2012. Lithium ion battery production. J. Chem. TherVa modyn. 46 (1), 80e85. Wang, Q., Cheng, X.Z., Wang, J., 2017. A new algorithm for a fast testing and sorting system applied to battery clustering. In: International Conference on Clean Electrical Power. Wang, S., Shang, L., Li, Z., Hu, D., Li, J., 2016. Online dynamic equalization adjustment of high-power lithium-ion battery packs based on the state of balance estimation. Appl. Energy 166, 44e58. Wei, J., Dong, G., Chen, Z., Yu, K., 2017. System state estimation and optimal energy control framework for multicell lithium-ion battery system. Appl. Energy 187 (Complete), 37e49. Yun, L., Sandoval, J., Zhang, J., Gao, L., Garg, A., Wang, C.-T., 2019. Lithium-ion battery packs formation with improved electrochemical performance for electric vehicles: experimental and clustering analysis. J. Electrochem. Energy Conv. Storage 16 (2), 021011. Zeng, Y., Yang, Y., He, Z., Gao, M., Wang, C., Hong, M., 2016. Lead-acid battery automatic grouping system based on graph cuts. Electr. Power Compon. Syst. 44 (4), 450e458. Zhang, C., Jiang, Y., Jiang, J., Cheng, G., Diao, W., Zhang, W., 2017a. Study on battery pack consistency evolutions and equilibrium diagnosis for serial-connected lithium-ion batteries. Appl. Energy 207, 510e519. Zhang, X., Wang, Y., Liu, C., Chen, Z., 2017b. A novel approach of remaining discharge energy prediction for large format lithium-ion battery pack. J. Power Sources 343, 216e225. Zheng, Y., Ouyang, M., Lu, L., Li, J., 2015. Understanding aging mechanisms in lithium-ion battery packs: from cell capacity loss to pack capacity evolution. J. Power Sources 278, 287e295.