Available online at www.sciencedirect.com Available online at www.sciencedirect.com
ScienceDirect ScienceDirect Available online atonline www.sciencedirect.com Available at www.sciencedirect.com Procedia CIRP 00 (2019) 000–000
ScienceDirect ScienceDirect
Procedia CIRP 00 (2019) 000–000
www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia
ProcediaProcedia CIRP 00CIRP (2017) 81000–000 (2019) 1183–1188
52nd CIRP Conference on Manufacturing Systems 52nd CIRP Conference on Manufacturing Systems
www.elsevier.com/locate/procedia
Forecasting Changes in Material Flow Networks with Stochastic Block Forecasting Changes inDesign Material FlowMay Networks with Stochastic Block 28th CIRP Conference, 2018, Nantes, France Models Models a, b physical architecture of A new methodology to analyze functional and Thorbenthe Funke *, Till Becker a, b Thorben Funke *, TillofBecker existing anundassembly product family BIBA -products Bremer Institut für for Produktion Logistik GmbH at oriented the University Bremen, Hochschulring 20, 28359identification Bremen, Germany a
b Faculty of Business Studies, University of Applied Sciences Hochschule Emden/Leer, Constantiaplatz 4,26723 Emden, Germany BIBA - Bremer Institut für Produktion und Logistik GmbH at the University of Bremen, Hochschulring 20, 28359 Bremen, Germany b of Business University Applied
[email protected] Sciences Hochschule Emden/Leer, Constantiaplatz 4,26723 Emden, Germany * Corresponding Faculty author. Tel.: +49 421Studies, 218 50046; E-mailofaddress: a
Paul Stief *, Jean-Yves Dantan, Alain Etienne, Ali Siadat
* Corresponding author. Tel.: +49 421 218 50046; E-mail address:
[email protected] École Nationale Supérieure d’Arts et Métiers, Arts et Métiers ParisTech, LCFC EA 4495, 4 Rue Augustin Fresnel, Metz 57078, France
Abstract
* Corresponding author. Tel.: +33 3 87 37 54 30; E-mail address:
[email protected]
Abstract Material flows in logistics and manufacturing become increasingly dynamic and flexible and at the same time those systems grow with increasing number of goods and partners. Existing forecasting solutions require detailed information to create an appropriate prediction about future states Material flows in logistics and manufacturing become increasingly dynamic and flexible and at the same time those systems grow with increasing of the system. Therefore, we propose a new approach using the Stochastic Block Model (SBM), which needs only the aggregated representation Abstract number of goods and partners. Existing forecasting solutions require detailed information to create an appropriate prediction about future states of the considered system as a material flow network. Based on an inferred clustering of the investigated material flow network, we are able to of the system. Therefore, we propose a new approach using the Stochastic Block Model (SBM), which needs only the aggregated representation predict the usage of specific paths and the system as a whole. Inoftoday’s businesssystem environment, the trend more product and customization unbroken. material Due to this development, the considered as a material flowtowards network. Based on anvariety inferred clustering of the is investigated flow network, wethe areneed able of to © 2019 The of Authors. Published Elsevier Ltd. This various is anproducts open and access article underTo the CC BY-NC-ND license agile andthe reconfigurable production systems emerged to an cope with product families. predict usage paths thebysystem as a is whole. © 2019 The Authors.specific Published byand Elsevier Ltd. This open access article under the CC BY-NC-ND licensedesign and optimize production (http://creativecommons.org/licenses/by-nc-nd/3.0/) systems as well asAuthors. to choose Published the optimal by product matches, are needed. Indeed, methods license aim to © 2019 Elsevier Ltd.product This analysis is an methods open access article undermosttheof the CCknown BY-NC-ND (http://creativecommons.org/licenses/by-nc-nd/3.0/) © 2019 2019 TheThe Authors. Published by Ltd. © The Authors. Published by Elsevier Elsevier Ltd. This islevel. ancommittee open accessof article under the however, CC BY-NC-ND license Peer-review under responsibility ofthe the scientific the 52nd CIRP Conference on Manufacturing Systems. analyze a product or one product family on physical Different product families, may differ largely in terms of the number and (http://creativecommons.org/licenses/by-nc-nd/3.0/) This is an open access article under CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/) Peer-review under responsibility of the scientific committee of the 52nd CIRP Conference on Manufacturing Systems. (http://creativecommons.org/licenses/by-nc-nd/3.0/) nature of components. This fact impedes comparison andofchoice of appropriate product family combinations Systems. for the production Peer-review responsibility ofantheefficient scientific committee the 52nd CIRP on Conference on Manufacturing Peer-review underunder responsibility of the scientific committee of the 52nd CIRP Conference Manufacturing Systems. Peer-review responsibility of the scientific committee of the 52nd CIRPofConference on Manufacturing system. A newunder methodology is proposed to analyze existing products in view their functional and physical Systems. architecture. The aim is to cluster Keywords: material flow networks; link prediction; stochastic block models; forecasting; ensemble learning these products in new assembly oriented product families for the optimization of existing assembly lines and the creation of future reconfigurable Keywords:systems. material flow networks; link Flow prediction; stochastic block models; forecasting; ensembleislearning assembly Based on Datum Chain, the physical structure of the products analyzed. Functional subassemblies are identified, and a functional analysis is performed. Moreover, a hybrid functional and physical architecture graph (HyFPAG) is the output which depicts the similarity between product families by providing design support to both, production system planners and product designers. An illustrative partitions and by combining the information of different time 1. Introduction example of a nail-clipper is used to explain the proposed methodology. An industrial case study on two product families of steering columns of steps, we are able toapproach. achieve the a higher accuracy than heuristic partitions by combining information of different time 1. Introduction thyssenkrupp Presta France is then carried out to give a first industrial evaluation of theand proposed methods. 1.1. Motivation steps, we are able to achieve a higher accuracy than heuristic © 2017 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the 28th CIRP Design Conference 2018. methods. 1.1. Motivation
Systems in manufacturing and production have become
more complex to various like have smart become sensors Keywords: Assembly; Design method;developments Family identification Systems in due manufacturing and production
1.2. Related Literature
1.2. Related Literature Our work aligns with those authors transferring the recent or thecomplex demanddueforto avarious reduced time to market [1]. sensors These more developments like smart progress in thealigns field of complex systems,transferring which is inspired and developments interplay with other effects like distributed Our work with those authors the recent or the demand for a reduced time to market [1]. These contributed by various disciplines from physics to social productions and an increased number of products or services progress in the range field ofand complex systems, which is inspired and interplay with other effects like distributed 1.developments Introduction of the product characteristics manufactured and/or science, to the field of engineering [4].from Results deduced in this for each participating company. Theof resulting system is contributed by various disciplines physics to social productions and an increased number products or services assembled in this but system. In this context, the main challenge in manner include are not limited to relationships between complex and coupled on all levels from the shop floor over science, toand the field of engineering [4]. Results deduced in this forDue eachtoparticipating company. Theinresulting system of is the fast development the domain modelling analysis is now not only to cope with single topological structures andnot dynamic behavior of manufacturing locations of acoupled company to levels dependencies the floor respective manner include butproduct are limited to relationships between complex and all from of theindigitization shop over communication andForecasting anonongoing trend and products, a[5], limited range or existing product families, networks studies of the worldwide maritime network [6, 7], supply chains [2]. future states of such a system topological structures and dynamic behavior of manufacturing locations of a company to dependencies in the respective digitalization, manufacturing enterprises are facing important but also global to be able to analyze network and to compare products to define or the air transport [8, 9]. As supply chain usually requires a lot of different detailed information [3]. But networks [5],families. studies ofIt the worldwide maritime network [6, 7], supply chains Forecasting future states of such a system challenges in [2]. today’s market environments: a continuing new product can beofobserved thatauthors classical existing networks already share a part their name, applied the important information is beyond the accessibility, such as or the global air transport network [8, 9]. As supply chain usually requires a lot of different detailed information [3]. But tendency towards reduction of product development times and product families are regrouped into function of clients or features. method complex networks assess relationship of timing and characteristic new orders strategic planning networksofalready share a partproduct of their name,the authors applied the important information isofbeyond the oraccessibility, such of as shortened product lifecycles. In addition, there is an increasing However, assembly oriented families are hardly tochain find. topological properties and the robustness of supply partners. Therefore, we propose a new approach using solely method complex networks to assess differ the relationship of timing and characteristic ofbeing new orders or strategic planning of demand of customization, at the timenetwork. in a global On theofproduct family level, products mainly in two networks [10]. Watts and Strogatz analyzed properties of basic material flow data represented as asame complex Our topological properties and the robustness of supply chain partners. Therefore, we propose a new approach using solely competition with competitors all overontheclustering world. This main characteristics:such (i) the number of components and (ii) the complex as the American power grid [11]. new prediction is based withtrend, the networksnetworks [10]. Watts and Strogatz analyzed properties of basic material flowmethod data represented as afrom complex network. Our which is inducing the development macro to micro type of components (e.g. mechanical, electrical, electronical). Stochastic Blockmethod Model is(SBM). multiple with inferred complex networks such as the American power grid [11]. new prediction based Using on clustering the markets, results in diminished lot sizes due to augmenting Classical methodologies considering mainly single products Stochastic Block Model (SBM). Using multiple inferred product varieties (high-volume to low-volume production) [1]. or solitary, already existing product families analyze the 2212-8271 © 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license To cope with this augmenting variety as well as to be able to product structure on a physical level (components level) which (http://creativecommons.org/licenses/by-nc-nd/3.0/) 2212-8271 possible © 2019 The optimization Authors. Publishedpotentials by Elsevier Ltd. This is an open access causes article under the CC BY-NC-ND license an efficient definition and identify in the existing regarding Peer-review under responsibility of the scientific committee of the 52nd CIRP Conference on difficulties Manufacturing Systems. (http://creativecommons.org/licenses/by-nc-nd/3.0/) production system, it is important to have a precise knowledge comparison of different product families. Addressing this Peer-review under responsibility of the scientific committee of the 52nd CIRP Conference on Manufacturing Systems.
2212-8271 © 2019 The Authors. Published by Elsevier Ltd. This is an©open article Published under theby CC BY-NC-ND 2212-8271 2017access The Authors. Elsevier B.V. license (http://creativecommons.org/licenses/by-nc-nd/3.0/) Peer-review under responsibility of scientific the scientific committee theCIRP 52ndDesign CIRPConference Conference2018. on Manufacturing Systems. Peer-review under responsibility of the committee of the of 28th 10.1016/j.procir.2019.03.289
1184 2
Thorben Funke et al. / Procedia CIRP 81 (2019) 1183–1188 Thorben Funke et al. / Procedia CIRP 00 (2019) 000–000
The aim of our contribution is not to compete or to replace existing methods of production planning or production control [12, 13]. Instead we thrive to give advices in those situations where these methods fail because of lacking data like an unreliable sales forecast caused by an increased product portfolio and rapidly changing markets. Our proposed method as well as our baseline methods are instances of link predictors. Link predictors assign scores to all possible relationships and a high score (relative to the others) should correlate to a high probability of observing this relationship. Haghani and Keyvanpour made a systematic review of recent link prediction methods in social networks, which can be translated to the general field of complex networks [14]. This publication is closest to the work of Liao et al. [15], who analyzed the performance of different link prediction heuristics on a selection of real-world networks. In contrast to our work, they solely focused on static cases, where new networks are generated from a single network by removing a certain rate of edges to measure the retrieval performance of the original network afterwards. Finally, as our proposed method is a combination of heuristic link prediction and link prediction based on the result of an inferred SBM clustering, this work builds on the formulation of the degree-corrected SBM by Karrer and Newman [16]. This SBM extension increases the applicability of SBM to real-world networks, which often have broader degree distributions than those networks described by the standard SBM. Guimerà and Sales-Pardo proposed together with different researchers [17-20] the application of SBM as link predictor. However, the pure application of link prediction based on SBM lacks the desired accuracy. 1.3. Article Structure After this introduction, we describe the real-world example of a material flow network used for the validation of our proposed method. Then, we present the link prediction methods used as baseline for our comparison. In the same section, we state the basics of SBM such as clustering and link prediction based on the clustering result as well as our extension based on ensemble learning. Having the methods prepared, we describe applied performance measurement and depict the results of different link predictors application to our real-world material flow networks. In our conclusion, we finish with a discussion of our results and give an outlook on further enhancements and applications. 2. Material Flow Networks Our proposed method is generic in its formulation and applicability. In our case, we strive to increase the knowledge about material flow networks by supplying a method to forecast future states, i.e. the used material flow paths during a so far not observed (future) time period. Material flow networks can be constructed in any scale from the global perspective such as the global maritime network or the global air transportation network to much more detailed views such as a single production facility. In this publication we selected the latter
because the dynamics at this level are much greater than the in the slower systems of larger size. At the level of a single production or company, we are able to construct material flow networks based on the completion confirmation data. Such data is usually available as a byproduct of manufacturing control systems and usually recorded on a very detailed time scale by logging the steps of all orders on machines down to minutes or even seconds. A material flow network for a given time period can be created by simply adding all used machines and connecting all pairs of machines, between which material has been transported. For a detailed description of the transformation see Funke and Becker [21]. The specific example used in this publication is from data of a job-shop production and covers about one year. The production under consideration works in a dynamic environment with heterogeneous orders and sometimes short lead times. This results in a lot of dynamics in the aggregated system and our new approach will focus on predicting future material flow paths to support analysis and optimization in such contexts. To ensure anonymity of the specific company, we are limited to an abstract description. We slice the data into weeks resulting in 51 material flow networks. For simplicity we regard material flow networks as instances of undirected graphs with self-loops. In each week between 110 and 156 machines were used and on average 475 material flows were recorded. 3. Link Prediction 3.1. Heuristic Link Prediction Before presenting different solutions to the link prediction task, we introduce the task itself and the notation needed. We start with an example: Think of a set of companies and their business relationships. From their public internet domains only a subset of present business to business ties can be assessed and there are limited resources available to verify some other possible relationships. The task of link prediction is to order all non-observed links in a way that the higher a link’s score the more likely a relationship between these two companies can be positively verified. A higher accuracy of a link predictor in this example means more positive verified business ties and less money spend on unlikely relationships. Above formulation of the link prediction is driven by the motivation to add missing observations to an incomplete observed static network and verification of edges need substantial amount of money or time. In our case of material flow networks, the desired task is a different because we already know all links until a certain time point. We are interested in predicting the links of next week given the information of last week(s). Luckily, this does not change the mathematical formulation of a link predictor and only affects the way of evaluating the performance of the different approaches. Formally, a link predictor is the following: Let 𝐺𝐺 = (𝑉𝑉, 𝐸𝐸) be a graph with |𝑉𝑉| = 𝑁𝑁 nodes and |𝐸𝐸 | = 𝑀𝑀 edges. A link predictor is a function 𝐹𝐹, : 𝑉𝑉 × 𝑉𝑉 → ℝ , which assigns each possible edge (𝑖𝑖, 𝑗𝑗) ∈ 𝑉𝑉 × 𝑉𝑉 a value 𝐹𝐹, (𝑖𝑖, 𝑗𝑗) in a total ordered set like the real numbers ℝ.
Thorben Funke et al. / Procedia CIRP 81 (2019) 1183–1188 Thorben Funke et al. / Procedia CIRP 00 (2019) 000–000
Specific examples of link predictors help to illustrate this formal definition. Our selection of basic link predictors is taken from Lia et al. [15] and includes common neighbor (CN), Jaccard coefficient (JC), resource allocation (RA), and local path index (LPI). According to the categories presented by Haghani and Keyvanpour, all these link predictors are based on heuristics [41]. The common neighbor (CN) link predictor assumes that two nodes are more likely to have a tie, if they share many neighbors. With Γ(𝑖𝑖) the neighborhood of a node 𝑖𝑖 , the common neighbor for two nodes 𝑖𝑖 and 𝑗𝑗 is 𝐹𝐹,56 (𝑖𝑖, 𝑗𝑗) = |Γ(i) ∩ Γ(j)|.
Newman used the CN to deduce a relation in scientific collaboration networks between number of shared collaborators and future collaborations [22]. Obviously, this measure favors nodes with high degree because they simply have more candidates to share with other nodes. One way to overcome this weakness is to calculate the Jaccard coefficient (JC) of the neighborhoods 𝐹𝐹,;5 (𝑖𝑖, 𝑗𝑗)
|Γ(i) ∩ Γ(j)| . = |Γ(i) ∪ Γ(j)|
The JC link predictor is only one way to normalize the CN measure. For further alternatives and their performances we refer to Zhou et al. [23]. The resource allocation (RA) measure is an alternative inspired by the resource allocation process on networks. In the simplest case each node has one unit of resource and will equally distribute it between all its neighbors. The resulting similarity of the edge (𝑖𝑖, 𝑗𝑗) is according to Zhou et al. [23] the amount of resource 𝑖𝑖 received from 𝑗𝑗 𝐹𝐹,=> (𝑖𝑖, 𝑗𝑗) =
?
B∈C(D)∩C(E)
1 . |Γ(𝑘𝑘)|
The last heuristic-based measure of our comparison is the local path index (LPI). It was introduced to break up the small number of assigned CN values, i.e. only 0, 1 in the example of Fig. 1. To improve the accuracy of CN the LPI takes paths of a higher order into account. Let 𝐴𝐴 be the adjacency matrix of 𝐺𝐺. Then CN can be written simply as 𝐹𝐹,56 ≡ 𝐴𝐴I in the way that
1185 3
𝐹𝐹,56 (𝑖𝑖, 𝑗𝑗) is the element at (𝑖𝑖, 𝑗𝑗) of 𝐴𝐴I . In the same manner, LPI is defined by 𝐹𝐹,JKL ≡ 𝐴𝐴I + 𝜀𝜀𝐴𝐴O ,
where 𝜀𝜀 is a free parameter. We select with 𝜀𝜀 = 10PO the same value as Zhou et al. [23], who introduced this measure. 3.2. Link Prediction with the Stochastic Block Model All the presented link predictors are based on the direct neighborhood or like LPI a close region of each node. This results in fast calculation, but may miss some important global structures. Therefore, we now introduce the degree-corrected Stochastic Block Model (SBM) of Karrer and Newman [16]. The basic idea of SBM is to assign the nodes into blocks with other nodes, which share similar neighborhood relationships to members of other blocks. If we first assume that such a partition 𝑏𝑏R⃗ = {𝑏𝑏U , … , 𝑏𝑏W } of 𝑉𝑉 is given, the likelihood of an edge between a node 𝑖𝑖 ∈ 𝑏𝑏Y and 𝑗𝑗 ∈ 𝑏𝑏Z is exp(−𝜔𝜔YZ |Γ(𝑖𝑖)||Γ(𝑗𝑗)|),
(1)
where 𝜔𝜔YZ is the likelihood for an edge between block 𝑟𝑟 and block 𝑠𝑠 , which can be replaced by its maximum-likelihood value 𝜔𝜔YZ =
𝑒𝑒YZ . |𝑏𝑏Y ||𝑏𝑏Z |
Here, 𝑒𝑒YZ is the number of observed edges between block 𝑟𝑟 and block 𝑠𝑠. As this link prediction requires a suitable partition, the likelihood of a partition belonging to a given graph can be calculated. We take the formulation of Karrer and Newman, who first proposed the degree-corrected SBM, 𝐿𝐿(𝐺𝐺 |𝑏𝑏R⃗ ) = ? 𝑒𝑒YZ log Y,Z
𝑒𝑒YZ , 𝜅𝜅Y 𝜅𝜅Z
where 𝜅𝜅Y is the sum of the degrees of all nodes in group 𝑟𝑟, i.e. 𝜅𝜅Y = ∑D∈ij |Γ(𝑖𝑖)|. To maximize this equation and infer a good partition, many algorithms, such as belief propagation, spectral clustering, or Kerninghan-Lin algorithm, can be used. We used a simple Metropolis-Hastings algorithm, which can be easily
Fig. 1. Visualization of the presented five different link predictors evaluated on a small example graph (top left) for all possible edges. The resulting scores of the edges are highlighted by a greyscale from black (high value) to light grey (low value). For the SBM link prediction the four nodes were assigned to two blocks (1,2) and (3,4), which is also shown by the node color.
Thorben Funke et al. / Procedia CIRP 81 (2019) 1183–1188 Thorben Funke et al. / Procedia CIRP 00 (2019) 000–000
1186 4
implemented and has a sufficient performance for the regarded material flow networks. In each step of the MetropolisHastings algorithm, a move of a node 𝑖𝑖 from block 𝑟𝑟 to another block 𝑠𝑠 is proposed and accepted with likelihood exp (−Δ𝐿𝐿), where Δ𝐿𝐿 is the delta of the likelihood created by the move. For the proposal of new blocks, we implemented Peixoto’s variant [24]. After inferring a partition, we follow Ghasemian et al. [25] and use as score for an edge between a node mno 𝐹𝐹,,i R⃗ (𝑖𝑖, 𝑗𝑗) = 𝑒𝑒YZ |Γ(𝑖𝑖)||Γ(𝑗𝑗)|,
which has the advantage of a similar scaling of values in comparison with the likelihood of Eq. (1). A usual challenging task is selecting the free parameters like number of steps of the Metropolis-Hastings algorithms or number of blocks 𝐾𝐾. Since our aim is not the inference of an optimal clustering but link prediction, we choose a single fixed number of blocks 𝐾𝐾 = 25. This value is higher than the number of blocks derived by other means, but fits to our case. For the second free choice we take 250,000 steps because the runtime is below a minute for each network and in comparison to values used by other authors [26] this choice is sufficient for the sizes of our networks. 3.3. Ensemble Learning To improve the quality of prediction with SBM, we can take multiple partitions into account. The general concept we want to use is called ensemble learning. To get an idea of the basic approach, we take a look at the local path index again. We can regard the LPI as a combination of the common neighbor predictor, which delivers the 𝐴𝐴I and another predictor given by 𝐴𝐴O . With the weighting of 𝜀𝜀, we receive LPI. The combination of SBM link predictors of different partitions is straightforward. If we assume that different predictors have the same accuracy and range of assigned scores, we can simply aggregate them without any additional weights. Consequently, the scores of the new link predictor is given by mno mno 𝐹𝐹,stZuviwu mno = mean {𝐹𝐹,,i R⃗ , 𝐹𝐹,,i R⃗ , … ~. |
}
Now we have to choose how many partitions we want to insert into the link predictor. Increasing the number of regarded partitions should increase the performance until a saturation is reached. Since in our use case we have a stream of graphs, i.e. 𝐺𝐺PU , 𝐺𝐺PI ,…, we extended this approach to the dynamic situation by taking multiple graphs and their prediction into account stZuviwu 𝐹𝐹,€Dvu = meanÅ𝜆𝜆ƒ𝐹𝐹,„…| , 𝜆𝜆U 𝐹𝐹,„…} , … †, ⃗
where each 𝐹𝐹 on the right hand can be any link predictor and 0 < 𝜆𝜆 < 1 is a free parameter. As the weights fall exponentially, the information from more distant networks has less impact on the prediction of the next network 𝐺𝐺 . We test
Fig. 2. Performance of the presented link predictors for the different material flow networks. Each predictor had the information of the week 𝑡𝑡 − 1 for the link prediction of the material flow network of week 𝑡𝑡.
this new approach with all yet presented link predictors using the last two networks and 𝜆𝜆 = 1ˆ2. For example, for CN this results in 56,I 𝐹𝐹(, = mean ‰𝐹𝐹,56 , „…| „…| ,,„…}
4. Application
𝐹𝐹,56 „…} Š. 2
4.1. Performance Measure To evaluate the performance of different link prediction methods, we group the material flow networks into pairs of subsequent weeks such as (𝐺𝐺U , 𝐺𝐺I ), (𝐺𝐺I , 𝐺𝐺O ), … , (𝐺𝐺€PU , 𝐺𝐺€ ) . For each of those pairs (𝐺𝐺 , 𝐺𝐺‹U ) we supplied each presented link predictors with the information of 𝐺𝐺 and evaluated the performance on 𝐺𝐺‹U with the area under the receiver operator curve (AUROC) [27]. The SBM link prediction requires as an intermediate step the clustering of 𝐺𝐺 with the MetropolisHastings algorithm. The calculation of the exact AUROC is time consuming because |𝑉𝑉|I values need to be calculated, ordered and aggregated. A sufficient approximation can be easily calculated with a comparison of the scores of a sufficient number 𝑛𝑛 of randomly selected edge pairs. In each step, we select a random existing and a random non-existing edge of 𝐺𝐺‹U and calculate the scores of both edges. We record the number of times 𝑛𝑛U the existing edge has a higher score than the non-existing one and the number of times 𝑛𝑛I their scores are equal. An approximation of the AUROC is given by 𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 =
𝑛𝑛U + 0.5𝑛𝑛I . 𝑛𝑛
To ensure sufficient accuracy, we calculate all AUROC with 𝑛𝑛 = 100,000 samples. The AUROC has the interpretation of the likelihood that a random existing edge has a higher or equal score than a random non-existing edge. This equivalent
Thorben Funke et al. / Procedia CIRP 81 (2019) 1183–1188 Thorben Funke et al. / Procedia CIRP 00 (2019) 000–000
Fig. 3. Results of the presented link predictors based on the knowledge of two previous weeks.
formulation is used by the formula above and gives a natural interpretation of results. Assigning all edges the same value would yield in an AUROC of 0.5 and a perfect predictor would have an AUROC of 1. Since we apply our link prediction based on the knowledge of 𝐺𝐺 , we introduce an additional baseline, which simply assumes that next week is the same like last week. We call the resulting link predictor simply `like before` (LB): (𝑖𝑖, 𝑗𝑗) = ” 𝐹𝐹,Jn „ ’(“„ ,s„ ) 4.2. Results
1 if (𝑖𝑖, 𝑗𝑗) ∈ 𝐸𝐸 0 else
Now we have everything prepared to present and discuss the results. Fig. 2 shows the results of all link predictors based on the data of one week. The performances of all link predictors vary throughout the regarded time period and roughly seem to follow the LB baseline, which measures the similarity of two consecutive weeks. The JC link prediction has the worst performance, which is close to the LB baseline and sometimes performs even worse. The degree correction seems to be unsuitable for our material flow networks. All other variants deliver results better than the LB baseline. To our surprise the results of RA and CN are consistently close to each other. The LPI shows the best performance and is only surpassed by the SBM link predictor in some cases. The SBM link prediction based on a single partition has a huge variance in its results. Since the representation of a network as SBM is a vast reduction from the whole graph needed by the other methods, even this result is impressive. To further demonstrate the capabilities of forecasting with SBM, we inferred five independent partitions for each weekly material flow network. Then we transformed the result into a single ensemble link predictor, as described in Section 3.3. For the vast majority of weeks our SBM ensemble outperforms the LPI. See Table 1 for an overview of the overall performance.
1187 5
The table contains the average AUROC over all weekly material flow networks. As described in Section 3.3, we propose to use multiple weeks for the forecast. Fig. 3 shows the results of all presented link predictors applied with our new approach. The performances of all approaches are increased by taking the information of two weeks into account. The link prediction based on SBM is clearly the best predictors now. Since inferred partitions can be reused, the additional week does not require an additional inference step. The ensemble SBM link prediction with five partitions for each week delivers continuously the best results and has in average an AUROC of 89%, i.e. in only 11% of the cases our developed approach assigns a non-existing material flow a higher score than an existing one. The improvement by our ensemble approach is clearly visible in the Table 1, which shows the aggregated results of both approaches. Our baseline approach, the LB link prediction, and the SBM prediction based on a single partition are those with the highest gains of 0.05. Other predictors, such as the LPI and JC coefficient, do not improve in a comparable amount. To emphasize the strength of our new approach, think of selecting the 10 realized material flows out of 20 possible ones. The naïve predictor based on one week would yield on average 7.6 out of 10, whereas our ensemble approach would select on average 8.9 realized material flow paths. This essential improvement allows further processing of the prediction, since the results become more accurate and reliable. Table 1. Overview of results averaged over all weekly material flow networks Link Predictor
Averaged AUROC based on one week
Averaged AUROC based on two weeks
Common Neighbor (CN)
0.810
0.832
Jaccard Coefficient (JC)
0.763
0.782
Resource Allocation (RA)
0.813
0.853
Local Path Index (LPI)
0.830
0.844
Like Before (LB)
0.760
0.830
Single Stochastic Block Model
0.812
0.867
Ensemble of five SBM
0.847
0.890
5. Conclusion and Outlook We have introduced different heuristic link prediction methods and the link prediction based on inferred SBM partition(s) and evaluated their performance on weekly material flow networks created from data of a real-world production. To further boost their performance, we propose the application of ensemble learning. The weighted consideration of multiple material flow networks improves the performance of all methods. Among all methods presented, the combination of using the data of two weeks for the SBM link prediction based on five partitions provides the best results. Our new approach has proven its capabilities in context of the regarded dynamic and flexible manufacturing system. Moreover, the dynamics and ongoing changes of the system are
1188 6
Thorben Funke et al. / Procedia CIRP 81 (2019) 1183–1188 Thorben Funke et al. / Procedia CIRP 00 (2019) 000–000
shown in the rather decent and fluctuating results of the naïve predictor (LB) in Fig. 2 and Fig. 3. With the improvement of the forecasting results by our method, material flow paths can be analyzed and optimized with a reduced uncertainty even in a dynamic manufacturing system. An open question, which will be dealt in future research, is the comparison with more advanced methods such as deep models or tensor factorization, which are beyond the scope of this work. Since our comparison is only based on data of one production sample, we want to verify our findings for other data sets as well. In addition, we want to study the influence of external effects such as the introduction of new products on the observed structure and the accuracy of our proposed method. This could be one part of a future analysis of the performance variance, which is probably partly caused by such effects. Breaking up with the strict continuous time scale of a week may help to improve the proposed method in such a dynamic environment. Additionally, the parameters we have fixed for this first study of using SBM as base for forecasting material flow networks need to be studied: Starting from the choices between different SBMs, such as hierarchical versions [28] or weighted SBM [29], over the yet fixed number of blocks used in the partition to the number of partitions used for the forecast. Another open question is the influence of the choices made in ensemble learning like the choices of 𝜆𝜆 , which probably depends on the selected time scale. As the investigated field has many open questions regarding the application of link prediction methods for the case of material flow networks, the presented results are only a first milestone and more complex models using SBM are going to be researched in future works. Acknowledgements This work was funded Forschungsgemeinschaft (DFG, Foundation) BE 5538/2-1.
by the German
Deutsche Research
References [1] ElMaraghy W, ElMaraghy H, Tomiyama T, Monostori L. Complexity in engineering design and manufacturing. CIRP annals 2012;61(2):793–814. [2] Aitken J, Bozarth C, Garn W. To eliminate or absorb supply chain complexity: a conceptual model and case study. Supply Chain Management: An International Journal 2016;21(6):759–774. [3] Gao R, Wang L, Teti R, Dornfeld D, Kumara S, Mori M, Helu M. Cloudenabled prognosis for manufacturing. CIRP annals 2015;64(2):749–772. [4] Newman ME. The structure and function of complex networks. SIAM review 2003;45(2):167–256. [5] Becker T, Meyer M, Windt K. A manufacturing systems network model for the evaluation of complex manufacturing systems. International Journal of Productivity and Performance Management 2014;63(3):324– 340. doi:10.1108/IJPPM-03-2013-0047.
[6] Kaluza P, Kölzsch A, Gastner MT, Blasius B. The complex network of global cargo ship movements. Journal of the Royal Society Interface 2010;7(48):1093–1103. [7] Ducruet C, Notteboom T. The worldwide maritime network of container shipping: spatial structure and regional dynamics. Global networks 2012;12(3):395–423. [8] Guimerà R, Mossa S, Turtschi A, Amaral LN. The worldwide air transportation network: Anomalous centrality, community structure, and cities’ global roles. Proceedings of the National Academy of Sciences 2005;102(22):7794–7799. [9] Rocha LE. Dynamics of air transport networks: A review from a complex systems perspective. Chinese Journal of Aeronautics 2017;30(2):469– 478. [10] Perera SS, Bell MG, Piraveenan M, Kasthurirathna D, Parhi M. Topological structure of manufacturing industry supply chain networks. Complexity 2018;2018. [11] Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’networks. nature 1998;393(6684):440. [12] Vollmann TE, Berry WL, Whybark DC. Manufacturing planning and control systems. Irwin; 1988. [13] Guide Jr VDR. Production planning and control for remanufacturing: industry practice and research needs. Journal of operations Management 2000;18(4):467–483. [14] Haghani S, Keyvanpour MR. A systemic analysis of link prediction in social network. Artificial Intelligence Review 2017;:1–35. [15] Liao H, Zeng A, Zhang YC. Predicting missing links via correlation between nodes. Physica A: Statistical Mechanics and its Applications 2015;436:216–223. [16] Karrer B, Newman ME. Stochastic blockmodels and community structure in networks. Physical Review E 2011;83(1):016107. [17] Guimerà R, Sales-Pardo M. Missing and spurious interactions and the reconstruction of complex networks. Proceedings of the National Academy of Sciences 2009;106(52):22073–22078. [18] Guimerà R, Llorente A, Moro E, Sales-Pardo M. Predicting human preferences using the block structure of complex social networks. PloS one 2012;7(9):e44620. [19] Vallès-Català T, Peixoto TP, Sales-Pardo M, Guimerà R. Consistencies and inconsistencies between model selection and link prediction in networks. Physical Review E 2018;97(6):062316. [20] Cobo-López S, Godoy-Lorite A, Duch J, Sales-Pardo M, Guimerà R. Optimal prediction of decisions and model selection in social dilemmas using block models. EPJ Data Science 2018;7(1):48. [21] Funke T, Becker T. Stochastic block models as a modeling approach for dynamic material flow networks in manufacturing and logistics. Procedia CIRP 2018;72:539–544. [22] Newman ME. Clustering and preferential attachment in growing networks. Physical review E 2001;64(2):025102. [23] Zhou T, Lü L, Zhang YC. Predicting missing links via local information. The European Physical Journal B 2009;71(4):623–630. [24] Peixoto TP. Efficient monte carlo and greedy heuristic for the inference of stochastic block models. Physical Review E 2014;89(1):012804. [25] Ghasemian A, Hosseinmardi H, Clauset A. Evaluating overfit and underfit in models of network community structure. arXiv preprint arXiv:180210582 2018;. [26] Newman ME, Reinert G. Estimating the number of communities in a network. Physical review letters 2016;117(7):078301. [27] Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143(1):29–36. [28] Peixoto TP. Hierarchical block structures and high-resolution model selection in large networks. Physical Review X 2014;4(1):011047. [29] Mariadassou M, Robin S, Vacher C. Uncovering latent structure in valued graphs: a variational approach. The Annals of Applied Statistics 2010;4(2):715–742.