Computers in Industry 95 (2018) 81–92
Contents lists available at ScienceDirect
Computers in Industry journal homepage: www.elsevier.com/locate/compind
Identifying experts for engineering changes using product data analytics Namchul Do Dept. of Industrial and Systems Engineering, ERI, Gyoengsang National University, 501 Jinju-daero, Jinju City, Gyoengnam, 52828, Republic of Korea
A R T I C L E I N F O
Article history: Received 25 February 2017 Received in revised form 1 September 2017 Accepted 1 December 2017 Available online xxx Keywords: Expert identification Engineering changes (ECs) Product data analytics (PDA) Product data management (PDM) Product lifecycle management (PLM)
A B S T R A C T
This paper aims to provide an expert identification procedure in an organization where design engineers share an integrated product data management (PDM) database for their product development and engineering changes (ECs). To identify experts for ECs, the procedure follows a product data analytics (PDA) approach that uses PDM databases as its operational data source to analyze different aspects of product development processes managed by PDM systems. It also employs a two-phase analysis procedure that considers the artefact and actor networks of the PDM system and participating engineers. The procedure also introduces EC history-centered multidimensional data analysis and social network analysis (SNA) for the two phases, respectively. To demonstrate the feasibility of the procedure, this study implemented it using a research-purpose PDM system, extract-transform-load (ETL) module, data cube with on-line analytical processing and SNA engines. It also provides a product design example with multiple engineering changes applied to the implemented prototype system as proof of the implementation and the procedure. © 2017 Elsevier B.V. All rights reserved.
1. Introduction Engineering changes (ECs) are inevitable processes in manufacturing companies, and their efficient management is a critical factor in maintaining the competitiveness of a company. To reduce the response time to engineering change requests (ECRs) and enhance the quality of ECs, it is important to identify suitable experts for engineering problems related to the current EC. If experts are limited to employees in an organization, they should not only have knowledge of general engineering domains but also experience of specific items such as parts or products developed in the organization. This paper aims to provide a data analysis procedure to identify suitable experts who can solve engineering problems during ECs in an organization, where design engineers can share an integrated product data management (PDM) database for their product development and ECs. Not only automotive or aerospace manufacturers, but also fabric and fashion companies are introducing PLM or PDM for their competitive product development [1,2]. Thus, many manufacturers use PDM systems to manage product design and ECs, and designers in the company share product design and EC data in PDM databases. The proposed expert identification
E-mail address:
[email protected] (N. Do). https://doi.org/10.1016/j.compind.2017.12.004 0166-3615/© 2017 Elsevier B.V. All rights reserved.
procedure can be applied to companies that manage product design and ECs using PDM database regardless of specific industries. To support the implementation of this procedure, this paper also provides specifications of Supporting information systems using a system architecture, product data models and prototype implementation. To identify suitable experts for the current EC, this paper proposes a two-phase analysis procedure. The first phase evaluates and selects similar ECs by comparing EC histories and other EC attributes. The second phase applies social network analysis (SNA) to identify suitable experts for the selected ECs. To provide an effective selection procedure, this paper introduces the following three features, which also differentiate the proposed approach from others. First, the proposed procedure follows the product data analytics (PDA) concept [3]. PDA is a data analysis approach that uses PDM databases as its operational data to analyze and evaluate different aspects of product development processes. It requires data-driven analysis methods and suitable measures for the analysis and evaluation of product development processes. Following the PDA concept, the proposed procedure uses an integrated PDM database as its source of operational data for analysis. For data analysis, this study uses both multidimensional data analysis and SNA to extract measures to find similar ECs and determine roles in networks of
82
N. Do / Computers in Industry 95 (2018) 81–92
engineers to identify suitable experts, respectively. This study also provides different measures for the two types of analysis. Second, the proposed procedure uses a two-phase analysis procedure based on the Artefact-Actor-Networks approach [4–6]. This approach views information systems and their users as separate data and human networks that interact with each other. Based on the concept, this study represents an artefact network from the participating old and new product structures in ECs and actor networks from the participating engineers in a set of ECs, and considers a two-phase analysis method that detects similarities using the old and new product structures (the artefact network) and analyzes roles of engineers from the network of participants of the selected ECs (the actor network). The PDM database provides necessary product and participation data to build all the actor and artefact networks. Third, the proposed procedure introduces an EC history-based approach to detect similarities in ECs. While an existing approach [7] uses simple manual counting of same items in the product structures of participating items, this study use multidimensional data from EC histories in a PDM database to automatically find similarities in ECs. To extract and determine related measures, this study proposes a multidimensional product data model and its data cube using an ETL module for PDM databases. In addition, since the retrieval requires ill-structured and complex query processes, the proposed procedure introduces case-based reasoning (CBR) to detect similar ECs. This paper proposes a CBR method that uses EC history and a set of attribute values of EC objects. To demonstrate the feasibility of the procedure and Supporting information systems, this study implements a prototype system consisting of a PDM database, ETL module, data cube and data analysis and visualization tools. Then, an example configuration control process with a PDM database is applied to the implemented system to show how to select suitable experts from analyzed PDM databases. The prototype system is based on the online analytical mining (OLAM) framework [8] as its Supporting information system architecture. OLAM integrates data cubes and on-line analytical processing (OLAP) for flexible multidimensional data analysis with data mining models. It supports exploratory data analysis, and preparation of data sets for different data mining models. The proposed procedure uses OLAM to prepare input data for EC analysis and SNA models for suitable expert identification, respectively. It shows the potential of the proposed procedure, which is expanded to analyze similar ECs as well as the individual contributions of the participants of selected ECs. The remainder of this paper is organized as follows: in Section 2, this study reviews related work. Section 3 introduces the overall process for identifying experts for ECs. Section 4 proposes a Supporting information systems architecture with product and multidimensional data models. It also describes details of the analysis procedure. Section 5 describes implementation of a prototype supporting system including an experimental PDM system with application examples as proof of the suggested concept. Section 6 concludes the study with further research topics.
Lappas et al. [9] define an expert as a person or agent with a high degree of skill or knowledge of a certain subject, and expert identification as the exercise of efficiently identifying the right expert (or set of experts) that can perform the given task from a set of candidates. Yiman-Seid and Kobsa [10] identified two main motives for seeking an expert, namely as a source of information (information need) and as someone who can perform a given organizational or social function (expertise need). In addition, they mentioned the needs of internal expert seeking while the existing approach focused on external expert seeking issues. To support expert identification, many automated supporting systems based on different information and communication technologies have been proposed [10]. One of these is a group approach based on networks among a pool of experts. They evaluate measures for suitable experts through score propagation, weighting or constraints of networks. For example, Song et al. [11] supported an expert identification method based on expertise networks with relationship and evolutionary representations. Smallblue [12] also supported expert identification using networks built from emails of participants in an organization. They use an agent system that analyzes emails of each user and gather analysis data from distributed agents to build networks between participants. The expert identification procedure proposed in this study is an internal expert identification and network-based method. It is different from existing approaches in that its application domain is EC analysis, and it uses a two-phase analysis approach that considers not only participating experts but also associated EC objects manipulated by them based on the Artefact-Actor-Network approach [5].
2. Related work
2.2. Engineering change analysis
Topics related to this study are expert identification (or expert location), engineering change analysis (ECA) and product data analytics (PDA). Fig. 1 shows that this study associates several features from each topic (see the features in each topic in Fig. 1). It considers internal and two-phase expert identification procedures. It uses EC history objects to analyze similarities in ECs through ECA. Through PDA, it uses an integrated PDM database as its operational database and multidimensional data analysis and SNA as its data-
Engineering changes (ECs) are modifications to dimensions, fits, forms, functions and materials in products or components after the product design has been released [13,14]. In order to maintain product data consistency, manufacturers establish a strict company-wide engineering change management (ECM) procedure that controls the processes and associated product data for ECs. Engineering change analysis (ECA) is a base data processing system for evaluation of ECs and predicting EC propagations, which
Fig. 1. Related topic.
driven data analysis methods. This section reviews related work regarding the three topics. 2.1. Expert identification
N. Do / Computers in Industry 95 (2018) 81–92
can support various decision-makings during ECM. Information systems supporting ECA should be flexible and expressive enough to support analysis of unstructured and uncertain EC processes. Among the tasks in ECA, typical tasks addressed in research include EC evaluation and EC propagation [15–19]. For both tasks, detecting similar ECs is a critical component in their data processing. There are three groups responsible for the existing approach to ECA support information tools, in terms of PDM integration [3]. The first group [20–24] establishes matrices that represent dependency relationships between ECs and related items, functions or design parameters. Since the relationships depend on the acquisition and representation of human expert knowledge, it is difficult for the approach to provide automated and flexible Supporting information tools for ECA. The second group [25] builds a separated EC database from a PDM database and aims to provide a fully computerized ECA system. The EC database records all ECs in detail and scans the data to evaluate the EC every time an ECR is submitted. They use past EC data in the database to reveal functional dependencies without defining the dependencies between design components. However, since they separate the EC database from PDM databases, they should duplicate product data managed in PDM databases that may lead to inefficiencies or errors in the product data management. The third group [3], including the proposed approach in this study, provides a flexible and interactive environment for ECA using OLAM architecture and integrated PDM databases as its operation data for ECA. There was an approach that analyzed similar ECs without dedicated information tools for ECA [7]. To find similar ECs from past EC archives, they use case-based reasoning (CBR). To rate ECs in CBR, they provide 5 dimensions for ECs (solution, product, process, problem and component) with their weighting factors decided by analytic hierarchy planning (AHP) [26]. However, they didn’t use automated information tools to gather or process EC data. For example, to calculate measures for the component dimension, they manually counted participating parts for ECs using displayed CAD models. This study is different from existing approaches on ECA to find similar ECs. It uses EC history data prepared by multidimensional data analysis that is connected to an operational PDM database. Using computerized data, it can automate CBR calculation using the EC history and other attributes of aggregated EC objects.
two different analysis methods, multidimensional data analysis to find similar ECs and SNA for participating engineers, and can extract both EC and participation data from a PDM database. 3. An expert identification procedure for ECs 3.1. Configuration control processes and EC objects This paper aims to evaluate and identify suitable experts for a specific EC problem using EC objects in a PDM database during configuration control processes. Configuration control processes provide steps of reviews or approvals through participating design engineers and decision makers of the current ECs represented by EC objects. Fig. 2 shows a configuration control process with participating engineers and an EC object. During the configuration control process, participating engineers review and approve an EC represented by the EC object (see the EC object in Fig. 2). They have certain access authorities to the EC object and use communication media to exchange opinions and knowledge. The access authorities and media exchanges will build networks between engineers (see the participating engineers in Fig. 2). The process also manages EC objects that link old and new items, representing changes in product specifications (see the old and new items in Fig. 2). The old and new items have their own product structures, which link other items as their components, and may share common items in their product structures. EC history represents the differences between old and new items. EC history can be recorded during changes or calculated from the differences between the old and new product structures (see EC history in Fig. 2). Current PDM databases contain recorded information about all participating engineers, EC objects, product structures and even change histories during configuration control processes (see PDM database in Fig. 2). To manage the product data, the PDM database requires expressive product data models that can represent the configuration control processes with the associated engineers, items and EC data. Section 4.1 describes a product data model that can represent all the objects in a PDM database, and Section 5 describes implementation of the PDM database as a prototype Supporting information system.
2.3. Product data analytics Product data analytics (PDA) is a data analysis approach that uses a PDM database as the operational data. It also selects datadriven analysis methods and suitable measures to analyze and evaluate different aspects of product development processes [3,27,28]. Using PDA, manufacturers can gain several advantages. First, they can reuse the accumulated product data in their PDM databases. The more product data gathered in the PDM database, the more accurate the data-driven analysis results are. Second, since the PDM database and analysis tools can be implemented as a computerized information system, they can automate the analysis procedure using computerized information systems. Third, since it can integrate the analysis procedure within the main product development process using PDM systems, the automated environment can support engineers with flexible and interactive analysis tools to answer various engineering problems during the main product development process. The proposed analysis procedure is based on the PDA approach, so it can enjoy all the advantages of PDA. In addition, it integrates
83
Fig. 2. A configuration control process and EC object using a PDM database.
84
N. Do / Computers in Industry 95 (2018) 81–92
3.2. Artefact-Actor-Networks and evaluation procedure As Reinhardt et al. [4] suggested, the data networks around the configuration control process in Fig. 2 can be represented with the actor and artefact networks. The actor network consists of the participating engineers and they build the network during their participation in terms of review or approval. The artefact network consists of ECs and old and new participating items with their product structures. The configuration control process links the two networks and manages the review and approval steps. This study considers a two-phase data analysis procedure to identify experts based on the actor and artefact networks. First, it compares EC histories from the artefact networks to find similar ECs from the archived EC objects in a PDM database. Then, in the second phase, it analyzes the actor networks associated with the ECs to identify suitable experts. Phase 1: Select similar ECs by comparing EC histories and their attributes in artefact networks Phase 2: Identify experts by analyzing actor networks of selected ECs The two phases are refined using the applied data analysis procedure in Section 4.3. To establish the analysis procedure, it requires measures to evaluate the networks derived from the PDM databases. This paper suggests a set of measures in the next section. 3.3. Measures for finding ECs and identifying experts To find similar ECs and identify suitable experts, the procedure requires appropriate measures for the evaluation. Different measures can be used due to the applied analysis methods and data availability. For applied data analysis methods, analysts consider both types of measures needed for the analysis as well as the availability of input data from PDM databases. This study is limited to measures that can be derived from PDM databases because it follows the PDA approach. Therefore, each manufacturer can choose or develop different measures for their procedures for expert identification. As an illustrative example, this section selects a set of representative measures for the two evaluation procedures for expert identification, which can be developed from data in a general PDM database. They are applied to example applications discussed in Section 5. Table 1 shows a set of measures that can be used to find ECs and identify experts. To match similar ECs, it uses the types of EC attributes of the two ECs. The types of ECs represent the types of problems that cause ECs. The types of old and new items represent the types of target products of ECs. Similarities in EC history represent similarities between product structure changes among ECs. The types of EC processes classify ECs for pilot product development from those for general product development. The data structures of EC history and how to detect the similarities will be described in Section 4.3. To detect similarities among ECs, the procedure uses CBR, which calculates a final similarity rate using the value of each measure through predefined weighting factors.
To evaluate participating engineers, this study applies SNA to the groups of engineers that participated in a configuration control process. The SNA calculates the centralities of each engineer in networks built during configuration control processes. Table 1 considers two types of centrality for engineers in a network, closeness and betweenness centralities. How to identify experts for specific ECs using the measures is described in Section 4.3. 4. Architecture and data models of the supporting information system This section introduces the architecture of a computer-based data analysis tool to illustrate how the proposed analysis procedure works using different data models, databases and analysis models. Fig. 3 shows the architecture, which consists of a PDM system with a product data model and PDM database (Section 4.1), a data cube with a multidimensional data model, an extract, transform and load (ETL) module and OLAP (Section 4.2), and data analysis models with SNA/CBR models (Section 4.3). 4.1. Product data model and PDM databases PDM is a computer-based information system that manages comprehensive product data and processes to support efficient product development. It consists of a PDM system and database (see the PDM system and database in Fig. 3). The PDM database stores all product development data represented by core objects, including items, engineering documents, product structures and engineering changes. As described in Section 3, the analysis procedure uses relationships between participating engineers (the actor networks) and the product data (the artefact networks) in the PDM database generated during product configuration control processes. A product data model represents how product data for product development are specified in a PDM database (see the product data model in Fig. 3). The ETL module and data cube should consider the product data model to prepare their input data. Fig. 4 shows core objects and their relationships in the proposed product data model. The person object represents a set of authorized engineers who can create, own or read objects in PDM systems (see the person object in Fig. 4). They can create items, engineering documents and product structures, and associate them to represent comprehensive and consistent product specifications.
Table 1 Measures for the expert identification analysis. Types
Measures
Phase 1: Finding Similar ECs
Type of ECs Type of participating products Similarities in EC history Type of EC processes
Phase 2: Evaluate Engineers
Closeness centrality in actor network Betweenness centrality in actor network Fig. 3. Proposed architecture of the Supporting information system.
N. Do / Computers in Industry 95 (2018) 81–92
85
item before the changes and the new item after the changes, respectively (see the EC object in Fig. 4). The action objects represent a sequence of stages a target item or EC object should go through (see the action object in Fig. 4). The stages can change the status of items or ECs and include working, reviewed or released conditions. Therefore, action objects can control and manage the lifecycle of items or ECs. The proposed procedure identifies experts for ECs using only the core objects in a PDM database, which many manufacturing companies have applied to their product development. 4.2. Multidimensional data model, data cube and OLAP
Fig. 4. Core objects and their relationships in the proposed product data model.
The object represents general objects including the item and EC objects (see the object, item and EC objects in Fig. 4). This will help the two subclasses, the item and EC objects, share relationships with the owner, action and document objects. The owner object associates the person with the item or EC objects to represent specific access authorities, such as read, edit, approval and delete operations (see the owner object in Fig. 4). One of the subclasses of the object, the item object represents products or parts that design engineers are developing or sharing during their product development. The structure object represents constituent relationships between items, and usually shows the assembly structures of the participating items (see the structure object in Fig. 4). Its relating attribute represents its assembly part and the related attribute represents its component part. A set of constituent relationships linked to each other through the participating items forms product structures. The EC objects represent changes to products through two attributes, the input and output attributes, which represent the old
The ETL module extracts, transforms and loads product data in the PDM database to a data cube that stores product data for multidimensional data analysis and preparation of input data for different data analysis models (see the extract/transform/load module and the data cube in Fig. 3). The data cube should have a multidimensional data model, which reorganizes the product data into special purpose data structures; facts and their associated dimensions. Fig. 5 shows the proposed multidimensional data model represented in mUML [29]. The multidimensional data model supports input data for analysis for EC history and SNA during the proposed analysis procedure. The multidimensional data model introduces a star constellation schema that consists of the EC history and owner fact classes (see the EC_history and owner classes in Fig. 5). The EC_history fact and associated dimension classes are borrowed from the EC history-centered multidimensional data model proposed by Do [3]. The multidimensional data model selects the EC history as one of its facts (see the EC history fact in Fig. 5). A fact is a central theme of a multidimensional data model to which all dimensions are linked [8]. The attribute value of the fact used as a unit for decisionmaking can be the weight, cost or other value. The proposed model uses unique identifications of EC history as its basic measure. The dimensions related to the EC history fact include the time_create, rel_parent, rel_child, operator and belonging_EC dimensions (see the dimensions and their relationships with the
Fig. 5. Proposed multidimensional data model.
86
N. Do / Computers in Industry 95 (2018) 81–92
fact in Fig. 5). A dimension is a view or feature of the fact being studied [8]. The dimensions represent attributes of the EC history in Fig. 5. The rel_parent (relationship with the parent product) and rel_child (relationship with the component) represent the target unit product structures of the EC history. Each dimension may have hierarchies. For example, the time dimension has the hierarchy of day, month and year. Hierarchies allow roll-up and drill-down OLAP operations during multidimensional data analysis. From the view of data structure, the EC history fact consists of a unit product structure and associated operations for product structure changes. Because the EC history fact represents a product structure with applied operations, it has two relationships with the item dimension, the rel_parent and rel_child. Through the relationships with the item dimension, it can locate exact items that participate in EC histories. The relationship with the EC dimension reveals which EC the fact belongs to, and the EC can group a set of EC history facts as its history. By determining the relationships between dimensions around the fact in the proposed multidimensional data model, engineers can evaluate and analyze ECs. For example, if an engineer is interested in which EC has the largest number of deleted products in its product structures during the changes, he or she can associate and analyze relationships among the rel_child, operator (its value may be ‘cut') and belonging_EC dimensions around the EC history fact. During the analysis, OLAP can provide various queries and navigation operations in the data cube. The other fact class, the owner class, contains the owner, target and time_create dimensions. They enable the proposed procedure to build social networks of participating engineers for a specific EC. The owner fact allows analysts to determine relationships between the owner and EC dimensions. Its owner_type attribute, which specifies the types of access privileges of a person, also allows analysts to analyze and retrieve different relationships between the EC and owner objects. The OLAP in Fig. 3 supports OLAP operations including slicing, pivoting and drill down/up [8] on the data cube, which help analysts to clarify relationships between different dimensions around the facts (see the dimensions and facts in Fig. 5). Multidimensional data analysis is also an effective tool for the exploratory data analysis phase, where analysts try to determine the scope of an unknown target analysis domain and reveal its characteristics. In addition, it can be used to prepare input data for different data mining models. In this study, the OLAP specifies EC
history dimensions for similarity detection and networks of participating engineers for SNA. 4.3. Analysis of EC history, case-based reasoning and social network analysis This section describes the proposed procedure for calculating the similarities between ECs and selecting experts who participate in selected ECs. Fig. 6 shows each step of the proposed procedure in IDEF0 notation [30]. The following sections detail each step in Fig. 6. 4.3.1. Analysis of EC history As mentioned in the introduction section, this study uses EC history data from PDM databases to automatically detect similarities in ECs using the components of its participating items. To prepare the EC history analysis, this study uses the EC object, EC history and EC history-centered multidimensional data model proposed by Do [3]. The study introduces the EC objects of management data, input/output items and EC history. It also describes operations of product structures that cause changes to product structures. Based on the EC object and operations, it defines EC history as a sequence of unit product structure and applied operation pairs that record the change in product structures. It defines the element of EC history using a predicate that has a unit product structure, operator and timestamp attributes. It also proposes an EC history-centered multidimensional data model that considers the EC object and EC history predicate. This study uses the data model as part of the proposed multidimensional model (see the EC_history class and associated dimension classes in Fig. 5). The procedure extracts EC data from EC objects (see the input of the extract EC data step in Fig. 6) and prepares a multidimensional database (a data cube) using the EC data (see the prepare data cube step). Using OLAP operations on the multidimensional database (a cube) based on the proposed multidimensional model, analysts can retrieve all participating items (object identifiers of the items) in the EC history of each EC object. By comparing and counting the same items in different EC histories from two EC objects, it is possible to compare similarities between them in terms of participating components of ECs. Section 5.3 describes details of the comparisons and calculations to check similarities of example ECs.
Fig. 6. Proposed PDA procedure for identifying experts for ECs.
N. Do / Computers in Industry 95 (2018) 81–92
4.3.2. Case-based reasoning The CBR model in the architecture is used to detect similarities in ECs from EC objects in the PDM database (see CBR in Fig. 3 and CBR for similarity step in Fig. 6). Mapping similar ECs is an illstructured and complex query problem, since the EC object is an aggregate of related objects, and the retrieval should find matching ECs using the attributes of the aggregated objects (see the EC and the related item, owner, action, document and EC_history objects in Figs. 4 and 5). CBR can be used to solve the problem by allowing retrieval and calculation of attribute values from the EC and its aggregated objects including EC history. To compare different attributes of aggregated objects during the calculation for similarities, it requires dimensions that classify the attribute values to aggregate each value of the dimension using pre-defined weighting factors and to produce a single final measure for similarities. For the dimensions, this study uses a portion of the key dimensions of performance measures in automobile development projects [7,31]. They include the product, components, problem, solution and process dimensions. For each dimension, this study uses specific attributes of the EC and EC history objects in Table 2. All matching criteria except the component dimension use simple keyword mapping among attribute values of ECs and their aggregated objects. The component dimension, which indicates how many same components have changed during ECs, is a leading factor in determining similarities in ECs. Lee et al. [7] reported the weights of dimensions determined by professionals using AHP. For example, the relative normalized weight of the component dimension is 0.447 (see the weights column in Table 2). To calculate the final similarity measure between two EC objects, this study uses a weighted sum of similarity of measures for each dimension: similarityðEa ; Eb Þ ¼
X
wi similarityðEa :di ; Eb :di Þ=
i
X
wi
i
where wi is the weight of dimension i (see the weights column in Table 2), and Ea.di and Eb.di represent the dimension i of the respective EC objects. The similarity function of each dimension calculates the result value on a scale of 0–1. 4.3.3. Social network analysis Social Network Analysis (SNA) is a set of analysis methods based on the graph theory to describe social networks and their features [32]. This study uses SNA to identify experts among participating engineers in similar ECs selected from the first phase. To identify experts, the proposed procedure applies two measures (closeness and betweenness centralities) in the networks among participating engineers in ECs. The closeness centrality is measured based on distances between nodes in networks [33]. This study defines the closeness of node i in networks with n nodes as follows: X Cc ðV Þ ¼ ðn 1Þ= dðv; jÞ; v6¼j
V ¼ 1; :::; n :
Table 2 The dimensions for similarity calculation in CBR. Dimensions
Matching Criteria
Weights [7]
Product Component Problem Solution Process
Old and New Product IDs of Two ECs Number of Same Components in EC Histories Application Type of ECs Type of Solutions for ECs (N/A in this study) Regular ECs or ECs for Pilot Production
0.094 0.447 0.21 0.214 0.035
87
where d(v,j) is the distance between the two nodes v and j. The betweenness centrality counts how many times the node follows the shortest paths between two nodes in a network. X givj =gij ; V ¼ 1; :::; n : Cb ðV Þ ¼ j6¼v6¼i
where gij is the number of shortest paths between nodes i and j, and givj is the number of shortest paths between nodes i and j that pass through node v. The closeness and betweenness centralities detect the position of actors in a network, which shows how often it is connected to other participants and links them. Therefore, they can be used to locate key actors or leaders who communicate with other actors more often and have access to more information and resources. From the view of the whole analysis procedure, SNA locates key or leader experts who worked on selected similar ECs. The procedure builds adjust networks from the data for participating engineers and experts during configuration control processes for specific EC objects (see the build adjust network step in Fig. 6). The result will be 2-mode social adjust networks that consist of participating engineers and selected EC objects. Using SNA, the procedure selects experts from the social network (see the calculate centrality step in Fig. 6). The 2-mode social networks and their analysis using SNA is described in Section 5.4 with illustrative examples. 5. Implementation of the expert identification procedure This section describes the implementation of the proposed product data analysis procedure using a prototype PDM system. To illustrate the implementation, it introduces an example product (Section 5.1). Using the example, it describes preparation of a data cube from a PDM database (Section 5.2), analysis of EC history to find similar ECs (Section 5.3), and social network analysis of participating engineers (Section 5.4). 5.1. Example product design The product design data in Fig. 7 show an example product and its product structures. The product structures are depicted with solid lines (see the line between Pump with Motor and Housing with Motor, which signifies the housing is a subcomponent of the pump). The example product design data in Fig. 7 are represented using a prototype PDM system [34] that supports the proposed product data model (see the proposed product data model in Fig. 4). Fig. 8 shows the output of the PDM system that implements example items in Fig. 7 with their product structures. Using the EC management functions of the PDM system, this study has changed the example product structure ten times. Fig. 8 illustrates the ten ECs including the first EC, introduction of a new product, with indicating EC numbers. The PDM system links each EC to a configuration control process represented by the action object (see the action object in Fig. 4) for review and approval by participating engineers (see Fig. 9). This example uses twenty participating engineers as reviewers or approvers for the configuration control processes. The product data in Fig. 8, including data from participating engineers, are stored in a PDM database that is implemented using the MySQL database management system [35]. This extracts, transforms and loads the product data to a pivot table, and implements the data cube in Fig. 3 using Microsoft Excel [36]. Microsoft Excel also supplies both an OLAP engine and its client functions including search and filter modules. The implementation
88
N. Do / Computers in Industry 95 (2018) 81–92
Fig. 7. Example product.
As mentioned in Sections 3.1 and 4.2, EC history represents differences between the old and new products in an EC and can be represented with a set of unit product structures using applied operations. To extract participating items in the EC history, implementation develops a database application that can calculate differences between the old and new product structures in an EC, and extract items from the calculated EC history. Fig. 10 shows a pivot table that is transferred from the EC data in the PDM database to find similar ECs. Its fact is the EC_history object and associated dimension is the EC object (see the EC_history and EC objects in Fig. 5). The numbers in the cid row in Fig. 10 are the identifiers of items from the extracted EC histories. Therefore, the pivot table itself provides an occurrence matrix of items in ECs. The pivot table shows which ECs (the ec_no column in Fig. 10) each participating item in the EC history (the cid row in Fig. 10) belongs to. For example, the participating item that has 2 as its identifier has ECs EC002, EC004 and EC005. 5.3. Multidimensional data analysis to find similar ECs
Fig. 8. Visualization of engineering changes applied to the example.
uses database views, Open Database Connectivity (ODBC) connections and query utility programs for the ETL function in Fig. 3. 5.2. Data cube from a PDM database A database query statement in SQL (Structured Query Language) for the ETL function joins tables in the PDM database to build aggregated product data that will be translated into multidimensional product data in a pivot table (a data cube). This produces aggregated product data that contain the EC_history and owner facts and associated dimensions of the proposed multidimensional data model, as shown in Fig. 5.
From the pivot table in Fig. 10, one can calculate measures to find similar ECs. Fig. 11 shows calculations for finding ECs that are similar to EC002 based on the EC history. The EC002 row shows occurrence numbers for its participating items. The occurrence matrix below, calculated from the EC history pivot table, indicates the existence of the same participating items as EC002. If an EC has the same participating item as E002 EC, then it is given the value 1 in the column representing the item with its system identifier. For example, EC004 has an occurrence value of 1 for item 2, since it has item 2 as its participating item, which also participated in the EC history of EC002. To calculate the EC history measure, the occurrence values for each EC are totaled (see the ratio column in Fig. 11). Using the EC history measure and other performance indicators through the CBR approach described in Section 4.3.2, one can calculate the final measure for similar ECs and find a set of ECs using the result.
N. Do / Computers in Industry 95 (2018) 81–92
Fig. 9. Implementation of a configuration control process using the action objects.
Fig. 10. Pivot tables from engineering change history data of example ECs.
89
90
N. Do / Computers in Industry 95 (2018) 81–92
Fig. 11. Calculation of measures from EC history.
Fig. 12. Adjacent matrix representing participating engineers.
5.4. Social network analysis The example built a network of participating engineers in a set of selected ECs using the owner objects, which associate the person and EC objects. If two different engineers participate in a configuration control process, they have a relationship through the EC for the configuration control. As a result, it provides a 2-mode network that represents relationships between participating engineers and selected ECs. To calculate measures of the extracted network, an adjacent matrix that represents a network is prepared in R [37]. Fig. 12 shows an adjacent matrix that represents the 2-mode network of ten engineers and five selected ECs in R. The five selected ECs (see EC004–EC010 in Fig. 12) came from the first phase of the analysis procedure and the ten participating engineers (see P001–P015 in Fig. 12) are those who participated in the configuration control processes for the selected ECs. Fig. 13 displays a visualized graph of the network in Fig. 12. The visualization shows there are two groups of ECs and engineers P003 and P012 have close relationships with other participants. Fig. 14 shows the result of SNA using the sna package in R [38]. The variable R is a matrix that stores the result of SNA and the round () is a function to truncate output numbers. The closeness and betweenness represent the closeness and betweenness centralities in Table 1. The output lists the values of each centrality
in order of the sequence of the participating engineers (see P001 to P015 in Fig. 12). From the result, engineers P003 and P012 (the third and eighth engineers) have high closeness and betweenness centralities {Cc = (0.56, 0.61), Cb = (18,40)}. In the example scenario, since
Fig. 13. Visualized social network from the matrix in Fig. 10.
N. Do / Computers in Industry 95 (2018) 81–92
91
7. Conclusion
Fig. 14. SNA result.
engineer P003 is an approver in charge of the ECs, he/she has a high centrality value. In the case of engineer P012, since he/she participated in both groups of ECs and performed roles to connect the ECs, he/she has a high centrality value. As a result, through the first and second phases, the procedure can recommend a set of engineers who have experienced similar ECs and performed central roles during the ECs. 6. Discussion Due to the limited unit of data analysis and closed commercial PDM databases, there may be constraints when it comes to solving real problems. First, isolation of the unit of data analysis to a PDM database can limit the applicability of the proposed procedure. Since the unit of data analysis of the expert identification procedure is limited to integrated data in PDM databases, the proposed procedure may not be applicable to extended ECs and expert evaluation problems. For example, if evaluation of experts needs additional attributes that are not managed by PDM databases, it may require additional data from other company-wide information systems. Second, implementation of the proposed procedure may be difficult because commercial PDM or PLM systems provide limited access to their databases. Commercial PLM systems allow limited access to the product data in their databases through specific application programming interfaces. Because of closed commercial PLM databases, real-world application of the procedure may face interface problems of EC and expert data in a PDM database to multidimensional data analysis and SNA engines. Even though limitations with the current implementation, the author can list several possibilities for the proposed procedure. First, the examples show that the operational data in a PDM databases can support the expert identification procedure that requires different EC and expert data. it supports all necessary data for the analysis and evaluation procedure using operational product data in a PDM database without additional efforts for data preparation. Second, the examples show that OLAP using multidimensional database enables flexible and interactive data processing that can support data preparation for both similar ECs and expert identification. Flexible and interactive OLAP allows analysts to explore the unit of domain through exploratory data analysis and easily prepare input data for different data analysis and evaluation processes. The unified data preparation environment based on OLAP can relieve analysts’ burden for data preparation. Third, the proposed approach can utilize the large amount of EC and product data accumulated in operational PDM databases to increase quality of the expert identification procedure. There are many reports that indicate that increasing the amount of input data for analysis or estimation processes can dramatically increase the quality of the analysis results [39]. Since product development projects will produce a large amount of EC data in PDM databases, this approach also looks forward to the same effect in real-world application of the procedure based on accumulated EC data in PDM databases.
This paper proposed a data analysis procedure to identify suitable experts during ECM. To support an efficient data analysis procedure, it applies the PDA approach so that it can reuse ordinary PDM databases as its source of analysis data. It devises a two-phase analysis procedure based on the actor and artefact networks, which provide flexible and modular analysis models. Its EC historybased approach to find similar ECs also enables analysts to implement fully automated analysis procedures based on computerized information management tools. This study also provides an information system architecture that can support the proposed analysis procedure. It consists of a product data model, an ETL module, a multidimensional data model and OLAP/SNA models and engines. Using the architecture, this study implemented a prototype PDM system with its database based on the proposed product data model, a data cube based on the multidimensional data model, and a data analysis system using commercial SNA and OLAP engines. To evaluate the proposed procedure and its implementation, this study analyzed example product design data with multiple EC applications using the implemented prototype analysis system. Using the product and multidimensional data models, it selects similar ECs by comparing its EC history with existing ECs. From the selected ECs, it can identify experts who have played key roles in the actor networks using SNA models and engines. The result shows that it can identify a set of suitable engineers for ECs through a series of applications of the PDM database, data models, data cube and data analysis models and engines. This approach has several limitations. First, it is applied to a synthesized example as proof of concept without real industrial application. It is difficult to find real industrial examples since the approach requires detailed data for ECs with their aggregated product data, which is usually classified information for manufacturers. Second, there is no connection of analysis results for each EC between the first and second phases of the procedure. Even if the first phase is likely to affect the second one, the procedure does not consider the result of the first phase for the selected ECs during the second phase. Finally, there is a possibility that the second phase will be improved. This study builds the actor networks from engineers’ participation data in ECs. If there are messages or question/answer data between participating engineers during configuration control processes, a new analysis procedure can select engineers from a contextual analysis of the exchanged messages or questions/answers. As a further study, enhancement of the second phase is planned. If there is a recommendation for experts or question and answer events between experts occur, the facts can be used to enhance the expert identification procedure in the second phase. The question and answer facts enable analysts to use other expert identification tools such as link analysis or text mining, and this can further improve the effectivity of the analysis procedure. Acknowledgements This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and future Planning (No. 2016R1A2B4006819). References [1] E. Vezzetti, M. Alemanni, J. Macheda, Supporting product development in the textile industry through the use of a product lifecycle management approach: a preliminary set of guidelines, Int. J. Adv. Manuf. Technol. 79 (9–12) (2015) 1493–1504.
92
N. Do / Computers in Industry 95 (2018) 81–92
[2] E. Vezzetti, M. Alemanni, B. Morelli, New product development (NPD) of ‘family business’ dealing in the luxury industry: evaluating maturity stage for implementing a PLM solution, Int. J. Fash. Des. Technol. Educ. 10 (2) (2017) 219–229. [3] N. Do, Integration of engineering change objects in product data management databases to support engineering change analysis, Comput. Ind. 73 (2015) 69– 81. [4] W. Reinhardt, T. Varlemann, M. Moi, Artefact-Actor-Networks as tie between social networks and artefact networks, Proceedings of 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing, Washington, DC, 2009. [5] W. Reinhardt, T. Varlemann, M. Moi, A. Wilke, Modeling, obtaining and storing data from social media tools with Artefact-Actor-Networks, Proceedings of the 18th Intl. Workshop on Personalization and Recommendation on the Web and Beyond, Kassel, Germany, 2010. [6] W. Reinhardt, T. Varlemann, M. Moi, H. Drachsler, P. Sloep, Mining and visualizing research networks using the Artefact-Actor-Network approach, Comput. Soc. Netw. 233–267 (2012). [7] H.J. Lee, H.J. Ahn, J.W. Kim, S.J. Park, Capturing and reusing knowledge in engineering change management: a case of automobile development, Inf. Syst. Front. 8 (2006) 375–394. [8] J. Han, M. Kamber, J. Pei, Data Mining, Concepts and Techniques, third edition, Morgan Kaufmann, 2012. [9] T. Lappas, K. Liu, E. Terzi, A survey of algorithms and systems for expert location in social networks, Soc. Netw. Data Anal. 215–241 (2011). [10] D. Yiman-Seid, A. Kobsa, Expert finding systems for organizations: problem and domain analysis and the DEMOIR approach, J. Organ. Comput. Electron. Commerce 13 (1) (2003) 1–24. [11] X. Song, B.L. Tseng, C.Y. Lin, M.T. Sun, ExpertiseNet: relational and evolutionary expert modeling, Chapter User Modeling 2005, Vol. 3538 of the series Lecture Notes in Computer Science, 99–108, (2005) . [12] C.Y. Lin, L. Wu, Z. Wen, H. Tong, V. Griffiths-Fisher, L. Shi, D. Lubensky, Social network analysis in enterprise, Proc. IEEE 100 (9) (2012) 2759–2776. [13] I.C. Wright, A review of research into engineering change management: implications for product design, Des. Stud. 18 (1) (1997) 33–42. [14] T.A.W. Jarratt, C.M. Eckert, N.H.M. Caldwell, P.J. Clarkson, Engineering change: an overview and perspective on the literature, Res. Eng. Des. 22 (2) (2011) 103– 124. [15] C.J. Ho, J. Li, Progressive engineering changes in multi-level product structures, Omega 25 (5) (1997) 585–594. [16] C.M. Eckert, P.J. Clarkson, W. Zanker, Change and customisation in complex engineering domains, Res. Eng. Des. 15 (1) (2004) 1–21. [17] N. Do, I.J. Choi, M. Song, Propagation of engineering changes to multiple product data views using history of product structure changes, Int. J. Comp. Integr. Manuf. 21 (1) (2008) 19–32.
[18] M.Z. Ouertani, Supporting conflict management in collaborative design: an approach to assess engineering change impacts, Comput. Ind. 59 (9) (2008) 882–893. [19] O.O. Ariyo, C.M. Eckert, P.J. Clarkson, Challenges in identifying the knock-on effects of engineering change, Int. J. Des. Eng. 2 (4) (2009) 414–431. [20] K. Rouibah, K.R. Caskey, Change management in concurrent engineering from a parameter perspective, Comput. Ind. 50 (1) (2003) 15–34. [21] P.J. Clarkson, C. Simons, C.M. Eckert, Predicting change propagation in complex design, J. Mech. Des. 126 (5) (2004) 788–797. [22] T. Eger, C.M. Eckert, P.J. Clarkson, Engineering change analysis during ongoing product development, International Conference on Engineering Design ICED’07, Paris, France, 2007. [23] G. Fei, J. Gao, O. Owodunni, X. Tang, A method for engineering design change analysis using system modelling and knowledge management techniques, Int. J. Comp. Integr. Manuf. 24 (6) (2011) 535–551. [24] B. Hamraz, N.H.M. Caldwell, P.J. Clarkson, A matrix-calculation-based algorithm for numerical change propagation analysis, IEEE Trans. Eng. Manage. 60 (1) (2013) 186–198. [25] V. Kocar, A. Akgunduz, ADVICE: a virtual environment for engineering change management, Comput. Ind. 61 (1) (2010) 15–28. [26] T.L. Saaty, How to make a decision: the analytic hierarchy process, Eur. J. Oper. Res. 48 (1990) 9–26. [27] N. Do, Application of OLAP to a PDM database for interactive performance evaluation of in-progress product development project, Comput. Ind. 65 (4) (2014) 636–645. [28] N. Do, S. Bae, C. Park, Interactive analysis of product development experiments using on-line analytical mining, Comput. Ind. 66 (2015) 52–62. [29] O. Herden, A design methodology for data warehouse, in: A. Hinze, B.J. Hommes (Eds.), Proceedings of Doctoral Workshop at CAiSE’00, Stockholm, 2000. [30] Defense Acquisition University, Systems Engineering Fundamentals, Defense Acquisition University Press, Virginia, 2001. [31] J. Golebiowska, R. Dieng-Kuntz, O. Corby, D. Mousseau, Building and exploiting ontologies for an automobile project memory, Proceedings of K-CAP’01, Victoria, 2001. [32] J. Scott, Social Network Analysis a Handbook, Sage Publications, London, 1987. [33] M.H. Huh, Introduction to Social Network Analysis Using R, Jayu Academy, Paju, 2014. [34] Team Engineering Environment Product Data Management System, http://tee. gnu.ac.kr. [35] MySQL Database Management System, http://mysql.com. [36] Microsoft Excel, http://www.microsoft.com/office. [37] R project, https://www.r-project.org. [38] C.T. Butts, Social network analysis with sna, J. Stat. Softw. 24 (6) (2008) 1–51. [39] V. Mayer-Schonberger, K. Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think, Houghton Mifflin Harcourt, London, (2013) .