Available online at www.sciencedirect.com Available online at www.sciencedirect.com
ScienceDirect ScienceDirect Available online atonline www.sciencedirect.com Available at www.sciencedirect.com Procedia CIRP 00 (2019) 000–000 Procedia CIRP 00 (2019) 000–000
ScienceDirect ScienceDirect
www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia
Procedia CIRP 00 (2017) 000–000 Procedia CIRP 83 (2019) 132–138 www.elsevier.com/locate/procedia
11th CIRP Conference on Industrial Product-Service Systems 11th CIRP Conference on Industrial Product-Service Systems
Collaborative Optimization of Service Scheduling for Industrial Cloud Collaborative Optimization Service Scheduling Industrial Cloud 28th CIRP Design of Conference, May 2018, Nantes,for France Robotics Based on Knowledge Sharing Robotics Based on Knowledge Sharing A new methodology analyze functional andZhou physical architecture of 1 4 Hang Du1,2to , Wenjun Xu1,2the , Bitao Yao3,2,*, Zude , Yang Hu 1,2 1,2 3,2,* 1 4 Hang Dufor, Wenjun Xu , Bitao Yao , product Zude Zhoufamily , Yang Hu existing products an assembly oriented School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China identification 1
Hubei Key Laboratory of1School Broadband Wireless Communication and Sensor Networks, Wuhan University of Technology, of Information Engineering, Wuhan University of Technology, Wuhan 430070, China Wuhan 430070, China 3 2 SchoolofofBroadband Mechanical and Electronic Engineering, WuhanNetworks, UniversityWuhan of Technology, 430070, China Hubei Key Laboratory Wireless Communication and Sensor UniversityWuhan of Technology, Wuhan 430070, China 4 3 China Development and Design Wuhan 430064, China School of Mechanical and Ship Electronic Engineering, WuhanCenter, University of Technology, Wuhan 430070, China 4 Shipaddress: Development and Design Center, 430064, Écoleauthor. Nationale d’Arts China et E-mail Métiers, Arts et Métiers ParisTech, LCFC Wuhan EA 4495, 4 Rue China Augustin Fresnel, Metz 57078, France * Corresponding Tel.:Supérieure +86-13971624993.
[email protected] 2
Paul Stief *, Jean-Yves Dantan, Alain Etienne, Ali Siadat
* Corresponding author. Tel.: +86-13971624993. E-mail address:
[email protected] * Corresponding author. Tel.: +33 3 87 37 54 30; E-mail address:
[email protected]
Abstract Abstract Industrial Cloud Robotics (ICR), which has the characteristics of resource sharing, convenient access and high efficiency, is the combination of Abstract Cloud Computing and Industrial In the current manufacturing workshop, industrial robots that areefficiency, not connected each other use Industrial Cloud Robotics (ICR), Robots. which has characteristics of resource sharing,most convenient access and high is thetocombination of onboard processors and memories with limited resources, which leading to the constraints of multi-robot information sharing. Cloud Computing and Industrial Robots. In the current manufacturing workshop, most industrial robots that are not connected to each other In today’s business environment, the trend towards more product variety and customization is unbroken. Due to this development, theHowever, needuse of knowledge sharing for collaborative optimization is very important. the product service scheduling ofsharing. industrial robots onboard processors andmulti-robot memories systems with limited resources, leading to the Inconstraints offamilies. multi-robot information However, agile and reconfigurable production emerged to copewhich with various products and Tooptimization design and optimize production oriented workshop manufacturing tasks, the lack of knowledge sharing seriously furthermost performance improvement of the knowledge sharing multi-robot collaborative optimization is very important. In the servicethe scheduling optimization of industrial robots systems astowell as tofor choose the optimal product matches, product analysis methods arerestricts needed. Indeed, of the known methods aim to collaborative optimization. Aiming at this theof collaborative framework of service scheduling for industrial cloud robotics oriented to workshop tasks, lack knowledge sharing restricts the performance improvement of and the analyze a product or onemanufacturing product family on problem, thethe physical level. Differentoptimization productseriously families, however, mayfurther differ largely in terms of the number is built, and optimization. then a cloud-based knowledge sharing mechanism for industrial robots and of a service collaborative optimization method of robotics service collaborative at this an problem, thecomparison collaborative optimization framework scheduling for industrial cloud nature of components. This Aiming fact impedes efficient and choice of appropriate product family combinations for the production scheduling based on Deep Reinforcement Learning (DRL) are proposed, so as to realize a comprehensive performance improvement of the is built, and then a cloud-based knowledge sharing mechanism for industrial robots and a collaborative optimization method of system. A new methodology is proposed to analyze existing products in view of their functional and physical architecture. The aim is to service cluster whole manufacturing system. Finally, caseLearning study is implemented to verify so the effectiveness thelines proposed scheduling based on assembly Deep Reinforcement (DRL) areoptimization proposed, to realize a of comprehensive performance improvement of the these products in new orienteda product families for the ofasexisting assembly and themethod. creation of future reconfigurable whole manufacturing system. Finally, a case study implemented to verify effectiveness of theFunctional proposed method. assembly systems. Based on Datum Flow Chain, theisphysical structure of thethe products is analyzed. subassemblies are identified, and 2019 Theanalysis Authors. Published by Moreover, Elsevier B.V. a© performed. a hybrid functional and physical architecture graph (HyFPAG) is the output which depicts the ©functional 2019 The Authors.isPublished by Elsevier B.V. Peer-review under responsibility of the scientific committee 11th CIRP Conference on Industrial Product-Service Systems.An illustrative © 2019 The Authors. Published by Elsevier B.V. similarity between product families byscientific providing design support to both, production system planners and productSystems designers. Peer-review under responsibility of the committee ofof thethe 11th CIRP Conference on Industrial Product-Service Peer-review responsibility ofexplain the scientific committee of the 11thAn CIRP Conference Industrial example of a under nail-clipper is used to the proposed methodology. industrial case on study on two Product-Service product familiesSystems. of steering columns of Keywords: Industrial knowledge service scheduling; Deep Reinforcement Learning; thyssenkrupp Presta Cloud FranceRobotics; is then carried outsharing; to givemanufacturing a first industrial evaluation of the proposed approach. Robotics; knowledgeB.V. sharing; manufacturing service scheduling; Deep Reinforcement Learning; ©Keywords: 2017 TheIndustrial Authors.Cloud Published by Elsevier Peer-review under responsibility of the scientific committee of the 28th CIRP Design Conference 2018.
1. Introduction Keywords: Assembly; Design method; Family identification
strategy, sensory information, etc. The lack of knowledge sharing sensory seriouslyinformation, restricts etc. the The further 1. Introduction strategy, lack ofperformance knowledge With the rapid development of sensor technology, control improvement of the collaborative optimization. sharing seriously restricts the further performance Cloud Robotics (CR) is a concept proposed by Professor technology artificial intelligence, Industrial Robotscontrol (IRs) With the and rapid development of sensor technology, improvement of the collaborative optimization. 1.have Introduction the product andis characteristics manufactured and/or James Kuffner at(CR) Carnegie in 2010 [3]. Cloud Roboticsrange a Mellon concept University proposed by Professor the and characteristics of high Industrial repetitionRobots accuracy, technology artificial intelligence, (IRs) of There some of CR. The RoboEarth [4] project in thisapplications In this context, the maininchallenge in James are Kuffner atsystem. Carnegie Mellon University 2010 [3]. adaptability and flexibility under harshrepetition working conditions have the characteristics of high accuracy, assembled is a World Wide Web isfor which giant There are some applications ofrobots, CR. RoboEarth [4] aproject Due thebeen fastwidely development in working the manufacturing domain of modelling and analysis now not The only to provides cope with single [1], and tohave used harsh in various adaptability and flexibility under conditions network database is a World Wide Web repository forrange robots, which robots provides a share giant communication and an ongoing trend ofship, digitization and products, aand limited product or where existing productcan families, industries, such been as automobile, aerospace, etc. However, [1], and have widely used in various manufacturing information learn from eachtoother about their can behaviors network database repository where robots share digitalization, manufacturing areship, facing also toand beand able to analyze and compare products to define most industrial are enterprises using onboard processors and but industries, such asrobots automobile, aerospace, etc. important However, and the environments. RoboBrain [5] isabout a large-scale robotic information and learnItfrom each other their behaviors challenges in today’s market environments: a continuing product families. can be observed that classical existing memories etc., and they faceusing enormous computational most industrial robots are onboard processors and new knowledge base which learns from publicly available network and the environments. RoboBrain [5] is a large-scale robotic tendency towards reduction of product development times and storage issues thethey context of intelligent manufacturing, memories etc., in and face enormous computational and product families are regrouped in function of clients or features. resources, computer simulations and real-life robot trials. The knowledge base which learns from publicly available network shortened product lifecycles. In addition, there ismanufacturing, anofincreasing especially in theinservice scheduling optimization industrial However, assembly oriented product families are hardly to find. storage issues the context of intelligent great advantages and potential of CR have attracted resources, computer simulations and real-life robot trials. demand of customization, being at manufacturing the same time of intasks a global On the product family level, products differ mainly in The two robots oriented to workshop [2]. especially in the service scheduling optimization industrial significant attentionandfrom academic and have manufacturing great advantages potential of CR attracted competition over the world. This Therefore, itwith is notcompetitors enough for all a single robot to complete tasks robots oriented to workshop manufacturing taskstrend, [2]. main characteristics: (i) the number of components and (ii) the industry, and the concept "Industrial Robotics significant attention from of academic and Cloud manufacturing which is inducing theits development macro to control micro efficiently own Therefore, itonly is notbyenough for aknowledge single from robotincluding to complete tasks type of components (e.g. mechanical, electrical, electronical). (ICR)" hasand emerged. ICR, with characteristics resource industry, the concept of the "Industrial Cloudof Robotics markets, results lot sizes due to augmenting Classical methodologies considering mainly single products efficiently only inbydiminished its own knowledge including control (ICR)" has emerged. ICR, with the characteristics of resource product varieties (high-volume to low-volume production) [1]. or solitary, already existing product families analyze the 2212-8271 © 2019 Theaugmenting Authors. Published by Elsevier To cope with this variety as wellB.V. as to be able to product structure on a physical level (components level) which Peer-review the scientific committee the 11th CIRP Conference Product-Service 2212-8271 possible ©under 2019responsibility The optimization Authors. of Published by Elsevier B.V. identify potentials in ofthe existing causeson Industrial difficulties regardingSystems. an efficient definition and doi:10.1016/j.procir.2017.04.009 Peer-review under responsibility of the scientific committee of the 11th CIRP Conference on Industrial Product-Service Systems.families. Addressing this production system, it is important to have a precise knowledge comparison of different product doi:10.1016/j.procir.2017.04.009
2212-8271©©2017 2019The The Authors. Published by Elsevier 2212-8271 Authors. Published by Elsevier B.V. B.V. Peer-reviewunder underresponsibility responsibility scientific committee of the CIRP Conference on 2018. Industrial Product-Service Systems. Peer-review of of thethe scientific committee of the 28th11th CIRP Design Conference 10.1016/j.procir.2019.03.142
2
Hang Du et al. / Procedia CIRP 83 (2019) 132–138 Author name / Procedia CIRP 00 (2019) 000–000
sharing, convenient access and high efficiency, is the combination of Cloud Computing and Industrial Robots[6], which can encapsulate related resources of IRs into services to support the wide-area publication, retrieval, query and subscription. Compared with traditional IRs that are
133
independent of each other, ICR has great advantage in learning capability and sharing capability. ICR can interact and share knowledge across a wide area to complete manufacturing tasks better.
Fig.1 Collaborative optimization framework of service scheduling for industrial cloud robotics.
Knowledge sharing is of great significance for scheduling optimization of multi-industrial robots. In the field of industrial robot scheduling, a lot of research work has been done on energy-efficient optimization, multi-level collaborative optimization and multi-robot scheduling. The goal of achieving energy-efficient optimization is to reduce the energy consumption of the equipment while ensuring the production performance of the manufacturing system [1]. Aiming at the multi-level hierarchy framework of robotic manufacturing systems, multi-level collaborative optimization model, mechanisms and corresponding optimization methods were proposed in [7, 8]. Xu et al. present a multi-objective joint model of energy consumption and production efficiency, and an enhanced Pareto-based algorithm is proposed to solve a multi-objective dynamic scheduling optimization of manufacturing services in workshop [9]. Tao et al. proposed an improved hybrid genetic algorithm based on a case library and Pareto solution to solve the service scheduling problem [10]. Bello et al. present a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning [11]. However, in multi-robot service scheduling optimization, most research work aims at independent systems, lacking necessary information interaction and knowledge sharing between robots. This makes it difficult to achieve optimal manufacturing efficiency. How to make robots share knowledge with each other requires methods to effectively encode, exchange and reuse knowledge [12]. Wang et al. developed a robot social network called Numbots which assists knowledge sharing and skill transfer among robots [13]. Moritz Tenorth discussed the representation and exchange of knowledge about actions, objects and environments in the RoboEarth framework, and developed a formal language which was used in robot
platform [14]. This paper aims to realize the knowledge sharing among industrial robots in a wide area based on the knowledge sharing mechanism, aiming to realize a comprehensive energy-effective collaborative optimization of the whole industrial cloud robotic manufacturing system. 2. Collaborative Optimization Framework of Service Scheduling for ICR As shown in Fig. 1, the proposed collaborative optimization framework of service scheduling for ICR has three layers [15], including the robot equipment layer, the local center server layer and the cloud layer. The robot equipment layer is the lowest layer and is the main research object of collaborative optimization of service scheduling. Industrial robots can be used in assembly, processing, welding, spraying, handling and so on. The local center server layer (local cloud layer) is the middle layer of the collaborative optimization framework. On the one hand, the local center server layer can integrate and manage the related resources of robots in the local area to a certain extent, and realize the knowledge interaction and sharing among robots in the local area. On the other hand, the local central service layer acts as an interface between the cloud layer and the robot equipment layer and is responsible for data filtering and pre-processing. The cloud layer is the top layer of the collaborative optimization framework of service scheduling. For one thing, the cloud, as a resource service center, is an intermediary for knowledge sharing among robots. For another, the cloud is the service coordinator and supervisor for the whole manufacturing system. During the service execution phase, the cloud service management center can monitor and ensure
Hang Du et al. / Procedia CIRP 83 (2019) 132–138 Author name / Procedia CIRP 00 (2019) 000–000
134
that the services are properly applied to the manufacturing process. In the collaborative optimization framework of service scheduling, IRs uploads its own resources (such as control strategy, perception information, etc.) to the cloud platform which integrates various resources and encapsulates them into services for real-time resources scheduling and management. IRs take the cloud as an intermediary to realize knowledge interaction and sharing with each other. When the service user initiates a cloud task request by the cloud platform, the task will be decomposed into atomic manufacturing tasks. Then, these tasks are matched with services which are encapsulated from robots' capabilities through logical language parsing, mapping relationship and reasoning evolution of rules. Finally, the cloud service management center will generate a collaborative optimization solution of service scheduling for industrial robots based on the knowledge, the constraints and optimization objectives. The collaborative optimization framework of service scheduling for ICR can upgrade the traditional robotic manufacturing systems to cloud robotic manufacturing system (ICRMSs) by enhancing the data interaction and knowledge sharing among robots, and support more efficient collaborative optimization of service scheduling for multiindustrial robots based on knowledge sharing. IRs can share information and learn from each other about their behaviors and environments based on the network architecture, database repository and other components carried by the cloud. At the same time, the cloud can provide more powerful computing power and data (or knowledge) management capabilities. 3. Cloud-Based Knowledge Sharing Mecha
[email protected] for ICR In the cloud environment, knowledge sharing among IRs includes three stages: knowledge coding, knowledge exchange and knowledge reuse, as shown in Fig. 2.
to store and manage knowledge services, and these services are exposed as SOAP services to support publication, retrieval, query and subscription. Task-oriented knowledge (TOK) can express information about the robots involved in a particular manufacturing task and their capabilities, processes, results, etc. by using a standardized conceptual model. It can be expressed as: TOK {TaskInfo , RobotInfo , ProcessInfo , ResultInfo }
(1)
TaskInfo {TaskID ,TaskName , SubTaskID , SubTaskName }
(2)
RobotInfo {RobotID , RobotName ,SkillSet }
(3)
SkillSet {SkillID , SkillName }
(4)
Where TaskInfo represents task-related information involved in completing a specific task requirement and it can be expressed as: Where TaskID and TaskName respectively represent the unique identifier and task name description of the and SubTaskName manufacturing task; SubTaskID respectively represent the identifier and subtask name description corresponding to the subtask. RobotInfo represents the information about a robot that is required to complete a specific task and it can be expressed as: Where RobotID and RobotName respectively represent the unique identifier and name of each robot that completes a specific manufacturing task; SkillSet represents the robot capability set and it can be expressed as: Where SkillID and SkillName respectively represent the unique identifier and capability name description of a certain capability. ProcessInfo represents the method and process of each robot task execution involved in completing a specific task requirement and it can be expressed as: ProcessInfo {ProcessID ,TaskName , RobotName , SkillName } (5
)
Where ProcessID represents the unique identifier of each process flow; TaskName , RobotName , SkillName represents the subtask name, robot name, and capability name corresponding to a process flow. ResultInfo represents the results of each robot task execution related to the completion of a specific task requirement, including the cost, quality, energy consumption and time of each sub-task execution. It can be defined as: ResultInfo {ProcessID ,Cost ,Quality , Energy ,Time }
Fig.2 Flow diagram of knowledge sharing mechanism.
3.1. Knowledge Coding The first step of knowledge sharing is to encode the knowledge related to the production process of ICR in the whole manufacturing system. Web Ontology Language (OWL) can be used to encode task-oriented knowledge into knowledge ontology, which encapsulates the knowledge applied by robots to complete specific tasks into web services. At the same time, a distributed knowledge base is established
3
(6)
Where ProcessID represents the unique identifier of each process flow; Co st , Quality , Energy and Time respectively represent the cost, quality, energy consumption and time. Quality is the collection of availability, reliability and so on.
3.2. Knowledge Exchange
In the stage of knowledge exchange, the knowledge exchange among robots is actually the synchronous updating of knowledge between the local knowledge base (LKB) and the cloud knowledge base (CKB). The knowledge exchange process is shown in Fig. 3.
4
Hang Du et al. / Procedia CIRP 83 (2019) 132–138 Author name / Procedia CIRP 00 (2019) 000–000
In this paper, the process of uplink knowledge exchange between the robot and the cloud is called R2C (Robot to Cloud) interactive sharing, that is, the knowledge synchronization mechanism from local knowledge base to cloud knowledge base. The process of downlink knowledge exchange between cloud and robot is called C2R (Cloud to Robot) interactive sharing, that is, the knowledge synchronization mechanism of cloud knowledge base to local knowledge base. The R2R (Robot to Robot) mechanism relies on the R2C and C2R mechanisms to form a closed loop. The principles of the R2C and C2R exchange mechanisms are given below.
135
will be decomposed into atomic tasks through logical language parsing, mapping relationship and reasoning evolution of rules.
Fig.4 Process of knowledge reuse.
Fig.3 Process of knowledge exchange.
(1) R2C mechanism: Assuming that the synchronous update interval is a constant time interval and initial time is t 0 , the LKB and CKB will match at t 0 by timestamp
management, when the timestamp [k i ,t i ] in LKB is latest
than [k i ,t i ] in CKB, the knowledge of LKB will be update to CKB, namely [k i ,t i ] [k i ,t i ] . Where [k i ,t i ] is the i-th
changed knowledge in CKB and [k i ,t i ] is the i-th changed knowledge in LKB. (2) C2R mechanism: When the knowledge in LKB is insufficient to support the robot to complete the corresponding task requirements, it is necessary to check whether there is updated knowledge in CKB that can meet the corresponding tasks. If Composition([k i ,t i ] ) Task , that is the knowledge [k i ,t i ] in the LKB and CKB is sufficient to complete the task, the corresponding knowledge in CKB should be updated to the LKB. At the same time, the robot will use the new updated knowledge to complete the corresponding task. 3.3. Knowledge Reuse The key to the collaborative optimization of service scheduling for ICR is to reuse the task-oriented knowledge acquired in the knowledge coding and knowledge exchange stage. As shown in Fig. 4, when a new manufacturing task comes, SPARQL [16] language is used to query the task information from the cloud knowledge base. After semantic analysis, the input task is decomposed into atomic tasks. When the task does not exist in the cloud knowledge base, it
Then, according to the task requirements, the robot information RobotInfo and process information ProcessInfo for the manufacturing task are queried, and the capability matching based on the knowledge semantic information is completed. Dynamic Description Logic (DDL) [17] is a dynamic extension of description logic that combines the knowledge of logical static domain with the knowledge of dynamic domain. The capability matching based on knowledge and semantic information can be accomplished by using DDL. Finally, the result information ResultInfo in the knowledge base is obtained as an alternative service set of the collaborative optimization of service scheduling, so that the optimal scheduling scheme of the service can be realized through the optimization algorithm and the knowledge can be reused. 4. Collaborative Optimization Method of Service Scheduling based on DRL In the service-oriented manufacturing model, the manufacturing capability of industrial robots becomes a service after virtualization and servitization, and connects with the manufacturing system of ICR in the form of service. The collaborative optimization of service scheduling for ICR is of great significance for the efficiency of the manufacturing workshop. The traditional service scheduling method is to combine the services within the local area, while the service scheduling based on knowledge sharing are more extensive. Deep Q-Network (DQN) [18] is a deep reinforcement learning algorithm and is developed from the foundation of Q learning. Q learning is only applicable for solving small-scale discrete space problems, while DQN can solve large-scale or continuous space problems. Meanwhile, DQN has been successfully applied in combination optimization, so we use DQN algorithm to realize the collaborative optimization of service scheduling for ICR based on knowledge sharing. Q learning is actually a variation of the Markov Decision Process (MDP). The basic mathematical form of Q learning is shown in Eqs. (7) and (8).
Hang Du et al. / Procedia CIRP 83 (2019) 132–138 Author name / Procedia CIRP 00 (2019) 000–000
136
Q * (st ,at ) r (st ,at )
p(st ,at ,st
st 1S
1
)max(s t 1 ,a ) aA
(7)
Q (st ,at ) Q (st ,at ) [rt maxQ (s t 1 ,a ) Q (s t ,at )] (8) aA
In Eq. (7), A is the action set, and S is the state set, Q (st ,at ) represents the optimal reward discount sum *
obtained by the agent adopting the action a in the state s . From this formula, the optimal strategy is to select the action with the largest Q value in the state s . Eq. (8) is a Q-value update formula, where is the learning rate, and the larger is, the faster the convergence speed is, but the excessive may cause no convergence. Q learning considers the state of the system as a finite set, uses the form of Q table to store and iteratively computes the state-action value function, which cannot solve the problem of large-scale or continuous state. Therefore, deep neural network was used as function approximator to estimate action-value function in DQN, ie: (9) Q * (s ,a ) Q (s ,a ; ) Where is the parameter of the deep neural network and Q (s ,a ; ) is the output of the input state s and action a to
the deep neural network. The DQN adopts an experience replay mechanism during the training process. A small batch of transferred samples is randomly extracted each time from the replay memory D during training, and the network parameters are updated using a stochastic gradient descent (SGD) algorithm. At the same time, the improved DQN consists of two networks, the current value network Q (s ,a ;i ) and the target value network Qˆ(s ,a ;i ) . The two network structures are identical, but the weight parameters are different. Q (s ,a ;i ) is used to evaluate the current stateaction pair, and r maxQˆ(s ,a ;i ) is used to represent the
target Q value. The network parameters are updated by minimizing the mean square error between the current Q value and the target Q value. The error function [19] can be expressed as:
L( ) E s ,a ,r ,s [(r maxQˆ(s ,a ;i ) Q(s ,a ;i ))2 ] '
(10)
In this paper, with the goal of production efficiency and energy consumption, the task requirements and the candidate service set obtained through the knowledge reuse stage are used as the input of DQN algorithm. The core problem of DQN algorithm is to model the Markov Decision Process of collaborative optimization of service scheduling for ICR. The action set is defined as an industrial robot selecting a certain service in the candidate service set corresponding to each subtask at each process so as to change its task status. The task state variables of industrial cloud robot include: task status, quality of service, energy consumption, ie:
S {TaskState ,WQoS , Energy }
(11)
Where TaskState is a preset task state, WQoS is the weighted quality of service and Energy is energy consumption of the service. The DQN system guides the training based on the reward value, and the design of the reward function matches the learning objectives. The goal of collaborative optimization of
5
service scheduling for ICR is to reduce energy consumption without degrading service quality. Therefore, the reward function is designed as follows. (1) When the training system is in a given process task and the task is not completed, the reward function is as shown in Eq. (11). 1 if En Enmax orWQoS WQoSmax r 0 if En Enmax andWQoS WQoSmax
(12)
(2) When the training system completes the task, the reward function is as shown in Eq. (12). if En Enmax orWQoS WQoSmax 1 (13) r ( EnWQoSmax EnWQoS ) EnWQoSmax if En Enmax andWQoS WQoSmax
Among them, WQoS and En represent the total service quality and total energy consumption of the completed task, EnWQoS is the weighted sum of WQoS and En , Enmax and
WQoS max
are
the
maximum
values
of
total
energy
consumption and total service quality. Therefore, the meaning of the reward function is that the industrial robot performs the specified tasks within the service quality WQoS max and energy
consumption Enmax , and the total energy consumption is as smaller as possible. The process of service scheduling collaborative optimization method based on DQN is shown in Figure 5.
Initialization: replay memory D to capacity N , action-value function Q with random weights , and target action-value Qˆ function with weights .
Repeat: Initialize the manufacturing task candidate service set. For each step t of the service combination (a) Produce a random number rand uniformly distributed at (0, 1). If rand is less than the probability , the action at is randomly selected, otherwise at argmaxQ (st ,a ; )
(b) Execute action at in emulator, observe reward rt , reach the next state s t 1 , and store transition
{st ,at , rt , st 1 } in D
(c) Sample random minibatch of transitions Dˆ from D , Where Dˆ {s i ,ai , ri , s i 1 }in 1
(d) For I = 1~n do (i) Define the virtual label for each sample as: r maxQˆ (si 1 , a; ) if Ter (i) 1 yi i if Ter (i) 0 ri Where, if the task is completed in step i+1, then Ter (i) 0 , otherwise Ter (i) 1 (ii)Updating the parameters of the current value network based on the stochastic gradient descent method according to Eq. (10) End For (e) every C steps reset End For
6
Hang Du et al. / Procedia CIRP 83 (2019) 132–138 Author name / Procedia CIRP 00 (2019) 000–000
Until The number of iterations reaches the upper limit Fig.5 Process of service scheduling collaborative optimization method based on DQN.
5. Implementation and Case Study
137
corresponding candidate service set, and x' is the value after normalization. Figure 7 shows the changes of the loss of the neural network during the training of the DQN algorithm. As the training progressed, the loss gradually decreased and eventually approached a stable value.
In order to verify the effectiveness of proposed framework, theories and approaches for ICRMSs, a simple vehicle body assembly case in industrial robotic assembly is illustrated in this section. As shown in Fig. 6, this assembly line has 7 steps of processes, including the frame assembly, the wheel assembly, the engine assembly, the instrument assembly, the brake assembly, the components assembly and the vehicle assembly. It means there are 7 types of industrial robot manufacturing capability services in the manufacturing cell level.
Fig. 7 The changes of the Loss during the training.
Figure 8 shows the changes of the EnWQoS during the training of the DQN algorithm. As the training progressed, the overall EnWQoS showed a downward trend. When training to 1600 rounds, the overall goal tends to be stable. It can be seen that the DQN algorithm can achieve the effect of optimizing quality of service and energy consumption in the service scheduling of ICR. Thus, the collaborative optimization method of service scheduling based on DQN is effective.
Fig. 6 A simple vehicle body assembly case.
The DQN algorithm proposed above is used in this case. The main algorithm parameters of this simulation experiment are shown in Table 1 and the simulation is performed by TensorFlow in PyCharm. Table 1. The main algorithm parameters. parameters
value
reward function parameter WQoS max
2.94
empirical cache pool D size N
3000
discount factor
0.5
reward function parameter Enmax
3.10
minibatch size n
300
random action selection probability
0.1
learning rate
0.001
target network update interval C
600
Fig. 8 The changes of the EnWQoS during the training.
Where WQoS max and Enmax have been normalized by Eqs. (14) and (15).
xmax x if xmax xmin 0 x xmax xmin if xmax xmin 0 1
(14)
x xmin if xmax xmin 0 x' xmax xmin if xmax xmin 0 1
(15)
'
Where x is a certain performance indicator of a service, xmax and xmin are respectively the maximum and minimum values of the performance indicator of a service in the
Hang Du et al. / Procedia CIRP 83 (2019) 132–138 Author name / Procedia CIRP 00 (2019) 000–000
138
7
References
Fig. 9 Knowledge of vehicle body assembly task.
The knowledge formed by the service combination scheme for completing the automobile body assembly task is encoded by the method of the section 3, as shown in Fig. 9, and update it to the cloud knowledge base during the knowledge exchange for its wide-area use. 6. Conclusion In this paper, aiming at the problem of lack of knowledge sharing among robots in the service scheduling process, a collaborative optimization framework of service scheduling for industrial cloud robotics is built, the industrial robot knowledge sharing mechanism in cloud environment and a service scheduling collaborative optimization method based on deep reinforcement learning are proposed. Case study demonstrates the collaborative optimization method of service scheduling based on DQN is effective. The follow-up work will be based on the industrial robot production line to realize the industrial cloud robot service platform prototype system based on knowledge sharing for the workshop environment to guide the operation of the actual workshop. Acknowledgements This research is supported by National Natural Science Foundation of China (Grant No. 51775399) and DITDP (Grant No. JCKY2016207C038).
[1] Zhou Z, Yao B, Xu W, Wang L. Condition monitoring towards energyefficient manufacturing: a review. International Journal of Advanced Manufacturing Technology; 2017. 91(9-12):1-21. [2] Yan H, Hua Q, Wang Y, et al. Cloud robotics in Smart Manufacturing Environments: Challenges and countermeasures. Computers & Electrical Engineering; 2017. [3] Kuffner, J. J. Cloud-enabled robots. In: IEEE-RAS International Conference on Humanoid Robotics, Nashville, TN; 2010. [4] Waibel M, Beetz M, Civera J, et al. RoboEarth. Robotics & Automation Magazine IEEE; 2011. 18(2):69-82. [5] Saxena A, Jain A, Sener O, et al. RoboBrain: Large-Scale Knowledge Engine for Robots. Computer Science; 2014. [6] Liu J, Xu W, Zhang J, et al. Industrial Cloud Robotics Towards Sustainable Manufacturing. AMSE 2016 11th International Manufacturing Science and Engineering Conference; 2016. V002T04A017. [7] Xu W, Du H, Liu J, et al. Energy-Efficient Multi-Level Collaborative Optimization for Robotic Manufacturing Systems. 51st CIRP Conference on Manufacturing Systems; 2018. 72:316-321. [8] Zhang J, Liu Q, Xu W, et al. Cross-layer optimization model toward service-oriented robotic manufacturing systems. Journal of Manufacturing Science and Engineering; 2018. 140(4):041002. [9] Xu W, Shao L, Yao B, et al. Perception data-driven optimization of manufacturing equipment service scheduling in sustainable manufacturing. Journal of Manufacturing Systems; 2016. 41:86-101. [10] Tao F, Feng Y, Zhang L, et al. CLPS-GA: A case library and Pareto solution-based hybrid genetic algorithm for energy-aware cloud service scheduling. Applied Soft Computing; 2014. 19: 264-279. [11] Bello I, Pham H, Le Q V, et al. Neural Combinatorial Optimization with Reinforcement Learning; 2016. [12] Waibel M, Beetz M, Civera J, et al. RoboEarth - A World Wide Web for Robots; 2011. 18(2):69-82. [13] Wang W, Johnston B, Williams M A. Social Networking for Robots to Share Knowledge, Skills and Know-How. Social Robotics; 2012. 7621:418-427. [14] Tenorth M, Perzylo A C, Lafrenz R, et al. Representation and Exchange of Knowledge About Actions, Objects, and Environments in the RoboEarth Framework. IEEE Transactions on Automation Science & Engineering; 2013. 10(3):643-651. [15] Wang X V, Wang L, Mohammed A, et al. Ubiquitous manufacturing system based on Cloud: A robotics application. Robotics and ComputerIntegrated Manufacturing; 2017. 45(C):116-125. [16] Bizer C, Schultz A. The Berlin SPARQL benchmark. International Journal on Semantic Web & Information Systems; 2009. 5(2):1-24. [17] Zhao Y, Liu Q, Xu W, et al. Dynamic and unified modelling of sustainable manufacturing capability for industrial robots in cloud manufacturing. International Journal of Advanced Manufacturing Technology; 2017. (1):1-19. [18] Mnih V, Kavukcuoglu K, Silver D, et al. Playing Atari with Deep Reinforcement Learning. Computer Science; 2013. [19] Volodymyr M, Koray K, David S, et al. Human-level control through deep reinforcement learning. Nature; 2015. 518(7540):529-533.