European Journal of Operational Research 140 (2002) 590–605 www.elsevier.com/locate/dsw
Discrete Optimization
A reduction technique for weighted grouping problems Timo Knuutila *, Olli Nevalainen Department of Mathematical Sciences and Turku Centre for Computer Science (TUCS), University of Turku, Lemminkaisenkatu 14A, FIN-20014 Turku, Finland Received 18 January 2000; accepted 3 June 2001
Abstract Weighted grouping problems are shown to have an equivalent reduced form, which is often considerably smaller than the original problem. Although the reduction may be small for randomly generated problems, real-life problems often contain non-random properties that greatly increase the effect of reduction. We give an efficient algorithm to build the reduced problem instance, and analyse the expected amount of reduction for certain statistical distributions and real-life data. In addition, we briefly discuss the effect of reduction on traditional solving methods of the grouping problem. The results show clearly the usefulness of problem reduction: it is computationally cheap to apply and may make the reduced problem solvable in a practical time whilst the original one is not. The method is readily applicable to the job grouping problem of printed circuit board (PCB) printing industry. 2002 Elsevier Science B.V. All rights reserved. Keywords: Flexible manufacturing systems; Combinatorial optimization; Job grouping; Electronics assembly
1. Introduction Let us consider the automated assembly of electronic components on printed circuit boards (PCBs) by the means of a flexible printing robot. It is a common feature of these machines that they accept one or more partially processed PCBs and install a set of electric components of varying sizes and shapes on predefined sites on the board. The input of the components is via a feeder unit the
*
Corresponding author. Tel.: +358-2-3338635; fax: +358-23338600. E-mail address:
[email protected].fi (T. Knuutila).
capacity of which is a limiting parameter of the robot. Extremely high operation speed and accuracy are necessities for the robots to be useful. These two properties are achieved by high technology only, which makes the robots expensive. In addition, their proper operation should be monitored by skilled workers. So, the number of assembly robots should be kept low which makes the production planning a critical task of the assembly plant. Among the many aspects of PCB assembly planning [4] we are interested in the possibilities of decreasing the overhead caused by setup operations of single assembly robots by the job grouping technique [17,18]. The key observation here is that the number of setup occasions (i.e. the times of
0377-2217/02/$ - see front matter 2002 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 7 - 2 2 1 7 ( 0 1 ) 0 0 2 0 4 - 1
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
performing the setup operations, or the number of job groups) can be minimized if the same feeder setup serves several different PCB batches. This means that the feeder unit is filled by such a selection of component types that all PCB types of the job group can be manufactured without an intermediate setup. This action saves the fixed setup starting costs and facilitates the manual work of the engineer. A common way of stating the job grouping problem is to determine the feeder setups in such a way that the number of setup occasions is minimal. The job grouping problem can be formulated as a 0/1-optimization problem and it can be shown to be NP-hard [3]. Exact solution of small scale problem instances is, however, possible by efficient 0/1-problem solvers (like CPLEX) [13]. Constraint programming is another way for finding the exact solution of the problem. An advantage of the approach is its flexibility to finetune the problem statement and at the same time the solution process. Surprisingly enough, also the constraint programming solution is capable of solving reallife cases to optimality [13]. Several efficient heuristics have been developed to solve the job grouping problem for larger problem instances. Leon and Peters [14] use hierarchical clustering based on Jaccard’s similarity coefficient. In addition to a small number of job groups the algorithm determines an advantageous component-to-feeder assignment. Shtub and Maimon [16] use Jaccard’s coefficients and also consider the different setup times of various components. Bhaskar and Narendran [1] base the grouping on a different coefficient (cosine similarity) and manipulation of spanning trees. In their model a board may be processed in several different groups. Smed et al. [17] give several variants of using local search operators with similarity measures. A different view to the setup problem is the minimization of the total setup operations by sequencing the jobs (see [7,9] for some recent methods). Finally, the job grouping problem is in many cases a part of a more complex multiobjective problem in which there are several conflicting and unprecise goals. Fuzzy set theory has turned out to be useful in these cases [10].
591
The present paper is a further study to our previous work on the job grouping problem. When developing new practical heuristics, we observed that certain components always appear together. It is therefore interesting to see, whether one could utilize this property to reduce the problem size without sacrificing the exactness of the solution process. A motivating fact for this is that it was noted that the running times of the 0/1-solver and the constraint programming solution increased rapidly with increasing problem size. We will discuss in the following the job grouping problem in somewhat more general terms. Instead of electronic components, PCBs and job groups we consider a general weight constrained set partitioning problem which can easily be interpreted as the concrete job grouping problem of PCB industry. In this general weighted grouping problem (GWGP), we have a basic set of elements, each element being of a known weight, and a set of collections each consisting of a number of elements. From the collections we want to form groups in such a way that the total weight of the elements in each group does not exceed a given limit, all collections belong to at least one group, and the number of groups is minimal. We will show that it is possible to reduce the size of the GWGP by merging the elements which are always used jointly in every collection that they are used. It is also shown that the solution of the original problem instance can be easily obtained from the solution of the reduced problem. Furthermore, we give an efficient reduction algorithm which performs maximal reduction and runs fast. Tests with statistical models and real problem instances from industry demonstrate that a considerable amount of reduction is often obtained. Finally, we analyse briefly the effect of reduction on the running times of basic GWGP solving algorithms.
2. Grouping problem We here define a mathematical model of the grouping problem and show how any problems of this kind can be transformed to equivalent reduced problems.
592
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
2.1. A general formulation of the grouping problem Let us start with some basic notations (see Fig. 1 for an illustration). • E is a set of elements e. Each element e has an associated weight wðeÞ 2 N. • C is a set of collections c, where each c E. We assume in the sequel that S E can be obtained indirectly via C as E ¼ c2C c. 1 • Subsets g of C are called groups. We use the abbreviation eðgÞ for the elements of collections in S group g, that is eðgÞ ¼ c2g c. • Grouping G is a set of groups. In the context of PCB printing problems, the notation can be used as follows: e is a component, c is a PCB (to be filled with elements), and g is a fillup of printing feeders with components. The weights of the elements generalize to collections and groups in the natural way: P • for c 2 C, wðcÞ ¼ Pe2c wðeÞ, and • for g 2 G, wðgÞ ¼ e2eðgÞ wðeÞ. P Note that wðgÞ 6 c2g wðcÞ and the inequality is strict if any collections in g intersect each other. We are especially interested in groupings G that S are set covers of C ( g2G g ¼ C), and where the weight of each group is less than some predefined maximum. In the PCB context, this means that each PCB appears at least in one group and the feeder capacity is not violated in any group. Definition 2.1 (Legal grouping). Let wM 2 N be some maximal allowed weight, and C a set of collections of weighted elements as defined above. We say that a grouping G is legal, if G is a set cover of C and wðgÞ 6 wM for all g 2 G. Definition 2.2 (GWGP, solution). The GWGP (C, wM ) is to find a legal grouping G of C with minimal index jGj. Such G is called the solution of the problem.
1 This interpretation also assures that each e 2 E is relevant to the grouping problem.
Fig. 1. Elements, collections and groups. A sample case of E ¼ fe1 ; e2 ; e3 ; e4 ; e5 g, C ¼ fc1 ; c2 ; c3 ; c4 g ¼ ffe1 ; e2 ; e3 g; fe2 ; e3 ; e4 ; e5 g; fe1 ; e4 ; e5 g; fe4 ; e5 gg, G ¼ fg1 ; g2 g ¼ ffc2 ; c4 g; fc1 ; c3 gg.
In the PCB printing context, numbers wðeÞ denote the component widths, wM means the maximum feeder width, and GWGP (C; wM ) means finding a minimum number of feeder setups within the feeder width limits. 2.2. Known complexity results The GWGP problem is computationally extremely hard: many well-known NP-hard problems are ‘only’ special cases of it. For example, the minimum bin packing problem (see [2], for example) is a special case of a GWGP (C; wM ) where ci \ cj ¼ ; for all ci 6¼ cj 2 C [18]. In GWGP problems the common elements do not increase the load of bins. In the batch selection problem we are seeking for a single fillup of the feeders with maximal number of jobs in it. The problem is easier than the job grouping problem (and hence also easier than GWGP) and it has been shown to be NP-complete (by reducing it to the graph clique problem) even when jcj ¼ 2 for all c 2 C [8]. In the set covering problem we want to solve whether there is a collection of sets such that their elements cover all the elements of a universal set. It has been shown that also the set covering problem is a special case of the job grouping problem [3,5].
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
The set covering problem is well-studied in complexity theory, and it is known that no polynomial approximation algorithm (using greedy selection method) with constant worst case bound exists (unless P ¼ NP). Also, for any d < 1=4, GWGP cannot be approximated within a factor of d log jCj even if wðeÞ ¼ 1 for all e 2 E and wM ¼ jEj 1 (unless NP is in DTIME½npolylogn ) [6].
2.3. Collection equivalence Let us denote with CðeÞ the set of collections in which the element e appears, i.e. CðeÞ ¼ fc 2 C j e 2 cg: We say that two elements e1 and e2 are collection equivalent if Cðe1 Þ ¼ Cðe2 Þ, and denote this with e1 c e2 . It is easy to verify that c is an equivalence relation (reflexive, symmetric and transitive) on E. This justifies the usual quotient set or partition notation E= c ¼ fe c j e 2 Eg where e c ¼ fe0 2 E j e c e0 g. We can give c also via the set E= c in the form E= c ¼ fE1 ; . . . ; Em g, where the sets Ei are the classes of c . It follows from the definition of c that for any F 2 E= c and c 2 C, either c 2 CðeÞ for all e 2 F or c 62 CðeÞ for all e 2 F . In other words, either F c or F \ c ¼ ;. Thus, c partitions also each c 2 C, and c= c is well-defined. The quotient notation can now be extended to groups and groupings: we define g= c ¼ fc= c j c 2 gg and G= c ¼ fg= c j g 2 Gg. Example. Consider the case of Fig. 1, where C ¼ fc1 ; c2 ; c3 ; c4 g and c1 ¼ fe1 ; e2 ; e3 g, c2 ¼ fe2 ; e3 ; e4 ; e5 g, c3 ¼ fe1 ; e4 ; e5 g, c4 ¼ fe4 ; e5 g. Then Cðe1 Þ ¼ fc1 ; c3 g, Cðe2 Þ ¼ Cðe3 Þ ¼ fc1 ; c2 g, and Cðe4 Þ ¼ Cðe5 Þ ¼ fc2 ; c3 ; c4 g. Thus, E= c ¼ ffe1 g; fe2 ; e3 g; fe4 ; e5 gg, c1 = c ¼ ffe1 g; fe2 ; e3 gg; c2 = c ¼ ffe2 ; e3 g; fe4 ; e5 gg; c3 = c ¼ ffe1 g; fe4 ; e5 gg; c4 = c ¼ ffe4 ; e5 gg; and
593
g1 = c ¼ fffe2 ; e3 g; fe4 ; e5 gg; ffe4 ; e5 ggg; g2 = c ¼ fffe1 g; fe2 ; e3 gg; ffe1 g; fe4 ; e5 ggg: Note that eðg1 Þ= c ¼ fe2 ; e3 ; e4 ; e5 g= c ¼ ffe2 ; e3 g; fe4 ; e5 gg ¼ eðg1 = c Þ: This is no coincidence: the next lemma shows that eðg= c Þ ¼ eðgÞ= c , i.e. one can always compute eðg= c Þ directly from eðgÞ. This property is needed later in the proof of Lemma 2.5. Lemma 2.3. Let g C. Then eðg= c Þ ¼ eðgÞ= c . Proof. Let F 2 E= c . F 2 eðg= c Þ () F 2
[
ðc= c Þ
% def: of eðgÞ;
c2g
() ð9c 2 gÞF 2 c= c % c partitions each c; !, [ c c % idem; () F 2 c2g
() F 2 eðgÞ= c :
2.4. Merging and reduction Let ei 6¼ ej 2 E. Then we call efi;jg with wðefi;jg Þ ¼ wðei Þ þ wðej Þ the merge of ei and ej . The element efi;jg can be considered as ei and ej collapsed into one. This notation extends to sets of elements. Definition 2.4. Let J ¼ fi j ei 2 F g be thePindex set of P F E. Then eJ with wðeJ Þ ¼ i2J wðei Þ ð¼ e2F wðeÞ ¼ wðF ÞÞ is the merge of F. In the PCB printing context, merging means replacing two or more components with one ‘supercomponent’ the width of which equals the sum of the widths of the original ones. We next define how merges of elements affect the collections. Let c 2 C, F E, and J the index set of F. Then rðc; F Þ, the result of reducing c with respect to F , is defined as
594
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
rðc; F Þ ¼
c n F [ feJ g if F c; c if F 6 c:
Informally, the subset F is replaced by its merge in c. Again, this notation can be extended to the whole C, to groups g 2 G, and to groupings G as rðC; F Þ ¼ frðc; F Þ j c 2 Cg; rðg; F Þ ¼ frðc; F Þ j c 2 gg; rðG; F Þ ¼ frðg; F Þ j g 2 Gg: We next extend the reduction operator over all of the sets in E= c . Let E= c ¼ fE1 ; . . . ; Em g with index sets I1 ; . . . ; Im . Then rðc; c Þ, the result of reducing c over c , is defined as the result of replacing all Ej with the corresponding merged element eIj in c. The resulting collection will then contain only merged elements. Note that the order in which the reductions are done can be chosen freely since sets Ej do not overlap. Similarly, we define rðg; c Þ and rðG; c Þ as results of performing the merging of all c -classes. Continuing the example of Section 2.3 we have rðc1 ; c Þ ¼ fef1g ; ef2;3g g; rðc2 ; c Þ ¼ fef2;3g ; ef4;5g g;
2.5. Problem reduction Let P ¼ GWGPðC; wM Þ. We denote by rðP ; c Þ the reduced problem GWGPðrðC; c Þ; wM Þ. In this section we will show essentially that any GWGP problem P can be solved by solving the reduced (and hopefully computationally easier) problem rðP ; c Þ. In order to do this, we need the property that mappings rðc; c Þ are actually bijections, where the inverse mapping r 1 replaces the merged elements with the sets of original elements. This mapping can be defined (analogously to r) as 1
r ðc; eJ Þ ¼
c n feJ g [ F c
if eJ 2 c; if eJ 2 6 c;
where J is the index set of F E. This operation is then extended to r 1 ðc; c Þ to mean the simultaneous application of all r 1 ðc; eI Þ, where eI ranges over the merges of c -classes. It should be straightforward to verify that r 1 ðrðc; c Þ; c Þ ¼ c. Using this inverse, we can easily show the ‘mirror image’ of Lemma 2.5.
rðc3 ; c Þ ¼ fef1g ; ef4;5g g; rðc4 ; c Þ ¼ fef4;5g g:
Corollary 2.6. For all g rðC;c Þ, wðr 1 ðg;c ÞÞ ¼ wðgÞ.
The following lemma shows that reduction preserves group weights. It will be essential later in the proof of Proposition 2.7.
The following proposition shows that solutions of the original problem can always be transformed into solutions of the reduced problem.
Lemma 2.5. For all g C, wðrðg; c ÞÞ ¼ wðgÞ. Proof. In the proof below we use the fact that each eJ 2 eðrðg; c ÞÞ is a result of reducing some F 2 eðg= c Þ. wðrðg; c ÞÞ X ¼
wðeJ Þ % definition of wðgÞ
eJ 2eðrðg;c ÞÞ
¼
X
wðF Þ % wðeJ Þ ¼ wðF Þ and the fact above
F 2eðg=c Þ
¼
X
wðF Þ % Lemma 2:3
F 2eðgÞ=c
¼
X
wðeÞ % c partitions eðgÞ
e2eðgÞ
¼ wðgÞ % definition of wðgÞ:
Proposition 2.7. If G is a solution of P ¼ GWGPðC; wM Þ, then rðG; c Þ is a solution of rðP ; c Þ. Proof. Recall that solutions are legal groupings with minimal index. Grouping rðG; c Þ is legal because none of the group weights change in the reduction (Lemma 2.5). What remains is to show that rðG; c Þ has also a minimal index. Suppose rðP ; c Þ has a solution G0 with jG0 j < jrðG; c Þj. Then, by Corollary 2.6, the grouping r 1 ðG0 ; c Þ must be a legal grouping of P and furthermore jr 1 ðG0 ; c Þj ¼ jG0 j < jrðG; c Þj ¼ jGj. This is clearly a contradiction, because G had a minimal index.
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
595
Corollary 2.8. Let P ¼ GWGPðC; wM Þ. If G is a solution of rðP ; c Þ, then r 1 ðG; c Þ is a solution of P .
construct rðC; c Þ. Although (in principle) any polynomial-time algorithm would do as a preprocessing stage for an NP-hard problem, we put some effort in the efficient implementation, because the reduction can also be used with fast heuristic algorithms (in addition to exact solution approaches).
Our main theorem simply collects the two results above into one.
3.1. Algorithm
The mirror image of Proposition 2.7 can be shown similarly. This gives us the following corollary.
Theorem 2.9 (Reduction Theorem). Let P ¼ GWGPðC; wM Þ. Grouping G is a solution of P if and only if rðG; c Þ is a solution of rðP ; c Þ. There is also an interesting corollary we obtain by using Theorem 2.9 ‘backwards’. Any weighted grouping problem can be transformed into a nonweighted one (with standard weight 1 on all elements), where the weight limit wM is simply the number of elements in the group. This can be shown by treating each e 2 E as a merge of wðeÞ unit-weighted elements (assuming the weights are natural numbers) and then applying the inverse mapping to e. This corollary has many practical applications, because many existing algorithms and theoretical results for job grouping problems consider only on non-weighted problems. Corollary 2.10. Let P ¼ GWGPðC; wM Þ, E ¼ S c2C c ¼ fe1 ; . . . ; em g and wðEÞ N. Let AðC; kÞ be an algorithm returning a solution of the non-weighted set grouping problem, where the maximum group size is k. Then there exists abijection s such that s 1 ðAðsðCÞ; wM ÞÞ is a solution of P . Proof. The required mapping s can be defined similarly to r 1 as wðei Þ 1 sðc; ei Þ ¼ c n fei g [ fei ; . . . ; ei g if ei 2 c; c if ei 62 c; for all ei 2 E.
3. Reduction algorithm We here present and analyse a non-naive algorithm to compute E= c which can then be used to
In the construction of c we avoid comparing the sets CðeÞ explicitly, since this would be rather costly. Instead, we start from an initial universal relation on E and refine it by iteratively splitting each class F of the current relation into classes F1 ¼ F \ c and F2 ¼ F n c with respect to each collection c 2 C. Lemma 3.1 shows (in a constructive way) that the relation resulting from all these iterations is c . This kind of iterative refining is a standard approach in algorithms finding the coarsest refinement (of some initial partitioning) containing a given binary relation on the elements (e.g. e1 e2 iff f ðe1 Þ ¼ f ðe2 Þ for some given function f) [15]. Our approach is different in the sense that we avoid the explicit computation of the basic condition (Cðe1 Þ ¼ Cðe2 Þ), too. function C-equal(C,E): equivalence relation on E ¼ E EðE= ¼ fEgÞ. for each c 2 C do for each F 2 E= do F1 ¼ ;, F2 ¼ ; for each e 2 F do if e 2 c then F1 ¼ F1 [ feg else F2 ¼ E2 [ feg if F1 6¼ ; and F2 6¼ ; then E= ¼ E= nfF g [ fF1 ; F2 g return Let us trace the execution of the algorithm with our running example. 0
¼
ffe1 ; e2 ; e3 ; e4 ; e5 gg;
1
¼
ffe1 ; e2 ; e3 g; fe4 ; e5 gg
2
¼
ffe1 g; fe2 ; e3 g; fe4 ; e5 gg % c2 ¼ fe2 ; e3 ; e4 ; e5 g;
3
¼
ffe1 g; fe2 ; e3 g; fe4 ; e5 gg % c3 ¼ fe1 ; e4 ; e5 g;
4
¼
ffe1 g; fe2 ; e3 g; fe4 ; e5 gg % c4 ¼ fe4 ; e5 g:
% c1 ¼ fe1 ; e2 ; e3 g;
596
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
Lemma 3.1. Function C-equal computes the relation c . Proof. Let C ¼ fc1 ; . . . ; cm g and denote with i relation after loop i. Our aim is to show that e1 m e2 if and only if e1 c e2 . Let us denote with C i ðeÞ the set CðeÞ \ fc1 ; . . . ; ci g. So C i ðeÞ is CðeÞ restricted to the first i collections (the ones the algorithm has already used in the refinement process). Note that the sets C i ðeÞ are obtained from sets C i 1 ðeÞ as follows: i 1 C ðeÞ [ fci g if e 2 ci ; C i ðeÞ ¼ C i 1 ðeÞ if e 62 ci : We show first that e1 i e2 if C i ðe1 Þ ¼ C i ðe2 Þ. At i ¼ 0 we have e1 0 e2 Cðe1 Þ \ ; ¼ ; ¼ Cðe2 Þ \ ; ¼ C 0 ðe2 Þ E. Assume now that the claim i 6 k 1 (k > 0) and consider the loop k for some e1 ; e2 2 E.
all equivalence classes (these weights will be assigned to the merged elements) and to store them with each class. We also assign a unique number keyðe= c Þ to each class e= c and store pointers to the classes in an array indexed with these numbers. These class numbers will be the elements of the reduced problem. The reduced collections can then be built simply by taking the union of keyðe= c Þ over e 2 c for each c 2 C (OðjCjjEjÞ operation). The solution of the original problem is obtained from the reduced one by finding the original classes via the keyðe= c Þ-values. 3.2. Implementation details and time analysis
and only if and C 0 ðe1 Þ ¼ for all e1 ; e2 2 holds for all situation after
• C k ðe1 Þ ¼ C k ðe2 Þ ) e1 k e2 . It follows from the definition that C k 1 ðe1 Þ ¼ C k 1 ðe2 Þ, and from the inductive assumption that e1 k 1 e2 . Since both e1 ; e2 2 ck , the algorithm moves both e1 and e2 to set F1 and consequently to the same class in k . • C k ðe1 Þ 6¼ C k ðe2 Þ ) e1 ¿ k e2 . Here we have two possibilities: either we had already C k 1 ðe1 Þ 6¼ C k 1 ðe2 Þ or then it was ck making the sets different. In the first case we also have e1 ¿ k 1 e2 and consequently e1 ¿ k e2 . In the latter case either C k ðe1 Þ or C k ðe2 Þ contains ck and the other one does not. Supposing ck 2 C k ðe1 Þ we see that the algorithm moves e1 to F1 and e2 to F2 , and consequently e1 ¿ k e2 . Using the loop invariant just shown, after all c 2 C have been considered we have that e1 m e2 () Cðe1 Þ \ C ¼ Cðe2 Þ \ C;
Elements and collections can be implemented simply as numbers 1; . . . ; jEj and 1; . . . ; jCj, and the sets c of e as a Boolean jEj jCj matrix. This matrix contains the sets c in its columns and sets CðeÞ in its rows. Equivalence class implementation has to support the refining (splitting) of classes only. The data structures used in [12], where class elements are stored as segments of an array of size jEj, and where each class header contains the starting index and size of the class, work well here. This implementation makes it possible to refine classes with constant work per moved element. Note that the line E= ¼ E= nfF g [ fF1 ; F2 g of the algorithm can be implemented effectively by ‘re-using’ F for the larger one of F1 and F2 and moving only the elements of the smaller half to a new class. With the above implementation, the two innermost for-loops perform OðjEjÞ work per iteration. This gives OðjCjjEjÞ time complexity for the whole algorithm. In our testing, the algorithm performed extremely fast; computation of c for a problem case of 217 elements and 16 collections took about a second on a slow Unix machine (75 MHz SPARC).
() Cðe1 Þ ¼ Cðe2 Þ; () e1 c e2 : After c has been built it is straightforward to compute the sum of weights of the elements in
4. Expected reduction ratio We have now a theoretically sound and computationally effective technique for reducing
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
GWGP problems. The next natural question is, how much we can expect it to help us in practice. That is, given some random problem instance, can we say anything in general terms about the expected amount of reduction. Our interest is in the reduction ratio rr ¼ jEj=jE= c j, which tells how much our technique can reduce the original problem. For example, a reduction ratio of 1.33 means that jE= c j is 75% of the original jEj. Note that this reduction ratio is the same as the average size of equivalence classes. Thus, in what follows, we typically compute rr using the probabilities Prðei c ej Þ to decide the expected size of each class ei = c , and rr is then obtained as the average over these sizes. These computations become quite simple when we assume that the occurrences of elements in collections are independent of each other, and particularly simple when Prðei c ej Þ (i 6¼ j) is a constant. 4.1. Problem reduction with random data Let us first consider the following simple model for the distribution of elements into collections. • The absence and presence of any e in any c are equally likely, that is, Prðe 2 cÞ ¼ 1=2. • The occurrences of elements in collections are independent of the occurrences of other elements. In this model, the probability of merging two elejCj ments ei 6¼ ej , Prðei c ej Þ, is then ð1=2Þ (the sets Cðei Þ and Cðej Þ are the same with this probability). Given some ei 2 E, each ej 6¼ ei can now be interpreted as a Bernoulli trial for ei c ej with jCj probability ð1=2Þ , which leads to a binomial distribution. The expected number of elements ej equivalent to a given ei , i.e. the expected reduction ratio, is henceforth jEj X
597
real-life applications. Thus, the reduction ratio is likely to be near 1 with any reasonable jEj. 4.2. Problem reduction with real data The result in the previous section looks quite disappointing. But in our opinion it is disappointing because the chosen simple model does not model real-life problem cases at all. To illustrate this, we represent reduction results obtained for samples taken from a real PCB assembly line of 341 different PCB types where the overall number of component types was 406. We tested our algorithm with sample sizes of 10; 15; . . . ; 50 PCB types. For each sample size 100 random samples were drawn from the pool of 341 S PCBs, after which the reduction ratio for E ¼ c2C c (the element set of the actual sample) was computed for each sample C. 2 Fig. 2 shows the average number of different elements for each sample size. The value of jEj varied between 150 and 350 and it strongly correlates (as expected) to the sample size jCj. The average reduction rates for the same sets of samples are shown in Fig. 3. We note that the observed ratio is very high for small sample sizes (3.1 for jCj ¼ 10) and considerably different from 1 (1.32) even for jCj ¼ 50. This is in strong contrast to the expected reduction ratios one gets using the random distribution model. The usual production batch consists of 30 to 40 PCB types in the assembly line the data was collected from, which implies that the expected reduction ratio is notable also in practice. 4.3. Problem reduction in uniform distributions We assumed Prðe 2 cÞ ¼ 1=2 for all e 2 E and c 2 C to create the random model of Section 4.1. If the elements are uniformly and independently
Prðei c ej Þ ¼ 1 þ ðjEj 1Þ=2jCj
j¼1
in this model. The number 1 comes from the observation that Prðei c ei Þ ¼ 1. If, for example, jCj ¼ 30, then 1 þ ðjEj 1Þ=2jCj P 1:01 only for jEj 107 and greater, which is seldom the case in
2
Note that the use of random sampling should be taken with some cautions. One could expect that the PCB types in active production have some kinds of interdependencies. We did not, however, have access to the production history of the plant to verify this.
598
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
ei ; ej 2 ck or neither of them is. The joint probability for this to happen is (using the independence assumption) Prððei 2 ck ^ ej 2 ck Þ _ ðei 62 ck ^ ej 62 ck ÞÞ ¼ Prðei 2 ck Þ Prðej 2 ck Þ þ Prðei 62 ck Þ Prðej 62 ck Þ ¼ b2 þ ð1 bÞ2 ¼ 2b2 2b þ 1:
Fig. 2. The average number of different elements in randomly selected PCB samples, as a function of the sample size (real production data).
Fig. 3. Reduction ratios (rr) for different jCj (averages of 100 executions, real production data).
distributed among the collections, then Prðe 2 cÞ can be (crudely) approximated as !, X Prðe 2 cÞ ðjCðeÞj=jCjÞ jEj: e2E
We computed this number for the test data of Section 4.2 (jEj ¼ 406, jCj ¼ 341) and obtained Prðe 2 cÞ ¼ 0:083, which means that the absence of e in c is much more likely than the presence of it. In this section we attempt to find out how much effect this base probability has on the expected rr. To make the subsequent formulae a bit more compact, we denote this probability with b in the sequel. Our next task is to derive a formulation for Prðei c ej Þ given some base probability b. This is best illustrated by analysing the execution of the reduction algorithm. Algorithm moves at each step k elements ei k 1 ej to the same class if both
The above process has to succeed jCj times before ei c ej . The overall probability Prðei c ej Þ is slightly jCj greater than ð2b2 2b þ 1Þ , however, since we can assume that each element is used at least in one collection. Let these two ‘obligatory’ collections of ei and ej be c1 and c2 , respectively. We have two cases to consider: • c1 6¼ c2 . This happens with probability ðjCj 1Þ=jCj, and Prðei 2 c2 Þ ¼ Prðej 2 c1 Þ ¼ b. The remaining jCj 2 memberships are the same jCj 2 with probability ð2b2 2b þ 1Þ . • c1 ¼ c2 . This happens with probability 1=jCj, and then we have a sure match at c1 ¼ c2 . The remaining jCj 1 memberships are the same with probability ð2b2 2b þ 1ÞjCj 1 . Combining these cases gives us
jCj 2 Prðei c ej Þ ¼ jCj 1Þ=jCj b2 ð2b2 2b þ 1Þ jCj 1 þ 1=jCj 2b2 2b þ 1 jCj 2 ¼ 2b2 2b þ 1 ðjCj þ 1Þb2 . 2b þ 1 jCj: One can see from the formula above, that the probability is slightly greater than in the unrestricted case (where CðeÞ ¼ ; is allowed) and the probability decreases when jCj increases. Although jEj may grow arbitrarily when jCj grows, this growth is quite small in our example (Fig. 2 shows sublinear progress). However, the decrease in the probability of merging is always exponential (as seen from the previous formula). Thus, at least with our data, the expected reduction ratio decreases when jCj increases. The main reason for the decreasing tendency of rr-values is that for
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
small jCj there are many e 2 E such that they appear only in the one and the same c (and get thus merged). So, this simple model catches at least this property of our real-life case. Let us next analyse the connection between b and Prðei c ej Þ. For the unrestricted case (with no obligatory memberships), b ¼ 1=2 gives actually the minimal Prðe1 c e2 Þ independent on jCj. This is easy to verify by observing the derivative (with jCj respect to b) of ð2b2 2b þ 1Þ . In the restricted case the minimal value is slightly smaller than 1/2, but the derivative is much more complex. We derived the (rather lengthy) function of jCj for the b0 minimizing Prðe1 c e2 Þ using Maple (a symbolic mathematics program) and plotted its values for varying jCj. The results are shown in Fig. 4, and they illustrate the fact that b0 is quite stable (and close to 1/2) from jCj P 20, and it is greater than 1/ 3 for any sensible sample size. The overall form of Prðe1 c e2 Þ as a function of b is steeply descending (ascending) for small (large) values of b, and it is almost zero for values between 0.2 and 0.7 (see Fig. 5). In practice, values of b close to 1 are rather unnatural, since this would imply that most PCBs contain almost all (and consequently the same) components. So, if we know that Prðe 2 cÞ is indeed close to a constant b
599
for some problem instance, then we also know that reduction is the more effective the closer b is to 0. With our test data, the observed value b ¼ 0:083 gives that Prðe1 c e2 Þ (for jCj ¼ 20) is about 2500 times greater than in the case b ¼ 1=2. Fig. 7 shows a comparison of results obtained by using the uniform distribution model and our observations using b ¼ 0:083. One can see from the figure that the shape of the estimation curve resembles the one we observed. However, it (greatly) over-estimates the actual one for small jCj and similarly under-estimates it for larger jCj. 4.4. Non-uniform distribution The uniform distribution with some base probability b is usually too simplistic to model real problem instances accurately. In our test data the probabilities Prðe 2 cÞ vary a lot, because there are some special components that are used only in a few PCBs whilst some common components are used in almost all of them. Fig. 6 contains the frequency histogram of these probabilities. It shows clearly how probabilities below 0.05 dominate the distribution. Suppose we could model the individual membership probabilities with some base probability
Fig. 4. Base probabilities (b) leading to minimal merging probability Prðei ej Þ for different jCj (restricted case, uniform distribution).
600
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
Fig. 5. Merging probability Prðei ej Þ as a function of base probability b for jCj ¼ 20 (restricted case, uniform distribution).
function f : E ! ½0; 1 (f ðei Þ gives the constant base probability of each element ei ). Estimates for this function could be derived using, for example, standard parametric curve fitting methods for a model template (the distribution in Fig. 6 could be a binomial one, for example). Then it is quite straightforward to extend the formulae of Section 4.3 to cover this more general model.
The probability gðei ; ej Þ that ei and ej are moved to the same equivalence class when considering collection ck is gðei ; ek Þ ¼ Prððei 2 ck ^ ej 2 ck Þ _ ðei 62 ck ^ ej 62 ck ÞÞ ¼ f ðei Þ f ðej Þ þ ð1 f ðei ÞÞ ð1 f ðej ÞÞ:
Prðei c ej Þ can (under the hypothesis of independence of element occurrences) be derived with suitable modifications to the ‘obligatory collection’ analysis: • c1 6¼ c2 : Prðej 2 c1 Þ ¼ f ðej Þ, Prðei 2 c2 Þ ¼ f ðei Þ, and the remaining jCj 2 memberships are jCj 2 the same with probability gðei ; ej Þ . • c1 ¼ c2 : the remaining jCj 1 memberships are the same with probability gðei ; ej ÞjCj 1 . Combining these cases gives us
Prðei c ej Þ ¼ ðjCj 1Þ f ðei Þ f ðej Þ gðei ; ej ÞjCj 2
. þ gðei ; ej ÞjCj 1 jCj:
Fig. 6. Frequency histogram for Prðe 2 cÞ. The bars indicate the number of elements with an observed jCðeÞj=jCj-value between a given range (real production data).
The computation of rr can no longer be done using a binomial distribution, however. Our analysis below is based on the idea that we first compute, how many elements fall into the same class with e1 , and after that, how many of the remaining elements are expected to be equivalent with e2 , etc. In
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
general, the expected number of elements ej such that j > i and ej has not already been merged with some earlier element ek ðk < iÞ in ei = c is jEj X
Prðei c ej Þ Prðej ¿ c ek ; k ¼ 1 . . . i 1Þ
j¼iþ1
¼
jEj X
Prðei c ej Þ
j¼iþ1
i 1 Y
ð1 Prðej c ek ÞÞ:
k¼1
We then get the expected rr by summing up all of the above (i ¼ 1 . . . jEj) and taking the average value. Instead of finding some appropriate theoretical model for our data, we used the observed probabilities jCðeÞj=jCj as such for values of f ðeÞ, and computed the expected reduction ratios. As seen in Fig. 7, the results were quite accurate for jCj P 20, but the over-estimation was still notable for smaller jCj. 4.5. Discussion We observed that neither the uniform nor the non-uniform model was able to estimate the reduction ratio. However, even the plain base probability b can be used as a simple test value for a given problem instance (small values of b indicate a good reduction result). Using the individual
601
frequencies for each element gives an even more accurate model. Better models could be obtained by enhancing our models to consider (at least) the following things. Dependency of element occurrences. There are at least two practical properties of PCBs violating our independency assumption. First, PCB types are often variants of one common ‘kernel design’, and the components of the kernel part always appear together in all variants. Second, even if the kernel design idea does not apply, components often form sub-designs where the components appear always together when that sub-design is used in a PCB. Fig. 8 illustrates the presence of these dependencies in our test data. The E C bit matrix presents memberships e 2 c with light dots, and rows have been ordered according to a clustering with Hamming metrics as the distance measure. The clustering was performed by a single application of the GLA using soft centroids and a random initial clustering of 50 clusters [11]. Larger light areas – which clearly are present – indicate similar membership patterns of elements. We assume that using some more informed metric (like Jaccard’s similarity used in [14]) and systematically experimenting with different numbers of the clusters would reveal even more dependencies of this kind in the data.
Fig. 7. Observed and estimated reduction ratios.
602
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
Fig. 8. Component families revealed by a simple clustering analysis (real production data, Hamming distance).
Collection sizes. PCB types are not of arbitrary sizes; it is against the nature of the field that some board would require all the available components. Our simple models do not consider the effect of jcj j nor the effect of the distribution of jcj j to the overall merging probabilities.
5. Time reduction in grouping algorithms The previous section showed that the reduction ratios are often notable. The next logical question is, how much does this reduction effect the running times of the solution methods for the GWGP problem. We discuss this aspect at a quite general level because we do not want to commit
ourselves to the details of any particular grouping algorithm. 5.1. Mixed integer programming Consider first a typical integer programming formulation of the problem (see [4], for example). Suppose that jGj is fixed and we simply wish to see whether there exists a legal grouping of the given size. This can be easily modified to an optimization algorithm by running the checking procedure with decreasing grouping sizes. The memberships ei 2 eðgk Þ and cj 2 gk are represented with 0/1decision variables xik and yjk , respectively. The problem is to find a grouping under the following constraints:
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
xik P yjk for all ei 2 cj ; jGj X
ning time of the reduced problem) separately for two of the most common jGj (3 and 4) and for the overall data. The results of Table 1 illustrate that reduction has a significant impact on the running times. Although the variances of running times were high, the reduced version was always at least 2.39 times faster to solve than its original counterpart. An average speedup of 15 times may become a crucial factor when running times are measured in hours and days.
ð1Þ
yjk ¼ 1;
ð2Þ
xik wðei Þ 6 wM for all gk 2 G:
ð3Þ
k¼1 jEj X
603
i¼1
Constraint (1) ensures that a collection may belong to a group only if all of its elements do, (2) states that collections must appear in some group, and (3) takes care of the capacity constraint. The number of decision variables in the formulation (1)–(3) is jGjðjCj þ jEjÞ, and the number of constraints on the variables is X jGj jcj þ jCj þ jGjjEj:
5.2. Exhaustive search algorithms A typical exhaustive branch-and-bound search algorithm first sets an upper bound for jGj using some fast heuristic algorithm giving as a result some grouping of size k, and starts then to check whether there exists a legal grouping of size k 1. This process is continued until no improvement can be made. The optimization process can thus be understood as a series of executions of a simpler procedure that attempts to find a legal grouping of a given size. These checking algorithms can be coarsely divided into two categories.
c2C
Any reduction in jEj thus decreases both the amount of variables and the number of constraints. This can lead to remarkable differences in running times of MIP solvers. We tested the effect of reduction with 30 random problems of size jCj ¼ 10 drawn randomly from the data used in Section 4.2. Although the small problem size gives a good reduction ratio, the main reason for selecting such small size was the sheer running time of the MIP solver (CPLEX). Some test runs (for the non-reduced cases) took over 3 hours even for problems of this size. Running times were measured both for finding a solution (given the minimal jGj) and for verifying that there were no solutions for jGj 1. The latter time is a more reliable measure, since the verification requires the examination of all of the search space. Most (24 out of 30) of the test cases had an optimal solution with jGj ¼ 3 or jGj ¼ 4. Table 1 contains the reduction in running times (running time of the original problem divided by the run-
• Collection-based approaches. The primitive action is to assign a collection to a group. The cost of each action is OðjEjÞ (because we have to check the feasibility of the resulting group), and the upper bound on the size of the search jCj space is jGj . The upper bound on the work jCj in this approach is OðjEjjGj Þ. • Element-based approaches. Here the primitive action is to assign an element to some groups. Possible assignments for e 2 E are all nonempty subsets of G. The cost of the primitive
Table 1 The effect of problem reduction for an integer programming formulation (real production data) jGj Find Verify
3
4
All
Min
Max
Avg
Min
Max
Avg
Min
Max
Avg
Dev
3.70 2.39
27.75 28.85
13.59 6.14
5.82 4.68
37.44 73.48
15.41 17.09
3.70 2.39
37.44 94.30
14.59 15.39
9.08 21.69
Find: reduction of the running time for finding an optimal solution. Verify: reduction of the running time for verifying the nonexistence of a smaller solution.
604
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605
action is OðjGjÞ, and the upper bound on the jEj search space is ð2jGj 1Þ . An upper bound on the work in this approach is then OðjGjð2jGj jEj 1Þ Þ. Of course, many states in this space are not legal or possible to generate using a given set of collections. The above consideration means that element reduction would give exponential savings in the element-based approach and linear savings in the collection-based approaches. According to our experience, the collection-based approaches are much more popular in the literature of grouping problems. If the constants of proportionality of the two approaches are ke and kc then the element-based approach is a better choice when jEj jCj ke jGj 2jGj 1 < kc jEjjGj jGj () jEj log 2 1 < logðkc =ke Þ þ log jEj þ ðjCj 1Þ log jGj: This is the case when jEj is small. For example, if jCj ¼ 32, jGj ¼ 4 and kc ¼ 28 ke then the elementbased approach has a better upper bound when jEj < 19. Note also that the upper bound for element-based methods is very coarse, so the elementbased approaches might be better in practice even for relatively large jEj.
6. Conclusion and further work We gave a formal definition for weighted grouping problems and showed that these problems have a theoretically sound reduction property which can be used to decrease the problem size. Because the problem is NP-hard, even a small reduction may lead to significant improvements in the running times of solving algorithms. The given reduction algorithm is so fast to execute that we lose computationally nothing by checking whether the problem reduces or not, whilst the savings may be large. Especially real-life problems seem to contain statistical properties that make them quite prone to reduction. There are many possible continuations for the initial analysis done in this paper. We should study
the merging probabilities under other distributions. Even more important is to examine real-life problems and try to find out their characteristic distribution properties. These could be exploited in creating a common set of benchmark problems (which we are badly lacking) for the researchers in the field. The effect of reduction on existing greedy and heuristic algorithms along with constrain programming methods should be studied more extensively. Most heuristic solving algorithms known to us are collection-based, so we can expect linear time savings with them. Since the number of elements seems to be the crucial factor in deciding whether to use an element- or a collection-based algorithm, our reduction technique may be just what is required to make the element-based approach a viable alternative. This suggests a further study in developing these kinds of methods. Finally, since the reduction property is mainly based on the structure of the sets needed to define the GWGP problem, we expect that the technique is applicable to other problems on sets, too. Especially the set cover problem looks like a good candidate here.
Acknowledgements The authors would like to thank Dr. T. Kaukoranta for providing the clustering results of Fig. 8. The pertinent comments of the anonymous referees were invaluable in improving the quality of this work.
References [1] G. Bhaskar, T.T. Narendran, Grouping PCBs for set-up reduction: A maximum spanning tree approach, International Journal of Production Research 34 (3) (1996) 621– 632. [2] E.G. Coffman Jr., M.R. Garey, D.S. Johnson, Approximation algorithms for bin packing: A survey, in: Hochbaum (Ed.), Approximation Algorithms for NP-hard Problems, PWS Publishing Company, 1996, pp. 46– 93. [3] Y. Crama, A. Oerlemans, A column generation approach to job grouping for flexible manufacturing systems, European Journal of Operational Research 78 (1994) 58–80.
T. Knuutila, O. Nevalainen / European Journal of Operational Research 140 (2002) 590–605 [4] Y. Crama, A. Oerlemans, F. Spieksma, Production Planning in Automated Manufacturing, Lecture Notes in Economics and Mathematical Systems, vol. 414, Springer, Berlin, 1994. [5] Y. Crama, J. van de Klundert, The approximability of tool management problems, Technical Report RM 96034, Maastricht Economic Research School on Technology and Organisation, 1996. [6] Y. Crama, J. van de Klundert, Worst-case performance of approximation algorithms for tool management problems, Naval Research Logistics 46 (1998) 445–462. [7] S. Dillon, R. Jones, C.J. Hinde, I. Hunt, PCB assembly line setup optimization using component commonality matrices, Journal of Electronics Manufacturing 8 (2) (1998) 77– 87. [8] G. Gallo, P.L. Hammer, B. Simeone, Quadratic knapsack problems, Mathematical Programming Studies 12 (1980) 132–149. [9] H.O. G€ unther, M. Gronalt, R. Zeller, Job sequencing and component set-up on a surface mount placement machine, Production Planning and Control 9 (2) (1998) 201–211. [10] T. Johtela, J. Smed, M. Johnsson, O. Nevalainen, Fuzzy approach for modeling multiple criteria in the job grouping problem, in: M.I. Dessoyky, S.M. Waly, M.S. Eid (Eds.), Proceedings of the 25th International Conference on Computers and Industrial Engineering, New Orleans, LA, March 1999, pp. 447–50.
605
[11] T. Kaukoranta, Iterative and Hierarchical Methods for Codebook Generation in Vector Quantization, Ph.D. Thesis, TUCS Dissertation 22, University of Turku, 2000. [12] T. Knuutila, Re-describing an algorithm by Hopcroft, Journal of Theoretical Computer Science 250 (2001) 333– 363. [13] T. Knuutila, M. Puranen, M. Johnsson, O. Nevalainen, Three perspectives for solving the job grouping problem, International Journal of Production Research 39 (2001) 4261–4280. [14] V.J. Leon, B.A. Peters, Replanning and analysis of partial setup strategies in printed circuit board assembly systems, International Journal of Flexible Manufacturing Systems 8 (4) (1996) 389–412. [15] R. Paige, R.E. Tarjan, Three partition refinement algorithms, SIAM Journal on Computing 16 (6) (1987) 973– 989. [16] A. Shtub, O. Maimon, Role of similarity in PCB grouping procedures, International Journal of Production Research 30 (5) (1992) 973–983. [17] J. Smed, M. Johnsson, M. Puranen, T. Leip€al€a, O. Nevalainen, Job grouping in surface mounted component printing, Robotics and Computer-Integrated Manufacturing 15 (1) (1999) 39–49. [18] C.S. Tang, E.V. Denardo, Models arising from a flexible manufacturing machine, Part II: Minimization of the number of switching instants, Operations Research 36 (5) (1988) 778–784.