Mechanism and Machine Theory 92 (2015) 464–483
Contents lists available at ScienceDirect
Mechanism and Machine Theory journal homepage: www.elsevier.com/locate/mechmt
Generalized coordinate partitioning for complex mechanisms based on kinematic substructuring Kristopher T. Wehage a,⁎, Roger A. Wehage b, Bahram Ravani a a b
Department of Mechanical and Aerospace Engineering, University of California, Davis, CA 95616, United States Retired Technical Steward, Caterpillar, Inc, Peoria, IL 61523, United States
a r t i c l e
i n f o
Article history: Received 6 October 2014 Received in revised form 10 June 2015 Accepted 11 June 2015 Available online xxxx Keywords: Kinematic substructures Generalized Coordinate Partitioning Jacobian analysis Constrained multibody dynamics Kinematics Matrix block diagonalization
a b s t r a c t Generalized Coordinate Partitioning (GCP), a partitioning of the Jacobian matrix, allows for the automatic identification of dependent and independent variables and primary and secondary systems of equations for constrained multibody mechanisms. The GCP method is achieved through the row and column permutations associated with applying Gaussian Elimination with Complete Pivoting (GECP) to the constraint Jacobian matrices. GCP forms the basis of robust and efficient kinematic path planning and solution of dynamic equations of motion, with the potential to achieve orders of magnitude speed up over least squares and iterative methods. Despite these benefits, a significant disadvantage is that the GCP method is numerically expensive. This paper presents a technique to automatically rearrange groups of intersecting kinematic loops into non-overlapping kinematic substructures, which effectively block-diagonalizes the Jacobian Matrix. The method is applied to a hydraulic excavator, an illustrative example of a complex mechanism, and an approximate order of magnitude reduction in floating point operations required to perform the GCP method is demonstrated. Numerical results comparing the run time of the kinematic substructuring technique to sparse matrix methods and full dense GECP are provided. Furthermore, a compact representation of the equations of motion is formulated, accounting for the kinematic substructures identified in the mechanism. © 2015 Elsevier Ltd. All rights reserved.
1. Introduction While the kinematic and dynamic behavior of constrained mechanical systems is well-understood, the numerical solution of kinematic posture equations and the dynamic equations of motion remains challenging. The sheer number and complexity of the approaches that have been devised over the years highlight the difficulty in achieving efficient and robust numerical solutions for general problems [1–3]. In the present work, a systematic method based on graph theoretic concepts is presented that allows setting up a general mechanism's governing equations for a wide range of parametric and topological variations. The underlying processing methods rely minimally on the analyst for successful execution. In this vein the initial equations are simple but difficult to process. However, they are amenable to handsoff, symbolic and numerical reduction techniques for generating algorithms suitable for robust and efficient processing. The constrained dynamical equations of motion, a form of Lagrange's equations of motion expressed in joint coordinate space, may be represented compactly as ( ‘ ) ‘ T € f ϕ M J ¼ ð1Þ −γ −λ J 0
⁎ Corresponding author. E-mail address:
[email protected] (K.T. Wehage).
http://dx.doi.org/10.1016/j.mechmachtheory.2015.06.006 0094-114X/© 2015 Elsevier Ltd. All rights reserved.
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
465
where M‘ is a symmetric, positive semidefinite generalized mass matrix, J is a rectangular constraint Jacobian matrix that may lack full € is a column matrix of generrow and column rank, f ‘ is a column matrix of externally applied and generalized dynamic wrenches, ϕ alized coordinate second time derivatives, λ is a column matrix of Lagrange multipliers, and γ ¼J ϕ þ e t , where e t represents the time derivative of constraint partial derivatives, e t , with respect to time. If the coefficient matrix in Eq. (1) is nonsingular, a unique solution can easily be obtained using basic linear algebra methods [3]. However, in practice, the Jacobian matrix almost never has full row rank, so the coefficient matrix in Eq. (1) is generally singular. In addition, bifurcations, end-of-stroke conditions, or instantaneous alignment of joints may occur as a system moves to different postures. Only the latter of these changes Jacobian matrix rank and overall mobility, but the solution processes remain complicated even for relatively simple mechanisms with few degrees of freedom. Robust solution methods must reliably handle these situations with minimal demands on the analyst, who may lack the training to intervene. The various approaches to deal with rank-deficient Jacobian matrices can be classified into two groups. In the first group, all variables and equations are considered to contribute equally to the solution, and the entire set of equations is solved using a best root– mean–squared (RMS) approximation approach, such as the Moore–Penrose pseudo-inverse method described in [4]. Unfortunately, the RMS approach not only requires many additional floating-point operations (FLOPs), but it also may fail to converge in conditions of poor mechanical advantage (see the discussion in Section 6.5 of [5]). In the second group, a cutset, or minimal set of independent variables (often referred to as generalized coordinates); a minimal set of constraint equations (often referred to as primary constraints); and a minimal set of equations of motion (often referred to as generalized equations of motion) are identified. Once the independent variables are computed from the equations of motion, the primary constraint equations are used to evaluate the dependent variables. This second, minimal approach has demonstrated an order of magnitude speedup over the RMS method [6] and has therefore attracted attention in recent decades—although splitting the complete set of equations into primary and secondary sets was first proposed as early as the beginning of the 20th century: see, for example [7–9]. While the second approach is appealing from an efficiency standpoint, ensuring robust solutions requires care in selecting the primary equations and independent variables. Particularly, choosing variables as independent that would result in large changes in system configuration for small incremental inputs must be avoided, because this may cause numerical instability and inaccuracy problems. Therefore, the optimal independent generalized coordinates are those that induce the smallest changes in system configuration. It follows that this set of generalized coordinates will have roughly the greatest mechanical advantage. Historically with the second approach, specialists choose the cutset of primary constraints and generalized coordinates based on their experience with similar classes of problems. Manual selection, however, has its limitations. For complex systems, the manual approach is prone to error and imposes a burden on the analyst. The optimal set of generalized coordinates often depends on a system's posture and may require periodic updating during an analysis. In addition, a mechanism's changing posture may periodically require a different set of primary constraint equations, or even a different number of primary equations when the Jacobian matrix rank changes. In recent decades numerical methods have been devised to automatically identify optimal sets of independent generalized coordinates and primary constraint equations. Generalized Coordinate Partitioning (GCP), introduced by Uicker and Sheth in the context of kinematics [10,11] and extended by Wehage and Haug in the context of dynamics [12] (see also [5] for a more recent discussion), is a row and column permutation and partitioning method applied to the Jacobian matrix. The GCP method is a byproduct of Gaussian Elimination with Complete Pivoting (GECP). GECP factors a matrix into row-echelon form, reliably revealing its rank [13]. At each successive factorization step, GECP permutes the largest remaining pivotal element to the upper-left position in the residual matrix, thereby selecting the next dependent variable with possibly the smallest mechanical advantage. After factorization is complete, rows of the largest nonsingular matrix identify the primary constraint equations, and columns identify the dependent variables. Matrices obtained from GECP are used to precondition the primary constraints and simplify the constrained dynamics equations. GECP is a triple-loop procedure requiring r(6mn − 3(m + n)r + 2r2)/3 FLOPS, where m is the number of rows, n is the number of columns and r is the rank of the matrix. Other algorithms have also been proposed, for example, based on Singular Value Decomposition (SVD) [14,15] or Congruency Transformations [16]. Note that the run-time of these alternative methods is increased over GECP; for example, SVD requires 4m2n + 22n3 FLOPS using the R-SVD algorithm [17]. Although the modified GECP algorithm is more efficient than alternative partitioning strategies, its Oðn3 Þ complexity presents a bottleneck for numerical simulation and should be avoided whenever possible. Certainly, one strategy to improve performance would be to call GECP only once at the beginning of a program's execution. As kinematic bifurcations are common in many mechanisms and the primary system of constraints may change, this approach comes at the expense of robustness and should be avoided. Alternatively, one could conservatively call GECP at every time step—and pay a substantial performance penalty. Other approaches call GECP at regular intervals or use a heuristic method to determine when to repartition the Jacobian matrix. Regardless of the specific method used, the robust numerical solution of dynamic equations may require performing GECP many times in the course of a simulation, and strategies to improve GECP algorithm performance are important. Block-diagonalization is a well-known technique for breaking a large system of equations into smaller, more manageable pieces, thereby increasing computational efficiency. Several algorithms exist in the literature to block partition matrices, such as described in [18–22]. However, these methods are generally unsuitable for block partitioning mechanism Jacobian matrices because they involve taking linear combinations of variables in addition to row and column permutations; are designed for special-purpose matrices (i.e. square or hermitian); or do not fully uncouple substructures (i.e. the resulting partitioning scheme has overlap or is in single-bordered, K-way form [21]). The present paper introduces a simple technique to automatically identify and uncouple independent kinematic substructures within a mechanism. After the kinematic substructuring process is complete, the Jacobian matrix will have been permuted to a
466
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
Fig. 1. An illustrative example of a complex mechanism with five kinematic loops, a hydraulic excavator.
block-diagonal form, allowing each block submatrix to be processed independently, thereby reducing the overhead of applying the GECP algorithm. In other words, this method takes advantage of the kinematic substructures to block-diagonalize the Jacobian matrix. Throughout the paper, kinematic substructuring is applied to a representative example of a complex mechanism, a hydraulic excavator, shown in Fig. 1, which contains five kinematic loops. 2. Preliminaries For completeness, the topology of the excavator is described using graph theory [23]. The preliminary process requires setting up a directed graph that describes how bodies, i.e. nodes, are connected by joints, i.e. edges (The terms body and node, and joint and edge are used synonymously throughout the paper, and sometimes a node or body may refer to a floating Cartesian frame, which is designated as a massless, virtual body.). One form of a directed graph, called an incidence matrix, Γ, is shown in Eq. (2) below for the excavator model. Manual generation of incidence matrices for complex mechanisms presents a burden to the analyst, and the process is prone to errors; therefore, automated methods are preferred. It is more convenient for an analyst to provide body–joint–body descriptions and shape matrices [5], preferably exported directly from a CAD program. For the present example, the bodies and joints of a hydraulic excavator are provided in an arbitrary, user-defined order as described in Table 1. Each body is assigned a unique number, and each joint is assigned a unique letter. As an aid to following the development in this paper, the arbitrary letter assigned to each joint has an associated descriptive name shown in Table 1. This name includes an arrow to indicate the user-defined sense or direction of the joint. Note that the zero identifier is reserved for the fixed inertial reference frame. From the body–joint–body information in Table 1, the incidence matrix for the excavator is given in Eq. (2). Zeros are removed from sparse matrices to improve clarity.
ð2Þ
The bodies and joints are permuted into a natural order using row and column permutations, and the sense of some joints are reversed so that all are oriented away from the fixed inertial reference frame. These joints are reversed for the convenience of obtaining
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
467
Table 1 Identification of bodies and joints in the excavator example. ID
Body name
ID
Joint name
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Undercarriage Cabin Boom Stick Bucket Bucket linkage 1 Bucket linkage 2 Boom cylinder 1 Boom cylinder 2 Boom rod 1 Boom rod 2 Stick cylinder Stick rod Bucket cylinder Bucket rod
A B C D E F G H I J K L M N O P Q R S T
Fixed → undercarriage Undercarriage → cabin Cabin → boom Boom → stick Stick → bucket Stick → bucket linkage 2 Bucket → bucket linkage 1 Cabin → boom cylinder 1 Cabin → boom cylinder 2 Boom → stick cylinder Stick → bucket cylinder Boom cylinder 1 → boom rod 1 Boom cylinder 2 → boom rod 2 Stick cylinder → stick rod Bucket cylinder → bucket rod Boom rod 1 → boom Boom rod 2 → boom Stick rod → stick Bucket rod → bucket linkage 1 Bucket linkage 1 → bucket linkage 2
positive entries on the diagonal of the matrix shown in Eq. (3) below. To effectively reverse joints without changing the sign of their variables, the sign of their influence coefficient matrices are reversed when evaluated. The sign of variables are not changed to remain consistent with any user-defined constraint equations and force models. At this point, the first column of the incidence matrix, which corresponds to the fixed inertial reference frame, can be dropped because the fixed frame is not movable and has no state variables. After the body and joint reordering process, the revised incidence matrix becomes
ð3Þ
The subset of graph edges containing all nodes, but no loops, is known as the kinematic tree or spanning tree [5]. The edges comprising this subset are often referred to as the arcs of the directed graph. A fundamental property of a spanning tree is that it provides a unique kinematic path between any two nodes or bodies, including the root node. The remaining edges required to reconnect the spanning tree are the chords of the directed graph. Consequently, the number of chords is equal to the number of independent loops in a directed graph [5]. The new incidence matrix in Eq. (3) has 15 joints comprising the arcs of the directed graph shown above the dashed line; they represent the row-space of ΓT. The 5 joints below the dashed line are the chords of the directed graph. The sense of joints P, Q, R
468
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
and T have been reversed to align with the spanning tree and obtain positive entries on the diagonal of ΓT. The upper block has been converted to lower triangular form with 1 s along the diagonal. Note that there is not a unique choice of arcs and chords in a given mechanism. In the present example, the permutations were chosen to maximize the breadth of the spanning tree, as shown in Fig. 2. The solid black arrows correspond to the joints that form the arcs in the mechanism, and the dashed blue arrows correspond to the chords. A corollary of the incidence matrix, the connectivity matrix, C, is used to navigate the topology of the mechanism and form the Jacobian matrix. The connectivity matrix is formed by element-wise replacement of ±1 s in ΓT with positive or negative 6 × 6 identity matrices, or screw bases. The connectivity matrix yields the relative velocities across joints as
Cq abs ¼ q rel
ð4Þ
where q abs and q rel are respective column matrices of absolute and relative body twists across the joints that form the arcs and chords. Inspecting rows of the incidence matrix in Eq. (3) reveals that the product in Eq. (4) effectively takes the difference between absolute body twists to obtain relative body twists across the arcs and chords. Additionally, the Partial Velocity Matrix, H, a block diagonal matrix of 6 × 1 joint influence coefficient matrices [24] (i.e. unit velocity screws or twists transformed to the fixed inertial reference frame) provides the mapping of joint rates, ϕ, to relative twist velocities expressed in the fixed inertial reference frame as
Cq abs ¼ Hϕ :
ð5Þ
Fig. 2. Spanning tree for hydraulic excavator. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
469
Eq. (5) may be block partitioned into
C arc Harc qabs ¼ 0 C chord
(
0
ϕ arc
H chord
ϕ chord
) :
ð6Þ
Eq. (6) yields −1
q abs ¼ C arc Harc ϕ arc
ð7Þ
and
C chord q abs ¼ Hchord ϕ chord :
ð8Þ
Substituting Eq. (7) into Eq. (8) and rearranging yields −1
−C chord C arc Harc ϕ arc þ Hchord ϕ chord ¼ 0:
ð9Þ
Eq. (9) represents the chord-loop velocity constraints for the mechanism, and from it is identified the constraint Jacobian matrix h J ¼ −C chord C −1 arc H arc
i H chord :
ð10Þ
introduced in Eq. (1). Using the mapping for q abs and q rel defined in Eq. (9), the Jacobian matrix for the excavator becomes
ð11Þ
where the hj are 6 × 1 influence coefficient matrices for each of the respective joints. The block row identifier labels, i through v, correspond to a minimal set of kinematic loops in the mechanism, where the number of loops is equal to the number of chords in the directed graph. 3. Kinematic substructuring At this point in the analysis, information regarding the system topology required to solve Eq. (1) has been determined. Performing GECP on Eq. (11) would require 9000 FLOPs, assuming that the Jacobian matrix has full-row rank (i.e m = 30, n = 20 and r = 15). In the following section, it will be shown how kinematic substructures can be identified and uncoupled, allowing for an increase in computational efficiency. The kinematic substructuring algorithm requires traversing the directed graph of the mechanism, and the spanning tree is an essential tool for this purpose. Note, however, that the spanning tree contains only the subset of user-defined joints that were classified as arcs in the previous section. A simple modification to Eq. (3) will effectively convert all user-defined joints to arcs of the directed graph, which is useful in the following discussion. Cut the child-side Cartesian frame of each chord joint from its embedding body and consider this released frame as a virtual, massless body. Each chord joint will then have its own unique child body and will become an integral part of an expanded spanning tree. This process will have effectively cut open all kinematic loops and replaced all chords by arcs. For each virtual body, six chord-loop constraint equations are then imposed to connect the virtual body back to its original embedding body. Table 2 Virtual and embedding body id for each kinematic loop. Loop id
Virtual body id
Embedding body id
i ii iii iv v
16 17 18 19 20
10 11 13 6 6
470
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
Table 3 Compact representation of spanning tree using parent–child list and binary tree. List
Node id
Index P L R
0 −1 1 0
1 0 2 0
2 1 3 0
3 2 4 8
4 3 5 10
5 4 19 7
6 7 0 0
7 4 6 13
8 2 16 9
9 2 17 0
10 3 0 11
11 3 0 12
12 3 18 0
13 4 0 14
14 4 15 0
15 14 20 0
16 8 0 0
17 9 0 0
18 12 0 0
19 5 0 0
20 15 0 0
In the excavator example, the child-side frames of the five chord joints are cut from their embedding bodies and converted into virtual, massless bodies. The virtual bodies are then constrained so that their position, orientation, velocity, and acceleration match the node locations in the respective embedding bodies from which they were cut. With this modification, the chords of the directed graph have been converted to rigid constraints. This simple modification requires no changes to Eq. (11). When the chord loops are cut, the virtual body number and the original chord's embedding (positively oriented) body id are recorded for later reuse. Table 2 shows this information for the hydraulic excavator model. A new spanning tree, modified to include the virtual bodies, is shown in Fig. 4 on Page 28. The solid arrows correspond to the joints that form the arcs in the mechanism, and the dashed lines correspond to the new constraints imposed. Virtual, massless bodies are designated by octagons. While the incidence matrix in Eq. (3) provides all the necessary information for traversing the directed graph, navigating through its many ones and zeros may be cumbersome. To avoid this problem the excavator's graph may also be represented more compactly as a parent–child list and binary tree [25]. The parent–child list facilitates traversal from any node of the spanning tree upward, towards the root element. The index into the parent–child list, P, is each body or node's unique ID number, and the value at that index is the node's unique parent node ID number. A binary tree facilitates traversal of the graph from any node of the directed graph away from the root element, towards the leaves of the spanning tree. Two lists, L and R, are used to represent the binary tree. The first list, L, is called the left or descendent list and contains the index of each node's first descendant node, if present. The second list, R, is called the right or sibling list and contains the node's first sibling, if present. Additionally, an index list contains the body indices in original user-defined order. For the excavator model, the four lists are shown in Table 3. The kinematic substructuring process is divided into the following five steps: 3.1. Identify the level of each node or body in the spanning tree Using the parent–child list and binary tree, the increasing level of each body in the spanning tree, moving away from the fixed inertial reference frame, is identified and stored in a node-level list. The parent–child list allows one to easily determine if node k is in the kinematic path of node j using Algorithm 1: Algorithm 1. InPath(j,k,P) inPath ← false while j ≥ 0 do if j is k then inPath ← true break end if j ← P[j] end while return inPath Using an iterative preorder binary tree traversal [25] and a temporary stack variable, S, a levels list can be generated. The levels list is of equal length to the index list introduced above. In the following algorithm descriptions, list manipulation is described as follows: list. pop() indicates removing an element from the end of the list, and list. pop(j) indicates removing an element from the list at the specified index, j. Similarly, list. push(1) indicates appending the value 1 to the end of the list. All pseudocode is described using the C programming convention for indexing. Therefore, list. pop(0) indicates removing the first element from the list. Additionally, list[1 : end] represents a subset of the list containing elements from the second value up to and including the last element. Table 4 Levels list for the excavator model. List
Node id
Index Levels
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 5
8 3
9 3
10 4
11 4
12 4
13 5
14 5
15 6
16 4
17 4
18 5
19 6
20 7
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
Algorithm 2. DetectLevels(P,L,R) Initialize empty stack, S Initialize empty list, levels levels[0] ← 0 k ← L[0] while not S.isempty() or k N 0 do if k N 0 then levels[k] = levels[P[k]] + 1 if R[k] N 0 then S.push(R[k]) end if k ← L[k] else k ← S. pop() end if end while return levels
Fig. 3. Loop v members and nodes in kinematic path of loop v's base node.
471
472
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
The levels list for the excavator model is shown in Table 4.
3.2. Identify loop members Each loop's base node and the nodes in the loop can also be readily identified from the parent–child list and Table 2. Let the respective kinematic paths from the spanning tree's root to each chord joint's virtual body and its user-defined embedding body represents the positive and negative branch segments of the loop, as shown for loop v in Fig. 3 below. In Fig. 3, edges F and T are oriented negatively with respect to the loop direction, edges K, O and S are oriented positively with respect to the loop direction, node 4 is loop v's unique base node, and nodes 0, 1, 2 and 3 are in the ancestor path of node 4. From Table 2, let the virtual body id for loop j be stored in positiveNodes[ j] and the corresponding embedding body id stored in negativeNodes[ j]. For each grouping in Table 2, each loop's members can be identified using Algorithm 3 below and stored in the list members, an initially empty list of lists equal to the number of chord loops, i–v. The first and second while loops are used to decrement each node index in the kinematic path until node indices in the positive and negative branches of the loop are at the same level in the spanning tree. Once the decremented node indices have reached the same level, they will be decremented together by level until the loop's unique base node is identified. During this process, the node indices are stored in a temporary stack variable. At the end of the process, the last node index that was pushed to stack is removed from the end of the list, as the loop's base node will have been written twice. The list sequence is then reversed and stored in members[ j]. The final members list contains each loop's base node first, followed by the loop members in monotonically increasing order by level. Algorithm 3. DetectLoopMembers(P, levels, negativeNodes, positiveNodes) Initialize empty stack, S Initialize empty list of lists, loopMembers for j ← 0 to number of loops do negative ← negativeNodes[j] positive ← positiveNodes[j] S.push(negative) S.push(positive) while levels[positive] N levels[negative] do positive ← P[positive] S.push(positive) end while while levels[negative] N levels[positive] do negative ← P[negative] S.push(negative) end while while positive is not negative do negative ← P[negative] positive ← P[positive] S.push(negative) S.push(positive) end while S.pop() while not S.isempty() do members[j].push(S.pop()) end while end for return members
3.3. Order loops by level At this point the loops are rearranged by sorting their base nodes into monotonically increasing order by level in the spanning trees. After processing by Algorithm 3 and sorting, the loop ordering sequence and loop member lists are shown in Table 5. In the excavator example, no reordering of the loops was necessary, as their base nodes were already arranged by monotonically increasing level order.
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
473
Table 5 Members of each kinematic loop with loops listed in monotonically increasing order by level in the spanning tree. The base node associated with each loop is the first member of each list. Loop id
Base node level
Loop members
i ii iii iv v
2 2 3 4 4
2, 3, 8, 10, 16 2, 3, 9, 11, 17 3, 4, 12, 13, 18 4, 7, 5, 6, 19 4, 7, 14, 15, 6, 20
3.4. Merge loops into kinematic substructures Any given mechanism will consist of at least one kinematic substructure, i.e. the entire mechanism. Therefore, the loop-merging process begins by assigning the first loop, i, to the first substructure. Each successive loop is then inspected to determine if it should be added to any of the existing kinematic substructures or if the loop should be assigned to a new substructure. A candidate chord-loop will be merged with a kinematic substructure if the substructure and the merge candidate have at least one body and one joint in common. Note, however, that the graph edge (joint) labels are not required for merging. The merging process can be performed using only the parent–child list and loop member assignments. A candidate loop will be merged with a kinematic substructure if the substructure's base node matches the candidate's base node or if it is in the candidate base node's ancestor path, and if the candidate loop and substructure have at least one descendant node in common. The term descendant node refers to any loop member that is not the loop's base node, i.e. the subset of loop members, loop[1 : end]. This applies to kinematic substructures as well, where loop[1 : end] is the union of all merged chord-loop bodies, excluding the unique base body at the lowest level in the spanning tree. Algorithm 4. DetectSubstructures(members,P) initialize empty list of lists, substructures substructures.push (empty list) substructures[0].push(members.pop(0)) while not members.isempty() do mergeCandidate ← members.pop(0) merge ← false for substructure in substructures do for loop in substructure do if InPath(mergeCandidate[0], loop[0], P) then if any mergeCandidate[1: end] in loop[1: end] then substructure.push(mergeCandidate) merge ← true break end if end if end for if merge then break end if end for if not merge then initialize empty list, S S.push(mergeCandidate) substructures.push(S) end if end while return substructures In the present example, the base node of loop i is found to be in the ancestor path of loop ii's base node. Inspecting the descendant members of loop ii reveals that node 3 is also a child of loop i, indicating that loop ii should be merged into the first kinematic substructure.
474
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483 Table 6 Kinematic substructures and their corresponding chord-loops. Substructure id
Loops
1 2 3
i, ii iii iv, v
Moving on to loop iii, an ancestor of loop iii's base node is found to be the base node of the first kinematic substructure. However, none of the descendant members of loop iii are members of loops i or ii, and therefore loop iii is not a member of the first kinematic substructure. Thus loop iii is assigned to a second kinematic substructure. Loops iv and v are then analyzed. An ancestor of the base node of loop iv is found to be in the kinematic path of the base node of loops i and ii in the first kinematic substructure, however, none of the children of loop iv's base node is a member of loops i and ii. Therefore, loop iv is not a member of the first substructure. Similarly loop iv is not a member of the second substructure. Thus a third substructure is designated, to which loop iv is assigned. Similarly, loop v is not a member of the first and second substructures. However, one child of loop v's base node is a member of the third substructure, so loop v is assigned to it. The substructures and their corresponding loops are shown in Table 6. 3.5. Reorder bodies and joints In this last step the joints are ordered by level in the spanning tree and permuted according to their kinematic substructure assignments, which requires creating a short list of bodies that are either: • not a member of any kinematic substructure, or • are the base node of a kinematic substructure. In the excavator example, the short list of bodies, ordered by level in the spanning tree becomes: (0) fixed, (1) undercarriage, (2) cabin, (3) boom, and (4) stick. From the bodies list, a short list of joints is also created. Each joint in the joint short list is the parent joint associated with the corresponding body in the body short list. The (0) fixed body has no parent, so the joints in the joint short list for the present example are: A, B, C, and D. From the joint short list, an expanded joint list is created. For each joint in the joint short list, perform the following operations: • if the joint is not a member of a kinematic substructure, append the joint to the expanded joint list, otherwise: • append the joint associated with the base node of the kinematic substructure to the expanded joint list only if it has not been appended in a previous step. Then, append the remaining joints associated with nodes in the kinematic substructure, ordered by level in the spanning tree. After the final sorting and expansion operation, the revised Jacobian matrix becomes
ð12Þ
and the block submatrices J1–J3 associated with each kinematic substructure become
ð13Þ
ð14Þ
and
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
475
ð15Þ
The permuted Jacobian matrix in Eq. (12) now shows the block-diagonalized structure. Joints A and B are not in any substructure and are therefore always independent. Chord loops i and ii and joints C, H, I, Q, P, L and M comprise kinematic substructure 1, chord loop iii and joints D, J, R, and N comprise kinematic substructure 2, and chord loops iv and v and joints F, E, K, T, O, G and S comprise substructure 3. The revised spanning tree shown in Fig. 4 has been assigned colors according to the kinematic substructure assignments: (Red: Substructure 1, Green: Substructure 2, and Blue: Substructure 3). The arcs, chords, and nodes are colored according to which kinematic substructure they appear in, with the exception that base nodes are colored according to their associated kinematic substructures. Likewise, labels J1, J2 and J3 are assigned to each of the respective Jacobian submatrices. GECP can then be applied independently to each of the Jacobian submatrices. Assuming that each Jacobian submatrix has full row rank, GECP would require a total of 1008 FLOPs, an approximate order of magnitude reduction in floating point operations. As the operations count is not always a direct measure of numerical efficiency, in the next section a numerical experiment is carried out to compare the kinematic substructuring technique to other methods. Although the process for automatically determining kinematic substructures is described in specific context to mechanisms, the algorithm is a general graph partitioning method that may have applications in other disciplines. To facilitate transposition to other subject areas, note that the Jacobian matrix, J, is a byproduct of the right null space of the directed graph, ΓT, as described in [10]. See also Eq. (10). In graph theory, this null space matrix is often referred to as the loop matrix [26]. Therefore, the methods described in this section may be adapted to other subjects by replacing any reference to the Jacobian matrix with the more general loop matrix.
Fig. 4. Modified spanning tree for hydraulic excavator with chords converted to arcs. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
476
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
4. Numerical experiment To investigate the numerical efficiency of kinematic substructuring, GECP of the block-diagonalized Jacobian matrix in Eq. (12) is computed using (a) the kinematic substructuring method applied to each of the three block submatrices, J1 –J3, (b) the sparse matrix algebra MA48 factorization algorithm (part of the Harwell Subroutine Library (HSL) [27]) applied only to the nonzero terms of the 30 × 20 Jacobian matrix, J and (c) the full, dense matrix algebra applied to the full 30 × 20 Jacobian matrix, J. In sparse matrix methods, only the non-zero terms of the coefficient matrices are stored. Therefore, memory usage and floating point operations required to carry out sparse-matrix algebra is reduced compared to full, dense algebra. Indeed, many large scientific problems, such as the numerical solution of finite element analysis (FEA) equations, would be intractable without the use of sparse-matrix methods. Note, however, that in sparse-matrix methods, the set of instructions to perform the operations is more complicated than full, dense matrix algebra. Furthermore, matrix coefficients may not be stored or accessed in contiguous memory, which may increase cache misses [28]. The use of sparse-matrix algebra is expected to offer little or no benefit when the coefficient matrices are small and/or dense. In contrast to the large, highly sparse equations found in other disciplines, the equations of motion for many multibody systems are typically smaller and moderately sparse. Nonetheless, Gonzalez et al. [29] showed that sparse-matrix methods do offer some performance benefit over full, dense matrix algebra, considering the special structure of dynamics equations, even for small to moderate problem sizes. Block-diagonalization via kinematic substructuring is essentially a hybrid of full and sparse matrix methods. As discussed in the previous section, full, dense matrix algebra can be applied to each of the Jacobian matrices J1–J3 at a small fraction of the operations required to apply full matrix algebra to the entire Jacobian matrix, J. Furthermore, these block submatrices are dense, and sparse-matrix algebra is expected to have increased complexity compared to full, dense algebra applied to them individually. The HSL MA48 factorization algorithm was chosen as the reference sparse implementation because HSL is well-known for its efficient, well-tested, and high-quality numerical algorithms. HSL subroutines are commonly used in many commercial and research-oriented multibody dynamics solvers [29,30], and the MA48 algorithm is suitable for nonsymmetric, rectangular matrices. The HSL MA48 algorithm consists of three primary components. The first analyze component analyzes the specific structure of a coefficient matrix and generates an initial factorization. The second factorize component is used to compute the factors of a matrix with a known sparsity pattern determined in the analyze phase. In factorize, a “fast” flag can be set. If the fast flag evaluates to true, the factorization sequence from the previous call to analyze or factorize is used; if the fast flag evaluates to false, the factorization sequence may change. The third component of HSL MA48 is a solve phase, which uses the matrix factors to solve a set of linear equations. For benchmarking purposes, only the first and second components of HSL MA48 are considered. The analyze component of HSL MA48 is analogous to block-diagonalization via kinematic substructuring described in Section 3. The analyze phase would be called once at the beginning of a simulation for the particular sparsity pattern of the Jacobian matrix. Analyze would not be called again unless a pivot has come too close to zero or the system's topology has changed. Factorize, with the “fast” flag set to false, is analogous to gaussian elimination with partial pivoting, and factorize, with the “fast” flag set to true, is analogous to gaussian elimination without pivoting. To compare the efficiency of kinematic substructuring to sparse and full matrix factorization, a computer program was written in the C programming language that closely follows the development of this paper. The bodies and joints of the excavator model are specified in an arbitrary user-defined order in a text file. When the program is launched, the bodies and joints are loaded into computer memory using a linked-list data structure. Bodies and joints are permuted according to Section 3 using memory pointers. Once the permutation process is complete, quantities are copied to contiguous memory, and Newton– Raphson iteration is applied to eliminate constraint errors, if present. Finally, the spanning tree is traversed to update the terms of either (a) the block diagonal submatrices J1 –J 3, (b) a sparse matrix representation of the Jacobian, or (c) the full dense Jacobian matrix, J. Once the terms are computed, the Jacobian matrix is passed to one of the three algorithms for matrix factorization. The Jacobian update and factorization steps are repeated 10,000 times for each method. The numerical experiments were carried out on a 6-core AMD Phenom II X6 1090 T Processor with 16 GB of RAM, running the 64 bit 3.18.6-1 Arch Linux kernel. The program was compiled using the GNU Compiler Collection (GCC) C compiler, version 4.9.2-4. The HSL MA48 algorithm is used in conjunction with the ATLAS (Automatically Tuned Linear Algebra Software) [31] implementation of BLAS (Basic Linear Algebra Subprograms), whereas the GECP algorithm for full, dense matrix factorization was programmed wholly without the use of BLAS. Refer to Ref. [29] for more details of various BLAS implementations in multibody dynamics. Results of the experiment are provided in Fig. 5. Kinematic substructuring provides an approximate 5 × speedup over full, dense matrix algebra. The speedup is not directly proportional to the 9 × reduction in FLOPS calculated in Section 3 due to the increased number of function evaluations. Nonetheless the 5 × speedup represents a considerable savings over full, dense GECP. MA48's performance was between the two methods. Kinematic substructuring was more than two times faster than MA48, even without the use of BLAS. The results depicted in Fig. 5 represent the portion of simulation time required to determine the set of independent and dependent generalized coordinates if GECP were conservatively called at every time step during numerical integration. As discussed in the Introduction, GECP need not be called at every time step. If the set of dependent and independent variables
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
477
0.036214
A
0.090873
B
0.188134
C 0
0.05
0.1
0.15
0.2
time (seconds) for 10000 evaluations, less is better Fig. 5. Run times for 10,000 gaussian elimination evaluations with complete row and column pivoting for A: Block-diagonalization via kinematic substructuring, B: Sparse factorization using HSL MA48 algorithm, and C: Full, dense gaussian elimination.
remains unchanged between two integration steps, the factorization pivot sequence may be reused. However, it is strongly emphasized that if this approach is taken, for robustness, GECP must be reapplied as the mechanism approaches a bifurcation configuration. In the present example, a heuristic method for monitoring the stability of the choice of dependent and independent variables based on [32] is used to determine when to reapply GECP. The ratio of largest to smallest pivots of the largest nonsingular submatrix of J in full, dense GECP or of the block-diagonal matrices J1–J3 in block-diagonalized GECP is monitored. If the pivot ratio becomes greater than a user-specified threshold obtained from the initial starting value, then GECP is applied again. The logic to check the pivot ratio is not added to the HSL MA48 “fast factorization” subroutine. Instead, MA48 returns an error when a pivot approaches machine roundoff error during the fast factorization process, which may be used as a basis to call factorization with the “fast” flag set to false again to determine a new set of dependent and independent variables. When the pivot sequence is reused from the initial call to GECP, block-diagonalization via kinematic substructuring is approximately 7 × faster than MA48's fast factorize routine, even accounting for the additional check on pivot ratio. The factorization and pivot ratio calculation for the entire full, dense Jacobian was also more than twice as fast as sparse factorization with the “fast” flag set to true. Another implication of using a heuristic method, together with the kinematic substructuring technique, is that when the mechanism approaches a bifurcation configuration, GECP need not be reapplied simultaneously to all three block submatrices, J1–J3. Each kinematic substructure can be monitored individually, and full GECP can be reapplied on an as-needed basis to each submatrix in approximately 1/3 the time indicated in Fig. 5A. On the other hand, if the mechanism approaches a bifurcation configuration when using full, dense algebra or sparse-matrix methods, GECP must be reapplied to the entire Jacobian matrix, and the time required identifying a new set of dependent and independent variables shown in Fig. 5B and C will always be roughly the same. The results of Figs. 5 and 6 indicate that block-diagonalization via kinematic substructuring requires a minimal computational expense to select a set of dependent and independent variables, which is a critical part of a robust solution of dynamics equations. The next section summarizes the role of kinematic substructures in setting up and processing the dynamic equations of motion.
5. Solving dynamics equations In this section, the introduction of incidence and connectivity matrices into the dynamic equations of motion is discussed; a minimal representation of the equations of motion is derived, accounting for each of the kinematic substructures; and a preconditioning process, hereafter referred to as the method of Preconditioned Primary Constraints (PPC), is applied to the primary constraint equations to simplify the equations of motion.
0.009111
A
0.064939
B
0.026275
C 0
0.05
0.1
0.15
0.2
time (seconds) for 10000 evaluations, less is better Fig. 6. Run times for 10,000 gaussian elimination evaluations without pivoting for A: Block-diagonalization via kinematic substructuring, B: Sparse “fast-factorization” using HSL MA48 algorithm, and C: Full, dense gaussian elimination without pivoting.
478
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
With the addition of virtual bodies, the incidence matrix, ΓT, is now square and invertible. Accounting for the joint and body ordering and block partitioning identified in Section 3, ΓT becomes
ð16Þ
The block pattern in Eq. (16) results because the kinematic substructures are coupled to each other only through their base bodies. Recall that the connectivity matrix, C, must also be updated by element-wise replacement of the ones and zeros in Eq. (16) with positive or negative 6 × 6 identity matrices. The column matrix q abs in Eq. (5) is modified to include the absolute twist velocities of the virtual bodies. With these modifications, Eq. (5) is differentiated to yield
€ H ϕ ¼ Hϕ € þγ € abs ¼ H ϕþ Cq
ð17Þ
where
γ ¼ H ϕ:
ð18Þ
Additionally, the constraints, denoted as e, and their first and second time derivatives must satisfy e ϕ; t ¼ 0
e ¼ Jϕ þ e t ¼ 0
ð19Þ
ð20Þ
and € þγ ¼0 €e ¼ J ϕ
ð21Þ
where
γ ¼J ϕ þ e t :
ð22Þ
In Eqs. (19)–(21), the symbol t represents time, which may appear explicitly in some constraint equations. Generally the chordloop constraint equations don't depend on time, but Eqs. (19)–(21) may also include additional constraints imposed on joint variables that do depend on time. The unconstrained Newton–Euler equations of motion, expressed in Cartesian coordinates, are [33] € abs ¼ g: Mq
ð23Þ
After kinematic substructuring has been performed, matrix M will be block-diagonal with each body's 6 × 6 spatial inertia matrix appearing along the diagonal, arranged according to the body ordering in Eq. (16). Interspersed along the block-diagonal submatrices of M will be zere or more 6 × 6 all-zero submatrices, one for each floating child frame of a chord joint variable and one for each floating child frame of selected arc joint variables. Thus M will generally be singular, and this singularity will propagate into the generalized mass matrix Mℓ introduced in Eq. (1).
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
479
As the bodies are now grouped according to their kinematic substructure assignments, as illustrated in Fig. 4, the mass matrix may be block-partitioned as 2 6 M¼6 4
3
Mii
7 7 5
M11 M 22
ð24Þ
M33 where Mii consists of inertia terms associated with bodies 1 and 2, M11 consists of inertia terms associated with bodies 3, 8, 9, 11, 10, 16 and 17, M22 consists of inertia terms associated with bodies 4, 12, 13 and 18, and M33 consists of inertia terms associated with bodies 7, 5, 14, 6, 15, 19 and 20. Note that a body may technically be a member of more than one kinematic substructure, but its parent joint will be included in at most one kinematic substructure. Therefore, to better understand the related body ordering and substructure assignments in Eq. (24), refer to Fig. 4 and the ordering of each body's unique parent joint in Eq. (12) or to the ordering within kinematic substructure in Eqs. (13) to (15). € abs is a column matrix of 6 × 1 submatrices of absolute body twist accelerations, and g is a column matrix of 6 × 1 As noted earlier, q submatrices of dynamic wrenches acting on the bodies, including inertial and gravitational effects. Eq. (17) may be expressed in factored matrix form as ( ½C
−H
€ q abs € ϕ
) ¼ γ:
ð25Þ
Eqs. (23) and (25), along with a column matrix of constraint reaction forces, k, are combined to obtain the unconstrained, augmented equations of motion as [34] 2
M 40 C
9 8 9 38 T € =
0 0 −H
ð26Þ
The equation T
H k¼f
ð27Þ
in Eq. (26) projects the constraint reaction wrenches, k, onto joint space. The column matrix, f , represents working generalized forces (such as from springs, dampers, actuators, friction, etc.) applied within the arc and converted chord joints [35,36]. If no such effects exist within any joint, then f will be zero. € abs , and Because the revised connectivity matrix is structurally nonsingular and invertible, the absolute accelerations, q reaction wrenches, k, may be solved for in Eq. (26) and symbolically eliminated to obtain the generalized equations of motion as T
H C
−T
MC
−1
€ ¼ f þH C Hϕ T
−T
−1 g−MC γ
ð28Þ
or ‘€ ‘ Mϕ ¼f
ð29Þ
where ‘
T
M ¼H C
−T
MC
−1
H
ð30Þ
and ‘
T
f ¼ f þH C
−T
−1 g−MC γ :
ð31Þ
The symmetric, positive semidefinite matrix M‘ introduced in Eq. (1) and defined in Eq. (30) corresponds to the generalized mass matrix. It contains the same quantities that would have been obtained by expanding a Lagrangian, hence the superscript ‘ [37]. The column matrix f ‘ contains the model's generalized forces. It also contains the same quantities that would have been obtained by expanding a Lagrangian. Appending Eq. (29) to Eq. (21) yields the constrained generalized equations of motion in Eq. (1). Mapping of the sparse, block-diagonal matrix, M, onto the generalized mass matrix, M‘, is determined by the congruency transformation shown in Eq. (30) and follows a unique structure precisely defined by the mechanism's topology. Efficient recursive strategies
480
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
for computing M‘ have been discussed in the literature, for example in [38,39]. In Eq. (16), the order of bodies was permuted to follow the order of joints. As suggested by the block partitioning in Eq. (16), the generalized mass matrix, M‘, may be block partitioned into the form 2
‘
M ii 6 ‘ ‘ 6M M ¼ 6 1i 4 M ‘2i ‘ M 3i
‘
M i1 ‘ M 11 ‘ M 21 ‘ M 31
‘
Mi2 ‘ M12 ‘ M22 ‘ M32
3 ‘ M i3 ‘ 7 M 13 7 ‘ 7 M 23 5 ‘ M 33
ð32Þ
where M‘ii, M‘11, M‘22, and M‘33 each have dimensions 1/6th the size of the respective submatrices Mii, M11, M22, and M33 in Eq. (24). The block-partitioned Jacobian submatrices in Eq. (12) and the constraint equations may be interspersed within the generalized equations of motion as
ð33Þ
The kinematic substructures appear as the three block 2 × 2 submatrices along the diagonal of the coefficient matrix in Eq. (33). Coupling between these substructures is through the off-diagonal submatrices, which are relatively sparse. As discussed earlier, if the Jacobian submatrices in Eq. (33) do not have full row rank, then it will be just as difficult to process this system of equations. Therefore, GECP, GCP and PPC are applied to each Jacobian submatrix to eliminate these problems. First consider how GECP would handle the constraint equations for a typical kinematic structure. The permuted and partitioned equations may be represented as
" #T ( d ) " d pd p € ϕ Pv J Pe J ¼ sd s i i € Pe J Pv ϕ
pi
J si J
#(
€d ϕ €i ϕ
)
¼
p s
−γ −γ
:
ð34Þ
In Eq. (34) the matrices Pep and Pes represent two pieces of an orthonormal permutation matrix that rearrange and partition rows of the Jacobian matrix and equations into primary and secondary components, and the matrices Pvd and Pvi represent two pieces of an orthonormal permutation matrix that rearrange and partition columns of the Jacobian matrix and rows of the column matrix of var€ and γ have been appropriately permuted and iables into dependent and independent components. Thus the column matrices ϕ partitioned according to the equation and variable permutation matrices. As the GECP algorithm factors each sample Jacobian matrix, it generates the appropriate equation and variable permutation information, which is stored and available for subsequent use. The primary constraint equations are extracted from Eq. (34) as h J
pi
J
pd
( i ) i ϕ € p ¼ −γ €d ϕ
ð35Þ
where this reduced Jacobian matrix now has full row rank. Because the primary Jacobian matrix in Eq. (35) for each kinematic substructure has full row rank, the primary constraint equations may be appended to a partitioned and permuted version of the generalized equations of motion instead of the full Jacobian matrix as was done in Eq. (33), yielding
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
481
ð36Þ
For well-posed models the coefficient matrix in Eq. (36) will be nonsingular, and all unknowns may be solved for. Starting at pd the lower right corner of the large matrix in Eq. (36), the submatrix J3 is nonsingular and the block 2 × 2 submatrix containing it and its transpose is also nonsingular, regardless of the condition of M‘dd 33 , which could be entirely zero. Thus one may proceed to € d and λ p from Eq. (36) and then ϕ € i . With these quantities eliminated, the same processes may be apeliminate the unknowns ϕ 3 3 3 € i may be computed. One drawback of plied to the second kinematic substructure, then the first kinematic structure, and finally ϕ this approach is that inverting the block 2 × 2 submatrices in Eq. (36) and computing the Schur complement [40] matrices can be a somewhat complicated procedure. However, there is an easier way, using the method of Preconditioned Primary Constraints (PPC), which is described next. To demonstrate the PPC method, start by expressing Eq. (34) as "
pd
pi
#(
J si J
J sd J
€d ϕ €i ϕ
)
¼
L ½U LR
( UR
€d ϕ €i ϕ
)
( ¼
p
−γ s −γ
) ð37Þ
where L represents an invertible lower-triangular matrix with unity diagonal, U represents an invertible upper-triangular matrix with pivots on the diagonal, and LR and UR represent residual matrices. From Eq. (37) an alternate set of primary constraint equations may be extracted as (
L½ U
€d ϕ UR €i ϕ
) p
¼ −γ :
ð38Þ
Since L and U are invertible, Eq. (38) may be revised to h I
U
−1
( d) i ϕ h € ¼ I UR i € ϕ
( d) i ϕ € −1 −1 p d ¼ −U L γ ¼ −γ : B €i ϕ di
ð39Þ
Eq. (39) identifies a replacement for the primary equations in Eq. (35) as h
di
B
i I
(
€i ϕ €d ϕ
) d
¼ −γ :
ð40Þ
Eq. (40) is the PPC template for each kinematic substructure. Using the notation BdiT = Bid, the system of equations for the entire mechanism may be simplified as
482
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483
ð41Þ
The block-partitioned equations in Eq. (41) are much easier to uncouple because inverting the block 2 × 2 pivotal submatrices containing the identity matrices is trivial. These preconditioned equations of motion may now be more easily solved using standard methods. 6. Conclusions This paper has developed a method of block-diagonalizing the Jacobian matrix of complex mechanisms by automatically detecting their kinematic substructures. For the hydraulic excavator example, three kinematic substructures were found, allowing the Jacobian matrix to be decomposed into three block-diagonal elements. Applying GECP separately to each block submatrix resulted in a 5-fold speedup over applying full, dense algebra to the entire Jacobian matrix and a 2-fold speedup over applying a sparse-matrix method. Furthermore, the value in processing each block Jacobian submatrix independently of the others was demonstrated—if GECP is not called at every time step during numerical integration, a heuristic method can be used to update individual submatrices of the Jacobian matrix on an as-needed basis at reduced computational cost. Furthermore, a compact representation of the dynamic equations of motion was developed, accounting for the substructures. Although the kinematic substructuring method described involves a nontrivial sorting operation, unless the topology of the mechanism changes, the sorting process would only need to be performed once at the beginning of program execution. It should be pointed out, however, that not all mechanisms would benefit from kinematic substructuring. The kinematic substructuring procedure is suited to large, complex, constrained mechanisms, particularly those systems which stand to benefit the most from increased algorithmic performance. References [1] A. Laulusa, O. Bauchau, Review of classical approaches for constraint enforcement in multibody systems, J. Comput. Nonlinear Dyn. 3 (1) (2008) 011004. [2] O. Bauchau, A. Laulusa, Review of contemporary approaches for constraint enforcement in multibody systems, J. Comput. Nonlinear Dyn. 3 (1) (2008) 011005. [3] P.E. Nikravesh, Some methods for dynamic analysis of constrained mechanical systems: A survey, in: E.J. Haug(Ed.), Computer Aided Analysis and Optimization of Mechanical System Dynamics, Vol. 9 of NATO ASI SeriesSpringer, Berlin Heidelberg 1984, pp. 351–368. [4] J.J. Uicker, J. Denavit, R.S. Hartenburg, An iterative method for the displacement analysis of spatial mechanisms, J. Appl. Mech. ASME Trans. (1964) 309–314. [5] J.J. Uicker, B. Ravani, P.N. Sheth, Matrix Methods in the Design Analysis of Mechanisms and Multibody Systems, Cambridge University Press, 2013. [6] R.A. Wehage, Generalized Coordinate Partitioning in Dynamic Analysis of Mechanical Systems(Ph.D. thesis) 1981. [7] G.A. Maggi, Principii della Teoria Matematica del Movimento Dei Corpi: Corso di Meccanica Razionale, Ulrico Hoepli, 1896. [8] G.A. Maggi, Di alcune nuove forme delle equazioni della dinamica applicabili ai systemi analonomi, Atti Accad. Naz. Lincei Rend. Cl. Fis. Mat., X 1901, pp. 287–291. [9] P. Woronetz, Über die Bewegung eines starren Körpers, der ohne Gleitung auf einer beliebigen Fläche rollt, 1911. [10] P.N. Sheth, A Digital Computer Based Simulation Procedure for Multiple Degree of Freedom Mechanical Systems With Geometric Constraints(Ph.D. thesis) 1972. [11] P.N. Sheth, J.J. Uicker, IMP (Integrated Mechanisms Program), a computer-aided design analysis system for mechanisms and linkage, J. Manuf. Sci. Eng. 94 (2) (1972) 454–464. [12] R.A. Wehage, E.J. Haug, Generalized coordinate partitioning for dimension reduction in analysis of constrained dynamic systems, J. Mech. Des. 104 (1982) 247–255. [13] A.M. Turing, Rounding-off errors in matrix processes, Quart. J. Mech. Appl. Math. 1 (1948) 287–308. [14] N.K. Mani, E.J. Haug, K.E. Atkinson, Application of singular value decomposition for analysis of mechanical system dynamics, J. Mech. Des. 107 (1) (1985) 82–87. [15] R.A. Wehage, W.Y. Loh, Application of svd to independent variable definition in constrained mechanical systems, ASME Dyn Sys Control Div Publ DSC, 52, ASME, New York, NY,(USA) 1993, pp. 71–79. [16] T.A. Loduha, B. Ravani, On first-order decoupling of equations of motion for constrained dynamical systems, J. Appl. Mech. 62 (1995) 216–222. [17] G.H. Golub, C.F. Van Loan, Matrix Computations, Johns Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press, 1996. [18] S. Acer, A Recursive Graph Bipartitioning Algorithm by Vertex Separators With Fixed Vertices for Permuting Sparse Matrices Into Block Diagonal Form With Overlap(Ph.D. thesis) Bilkent University, 2011. [19] B. Uçar, C. Aykanat, Partitioning sparse matrices for parallel preconditioned iterative methods, SIAM J. Sci. Comput. 29 (4) (2007) 1683–1709. [20] D.V. Steward, Partitioning and tearing systems of equations, J. Soc. Ind. Appl. Math. Ser. B Numer. Anal. 2 (2) (1965) 345–365. [21] C. Aykanat, A. Pinar, Ü.V. Catalyürek, Permuting sparse rectangular matrices into block-diagonal form, SIAM J. Sci. Comput. 25 (6) (2004) 1860–1879. [22] R.J. Cave, J.F. Stanton, Block diagonalization of the equation-of-motion coupled cluster effective hamiltonian: treatment of diabatic potential constants and triple excitations, J. Chem. Phys. 140 (21) (2014) 214112. [23] J.A. Bondy, U.S.R. Murty, Graph theory, Grad. Texts in Math2008.
K.T. Wehage et al. / Mechanism and Machine Theory 92 (2015) 464–483 [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40]
483
L.W. Tsai, Robot analysis: the Mechanics of Serial and Parallel Manipulators, John Wiley & Sons, 1999. T. Cormen, Introduction to Algorithms, 3rd edition MIT Press, 2009. S. Seshu, M. Reed, Linear Graphs and Electrical Networks, Addison-Wesley, Reading, Mass., 1961 I. Duff, J. Reid, The design of ma48: a code for the direct solution of sparse unsymmetric linear systems of equations, ACM Trans. Math. Softw. (TOMS) 22 (2) (1996) 187–226. S. Toledo, Improving the memory-system performance of sparse-matrix vector multiplication, IBM J. Res. Dev. 41 (6) (1997) 711–725. M. González, F. González, D. Dopico, A. Luaces, On the effect of linear algebra implementations in real-time multibody system dynamics, Comput. Mech. 41 (4) (2008) 607–615. A.F. Hidalgo, F.J. García de Jalón de la Fuente, S. Tapia Fernandez, High performance algorithms and implementations using sparse and parallelization techniques on MBS, Proc. of ECCOMAS Thematic Conference, Brussels, Belgium, 2011. C. Whaley, A. Petitet, Minimizing development and maintenance costs in supporting persistently optimized BLAS, Softw. Pract. Exp. 35 (2) (2005) 101–121. R. Mittra, C. Klein, The use of pivot ratios as a guide to stability of matrix equations arising in the method of moments, IEEE Trans. Antennas Propag. 23 (3) (1975) 448–450. J.M. Selig, Geometric fundamentals of robotics, Monographs in Computer Science, Springer, 2005. R.A. Wehage, Application of matrix partitioning and recursive projection to O(n) solution of constrained equations of motion, Trends and Developments in Mechanisms, Machines and Robotics, DE, 15-2 1988, pp. 221–230. R.E. Roberson, R. Schwertassek, Dynamics of Multibody Systems, John Wiley and Sons, 1988. R. Featherstone, Rigid Body Dynamics Algorithms, Springer-Verlag New York, Inc., New York, NY, USA, 2008. L.A. Pars, A Treatise on Analytical Dynamics, Ox Bow Press, 1979. K. Lilly, D. Orin, Efficient O(n) recursive computation of the operational space inertia matrix, IEEE Trans. Syst. Man Cybern 23 (5) (1993). G. Rodriguez, A. Jain, K. Kreutz-Delgado, A spatial operator algebra for manipulator modeling and control, Int. J. Robot. Res. 10 (4) (1991) 371–381. F. Zhang, The Schur complement and its applications, vol. 4, Springer, 2006.