Deterministic Global Optimization and Transition States

Deterministic Global Optimization and Transition States

Krist V. Gernaey, Jakob K. Huusom and Rafiqul Gani (Eds.), 12th International Symposium on Process Systems Engineering and 25th European Symposium on C...

200KB Sizes 10 Downloads 119 Views

Krist V. Gernaey, Jakob K. Huusom and Rafiqul Gani (Eds.), 12th International Symposium on Process Systems Engineering and 25th European Symposium on Computer Aided Process Engineering. c 2015 Elsevier B.V. All rights reserved. 31 May - 4 June 2015, Copenhagen, Denmark. 

Deterministic Global Optimization and Transition States Dimitrios Nerantzis and Claire S. Adjiman Department of Chemical Engineering; Centre for Process Systems Engineering; Imperial College London; London SW7 2AZ; United Kingdom [email protected]; [email protected]

Abstract Transition states (index-1 saddle points) play a crucial role in determining the rates of chemical transformations but their reliable identification remains challenging in many applications. In the current literature deterministic global optimization methods have been employed for the location of transition states (TSs) by initially finding all stationary points and then identifying the TSs among the set of solutions. Aiming to focus the computational effort on the location of TSs of general nonlinear, three times continuously differentiable functions, we introduce regional tests for the identification of hyper-rectangular areas that do not contain any TS or that may contain a unique TS. These tests can be used within the framework of global optimization methods with the potential of reducing the computational time for TS location. We present the theory behind the tests and results on a simple benchmark function. Keywords: global optimization, transition states, interval matrix, eigenvalue bounding

1. Introduction We consider the following problem: Given a function f : B ⊆ Rn → R, f ∈ C3 we want to find all the critical points, x∗ ∈ B : ∇ f (x∗ ) = 0, of f for which the Hessian matrix ∇2 f (x∗ ) has eigenvalues λn < 0 < λn−1 ≤ ... ≤ λ1 . Such points are called Transition States (TSs) or index-1 saddle points. TSs play a crucial role in determining the rates of chemical transformations (Wales, 2003) and are also of interest in robotics and economics (Ellabaan et al., 2009). A number of local methods have been proposed in the literature for the identification of transition states. For example, in the Rational Function Optimization (RFO) method (Banerjee et al., 1985) a local search for a single TS is performed while in the Nudged Elastic Band method (Henkelman and J´onsson, 2000) an approximation of the minimum energy path between two minima is built. The point with the maximum energy on the path is a TS. Stochastic methods such as simulated annealing (Chaudhury and Bhattacharyya, 1998) and genetic algorithms (Ellabaan et al., 2009) have also been employed for locating TSs. While more computationally expensive, such methods do not require any starting points to locate a TS and may find multiple TSs. Our focus in this paper is on deterministic global methods, that can guarantee the identification of all TSs within a specified domain. In the existing literature, the use of such methods for TS location includes the work of Westerberg and Floudas (1999) using the α BB algorithm (Androulakis et al., 1995; Adjiman et al., 1998) and the work of Lin and Stadtherr (2004) using the interval Newton method (Hansen and Walster, 2003). In (Westerberg and Floudas, 1999) and (Lin and Stadtherr, 2004) the authors locate all critical points of a potential energy function and then identify each type of solution based on the eigenvalues of the corresponding Hessian matrix. The drawback of this approach is that computational time is spent for the location of critical points with index > 1 ( i.e., with a number of negative eigenvalues > 1).

852

Nerantzis and Adjiman

Because of the computational cost associated with deterministic global optimization, it may be beneficial to focus the search on regions that contain TSs only. In this paper, we propose several regional tests that allow the elimination of certain regions and we apply this approach to a test function. We explore the trade-off between the cost of the tests and the number of iterations and CPU time required to identify all TSs. The paper is organized as follows: In section 2, we introduce the general approach briefly. In section 3, the regional tests are discussed. The algorithm is applied to an example in Section 4 and conclusions are drawn in Section 5.

2. Proposed approach We use the α BB algorithm and the formulation proposed in (Westerberg and Floudas, 1999) in order to search for critical points. However, aiming to focus the computational effort on the location of TSs, we introduce a number of tests which can be used to bound the number of negative and positive eigenvalues of an interval matrix. Since α BB is a branch & bound optimization algorithm, at any given iteration, it proceeds by calculating valid lower and upper bounds over hyper-rectangular subsets R of the initial domain B and then dividing each subset area to obtain better lower and upper bounds. Whenever the lower bound of a given area is found to be greater than the best upper bound so far, the area is fathomed. We modify the approach by applying, prior to each bounding step, a test on the interval Hessian matrix, [∇2 f (R)], calculated over R by the natural interval extension (Hansen and Walster, 2003) of the second derivatives ∂ 2 f /∂ xi ∂ x j . The interval Hessian can be seen as a superset of {∇2 f (x) : x ∈ R}. If the test reveals that every matrix in [∇2 f (R)] has index > 1 then we fathom the area R. If the test reveals that every matrix in [∇2 f (R)] is index-1 then we perform a local search, since it can be shown that this implies that there can be at most one TS in R and: if we find a TS we fathom the area. Otherwise the test is inconclusive and so we proceed with the next step of the modified α BB algorithm.

3. Regional tests In this section, we introduce three tests to assess whether a given region may contain a TS. 3.1. Recursive inertia test Based on Haynsworth’s theorem (Carlson et al., 1974) we can construct algorithms for obtaining bounds on the number of negative and positive eigenvalues of interval matrices. First we give the following definition: Definition 3.1 (Inertia of a symmetric scalar matrix) Given a symmetric matrix A, the inertia of A, In(A), is the triplet (π (A), ν (A), δ (A)) of the numbers of positive, negative and zero eigenvalues of A respectively. Theorem 3.2 (Haynsworth) Given a symmetric matrix M partitioned in the form   A B M= T and assuming A is non-singular, then In(M) = In(A) + In(C − BT A−1 B). B C We can make use of Haynsworth’s theorem recursively, as shown in Cottle (1974). Cottle considers scalar matrices and chooses A to be a single non-zero entry in the diagonal. By interchanging corresponding rows and columns simultaneously, thus not affecting the eigenvalues, we bring the selected entry A to the top right position of the matrix. We note the sign of A, we then calculate

Deterministic Global Optimization and Transition States

853

C − BT A−1 B (this is the Schur complement of A in M) and we repeat. If all the elements  in the  0 a diagonal are zero we are either left with a zero matrix or we can choose A to be of the form . a 0 This way we can always calculate the complete inertia of a scalar matrix. A straightforward adaptation of this recursive scheme for interval matrices and for our purposes is simply to scan the diagonal for intervals that do not contain zero, calculate the interval Schur complement [C] − [B]T [A]−1 [B] and repeat. We should give priority to negative intervals. If at any point all the diagonal interval elements contain zero then we cannot proceed further with the analysis and stop. Note that in the interval case, each time we find a negative (resp. positive) interval in the diagonal of a subsequent Schur complement, this means that all the scalar matrices contained in the initial interval matrix have a further negative (resp. positive) eigenvalue. In a similar manner Meyer and Swartz (1998) used Schur’s formula, det(M) = det(A)det(C − BT A−1 B), for a convexity test applied to interval matrices (such a test was mentioned in Cottle (1974) for scalar matrices) along with a branch-and-bound method. In algorithm 1 we give a pseudocode for the proposed recursive inertia algorithm, RecIn. Algorithm 1 Pseudocode for RecIn algorithm: O(n3 ) 1: Set n = dim([M]). Initialize neg=0, pos=0 (number of negative and positive (interval) eigenvalues). 2: Search for a diagonal interval [maa ] with 0 ∈ [maa ]. Give priority to negative intervals 3: if none found then 4: Stop, test is inconclusive. 5: else 6: pos = pos + 1, if [maa ] > 0 or neg = neg + 1, if [maa ] < 0 7: if neg > 1 or pos == n then 8: Fathom area. 9: else if neg == 1 and pos == n − 1. then 10: Optional: Local search. 11: else if dim([M]) > 1 then 12: Calculate the interval Schur complement of A = [maa ] in [M], set [M] to the Schur complement and repeat from step 2. 13: else 14: Test is inconclusive. 15: end if 16: end if 3.2. 2 × 2 Inertia test Another possible way to make use of theorem 3.2 for our purpose is to choose [A], in[M] =    [A] [B] [mii ] [mi j ] . The , to be any of the 2 × 2 diagonal sub-matrices of [M], [Ai j ] = [m ji ] [m j j ] [B]T [C] maximum eigenvalue, λi j = max λ1 (Ai j ), of each of these matrices is Ai j ∈[Ai j ]

λi j =

mii + m j j +

(

(mii − m j j )2 + 4 max{mi j 2 , mi j 2 } 2

,

(1)

where mi j and mi j are the lower and upper bounds of the interval entry [mi j ] of [M]. If λi j < 0 for any of the sub-matrices then by theorem 3.2 we know that every M ∈ [M] has at least two negative eigenvalues and thus we can fathom the corresponding area. In algorithm 2 we give a pseudocode of this test to which we will refer as the 2 × 2 inertia test.

Nerantzis and Adjiman

854

Algorithm 2 Pseudocode for the 2 × 2 inertia test: O(n2 ) 1: Set n = dim([M]). 2: for i=1:n do 3: for j>i:n do ( mii +m j j + (mii −m j j )2 +4 max{mi j 2 ,mi j 2 }

4: 5: 6: 7: 8: 9:

λ= 2 if λ < 0 then Fathom area. Exit test. end if end for end for

Note that the 2 × 2 inertia test does not remove minima and TSs and that it might be inconclusive even for a scalar matrix. However it is computationally cheap and easy to implement, and is thus worth investigating. 3.3. Rohn test A third test is based on Rohn’s method (Hladk et al., 2010) which is derived from the interval extension of Weyl’s inequality (Golub and Van Loan, 1996). Theorem 3.3 (Weyl) Given symmetric (scalar) matrices C and E then

λk (C) + λn (E) ≤ λk (C + E) ≤ λk (C) + λ1 (E). Any given interval matrix [M] can be written as C + [E] where ci j = (mi j + mi j )/2 and [ei j ] = [ci j − mi j , mi j − ci j ]. By calculating lower and upper bounds, λn and λ1 , for λn ([E]) = {λn (E) : E ∈ [E]} and λ1 ([E]) = {λ1 (E) : E ∈ [E]} respectively then we would have, Theorem 3.4 (Rohn) Given a symmetric interval matrix [M] = C + [E] then

λk (C) + λn ≤ λk (C + [E]) ≤ λk (C) + λ1 . Note that λn = −λ1 and also that the widths of the intervals λk ([M]) are all the same. We can calculate λn (and λ1 ) using a number of methods (Adjiman et al., 1998), the simplest being the interval extension of Gerschgorin’s theorem (O(n2 )) and the most expensive being the Hertz-Rohn method (O(2n−1 )). The Rohn test is summarized in algorithm 3. Algorithm 3 Pseudocode for Rohn’s test: O(n2 ) - O(2n−1 ) 1: Set n = dim([M]). 2: Calculate [λn ] and [λn−1 ] for [M] using Theorem 3.4 and an eigenvalue bounding method. 3: if λn ≥ 0 then 4: Fathom area (convex area). 5: else if λn < 0 and λn−1 > 0 then 6: Optional: Local search. 7: else if λn−1 ≤ 0 then 8: Fathom area. 9: else 10: Test is inconclusive. 11: end if

Deterministic Global Optimization and Transition States

855

4. Application to a benchmark problem. The proposed tests have been implemented in the α BB algorithm (Adjiman et al., 1998). We investigate the performance the proposed tests on a small example, Ackley’s function in three dimensions. The form and domain are given below. (   f (x) = −20 exp(−0.2 1n ∑ni=1 xi2 ) − exp 1n ∑ni=1 cos(2π xi ) + 20 + e, x ∈ [0.5, 3]3 , n = 3. (2) We perform six runs, one run without any regional tests, one run with the 2 × 2 inertia test, two runs with the Rohn test (with and without the local search step) and two runs with the RecIn test. For bounding the eigenvalues in Rohn’s test we use the interval extension of Gerschgorin’s theorem (Adjiman et al., 1998). The CPU time and number of solutions of each type (minima, TSs, higher-index critical points) found are reported in table 1. The number of unfathomed nodes at each iteration for each run is shown in figure 1. The computations were performed on a 3060 MHz computer using an absolute convergence tolerance of 10−6 and a minimum box size of 10−6 . All 81 TSs are identified in all cases, and significant decreases in the CPU time can be observed, with all tests performing well. The local search is found to reduce the number of iterations and to lead to a further decrease in CPU time. This is likely to be critical when dealing with larger problems. Table 1: CPU times without local search / with local search and number of solutions of each type found for each run for the Ackley function. Test CPU Time #Mins #TSs #index > 1 No test 65 sec 27 81 84 2x2 Inertia 33 sec 27 81 0 Rohn 30/21 sec 0 81 0 RecIn 28/19 sec 0 81 0

Figure 1: Number of unfathomed nodes at each iteration for each run.

856

Nerantzis and Adjiman

5. Conclusions In this paper we considered the problem of locating all transition states (TSs) of general nonlinear functions, in C3 , using global deterministic methods. In particular we used the α BB optimization method and we introduced a number of tests which we applied prior to the bounding step of the α BB algorithm. These tests help to identify areas of the search space which do not contain any TSs or may contain at most one. In the first case we fathom/remove the area while in the second we perform a local search and if a solution is found we fathom the area. Without the tests the algorithm simply locates all critical points while with the tests we aim to focus the computational effort on the location of TSs. We have presented the successful application of the proposed tests to a benchmark function. The results indicate that the addition of the tests can reduce the computational time significantly. Finally we note that the proposed tests can be used within any branch and bound algorithm or within the interval Newton method and that they can be altered in order to locate any index-k critical point, with the exception of the 2 × 2 inertia test.

6. Acknowledgements Financial support from the Engineering and Physical Sciences Research Council (EPSRC) of the UK, via a Leadership Fellowship (EP/J003840/1), is gratefully acknowledged.

References Adjiman, C. S., Dallwig, S., Floudas, C. A., Neumaier, A., 1998. A global optimization method, α BB, for general twice-differentiable constrained NLPs I. Theoretical advances. Comp & Chem Eng 22, 1137 – 1158. Androulakis, I. P., Maranas, C. D., Floudas, C. A., 1995. α BB: A global optimization method for general constrained nonconvex problems. J Global Optim 7 (4), 337–363. Banerjee, A., Adams, N., Simons, J., Shepard, R., 1985. Search for stationary points on surfaces. J Phys Chem 89 (1), 52–57. Carlson, D., Haynsworth, E., Markham, T., 1974. A generalization of the Schur complement by means of the moorepenrose inverse. SIAM J Appl Math 26, 169–175. Chaudhury, P., Bhattacharyya, S., 1998. A simulated annealing based technique for locating first-order saddle points on multidimensional surfaces and constructing reaction paths: several model studies. J. Mol. Struct. (Theochem) 429, 175 – 186. Cottle, R. W., 1974. Manifestations of the Schur complement. Lin Alg Appl 8, 189 – 211. Ellabaan, M., Ong, Y. S., Lim, M. H., Kuo, J. L., 2009. Finding multiple first order saddle points using a valley adaptive clearing genetic algorithm. In: IEEE CIRA. pp. 457–462. Golub, G. H., Van Loan, C. F., 1996. Matrix Computations (Johns Hopkins Studies in Mathematical Sciences)(3rd Edition). The Johns Hopkins University Press. Hansen, E., Walster, G., 2003. Global optimization using interval analysis. Pure and applied mathematics. M. Dekker, New York. Henkelman, G., J´onsson, H., 2000. Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points. J Chem Phys 113 (22), 9978–9985. Hladk, M., Daney, D., Tsigaridas, E. P., 2010. Bounds on real eigenvalues and singular values of interval matrices. SIAM J Matrix Anal Appl 31 (4), 2116–2129. Lin, Y., Stadtherr, M. A., 2004. Locating stationary points of sorbate-zeolite potential energy surfaces using interval analysis. J Chem Phys 121 (20), 10159–10166. Meyer, C. A., Swartz, C. L. E., 1998. A regional convexity test for global optimization: Application to the phase equilibrium problem. Comp and Chem Eng 22, 1407 – 1418. Wales, D., 2003. Energy Landscapes: Applications to Clusters, Biomolecules and Glasses. Cambridge University Press. Westerberg, K. M., Floudas, C. A., 1999. Locating all transition states and studying the reaction pathways of potential energy surfaces. J Chem Phys 110 (18), 9259–9295.