PHYSICA ELSEVIER
Physica D 107 (1997) 240-254
Correlation length, isotropy and meta-stable states Ricardo Garcfa-Pelayo a, Peter F. Stadler b,c,, a lnstituto de Ffsica, Universidad Nacional Auttnoma de M~xico, M~xico b lnstitutf. Theoretische Chemie, Univ. Wien, Wiihringerstr. 17, A-1090 Wien, Austria c The Santa Fe Institute, Santa Fe, NM, USA
Abstract A landscape is rugged if it has many local optima, if it gives rise to short adaptive walks, and if it exhibits a rapidly decreasing pair-correlation function (and hence if it has a short correlation length). The "correlation length conjecture" allows to estimate the number of meta-stable states from the correlation length, provided the landscape is "typical". Isotropy, originally introduced as a geometrical condition on the covariance matrix of a random field, can be re-interpreted as a maximum entropy condition that lends a precise meaning to the notion of a "typical" landscape. The XY-Hamiltonian, which violates isotropy only to a relatively small extent, is an ideal model for investigating the influence of anisotropies. Numerical estimates for the number of local optima and predictions obtained from the correlation length conjecture indeed show deviations that increase with the extent of anisotropies in the model. Keywords: Rugged landscapes; XY-Hamiltonian; Isotropy; Graph Laplacian; Meta-stable states
1. Introduction Spin glass Hamiltonians, cost functions of a combinatorial optimization problems, and "fitness functions" in models of biological evolution can be regarded as mappings f from the vertex set V of a (usually huge, but finite) graph F into the real numbers. The vertex set V is the set of all possible configurations (e.g. spin orientations, tours of a traveling salesman problem (TSP), or DNA sequences). The set E of edges of the graph F = (V, E) is introduced by defining a "move set" that allows to inter-convert "neighboring" configurations, see e.g. [ 19,51,60]. A move set may consist, for instance, in flipping single spins, in exchanging two cities along the salesman's tour, or in mutating a single nucleotide of a DNA sequence. A mapping f : V ~ R has been termed l a n d s c a p e following a picture of evolutionary optimization originally proposed by Wright [63]. Most landscape models contain a stochastic element in their definition: a particular instance is generated by assigning a (usually) large number of parameters at random. Such models are called r a n d o m f i e l d s [3]. A typical example is the Sherrington-Kirkpatrick Hamiltonian [46], f (x) de__f~ i
R. Garcfa-Pelayo, P.F Stadler / Physica D 107 (1997) 240-254
241
Gaussian random variables. Instead of specifying the distribution of the random parameters one may as well define the joint distribution P(u) dem-fProb[f(x) _< Ux Ix ~ V[ of the "fitness values" of all configurations x ~ V. Mathematically speaking, a random field is the probability space with point set {f : V -+ •} and measure dP, see e.g. [56]. We shall use the notation C[.] d=eff . dP for the average of a quantity w.r.t, this measure, i.e. the average over the disorder. In the SK model this amounts to integrating over the Gaussian distributions of the interaction coefficients Jij. We shall restrict the use of the term landscape to individual mappings f : V --+ R. An element (a realization, an instance) of a random field on F is therefore a landscape, corresponding to quenched disorder. The number IV[ of vertices is very large in general, hence we are interested in a simple statistical description of a landscape or a random field model. The most important characteristic is ruggedness, a notion that is closely related to the hardness of the optimization problem for heuristic algorithms [37]. Three distinct approaches have been proposed to measure and quantify ruggedness. Sorkin [47], Eigen et al. [18] and Weinberger [60] used pair correlation functions. Kauffman and Levin [30] proposed adaptive walks, and Palmer [40] based his discussion on the number of meta-stable states (local optima). Of course one expects a close relationship between these different characterizations of ruggedness. In this contribution we shall elaborate on the relation between the number of local optima and pair-correlation functions. In Section 2 we introduce the notion of an elementary landscape and discuss correlation functions and the correlation length. In Section 3 we introduce the notion of isotropic random fields. Section 4 presents the "correlation length conjecture" providing an estimate for the number of local optima (meta-stable states) for isotropic elementary random fields. In Section 5 we study the discrete XY-Hamiltonian and show that this model is elementary on two fairly different types of graphs but violates isotropy to a different extent in these two cases. This fact makes the XY-Hamiltonian an ideal model for a systematic investigation of the effects of anisotropy. Section 6 contains the discussion of our results. Appendix A contains some technical details.
2. Correlation length and elementary landscapes The mathematical investigation of a landscape f on a graph F requires an algebraic description of the graph itself. The most straightforward encoding of F is the adjacency matrix A with entries Axy = 1 if the vertices x and y are connected by an edge, and Axy = 0 if x and y are not neighbors of each other. The number of neighbors of a vertex x is called the degree of x. The degree matrix D of F is the diagonal matrix containing the individual vertex degrees. The graph F is regular if all vertices have the same number of neighbors, i.e., if D = DI, where I denotes the identity matrix. For our purposes it will be more useful to use the graph Laplacian --A de__fD - A. This matrix can be regarded as a discretization of the familiar Laplacian differential operator; see [5,38,39] for reviews on graph Laplacians. The operator --A is symmetric, non-negative definite, and its smallest eigenvalue is A0 = 0 corresponding to a constant (flat) eigenfunction. Furthermore, the multiplicity of A0 equals the number of connected components of F; hence F is connected if and only if A0 is a simple eigenvalue of --A. Throughout this paper we shall assume that F is connected and D-regular.
Definition. [55] A landscape f on F is elementary if and only if dof f ( x )
-
1
f(z)
zEV
is an eigenfunction of the graph Laplacian with an eigenvalue A > 0.
(1)
242
R. Garcia-Pelayo,PF. Stadler/ Physica D 107 (1997) 240-254
Elementary landscapes play an important role because of their algebraic properties [28,52,54,55] and because a large number of well-studied examples from spin glass physics (such as Derrida's p-spin models) and from combinatorial optimization (among them the TSP) are elementary, see [27,52,54,55]. A landscape f can be decomposed into a superposition of elementary landscapes: Since --A is symmetric there is an orthonormal basis {~Pk} of eigenvectors. The eigenvector ~0obelonging to the eigenvalue A0 = 0 is constant. An expansion N-1
f ( x ) = Z ak~ok(x) (2) k=0 in terms of this so-called Fourier basis may be called a Fourier series of the landscape [61]. In this contribution we shall restrict our attention to a class of (highly symmetric) graphs for which the Fourier basis can be computed explicitly. These so-called Cayley graphs are constructed from a finite group G and set of generators q) C G of G (which has the property that each element u 6 ~ can be obtained as a finite product of elements from q~). In addition we assume that (i) the group identity t ¢ ¢~ and that (ii) x ~ q~ implies x -~ ~ q~. The Cayley graph F(G, ~ ) has the vertex set ~ and there is an edge connecting two elements x and y if and only if x y - l ~ ~. For instance, the Boolean hypercube is a Cayley graph. It is not hard to show that the product graph of two Cayley graphs F (G1, ~1) and F (~2, ~2) is the Cayley graph /-'(~1 x G2, ~l x {t2} tO{q} x q02), where L1 and t2 are the group identities of G1 and G2, respectively. The spectrum and the eigenbasis of a product graph can be obtained easily from its components [10]: Afii2 = Ait + Ai2 is an eigenvalue of/'1 x F2 with eigenfunction CPili2(Xl, X2) = ~0il (Xl)~0i 2 (X2). The eigenvalues and eigenfunctions of the Laplacian of a Cayley graph with a commutative group are not hard to construct explicitly, see e.g., [34,61]. An eigenbasis is given by the characters of ~, ~Og(X) = exp 2zri Z
~
] = I--I exp
k
,
0 < xk, gk < Nk,
(3)
h
where the constants Ark are the orders of the cyclic subgroups Yuk into which the commutative group can be decomposed, see [25] for details. The corresponding eigenvalues of the Laplacian are Ag = Y~xe,[1 - ~0g(x)]. We shall use these expressions in Section 5. Let us now return to the properties of a landscape f on F. Two types of correlation functions have been investigated as a means of quantifying the ruggedness of a landscape. Eigen et al. [18] introduced p(d) which measures the pair correlation as a function of the distance between the vertices of F. Weinberger [60] used the autocorrelation function r(s) of the "time series" {f(x0), f ( x l ) . . . . } generated by a simple random walk [48] on F in order to measure properties of f . The relationship between r(s) and p(d) is discussed in [20,60]. The correlation function r (s) is intimately related to the Fourier series expansion of the landscape [55]. Elementary landscapes belonging to the eigenvalue Ap have exponential autocorrelation functions of the form r(s) = (1 A p / D ) s. For a general landscape we have r(s) = Z Bp(1 - A p / D ) s . p#o
(4)
The amplitudes Bp are determined by the Fourier coefficients ak in Eq. (2): Bp = Z l a k l 2 / Z lakl2 > 0, ketp - - k#o
(5)
where Ip denotes the set of the indices j for which --Acpj = Apcpj. The crucial information about a landscape is therefore contained in the eigenvalues Ap of the graph Laplacian, which determine the ruggedness of a component, and in the amplitudes Bp, which determine the relative importance of the different components.
R. Garc(a-Pelayo, P.E Stadler/Physica D 107 (1997) 240-254
243
A particularly useful measure for the ruggedness of a landscape is the correlation length [20,23,28,60]. In most of the earlier work on the Nk model and on RNA landscapes a "correlation length" e is used that is defined such that r(~) = l/e, see e.g. [20,60]. It is more convenient, however, to use the definition
r(s) = D Z
g def s=0
Ap Bp
(6)
p¢0
The discrepancy between the two definitions, ~ - ~ = 1/2 + O(Ap/D), is negligible because we have in general Ap -= O(1), and D = O(n).
3. Isotropy as a maximum entropy condition
A quantity that is closely related to the autocorrelation functions discussed above is the covariance matrix C of a random field with the entries Cxy =
g[f (x) f (y)] - g[f (x)lg[f (y)l.
(7)
Clearly, C does not depend on the neighborhood structure among the configurations. A random field on a graph F will be called isotropic if its covariance matrix has the same symmetry as F itself. Isotropy thus depends on the structure of F. In order to make this contribution self-contained we briefly outline the precise meaning of this symmetry condition. An automorphism of a graph F is a one-to-one map a : V -~ V that preserves adjacency, i.e., {a (x), a (y)} is an edge of F if and only {x, y} is an edge. The set of all automorphism forms the automorphism group Aut[F]. The action of A u t [ F ] on the set of all ordered pairs of vertices induces a partition 31 of V x V. The classes of this partition are the orbits Aut[F]. The class A' containing a particular pair (x0, Y0) of vertices is 2' = {(a(x0), a(y0))l a e A u t [ F ] }, i.e., all vertex pairs that belong to the same class ,Y are equivalent w.r.t, the symmetry of F.
Definition. A random field is isotropic if and only if (i) there is a constant a0 such that g [ f ( x ) ] = a0 for all x ~ V and (ii) there is a function c : 3t ~ R such that Cxy = c(,V) for all (x, y) ~ Pc', i.e., the covariance matrix the random field is constant on the symmetry classes of underlying graph F. A fairly general algebraic theory of isotropy is laid out in [56]. For our discussion we shall need only a simplified version of Theorem 3 from this paper:
Proposition. Let F be a Cayley graph with a commutative group. Then a random field on F is isotropic if and only if its Fourier coefficients {ak } fulfill (i) g[ak] = 0 for all k 5~ 0; (ii) g[aka~] = 8klC[lakl2]; (iii) g[laj 12] = g[lak t2] = tip whenever the corresponding eigenfunctions tpj and ~0k belong to the same eigenvalue Ap of the graph Laplacian. In the case of Ising models this condition means that there is no structure in the interaction coefficients Jij, i.e., all the Jij are assigned as i.i.d, random numbers. Thus the Sherrington-Kirkpatrick Hamiltonian is isotropic, while short range spin glasses do not have this property.
244
R. Garc[a-Pelayo,P.EStadler/ PhysicaD 107(1997)240-254
In general, the number of uncorrelated non-zero Fourier coefficients in an isotropic elementary random field equals the dimension m (Ap) : Ilpl of the corresponding eigenspace of the graph Laplacian - A . This observation suggests to interpret isotropy as a maximum entropy like condition: Given the parameters tip, the "most random" choice of coupling constants are Gaussian random variables fulfilling (i)-(iii). On the other hand, the tip are closely related to the expected autocorrelation function g[r(s)] in the isotropic case because
glk~lplakl2 1 =m(Zp)tip,
(8)
where m(Ap) denotes the multiplicity of the eigenvalue Ap. Eq. (5) then implies that the autocorrelation functions is determined by the tip'S. Conversely, the tip's are determined uniquely by r(s), see [55,56] for the details. In particular, a Gaussian isotropic elementary random field is therefore the maximum entropy model subject to prescribed parameters tip. Derrida's p-spin models [13], for instance, are the maximum entropy models with the single constraint that only one order of interaction contributes to the Hamiltonian. The random energy model [14] can be regarded as the maximum entropy model subject to the constraint that the constants tip are all equal [56]. The multiplicities of the eigenvalues of "sufficiently" symmetric graphs are large. As a consequence the cost function of an isotropic random field is composed of a large number of uncorrelated random variables (at each point x in configuration space). Thus the central limit theorem tells us that the distribution of the cost function f (x) across the configurations x 6 V should be approximately Gaussian. Deviations from a normal distribution can therefore be interpreted as a sign of substantial anisotropy.
4. The expected number of local optima Meta-stable states (local optima) are an intrinsic property of a rugged landscape. In fact, Palmer [40] used the existence of a large number of local optima to define ruggedness. We say that x ~ V is a local minimum of the landscape f if f ( x ) < f ( y ) for all neighbors y ofx. The use of < instead of < is conventional [33,42]; it does not make a significant difference for spin glass models. Local maxima are defined analogously. The number.A/" of local optima of a landscape, however, is much harder to determine than its autocorrelation function r (s) or its correlation length e. As it appears that A/" and e are two sides of the same coin we search for a connection between the two quantities. For a random field model defined on a Hamming graph Qn it will be convenient to use the parameter def
/z =
lira
n--->oo
--
(9)
for describing the scaling of the number of local optima (we use the subscript n to emphasize the dependence on def
the number of spins). Some papers use y = limn--, oo(1/n) log(£[A/'n ]) or !# = log # / l o g a instead of/z. The expected number C[A/'] of meta-stable states has been determined by various statistical mechanics methods for a selection of Ising models with different assignments of the interaction coefficients. These can be drawn independently from a Gaussian distribution as in the SK spin glass [46], or one may assume that the spins are arranged on a two- or three-dimensional lattice with non-zero interactions only between lattice-neighbors. At least three groups have computed the number of local minima of the SK model by means of what are now considered standard methods in Statistical Mechanics. Tanaka and Edwards [58] computed the expected number of local optima E[.N'], while Bray and Moore [6] and De Dominicis et al. [11] used a replica approach to evaluate E[In.N']. For
R. Garcfa-Pelayo,P.E Stadler/PhysicaD 107 (1997) 240-254
245
the case of short range spin glasses, in which only a small number z of coupling constants Jij are non-zero for any given spin i, a slightly larger number of local optima has been found [7,58]. The only known case in which the logarithmic average deviates from the direct average is the linear spin chain [ 16]. Since all Ising models have the same correlation length £n = i n [52,62] but somewhat different values of.M, we cannot hope for an exact formula relating ~0 and C[e] for general random field models. Using the maximum entropy condition, however, we know that the expected density of meta-stable states in an isotropic Gaussian random field is determined completely by the expected correlation function C[r(s)] because the model does not contain more information. In the case of an elementary isotropic random field the correlation length e already determines r(s) and hence there must be a direct relationship between ~ and the expected number of meta-stable states C[.N']. Its functional form will of course depend on the geometric properties of F. A heuristic argument [50] linking local optima and correlation measures runs as follows: For a typical elementary landscape we expect that the correlation length ~ gives a good description of its structure because the landscape does not have any other distinctive features. By construction g determines the size of the mountains and valleys. As there are many directions available at each configuration we expect there are only very few meta-stable states besides the summit of each of these e-sized mountains - almost all of the configurations will be saddle points with at least a few superior neighbors. We measure g along a random walk but the radius R(g) of a mountain is more conveniently described in terms of the distance between vertices on F. Here R(g) is the average distance that is reached by the random walk in g steps. With the notation B(R) for the number of vertices contained in a ball of radius R in F we obtain the following.
Conjecture. An isotropic elementary random field has £[A/'] ~ IV I/B(R(e)) local optima. A more convenient form is ~t ~ l i m n ~
1/ ~ .
Using the scaled correlation length ~ = l i m n ~
gn/n and
the scaled correlation radius ~ def l i m n ~ R(n ~)/n the conjecture takes the form/z ,~ l i m n ~ 1/n~/B(~ There is a fair amount of computational evidence for the correlation length conjecture, some of which is compiled in Table 1. In addition, one obtains reasonable estimates for Kauffman's Nk landscapes [36]. A few counter examples are known as well; all of them strongly violate the maximum entropy assumption.
Table 1 Meta-stable states in elementary isotropic random fields Model
Graph
~ = lim ~-o--b{loc.opt.} Best estimate, a Ref.
Conjecture, Ref.
Relative error (%)
0.60906 [57] 0.6649 [57] 0.7062 [57] 0.7379 [57] 0.7628 [57]
0.2 0.9 0.6 0.2 I. 1
0.60906 [321
2.5 4- 1.1
SK c 3-spin 4-spin 5-spin 6-spin
~! ~n 2 Q~
0.61023 S [6,11,58] 0.6708 N 157] 0.7102 N [57] 0.7391 N [57] 0.7546 N [571
GBP d
J(n, n/2)
0.594(7) N [32]
Symmetric TSP
F(Sn, 7-)
non-exp. N [50]
2.773 - n
~ ~
[36]
< 5b
a The best estimates for/z are either obtained by statistical mechanics methods (S) such as replica calculations or by means of numerical simulations (N). b Relative error of log(.Afn/n!) compared with 1og[2.773-n/(n/4) !] averaged over the available numerical data. c Derrida's p-spin Hamiltonian is 7-((tr) = ~--~-il
R. Garc(a-Pelayo, P.E Stadler/Physica D 107 (1997) 240-254
246
5. The discrete XY-Hamiltonian
The discrete version of the XY model was introduced by Tanaka and Edwards [59]. There are n spins in the plane, each of which can point into one of the ot directions ( COS Oi ~ O'i = \ sin0i /
2~r
with
0i = --xi,ot O < x i < o t .
(10)
The set V of all possible spin configurations contains IV I = otn points. The interaction energy between two spins i and j is given by
Jij(tlri, tlrj) ~-- Jij cos(0/ - Oj),
(11)
with i.i.d. Gaussian coupling constants J/j. The Hamiltonian takes the form
~-~(x)-~-ZJijcos(Oi - O j ) - ~ - Z J i j c o s ( 2 - - ~ ( x i - x , ) ) . i
(12)
Two definitions of neighborhood between spin configurations are natural for the XY model. (i) Two configurations are neighbors if they differ in the orientation of a single spin. This arranges the otn spin configurations as a Hamming graph Qn, which is the n-fold (Cartesian or direct) product of the complete graph Q_a with ot vertices. Each configuration x 6 V has D = n(ot - 1) neighbors in Qn. (ii) Two configurations are neighbors of each other if a single spin orientation differs by +(2rr/ot) while all the other spins are the same. The corresponding graphs C~n are the n-fold direct products of the cycles Ca. This notion of neighborhood was used in [59]. Each spin configuration has D = 2n neighbors in Cn if ot > 3, and only D = n neighbors if ot = 2. It is trivial to check that the complete graphs Q~ and the cycle graphs Cct are Cayley graphs of the commutative group 7/ct = {t, /7, 02 . . . . . 0 ct-1} with the sets of generators q0Q = 77~ \ {t} and q~c = {r/, Oct--1 = /7--1}, respectively. Consequently, both the Hamming graphs Qn and the graphs Cn are Cayley graphs of the n-fold Cartesian product of the group 77ct with itself, which is of course again a commutative group. Note furthermore that C~ - Q~ and C~ = Q~. It is easy to verify that C~ - Q2n. Direct products of larger cycles are not isomorphic to Hamming graphs. Theorem 1. The discrete XY-Hamiltonian is elementary on Qn with eigenvalue A = 2ot for all n > 2 and all ot > 2. The correlation length is £ = D / A = (n(ot - 1))/2or. Proof. A short computation verifies that f is an eigenfunction of --A for Qn.
[]
The eigenvalues of the Laplacian Qn are Ap = pot for 0 < p < n with the multiplicities mp = (ot - 1)p(n), see e.g., [8]. The (~) independent coupling constants Jij imply that 7-[ is contained in a (~)-dimensional subspace of the (ot - 1) 2 (~)-dimensional eigenspace of the third eigenvalue A2. Restricting the spin-orientations to a plane hence severely restricts the structure of the Hamiltonian for larger ot in comparison to an isotropic model with the same correlation structure. Evaluating/z for the XY-Hamiltonian on Qn is straightforward. We find: Theorem 2. u ( ~ ) _- (1 - g )
^)(ot
with ~ -
~ -
1
1 -
e
247
R. Garc[a-Pelayo, P.E Stadler /Physica D 107 (1997) 240-254
Proof. The average radius R(s) that is reached by an unbiased walker after s steps is R(s)=n
1-
t~
1
nt~-I
(13)
'
see [2]. A short calculation then yields g. Let O(d) denote the number of vertices with Hamming distance d from an arbitrary vertex x (of course this number is independent of the starting point x for a vertex transitive graph I such as Qn). The volume of a ball is of course B(R) = )-]~=00(d). The distance degrees O(d) are well-known for many graphs [9]. For Hamming graphs we have O(d) = (et - 1)d(~). Since ~ _< (a - 1)/or we have B(n~) ,~ O(n~) up to a multiplicative correction of at most O(n). Using Stirling's approximation completes the proof. []
Theorem 3. The discrete XY-Hamiltonian is elementary on Can with eigenvalue 4 8sin2(zr/c0]
A =
if or = 2 , if or > 3
for all n > 2 and all a _> 2. The correlation length is £ = n/(4 sina(zr/et)).
Proof A simple calculation verifies that f (x) is an eigenfunction of the graph Laplacian of Can.
[]
The eigenvalues and eigenvectors of the cycle graphs Cu are also well known. We have ~-k = 4 sin2(rrk/a) with multiplicities m k = 2 except for k = 0 and k = ½a where mk = 1. It is easily verified then that the XY-Hamiltonian belongs to the third eigenvalue A l, l of the Laplacian of the graph Can, which is 4(~)-fold degenerate for a > 3. As in the case of the Hamming graphs, the XY models do not span a complete eigenspace. However, their contribution to the full eigenspace does not decrease with increasing ct on the graphs Can since the XY-Hamiltonian contains (~) i.i.d, coupling constants, and hence spans a (~)-dimensional vector space. Evaluating/z is much more complicated for Can then for Hamming graphs. The scaled correlation length of the XY-Hamiltonian on Can is ~ = 1/(4 sinE(zr/ot)). For the scaled correlation radius we find
=g~(oo)+~w~;kexp
-
1-cos
~
(14)
k=l
for a > 3. The coefficients ga(o~) and w~;k are computed in Appendix A. The graphs Can are not distance transitive for ct > 5, i.e., not all pairs of vertices with given distance are equivalent. We have not succeeded in finding explicit expressions for 8(d) in this case. It is possible, however, to compute limn~oo ( l / n ) l o g 0(n~): since the Cayley graph Can is vertex transitive we may choose an arbitrary reference configuration, say all spins 'up', and classify the configurations by the numbers nk, 0 < k < ½a, of spins that differ by +(2rk/ot) from the reference configuration. The number of such configurations can be estimated by the multinomial coefficients
Sn=
(
n
nonl/2nl/2
)
... nK
and
Sn=
(
n
nonl/2nl/2
... n K / 2 n x / 2
)
(15)
for even and odd values of a, respectively. These terms assume that the positions are distributed evenly between + k and - k and give rise to the factors ½. Neglecting uneven distributions between -4-k amounts only to a polynomial 1 A graph is vertex transitive if Aut[F] acts transitively on V, i.e., if for any two vertices x and y there is an automorphism a such that y = a(x). For instance, all Cayley graphs have this property, see e.g., [43].
R. Garcfa-Pelayo, P.E Stadler/Physica D 107 (1997) 240-254
248
error term. All these configurations have the same distance d = Y~k knk to the reference vertex. Thus 0(d) is approximated by the largest of the above terms with given d up to a multiplicative error of at most O(n3~/2). Using the fractions qk de=fn k / n and Stirling's approximation we introduce G(~) de__f lim --1log Sn = --
qo logq0 +
(16)
qd log(qd/2) + q,c log(qK/m~)
n---~ OC, n d=l
with m~ = 1 or 2 depending on whether a is even or odd and ~c = L½aJ. Write Gma×(~) for the maximum of G(q)
subject to the constraints ~ j qj = 1 and Y~j j qj = d / n = ~. Then 0c~(d) ~ exp(n Gmax( d / n ) ) with a multiplicative polynomial error term. Introducing Lagrange multipliers u and v we have to compute the unconstrained maximum of F(q; u, v) de_.=fG(qo, ql . . . . . qx) + u
qj - 1
+ v
jqj -
(17)
.
The stationary point fulfills l o g ( q k / ~ k ) = U -- 1 + kv, where ~k is 1 or 2 according to the corresponding entries def
def
in the multinomial coefficient. Setting s = exp(u - 1) > 0, t = exp(v) > 0, and eliminating s we obtain the algebraic equation rhkt k = Z k=0
krhktk,
(18)
k=0
which has a unique positive solution/'(~) for 0 < ~ < K that is a monotonically increasing function of ~. Thus K
Gmax (~) = log Z
fftkt'*(~) - ~ log t'(~)
(19)
k=0
is a unimodal function which has its maximum ~ at the average distance of randomly chosen points on Ca. Since _< ~ we finally obtain U(~) = - G m a x ( ~ ) / l o g a . Note that this is an exact result because all error terms in the above derivation have been non-exponential. Numerical estimates for the fraction A/n/a n of recta-stable spin configurations are straightforward but quite time consuming, at least for larger values of a, because local optima become an exceedingly rare phenomenon. In fact, the total number of recta-stable states increases exponentially with n, but with a much smaller rate than the total number of configurations. Computer experiments hence are looking for the proverbial needle in the haystack. We have been able to obtain reasonably accurate estimates only for a = 3, 4, 5, see Fig. 1. Data for a = 2, i.e. the SK spin glass, are reported in [57]. For Hamming graphs we observe an increasing discrepancy between the numerical estimates and the predictions from the correlation length conjecture, which we attribute to the increasing deviations from isotropy for large values of a. The Cn-data agree very well with the statistical mechanics calculations in [59]. The quality of the estimate from the correlation length conjecture is not perfect, but much better than those for the Hamming graphs. We find a relative error of less than 6% for ~p, which translates to about 10% relative error for #, see Table 2.
6. Discussion
Many important examples of landscapes are elementary, i.e. (up to an additive constant) they fulfill a discrete analogue of the Helmholtz equation A f + Xf = O. Examples include Derrida's p-spin models and the landscapes of
R. GarcFa-Pelayo, P.E Stadler/Physica D 107 (1997) 240-254 0
~
0
• ........
-5
i
249
~'~'\
3
-10
-15
~
-20
5
. . . .
'
. . . . . . . . . . . . . .
10 Numberof Spins
15 n
(a)
"\6
-20
20
10 Number of Spins
15 n
20
(b)
Fig. 1. Numerical estimates for the number of meta-stable states of the XY-Hamiltonian. The dotted lines indicate 2a errors on the individual estimates, the solid line is a least square fit. (a) Hamming graphs. Data for a = 4 and ot = 5 are offset by - 4 and - 8 , respectively. (b) Direct products of cycles. Data for ot = 5 are offset by - 4 . Table 2 Local minima of the XY-Hamiltonian #(~)
~
Statistical mechanics
Numerical
Q~ 2 3 4 5 6
0.609058 0.468938 0.394215 0.346719 0.313362
-0.715349 -0.689309 -0.671473 -0.658143 -0.647629
2 3 4 5 6 7 8 9 10
0.609058 0.468938 0.370952 0.299164 0.249301 0.212308 0.185438 0.164283 0.147908
-0.715349 -0.689309 -0.715349 -0.749805 --0.775269 --0.796398 --0.810330 --0.822021 --0.830008
-0.71257 a [6,11,58] -0.69870 b [59]
-0.71444-0.0011 [57] -0.706 + 0.002 -0.721 4- 0.003 -0.749 -4- 0.008
c~ -0.71257 a [6,11,58] -0.69870 b [59] -0.71256 b [591 -0.72918 b [59] --0.74383 b [591 --0.75699 b [59] --0.76795 b[59] --0.77725 b[59] --0.78525 b[59]
-0.71444-0.0011 [57] -0.706 + 0.002 -0.714 + 0.004 -0.733 ± 0.007
a Exact result. b From a cumulant expansion, not exact.
the b e s t k n o w n c o m b i n a t o r i a l o p t i m i z a t i o n p r o b l e m s s u c h as the t r a v e l i n g s a l e s m a n p r o b l e m . E l e m e n t a r y l a n d s c a p e s e x h i b i t a c h a r a c t e r i s t i c d i s t r i b u t i o n o f local o p t i m a o n the c o n f i g u r a t i o n s p a c e w h i c h d e p e n d s c r u c i a l l y o n the c o r r e s p o n d i n g e i g e n v a l u e A o f the L a p l a c i a n . In particular, the l o c a t i o n o f A in the s p e c t r u m o f the L a p l a c e o p e r a t o r d e t e r m i n e s the m a x i m u m n u m b e r o f n o d a l d o m a i n s , i.e. the m a x i m u m n u m b e r o f d i s c o n n e c t e d islands o f
250
R. Garcfa-Pelayo, PE Stadler/Physica D 107 (1997) 240-254
configurations on which f has a value below average [ 12]. The nearest neighbor correlation is directly linked to the eigenvalue A, which can therefore serve as a measure of the ruggedness. In this contribution we have investigated the discrete XY-Hamiltonian on two different types of configuration spaces corresponding to the two natural definitions of neighborhood between spin configurations: The re-orientation of a single spin into an arbitrary direction gives rise to the Hamming graphs Qn, while changing the current orientation of a single spin by the smallest possible angle, +2zr/a, yields a direct product of cycles Can. The XY-Hamiltonian is elementary on both Qn and Cn. In both cases it belongs to the third-smallest eigenvalue, i.e., the second "excited state", just like most of the combinatorial optimization problems studied so far [55]. The notion of (statistical) isotropy was introduced from a purely geometric point of view [54,60]. It is an important property from a practical point of view because it has become clear in computer simulations [21,22] that plateaus, ridges, and other anisotropies strongly influence adaptation on a landscape [ 18,44]. Isotropy translates to the notion of a "typical landscape" because of the following observation: Given an orthonormal basis { ~ } of the eigenspace belonging to the eigenvalue A of the graph Laplacian we can write an elementary model in the form f = f + )-]k ak ~P~. This model is isotropic if and only if the coefficients ak are uncorrelated random variables. In general, it is possible to show that a Gaussian isotropic random field is exactly the maximum entropy model with a prescribed autocorrelation function. The "correlation length conjecture" [50] suggests that the number of local optima of a "typical" landscape can be estimated from its correlation length £. More precisely, one expects on the order of one local optimum on a mountain with a radius that is determined by the correlation length £. Recent numerical surveys provided good evidence that the "correlation length conjecture" yields a fairly accurate prediction of the number of local optima (meta-stable states) of isotropic elementary random field models [32,36,50,57]. The XY model is not quite isotropic on both types of configuration spaces. The deviations from isotropy increase with ot on the Hamming graphs - - and so do the discrepancies between the prediction of the correlation length conjecture and numerical estimates of the number of local optima. For the graphs Can the relative deviation from isotropy (as measured by the fraction of independent coefficients) does not depend on a. The quality of the predictions obtained by the correlation length conjecture is reasonable but not as good as for isotropic models. Other spin glass like models that are known to deviate from the conjecture also have strongly constrained Fourier coefficients ak. For instance, in short-range Ising spin glasses most of the coupling coefficients are zero [7], and the graph matching problem can be treated as a TSP with a severely constrained distance matrix [31 ]. The data presented in this contribution suggest that there should be a quantitative relation between the deviations from the "correlation length conjecture" and the "degree of anisotropy" as measured by the constraints on the Fourier coefficients a~. For Ising models it has been found [7] that the deviation is proportional to 1/z, where z is the number of non-zero coupling constants per spin. Note that 1/z can be regarded as a measure for the deviations from isotropy. More work will be necessary in order to establish a detailed understanding of different aspects of ruggedness in landscapes and their random field models.
Acknowledgements This work on this study began while PFS was a guest of UNAM in Mexico City in summer 1995. Thanks to the colleagues at the Instituto de Fisica and the Instituto de Ciencias Nucleares for stimulating discussions and for their hospitality. RG-P wants to thank the Departamento de Fisica Fundamental, UNED, Madrid, for their hospitality during the final stages of this work.
R. Garc[a-Pelayo, P.F Stadler/Physica D 107 (1997) 240-254 Appendix
251
A
We outline here a rather general procedure for computing the expected distance R(s) traveled by a simple random walk of s steps on the graph product 2"n of n identical copies of a K-regular graph 2". The distance in 2"n equals the sum of the distances in the individual factors. We start with a random walk on 2( that pauses with probability 1 - 7 and moves to one of the K neighbors in 2" with probability 7/K. The transition matrix of this random walk is 7 To =(1-7)I+~-Ax
= 1 - 7AX'K
(A.I)
where A x is the adjacency matrix of 2" and --Ax d=efA x - KI is the Laplacian of 2". Let 6(7, s) denote the average distance from the origin at time step s of this walk. A simple random walk on 2"n performs a step in a particular factor 2' with probability 1/n at each given time-step. Thus the expected distance after s steps within a given factor 2" is exactly 6(1/n, s). Since there are n identical factors we have R(s) = n6(1/n, s). The next step is to derive an explicit expressions for 6(7, s). Let {4~i}denote a complete set of eigenvectors of Ax, with associated eigenvalues #i. Furthermore let 0 denote the starting point of the random walk, and let pxs denote the probability that the walk is in position x at time s. Then Pxs = ~--~[TSlx,,60., ,, = y~)TSlxy,ff-~ 4~i(0) ,,~.,,,~(Pi (y) = S-" z . . ~ ~bi(0) ri~¢i (x), .v
~v
i
(A.2)
i
where r i dej (1 - 7) + (7/K)Izi = 1 - 7)~i/K is the eigenvalue of the transition matrix T belonging to the eigenvector 4~i of the graph Laplacian - A,,e. As usual, 6i, j is Kronecker's symbol. The norm of the eigenvectors is of course [[q~iII2 = E.~ q~i(x)2. Eq. (A.2) was used in a different context in [52]. The expected distance 6(7, s) is then given by 6(7, s) = Y~x Pxs dx(0, x). In order to use this formalism for the graphs C~ we need to compute the series expansion of the Kronecker delta in terms of the eigenfunction of the cycle graphs C,~. Let us label the vertices by 0, 4-1 . . . . ; depending on whether ot is even or odd we have either one vertex labeled lot or two vertices labeled +½ (ot - 1) at the maximum distance from 0. We have encountered the eigenvalues and eigenvectors of the cycle graphs in Section 5. The eigenvalue #0 with eigenvector 1 and the eigenvalue/zc~/2 (for even ot) are simple. All the other eigenvalues have multiplicity mk = 2. Substituting eigenvalues, their multiplicities, and K = 2 (for ot > 3) yields
['2zrk'~-] l-cos~-)j
rk= 1-7
(A.3)
for the eigenvalues of the transition matrix. Eq. (A.2) thus becomes ( a - 1)/2
l Pxs=-r~+ ot
Z k=l
~/2
sin(0) (_~_k)s Z sin x ri+ II sink II2
k=l
c°s(0) (2_~) cos x 11COSk II2
s rk"
The sine-terms vanish of course. The eigenvectors of a distance regular graph with N vertices fulfill [4, p. 167] [14~kII2 ~b(0)2
N mk
(A.4)
Thus we have
!
a/2
Pxs = ottO_ + Z k=l
mk cos ( _ ~ _ )x --ot
r ks
(A.5)
R. Garcfa-Pelayo, P.E Stadler/Physica D 107 (1997) 240-254
252
and therefore
~(rl, s) =
ot1
IXlpxs x
Ix[ +
zx
[x[ cos
z °'ot
x
r~.
(A.6)
k=l
The brackets can be evaluated using Eq. (1.314) in [26]. One obtains different expressions for odd and even values of a , respectively:
ff)ot:k def
,
Z Ixl COS
x
j=l
=
x
2 Lj=I
1 1 - cos(2k~r/oe)
x
J
{ 1 - (-1)kcos(k~/ot) 1 - ( - 1) k
f o r o d d ct, for even ~.
For the constant term we find 6 ( e c ) = (off - 1)/(4o0 for odd ot and 6 ( o c ) = 1~ for even a , respectively. Setting
w~;k = ( m k / ~ ) ~ ; k
we obtain the desired expansion for the distance on the cycle graph ~/2
8(0, s) = ~ ( o o ) + Z w~;k r ~ . k=l
(A,7)
The scaled correlation radius finally becomes
= lim ~ ( 1 / n , nse ) = ~ ( ~ ) + ~ w ~ ; k e x p n--->oo
-
1-cos
-
s~ .
(A.8)
k=l
References [1] G. Arfken, Mathematical methods for physicists, 2nd Ed. (Academic Press, New York, 1979). [2] S. Baskaran, P.E Stadler and P. Schuster, Approximate scaling properties of RNA free energy landscapes, J. Theor. Biol. 181 (1996) 299-310. [3] J. Besag, Spatial interactions and the statistical analysis of lattice systems, Am. Math. Monthly 81 (1974) 192-236. [4] N.J. Biggs, Algebraic Graph Theory, 2nd Ed. (Cambridge University Press, Cambridge, 1993). [5] N.J. Biggs, Potential Theory on Graphs, CDAM Research report, LSE-MPS-95, Centre for Discrete and Applicable Mathematics, London School of Economics (1995). [6] A.J. Bray and M.A. Moore, Metastable states in spin glasses, J. Phys. C 13 (1980) L469-1A76. [7] A.J. Bray and M.A. Moore, Metastable states in spin glasses with short-ranged interactions, J. Phys. C 14 (1981) 1313-1327. [8] A.E. Brouwer and A.M. Cohen and A. Neumaier, Distance-regular Graphs (Springer, Berlin, 1989). [9] E Buckley and E Harrary, Distance in Graphs (Addison-Wesley, Redwood City, CA, 1990). [10] D.M. Cvetkovi6, M. Doob and H. Sachs, Spectra of Graphs - Theory and Applications (Academic Press, New York, 1980). [ 11 ] C. De Dominicis, M. Gabay, T. Garel and H. Orland, White and weighted averages over solutions of Thouless Anderson Palmer equations for the Sherrington-Kirkpatrick spin glass, J. Phys. 41 (1980) 923-930. [ 12] Y.C. De Verdiere, Multiplicites des valeurs proms Laplaciens discrets et laplaciens continus, Rend. Mat. Appl. 13 (1993) 433--460. [13] B. Derrida, Random energy model: Limit of a family of disordered models, Phys. Rev. Lett. 45 (1980) 79-82. [14] B. Derrida, The random energy model, Phys. Rep. 67 (1980) 29-35. [15] B. Derrida, Random-energy model: An exactly solvable model of disordered systems, Phys. Rev. B 24 (5) (1981) 2613-2626. [16] B. Derrida and E. Gardner, Metastable states of a spin glass chain at 0 temperature, J. Phys. 47 (1986) 959-965. [ 17] A.W.M. Dress and D.S. Rumschitzki, Evolution on sequence space and tensor products of representation spaces, Acta Appl. Math. 11 (1988) 103-111. 118] M. Eigen, J. McCaskill and P. Schuster, The molecular quasispecies, Adv. Chem. Phys. 75 (1989) 149-263. [19] M. Eigen and P. Schuster, The Hypercycle A: A principle of natural self-organization: Emergence of the hypercycle, Naturwissenschaften 64 (1977) 541-565.
R. Garcfa-Pelayo, PE Stadler / Physica D 107 (1997) 240--254
253
[20] W. Fontana, T. Griesmacher, W. Schnabl, EE Stadler and P. Schuster, Statistics of landscapes based on free energies, Replication and degredation rate constants of RNA secondary structures, Monatsh. Chemie 122 (1991 ) 795-819. [21] W. Fontana, W. Schnabl and P. Schuster, Physical aspects of evolutionary optimization and adaption, Phys. Rev. A 40 (1989) 3301-3321. [22] W. Fontana and P. Schuster, A computer model of evolutionary optimization, B iophys. Chem. 26 (1987) 123-147. [23] W. Fontana, P.E Stadler, E.G. Bornberg-Bauer, T. Griesmacher, I.L. Hofacker, M. Tacker, P. Tarazona, E.D. Weinberger and P. Schuster, RNA folding and combinatory landscapes, Phys. Rev. E 47 (1993) 2083-2099. [24] Y. Fu and P.W. Anderson, Application of statistical mechanics to NP-complete problems in combinatorial optimization, J. Phys. A 19 (1986) 1605-1620. [25] L. Fuchs, Abelian Groups (Pergamon Press, Oxford, 1960). [26] I. Gradshteyn and I.M. Ryshik, Tables of Series, Products, and Intregrals, German/English translation, 2 volumes, 5th Ed. (Verlag Harri Deutsch, Thun, Germany, 1981). [27] L.K. Grover, Local search and the local structure of NP-complete problems, Oper. Res. Lett. 12 (1992) 235-243. [28l R. Happel and P.F. Stadler, Canonical approximation of fitness landscapes, Complexity 2 (1996) 53-58. [29] S.A. Kauffman, The Origin of Order (Oxford University Press, Oxford, NY, 1993). [30] S.A. Kauffman and S. Levin, Towards a general theory of adaptive walks on rugged landscapes, J. Theor. Biol. 128 (1987) 11. [311 B. Krakhofer, Local optima in landscapes of combinatorial optimization problems, Dept. of Theoretical Chemistry, University of Vienna (1995). [32] B. Krakhofer and P.E Stadler, Local minima in the graph bipartitioning problem, Europhys. Lett. 34 (1996) 85-90. [33] W. Kern, On the depth of combinatorial optimization problems, Discr. Appl. Math. 43 (1993) 115-129. [34] L. Lov~isz, Spectra of graphs with transitive groups, Periodica Math. Hung. 6 (1975) 191-195. [35] C.A. Macken, P.S. Hagan and A.S. Perelson, Evolutionary walks on rugged landscapes, SIAM J. Appl. Math. 51 (1991) 799-827. [36] C.A. Macken and P.E Stadler, Evolution on fitness landscapes, in: 1993 Lectures in Complex Systems, eds. L. Nadel and D.L. Stein, SFI Studies in the Sciences of Complexity, Vol. VI (Addison-Wesley, Reading, MA, 1995) pp. 43-86. [37] B. Manderick, M. de Weger and P. Spiessen, The genetic algorithm and the structure of the fitness landscape, in: Proc. 4th Int. Conf. Genetic Algorithms, eds. R.K. Belew and L.B. Booker (Morgan Kaufmann, Los Altos, CA, 1991). [38] B. Mohar, The laplacian spectrum of graphs, in: Graph Theory, Combinatorics, and Applications, eds. Y. Alavi, G. Chartrand, O.R. Ollermann and A.J. Schwenk (Wiley, New York, 1991) pp. 871-897. [39] B. Mohar and S. Poljak, Eigenvalues in combinatorial optimization, in: Combinatorial and Graph-Theoretical Problems in Linear Algebra, eds. R.A. Brualdi, S. Friedland and V. Klee, IMA Volumes in Mathematics and Its Applications, Vol. 50 (Springer, Berlin, 1993) pp. 107-151. [40] R. Palmer, Optimization on rugged landscapes, in: Molecular Evolution on Rugged Landscapes: Proteins, RNA, and the Immune System, eds. A.S. Perelson and S.A. Kauffman (Addison-Wesley, Redwood City, CA, 1991) 3-25. [41] D.S. Rumschitzky, Spectral properties of Eigen's evolution matrices, J. Math. Biol. 24 (1987) 667--680. [42] J. Ryan, The depth and width of local minima in discrete solution spaces, Discr. Appl. Math. 56 (1995) 75-82. [43] G. Sabidussi, Vertex transitive graphs, Mh. Math. 68 (1994) 426--438. [44] P. Schuster and P.E Stadler, Landscapes: Complex optimization problems and biopolymer structures, Computers Chem. 18 (1994) 295-314. [45] J.-P. Serre, Linear Representations of Finite Groups (Springer, New York, 1977). [46] D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass, Phys. Rev, Lett. 35 (26) (1975) 1792-1795. [47] G.B. Sorkin, Combinatorial optimization, simulated annealing, and fractals, IBM Research Report, RC13674 (No. 61253) (1988). [48] E Spitzer, Principles of Random Walks (Springer, New York, 1976). [49] P.E Stadler and R. Happel, Correlation structure of the landscape of the graph-bipartitioning-problem, J. Phys. A. 25 (1992) 3103-3110. [50] P.E Stadler and W. Schnabl, The landscape of the traveling salesman problem, Phys. Lett. A 161 (1992) 337-344. [51] P.E Stadler, Correlation in landscapes of combinatorial optimization problems, Europhys. Lett. 20 (1992) 479-482. [521 P.E Stadler, Linear operators on correlated landscapes, J. Phys. I France 4 (1994) 681-696. [53] P.E Stadler, Random walks and orthogonal functions associated with highly symmetric graphs, Disc. Math. 145 (1995) 229-238. [54] P.E Stadler, Towards a theory of landscapes, in: Complex Systems and Binary Networks, eds. R. Lop6z-Pefia, R. Capovilla, R. Garcia-Pelayo, H. Waelbroeck and E Zertuche (Springer, Berlin, 1995) pp. 77-163. 155] P.E Stadler, Landscapes and their correlation functions, J. Math. Chem. 20 (1996) 1-45. [56] P.E Stadler and R. Happel, Random field models for fitness landscapes, J. Math. Biol. (1997), SFI preprint 95-07-069, in press. [57] P.E Stadler and B. Krakhofer, Local minima of p-spin models, Rev. Mex. Fis. 42 (1996) 355-363. [58] E Tanaka and S,E Edwards, Analytic theory of ground state properties of a spin glass: I. Ising spin glass, J. Phys. F 10 (1980) 2769-2778. [59] E Tanaka and S.E Edwards, Analytic theory of ground state properties of a spin glass: II. XY spin glass, J. Phys. F 10 (1980) 2779-2792.
254
R. Garcfa-Pelayo, PE Stadler/Physica D 107 (1997) 240-254
[60] E.D. Weinberger, Correlated and uncorrelated fitness landscapes and how to tell the difference, Biol. Cybern. 63 (1990) 325-336. [61 ] E.D. Weinberger, Fourier and Taylor series on fitness landscapes, Biol. Cybern. 65 (1991) 321-330. [62] E.D. Weinberger and P.E Stadler, Why some fitness landscapes are fractal, J. Theoret. Biol. 163 (1993) 255-275. [63] S. Wright, The roles of mutation, inbreeding, crossbreeeding and selection in evolution, in: Proc. 6th Int. Cong. on Genetics, ed. D.E Jones, Vol. 1 (1932)pp. 356--366.