An exact D-dimensional Tsallis random number generator for generalized simulated annealing

An exact D-dimensional Tsallis random number generator for generalized simulated annealing

Computer Physics Communications 175 (2006) 708–712 www.elsevier.com/locate/cpc An exact D-dimensional Tsallis random number generator for generalized...

273KB Sizes 7 Downloads 96 Views

Computer Physics Communications 175 (2006) 708–712 www.elsevier.com/locate/cpc

An exact D-dimensional Tsallis random number generator for generalized simulated annealing ✩ Thomas Schanze 1 Applied Physics—NeuroPhysics Group, Department of Physics, Philipps University, Renthof 7, D-35037 Marburg, Germany Received 6 February 2006; received in revised form 26 July 2006; accepted 26 July 2006 Available online 7 September 2006

Abstract Generalized simulated annealing is an important method for finding the global minimum of a function. However, it requires the generation of Tsallis random numbers. Here we show that a D-dimensional Tsallis random variable is a multivariate t or Student distributed random variable. Using this finding we design an exact Tsallis random number generator and, finally, present some applications of this generator to generalized simulated annealing. © 2006 Elsevier B.V. All rights reserved. PACS: 02.70.-c; 02.50.-r; 05.10.-a Keywords: Generalized simulated annealing; Optimization; Random number generator; Tsallis probability distribution

1. Introduction The global multi-dimensional function minimization is often an important problem in applied mathematics, physics, chemistry, biology, economy, engineering, neuronal networks, etc. [1,2]. Although many methods have been developed along the years to solve this non-trivial problem, simulated annealing algorithms have become very popular because they can be very efficient. The fundamental idea behind simulated annealing is a cooling process that minimizes the thermodynamical energy of a system in such a way that a global minimum is likely reached [3]. Hence we have a thermodynamical system that can be described by a D-dimensional state space variable x = (x1 , x2 , . . . , xD ) , energy E(x) and temperature T . In thermal equilibrium the system’s probability of being in a possible state is

proportional to the Gibbs factor: P (xt ) ∝ e−E(xt )/T .

(1)

Thus states with high energy can appear at high temperatures, but if T → 0, only states near the minimum of E(x) have a non-vanishing probability to appear. To implement algorithmically simulated annealing we require three steps [3–6]. The first step is the generation of states xt with discrete time steps t = 1, 2, . . . . This state space sampling can be obtained by means of random numbers obeying a visiting distribution. For the Tsallis sampling [6] we have the visiting distribution   1   D−1 qV − 1 D/2  qV −1 + 2 gqV (xt ) =   π  qV1−1 − 12 [TqVV ]−D/(3−qV ) 1/(qV −1)+(D−1)/2 2 t) 1 + (qV − 1) V(x 2/(3−qV )

× ✩

Parts of the work were finished at the Institute for Technology and Development of Medical Products (ITEMP), RWTH Aachen University, Pauwelsstraße 30, D-52074 Aachen, Germany. E-mail addresses: [email protected], [email protected] (T. Schanze). 1 Present address: CORRSYS 3D Sensors AG, Charlotte-Bamberg-Straße 6, D-35578 Wetzlar, Germany. 0010-4655/$ – see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.cpc.2006.07.012

[Tq ] V

(2) with state update xt = xt+1 − xt , visiting temperature TqVV and visiting distribution shape parameter 1 < qV < 3. The second step is the acceptance of states: if the energy of the new state is lower than the energy of the old state, then the new state will

T. Schanze / Computer Physics Communications 175 (2006) 708–712

be accepted, but if not, then the new state will be accepted randomly. The last property is important, since it helps to avoid trapping into a suboptimal state. The Tsallis acceptance probability reads: PqA (xt → xt+1 )  1, =

1 , [1+(qA −1)Et /TqA ]1/(qA −1)

Et < 0, Et  0

(3)

TqAA

with acceptance temperature and acceptance shape parameter 1 < qA < 3. The final step is the cooling schedule or, i.e. temperature reduction, Tq (t) = Tq (1)

2q−1 − 1 , (1 + t)q−1 − 1

multivariate t or Student distributed. Let X = (X1 , . . . , Xd ) a d-dimensional normal random variable with zero mean and covariance matrix Σ and let Y obey a χn2 distribution. If X and Y √ are independent, then the variable Z = X n/Y + μ, μ ∈ Rd , is multivariate t or Student distributed [11] with probability density function fZ (z) =

A

(4)

where Tq and q are the respective temperature and shape parameters for visiting and acceptance procedures. These three steps constitute Tsallis’ generalized simulated annealing (GSA). Note, for parameters qV = qA = 1 we have the classical simulated annealing (CSA, Boltzmann machine) [3] and for qV /2 = qA = 1 the fast simulated annealing (FSA, Cauchy machine) [5]. In addition to the initial results of Tsallis and Stariolo [6], Xiang and Gong [7] showed that the more complex or higher-dimensional the system, the more efficient is GSA provided qV > 2 compared to CSA or FSA. The most difficult part of GSA is the generation of random numbers obeying the Tsallis distribution given by (2). As a provisionary solution, Tsallis and Stariolo [6] developed an approximation of the one-dimensional Tsallis distribution. Basically, their approximation method relies on an idea proposed by Mantegna [8] for the simulation of Lévy stable stochastic processes: the fitting of a non-linear transformation Z = [(K(α) − 1)e−W/C(α) + 1]W of W = X/|Y |1/α , where X and Y are Gaussian random variables and α = α(qV ). However, a careful selection of α ensures a correct asymptotic approximation to the power law behavior of the visiting Tsallis distribution for large values and a moderate fitting in the vicinity of the origin. Since the performance of GSA depends on the fast and also on the exact generation of Tsallis random numbers Deng et al. [9] developed a Tsallis random number generator with improved properties. Unfortunately, this generator is still—as that of Tsallis and Stariolo—an approximative, one-dimensional generator. However, recently Deng et al. [10] published an exact one-dimensional Tsallis random number generator. The goal of this paper is to present an exact D-dimensional Tsallis random number generator as required for correct Ddimensional GSA state space sampling. The paper proceeds as follows. In Section 2 we show that D-dimensional Tsallis random variable obeys a multivariate t distribution and construct a D-dimensional Tsallis random number generator. In Section 3 we present simple numerical examples and conclude in Section 4. 2. D-dimensional Tsallis random number generator Before we construct an exact D-dimensional Tsallis random number generator we show that a Tsallis random variable is

709



n 2

(πν)d/2 |Σ|1/2



 1+

 n+d 

2 (n+d)/2 1  −1 n (z − μ) Σ (z − μ)

(5) By comparing the parameters of Eq. (5) with those of Eq. (2) we find n = (3 − qV )/(qV − 1), d = D, Σ −1 = n(qV − 1)/ [TqVV ]2/(3−qV ) 1, and μ = 0. Finally, to come up to the notation of Tsallis and Stariolo we have to set (xt )2 = z z = z12 +· · ·+ 2 and g (x ) = f (z). Thus a Tsallis random variable is a zD qV t Z special multivariate t or Student distributed random variable. Since the generation of independent Gaussian random numbers can, for example, be done by the well-known method of Box and Muller [12], see also [13–16], we focus here on the generation of random numbers obeying a χn2 distribution. A random variable U with probability density function  bp p−1 −bu u e , u > 0, fU (u) = (a) (6) 0, u0 is called gamma distributed [14], for short notation we write γ (b, p). Note that p is a form and b is a scale parameter. Using this notation we find that a χn2 distribution is a special gamma distribution: γ (b = 1/2, p = n/2). In accordance with Devroye [17] we do not know how to generate a Gamma random number in one line of code. But, there exist some gamma generators [14–16] and here one of three algorithms is used to generate the Gamma random numbers depending on the value of p, i.e. γ (1, p) numbers: (1) If 0 < p < 1, Vaduva’s generator that generates gamma random numbers by rejection from the Weibull distribution is used [18]. (2) If p = 1, the gamma distribution reduces to the exponential distribution and a logarithmic transformation of a uniform random variable is the method of choice (e.g., [13]). (3) If p > 1, Best’s generator is used [19,20], it is based upon rejection from a one-dimensional Student or t distribution with two degrees of freedom. Thus, an algorithm for the D-dimensional Tsallis random number generation reads as follows: FUNCTION Tsallis_RNG(qV , TqVV , D) p ←√ (3 − qV )/(2(qV − 1)) s ← 2(qV − 1)/[TqVV ]1/(3−qV ) [Generate iid normal random numbers xi , i = 1, . . . , D] x ← (x1 , . . . , xD ) [Generate a gamma random number u obeying γ (1, p)] √ y←s u

710

T. Schanze / Computer Physics Communications 175 (2006) 708–712

z ← x/y RETURN, z END Note that the choice of normal random numbers with standard deviation σx = 1 and gamma random numbers obeying γ (1, p) has been accounted for the computation of the scaling factor s. Before we present some simulation results we want to address an important issue of random number generation in simulated annealing that is sometimes ignored. It is evident that a good random number generator produces random numbers that are hard to predict, provided that the generating algorithm is unknown [15,16]. However, poor algorithms often generate random numbers featuring low-order serial correlations or other unwanted structures that may lead to an insufficient or biased sampling of the state space in simulated annealing. To avoid this we decided to use Park and Miller’s minimal standard random number generator [21,22] in combination with a Bays–Durham shuffle [22,23]. Note that this generator produces uniformly dis-

Fig. 1. Distribution of Tsallis random variables. The thin black lines are empirical probability density functions and the thick grey lines are theoretical densities. Parameters are TqVV = 1, D = 1, qV = 1.5, qV = 2 (Cauchy distribution) and qV = 2.5. For the computation of each empirical density one million Tsallis random numbers were generated, the binsize of the histograms used for the computation of the empirical densities was 0.5.

tributed random numbers that can be transformed into random numbers distributed as required. As a first application of the above presented Tsallis random number generator we show in Fig. 1 examples for estimated one-dimensional probability density functions and their theoretical counterparts obeying (2). Note the very good match between the empirical and the theoretical densities. 3. A plain GSA application To apply our Tsallis random number generator for a generalized simulated annealing application we choose the well-known Rastrigin’s function [24–26] to be minimized: E(x) = AN +

N 

xi2 − A cos(2πxi )

(7)

i=1

where −5.12  xi  5.12 and dimension N . For our simulations we set A = 10 and find that the global minimum of this non-linear but separable multimodal function is at the origin and is surrounded by 11N − 1 relative minima. A smart state space sampling can be important for finding the global minimum. However, for a two-dimensional minimization problem we have two obvious procedures. In the case of one-dimensional Tsallis random numbers (D = 1) we update only one randomly chosen component of the state vector x, either x1 or x2 , at a given simulation step (procedure A). For the true two-dimensional GSA we use two-dimensional Tsallis random numbers for the state vector update (procedure B). The remaining start parameters for both procedures are xt = (−1, −1) , qA = 1.05, qV = 2.2, and TqAA (t) = TqVV (t) = Tq (t) = 1 with t = 1. In addition, if the magnitude of a component of the state vector exceeds the value 5.12, i.e. |xi | > 5.12, then a new state vector is computed and at every temperature step Tq (t), t = 1, . . . , 50, and obeying Eq. (4), the state vector is updated 500 times in order to ensure a sufficient state space sampling. Two related state space samplings are shown in Fig. 2. As expected, the component-wise update of the state vector leads to a state space sampling oriented along the coordinate axes whereas the true two-dimensional state vector

Fig. 2. GSA state space sampling of two-dimensional Rastrigin’s function with one-dimensional (A) and two-dimensional (B) Tsallis random numbers. The initial value of the state vector was for both simulations x = (−1, −1) . Note that both procedures show a stochastic convergence to the global minimum of Rastrigin’s function at (0, 0) . The number of sampling points in the scatter diagrams is 25 000, respectively.

T. Schanze / Computer Physics Communications 175 (2006) 708–712

711

update does not. Thus the true two-dimensional sampling is more homogeneous. In addition, it is plausible that true multidimensional state space sampling should be used preferentially for non-separable functions that cannot be optimized for each variable separately. Tsallis and Stariolo [6] have shown that the parameter qV influences the performance of GSA significantly. To find out if this is also the case for GSA with true D-dimensional Tsallis random numbers, we have to vary qV and have to analyze the respective mean energies at every time step for a given optimization problem. From Fig. 3 we see that the performance of the GSA with true three-dimensional Tsallis numbers increases on average with qV for three-dimensional Rastrigin’s function. Note that this must not be the case for single trials. The reason for this characteristic is the stochastic nature of GSA, i.e. the use of random numbers for state space sampling and state acceptance.

bers and one gamma random number to yield a D-dimensional Tsallis random number. The generation of normal random numbers can be achieved by a variety of simple and effective methods. Examples are the seminal and well known twodimensional transformation method of Box and Muller [12], that we used here, or the application of Marsaglia and Tsang’s ziggurat method [27], that may improve the efficiency and the quality of the randomness of our generator significantly. In agreement with Devroye [17] we do not know how to generate a gamma random number in one line of code, but there exist some simple, exact, and somehow efficient rejection method based gamma number generators as those of Vaduva and Best [14,18–20] that we selected and that truly allow the exact generation of Tsallis random numbers for 1 < qV < 3. Fortunately, for high-dimensional applications the efficiency of our generator is dominated by the efficiency of the normal generator and not by the efficiency of the gamma generator. However, despite this inherent, advantageous property there is still, despite much progress [16], a need for more simple and efficient gamma generators. Simulated annealing relies, like many other statistical methods, on computationally generated random numbers. For the generation of Tsallis random numbers we used Park and Miller’s minimal standard linear congruential uniform random number generator that was improved with a Bays–Durham shuffle. However, this generator produces sequences of uniformly distributed random numbers with minor serial correlations. This is important, because serially correlated uniform random numbers may lead to serially correlated Tsallis random numbers that may adversely affect the quality of the state space sampling and so that of GSA. However, the relation between the randomness of a Tsallis random number generator and GSA phase space sampling is of general importance and should be analyzed in more detail elsewhere. In addition, it would be interesting in future research to analyze the GSA dynamics, in particular the dynamics of the state space sampling, and to determine whether a high-dimensional update of the GSA state variable is generally superior compared to a onedimensional update and whether all GSA parameters and not only the annealing schedule [28], especially qV , can be optimized for a given problem.

4. Conclusion

Acknowledgements

It has been shown that a D-dimensional Tsallis random variable is a special multivariate t or Student distributed random variable. In addition, we have introduced a D-dimensional Tsallis random number generator that consists of a D-dimensional normal number generator and a one-dimensional gamma random number generator in combination with a simple transformation and we showed some GSA applications. Due to its construction, our Tsallis random number generator neither depends on the ingenious definition of an approximation function nor is it restricted to generate one-dimensional random numbers only as the hitherto existing generators [6,9,10]. The proposed Tsallis random number generator requires the generation of D independent Gaussian or normal random num-

I am grateful to the anonymous referees for valuable comments and suggestions and, especially, for the hint to the recently published paper by Deng et al. [10] concerning the exact one-dimensional Tsallis random number generation.

Fig. 3. Performance of the GSA with three-dimensional Tsallis random numbers for the three-dimensional Rastrigin’s function. The convergence or the decrease of the systems energy towards the global minimum at (0, 0, 0) depends on qV , i.e. on average, the larger qV the better the convergence. E(t; qV ) denotes the systems energy. The black curves are the mean energies of 30 independent simulations performed for qV = 1.5, 2, and 2.5, xt = (−3, −3, −3) , qA = 1.05, TqAA (t) = TqVV (t) = 1 with t = 1 and 20 000 state vector updates at every timestep t , respectively. The standard deviation of the system’s energy is given by vertical extension of the grey area juxtaposed directly above a black mean energy curve.

References [1] P.J.M. van Laarhoven, E.H.L. Aarts, Simulated Annealing: Theory and Applications, Reidel, Dordrecht, 1987. [2] J. Hertz, A. Krogh, R.G. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley, Reading, MA, 1991. [3] S. Kirkpatrick, C.D. Gellat Jr., M.P. Vecchi, Science 220 (1983) 671. [4] S. Geman, D. Geman, IEEE Trans. Pattern Anal. Machine Intell. PAMI-6 (1984) 721.

712

[5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]

T. Schanze / Computer Physics Communications 175 (2006) 708–712

H. Szu, R. Hartley, Phys. Lett. A 122 (1987) 157. C. Tsallis, D.A. Stariolo, Physica A 233 (1996) 395. Y. Xiang, X.G. Gong, Phys. Rev. E 62 (2000) 4473. R.N. Mantegna, Phys. Rev. E 49 (1984) 4677–4683. J. Deng, H. Chen, C. Chan, Z. Yang, Int. J. Comput. Math. 81 (2004) 103. J. Deng, C. Chang, Z. Yang, Int. J. Simul.: System Sci. Technol. 6 (2005) 54. T.W. Anderson, An Introduction to Multivariate Statistical Analysis, J. Wiley & Sons, New York, 1984. G.E.P. Box, M.E. Muller, Ann. Math. Stat. 29 (1958) 610–611. T.G. Lewis, Distribution Sampling for Computer Simulation, Lexington Books, Massachusetts, 1975. L. Devroye, Non-Uniform Random Variate Generation, Springer-Verlag, New York, 1986. B.D. Ripley, Stochastic Simulation, J. Wiley & Sons, New York, 1987. J. Gentle, Random Number Generation and Monte Carlo Methods, Springer-Verlag, New York, 2005.

[17] L. Devroye, in: J.M. Charnes, D.J. Morrice, D.T. Brunner, J.J. Swain (Eds.), Winter Simulation Conference Proceedings, ACM, 1996, p. 265. [18] I. Vaduva, Math. Operationsforschung Stat. Ser. Stat. 8 (1977) 545. [19] D.J. Best, Appl. Stat. 29 (1978) 181. [20] D.J. Best, in: J.M. Charnes, D.T. Brunner, J.J. Swain (Eds.), Proc. Comput. Stat. (COMPSTAT), Physica Verlag, Vienna, 1978, p. 341. [21] S.K. Park, K.W. Miller, Comm. ACM 31 (1998) 1192. [22] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C: The Art of Scientific Computing, second ed., Cambridge University Press, Cambridge, 2002. [23] C. Bays, S.D. Durham, ACM Trans. Math. Software 2 (1976) 59. [24] L.A. Rastrigin, Systems of Extremal Control, Nauka, Moscow, 1974. [25] A. Törn, A. Zilinskas, Global Optimization, Lecture Notes in Computer Science, vol. 350, Springer-Verlag, Berlin, 1989, p. 255. [26] H. Mühlenbein, D. Schomisch, J. Born, Parallel Comput. 17 (1991) 619. [27] G. Marsaglia, W.W. Tsang, J. Stat. Software 5 (2000) 1. [28] A. Franz, K.H. Hoffmann, J. Comput. Phys. 176 (2002) 196.