Long-term transient thermal analysis using compact models for data center applications

Long-term transient thermal analysis using compact models for data center applications

International Journal of Heat and Mass Transfer 71 (2014) 69–78 Contents lists available at ScienceDirect International Journal of Heat and Mass Tra...

4MB Sizes 0 Downloads 13 Views

International Journal of Heat and Mass Transfer 71 (2014) 69–78

Contents lists available at ScienceDirect

International Journal of Heat and Mass Transfer journal homepage: www.elsevier.com/locate/ijhmt

Long-term transient thermal analysis using compact models for data center applications Zhihang Song ⇑, Bruce T. Murray, Bahgat Sammakia Department of Mechanical Engineering, Binghamton University, Binghamton, NY, USA

a r t i c l e

i n f o

Article history: Received 30 August 2013 Received in revised form 27 November 2013 Accepted 2 December 2013

Keywords: POD NLPCA Compact model Thermal modeling Server room

a b s t r a c t Reduced-order thermal models are necessary to enable real-time assessment of the optimum operating and control conditions to improve data center energy efficiency. A 3D computational fluid dynamics (CFD) and heat transfer model of a basic raised floor, air cooled, hot aisle/cold aisle data center configuration was developed and simulation results were used to generate snapshots for initializing compact models based on the Proper Orthogonal Decomposition (POD) and the Nonlinear Principle Component Analysis (NLPCA) methods. The specific focus of the study was to numerically investigate how the thermal trends can be affected by the long-term transient flow patterns associated with leakage and how the compact models (e.g. POD and NLPCA) can be utilized for much faster implementations and characterizations of the transient flows. Using both the POD and the NLPCA method, good agreement was achieved between the full simulation results and the compact models for predicting the dynamically developed local flow structure over a range of transient cooling air supply operating conditions. In addition, the NLPCA method was implemented to better characterize the nonlinear aspects of the CFD results. The benefits of using both the POD and NLPCA methods are discussed in relation to constructing compact models as real-time predictive tools. A systematic use of the compact models has also been proposed, to enable more robust thermal management and control of data centers. Ó 2013 Elsevier Ltd. All rights reserved.

1. Introduction Due to the need to improve energy efficiency in the thermal management of data centers, the number of experimental and computational studies in this area is increasing rapidly. A significant portion of this research involves full-scale computational fluid dynamics simulations of the turbulent flow and temperature distributions in air-cooled, raised floor data centers. In order for the full-scale CFD simulations to adequately capture the highdimensional, complex-flow physics and energy transport, substantial computational effort is required. An extremely large number of discrete equations and degrees of freedom result from approximation methods such as the finite volume or finite element approaches. The solution of the discrete systems, especially for transient behavior, generally requires long computation times, which make it difficult to achieve real-time assessment. Fast response in modeling the thermal environment is necessary in order to develop a feedback control system to minimize energy usage. Thus, reduced-order or compact models are required as a predictive tool for generating the response of the thermal environment to various operating conditions. ⇑ Corresponding author. Tel.: +1 315 380 8882. E-mail address: [email protected] (Z. Song). 0017-9310/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ijheatmasstransfer.2013.12.007

The POD method is one approach that can be used to develop a reduced order model based on numerical or experimental data. However, there are some disadvantages in the traditional POD method, because the selection of higher order modes and the number of modes impacts the accuracy and reliability of the predictions. Difficulties occur because a nonlinear coupling relationship exists between the high-order modes and the low-order modes in the nonlinear momentum and energy transport models that are sensitive to the initial and boundary conditions. A long-term perturbation (such as unreasonably truncated higher-order POD modes) may cause a change in the nature of the topology of the system. It was found that the long-term vortex behavior described by the developed reduced-order 12 POD modal low-dimensional systems did not match the numerical simulation results when time increases beyond 25 s [1]. POD reduced-order modeling was applied to the problem of flow around a cylinder in [2], which concluded that the traditional POD cannot accurately describe the transient behavior of the original system and long-term behavior with a varying Reynolds number. Ref. [3] shows that the included shift-mode significantly improves the resolution of the transient dynamics from the onset of the vortex shedding to the periodic von Karman vortex street. The inclusion of the shift-mode can precisely describe the dependence of the flow on Reynolds number. However, an unstable steady-state solution of the original system

70

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

is required for the approach, which increases both the difficulty and time of the computation. The Galerkin projection-based method is an alternative procedure to solve for the POD coefficients using a set of coupled nonlinear time-based ordinary differential equations for the transient stage. Earlier investigations, such as [4], utilize this method to create reduced-order models of transient flow and temperature fields in terms of a parametric study, for channel flow with easily homogenized boundary conditions. The POD-Galerkin approach was employed in conjunction with the finite volume method (FVM) to generate a reduced-order model for a transient 1D heat transfer problem [5], and both POD based interpolation and projection methods were applied to multivariate heat conduction for fast thermal predictions [6]. Nie and Joshi [7] combined the fluxmatching POD and Flow Network Method (FNM) to achieve a multiple scale reduced model strategy for chip-level conductive through rack-level convective transport. The POD method has been widely used in the field of data center reduced-order modeling [8–11]. More recently, Ghosh and Joshi [12] developed a dynamic POD model to predict the transient temperature response for a transient start-up scenario (server heat load and cooling air flow rate). When time was used as the only variable, the interpolation technique was used to efficiently determine the POD coefficients instead of the more traditional Galerkin based method. However, the methodology was found to be unsuitable to solve a multivariate problem (e.g., transient flow and temperature field with variable cooling load). In addition, in the study the temperature field was expressed by a set of linear basis functions alone. The current study focuses on the development and implementation of an improved method to characterize the nonlinear behavior such that a reduced-order, dynamic predictive capability can be achieved for real time control and optimization of the energy used to provide sufficient cooling of a data center. 2. Data center computational model The model of the basic raised floor data center configuration is shown in Fig. 1. For the representative study presented here, the server room is taken to be 8.75 m long and 6.4 m wide. The cooling air is supplied from a single CRAC unit located on one end through the raised floor plenum (0.914 m height) and returned through an overhead plenum mounted on the ceiling (1.524 m height), as seen in Fig. 2. The height between the raised floor and the drop ceiling plenum is 2.7 m. Under steady conditions, the CRAC flow rate (downward to the under-floor plenum) is specified as 5.3 m3/s (at 80% capacity) and the air temperature is set at 14 °C. The same amount of air returns to the CRAC via the ceiling plenum. In the data center model, two rows of five server racks (2.13 m high) are located on either side of the center cold aisle. This single cold

Fig. 1. Data center layout and specifications (top view).

Fig. 2. Basic hot aisle/cold aisle raised-floor data center.

aisle room configuration is similar to the one that was characterized experimentally and modeled using the POD approach by Ghosh and Joshi [13]. The racks are modeled as enclosures with a heat source and internal fans. The total airflow required by the server fans is equal to the CRAC supply. The cold aisle consists of two rows of perforated tiles (0.61  0.61 m2; 56% porosity). Fig. 3 provides more details about the model specifications; air resistance modeling was used to characterize the flow distribution through the perforated tiles [14]. In this approach, a local pressure drop is instantaneously produced at the foot of the server racks due to the air injected from the perforated tiles. In this study, the largescale flow and temperature simulations were performed using the commercial CFD software package FloTHERM [15]. For the simulations performed using the FloTHERM software package, a finite-volume approximation of the Reynolds-averaged Navier–Stokes (RANS) and energy equations with a standard k–e turbulence model is used. The continuity, momentum and energy equations of the 3D constant-density mean flow are expressed as follows.

r ~ u¼0 @~ u 1 þ~ u  r~ u ¼ r  ðmeff r~ uÞ  rp þ ~ g @t q

qcp



 @T þ ð~ u  rÞT ¼ r  ðkeff rTÞ þ S @t

ð1Þ

where, meff and keff are effective fluid viscosity and thermal conductivity, respectively. The term ~ g is the gravitational acceleration vector, and the term S represents the heat source. The convective terms were discretized using a second-order upwinding numerical scheme and the SIMPLEST algorithm [16] used to evaluate the coupled velocity and pressure field. A thermal/airflow solver similar to Patankar’s formulation in [17] was

Fig. 3. Model specifications for server racks and perforations.

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

adopted to conduct the CFD simulations using a pressure-staggered grid. A mesh sensitivity study was performed using the computed temperature field to assess the accuracy of the calculation. An optimum computational mesh (a good trade-off between accuracy and computation time) for the room model consisted of more than three hundred thousand grid points in the 3D computational domain. The resulting minimum grid spacing was 0.013 m, and the maximum grid spacing was 0.026 m. Similarly, a time step sensitivity analysis was performed to determine the required time step to obtain sufficient accuracy (maximum temperature variation within five percent) for the transient calculations. A uniform time step of 10 s was used for all of the required calculations used in this study.

3. The long-term transient CFD results In order to simulate the response to transient changes in room cooling conditions, the CRAC flow rate was varied from 80% to 100% of the cooling capacity. One half of a sine wave applied over 1500 s is used to introduce the transient variation starting at 80%. The maximum flow rate is obtained at 750 s and the flow rate drops back to 80% of the maximum value at 1500 s. This transient variation scenario was used to obtain the necessary CFD simulation data required to develop the POD and NLPCA implementations. To observe the effect of the transient cooling air supply, the velocity field and temperature contours in front of rack 3 (refer to Fig. 1) are presented in Figs. 4 and 5 at three time values (i.e., 0 s, 250 s, and 500 s). The region shown in the plots extends from the tile surface to the top of the rack vertically and spans the width of a tile. The plane shown is located in the center of the rack. As illustrated in Figs. 4(a) and 5(a), when the CRAC flow is relatively low (at the initial condition), there is an apparent over-cabinet hot air recirculation observed near the top of the rack. Additionally, the leakage flow caused by the gap between the cabinet and raised floor was obtained at the foot of the rack, which leads to the region of rising temperature underneath. For the scenario shown in Figs. 4(b) and 5(b), the tile flow was driven by a larger momentum source (higher CRAC supply), which gave rise to an improved cooling behavior (less over-cabinet hot air recirculation). However, the under-cabinet hot air recirculation (leakage flow speed) occurs with magnitudes higher than the first scenario. This result most likely occurs from the fact that the impact of the pressure drop

71

tends to be more significant as the airflow velocity approaching the perforations increases. Following the same thermal trend, as the supplied cooling air is further increased, as shown in Figs. 4(c) and 5(c), a significant amount of the cooling air bypasses the servers because of the increased velocity. Although the thermal condition at the top of the rack remained healthy without much over-cabinet recirculation observed, the region on the bottom becomes slightly hotter in association with a greater under-cabinet leakage flow pattern caused by a larger pressure drop across the perforated tiles and increased pressure gradient from the hot to cold aisle. Considering the existence of the flow disturbance due to the leakage effect, the flow velocities, in response to the longterm nonlinear CRAC cooling variation, were chosen as the observations for implementing the reduced order models (e.g., POD and NLPCA) such that the resulting local transient reduced order flow approximations would enable more effective and efficient thermal control principles in energy management.

4. POD methodology Much of the basic theory of the proper orthogonal decomposition (POD) method [18,19] was developed for the purpose of extracting the primary components from experimental data for coherent structures in turbulent flows. For the present application, the idea is essentially to determine the optimal orthogonal basis functions driven by the data in a temporal domain and associated with the characteristic structure of the flow and temperature field (referred to as the POD modes). Because the modal representation depends highly on the experimental or numerical observations, the reduced order description can be used to effectively capture the essential characteristics of the data, i.e. the principle flow structure is capable of being reproduced. Additionally, using the Galerkin method, the governing equations (e.g., Navier–Stokes equations) can be projected onto the space spanned by the POD modes to achieve a reduce ordered flow model, which enables a significant computational time savings, faster description of low-dimensional dynamic system, and an ultimate integration with a real-time control system for energy-efficient data centers. Here, the method of snapshots is first used to initialize the POD procedure. A selected group of M transient velocity fields is given at discrete times uðx; t m Þ where m = 1 to M. Each snapshot of the velocity field can be decomposed as a combination of a mean flow and an instantaneous fluctuating component as follows:

Fig. 4. Velocity vectors and speed contours (m/s) for three sampled transient scenarios: (a) 0 s; (b) 250 s; (c) 500 s.

72

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

Fig. 5. Temperature contours (°C) for three sampled transient scenarios: (a) 0 s; (b) 250 s; (c) 500 s.

uðx; t m Þ ¼

M 1X uðx; t m Þ þ u0 ðx; t m Þ M m¼1

ð2Þ

where a set of optimum orthogonal basis functions fui ji ¼ 1; . . . ; 1g (POD modes) can be found in the temporal domain uðx; t m Þ, which signifies that the error is minimized through mapping / between the original systemic solution X and the projected POD solution in the reduced-order domain W. The error is calculated between the original velocity field observations uðx; t m Þ and the POD solution /uðx; t m Þ, where / represents the POD mapping. The error is defined by the following expression:



M 1X ðkuðx; t m Þ  /uðx; t m ÞkÞ M m¼1

ð3Þ

which is equivalent to satisfying

0D MAX @

jðu0 ; uÞj2 kuk2

E1 A

ð4Þ

uW

Eq. (3) represents the most representative projection of uh onto the space spanned by the POD modes, ui , and is constructed so that the variance of the temporal flow structure can be effectively characterized to the maximum extent. In the above expression, hi represents the arithmetic average, (f, g) is the inner product and the norm is given by 1

kuk ¼ ðu; uÞ2

ð5Þ

In the spatial domain, the two-point POD based correlation (coherence) of the flow field is determined from the outer product as follows:

Cðx; yÞ ¼ huðx; tÞ  uðy; tÞi

ð6Þ

The eigenvalue problem obtained is represented by the following system:

Z X

dyCðx; yÞui ðyÞ ¼ ki ui ðxÞ

ð7Þ

Similar to the spatial decomposition, the correlation matrix in the temporal manner based on the snapshots method is given as follows:

C ðt m ; tn Þ ¼

1 ðuðx; tm Þ; uðx; t n ÞÞX M

m; n ¼ 1; . . . ; M

ð8Þ

Every captured snapshot from the CFD data yields N spatial points because of the grid spacing. Accordingly, the size of the correlation matrix in the spatial domain becomes N  N for each of the velocity components, which may cause a difficult computation process as well as low accuracy. However, the POD based correlation matrix in the temporal domain yields a greatly reduced size (M  M). The eigenvalue ðki Þ problem obtained, including the eigenvalue in the eigenfunctions ui (ith POD mode), is given as follows:

1 T

Z

T

0

ðmÞ

dt n Cðt m ; t n ÞAi ðt n Þ ¼ ki Ai ðt m Þ ¼ ki Ai

ð9Þ

The obtained POD modes are rearranged with respect to the descending order of the real positive eigenvalues (i.e., k1 P k2 P k3    > 0). It should be noted that zero eigenvalues are not taken into account because no contribution is made to the velocity. The temporal coefficients (the eigenvectors of the correlation matrix) yield orthogonality as follows: M 1X aðmÞ aðmÞ ¼ ki dij j M m¼1 i

ð10Þ

  ðmÞ ð1Þ ðMÞ where Ai ¼ ai ; . . . ; ai The POD modes are subsequently calculated as follows:

ui ¼

Mki

PM

1

ðmÞ m¼1 ai uðx; t m Þ

ð11Þ

5. NLPCA methodology It was demonstrated that the POD method, applied in the specified framework, works quite well due to its absorptive bistability if the base states encompass the full range of behavior for the particular variable measured. However, inaccuracies arising from asymmetry and the extent of the variation in the primary variables are the fundamental limitations of the POD method. While it is able to capture some variation from the mean state through the specification of a linear space, both large dynamics away from the mean and the nonlinear underlying structure of the dynamics (such as multiple stable states) present challenges to the POD method. Unfortunately, it is often this behavior that we wish to capture or control because many thermal phenomena in data centers behave in a nonlinear manner, indicating that the observed data describe a curve or curved subspace in the original data space. The identification of such nonlinear manifolds becomes significant in the phys-

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

ical fields (velocity, temperature, etc.) of data centers. Generally, the CFD data are of very high dimensionality. However, because the data are usually located within a low-dimensional subspace, they can be well described by a single or few components. Considering the above-mentioned limitations, the introduction of nonlinear functions is a natural extension of the typical POD method so that the curved dynamic response surface can be represented by its principle modes through nonlinear mapping. The NLPCA methodology, which was proposed by Kramer [20], uncovers linear correlations between variables like the POD method and allows for data dimensionality reduction and visualization without restriction on the character of the nonlinearities. Several investigations were made regarding the NLPCA method based on autoassociative neural networks [21–23]. Some applications of the NLPCA were in the area of ocean–atmosphere oscillation [24] as well as the water level and current fields in a tidal sea [25]. In [26], the NLPCA methodology was used as a diagnostic tool to analyze and control the periodic actions in valve processing. The broad range of applications demonstrates that the NLPCA technique may be used as an alternative to predict the flow distribution and spatial temperature variation for the relatively complex configurations seen in data centers. To the best of our knowledge, this study is the first attempt to utilize the NLPCA method to develop a compact tool for long-term dynamic thermal analysis in data centers. As shown in Fig. 6, a multi-layer perceptron (MLP) with an autoassociative topology and architecture of 3-4-1-4-3 is used to perform the NLPCA identity mapping, where the inputs of the network are reproduced at the outputs through the activation of the internal bottleneck layer, which has fewer nodes (1 node in present study) than the input layer, and represents the input data in a compact manner. The second and fourth layers, with four nonlinear units each, enable the network to perform nonlinear mappings. The assessment of the dimensionality reduction can be referred to as two functions: the extraction function FE and the generation function FG. The extraction function maps each higher-dimensional sample vector (input layer nodes) onto a reduced-dimensional space (bottleneck layer nodes) as follows:

uc ¼ F E ðuÞ

ð12Þ

where FE is a nonlinear vector function consisting of individual nonlinear functions. Conversely, the generation function FG is used to inversely transform the reduced-order component values back into the original space (output layer nodes) as follows:

^ ¼ F G ðuc Þ u

ð13Þ

Here, the Levenberg–Marquardt iterative algorithm is utilized to update the network weights (the connections between network nodes throughout individual layers) such that the model is trained using the standard backward propagation approach until convergence occurs. The sum of the squared error (SSE) is used to

Fig. 6. NLPCA model architecture (3-4-1-4-3).

73

specify the convergence of the procedure with respect to the network weights as follows:

SSE ¼

M X D X m

^ ðx; t m Þ  uðx; tm Þk2 ku

ð14Þ

d

where d is the dimensionality of the data given by the number of observed field velocities; and M is the number of long-term transient snapshots. 6. Results and discussion 6.1. Mean flow and POD modes As discussed in the Section 4, the results from the CFD computations (a total of M snapshots), including the velocity vectors in the y- and z- directions in the cold aisle region, were used to construct the POD model in order to characterize the complicated transient flow structure resulting from variation in the control parameter, which is the CRAC cooling supply air flow rate, Vs. As Fig. 7 shows, the normalized eigenspectrum of the correlation matrix C for a specific case. The hierarchical POD modes with significant energy content were used to obtain the truncated expansion in order to reconstruct the temporal flow field. The modal expansion is written as follows:

^ ðx; tm Þ ¼ u0 ðxÞ þ u

I X

ai ðtÞui ðxÞ

ð15Þ

i¼0

where u0(x) is the mean flow velocity structure (the temporal average of the snapshots) in the observed cold aisle region, I is ^ ðx; t m Þ represents the number of the included POD modes u0i ðxÞ and u the approximated airflow distribution from the observations. An example of the mean flow velocity vectors and superimposed speed contours is shown in Fig. 8, while the velocity vectors and speed contours of the first three modes (POD modes 1 through 3) are shown in Fig. 9. It should be noted that the interrelated pattern between the three modal representations was effectively eliminated with the appropriate variance of the temporal flow structure. 6.2. Sensor allocations To appropriately make comparisons between the results from the CFD simulations and those from the compact model (POD and NLPCA), the CFD flow field results were monitored at three representative finite volume grid locations. Here, we refer to these as sensor points. As indicated in Fig. 10, the first sensor monitoring

Fig. 7. Energy percentage captured by POD modes.

74

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

Fig. 10. Sensor specifications.

Fig. 8. Temporally averaged flow structure.

point (circle point) was specifically located near the center of the cold aisle perforated tiles (0.013 m away from the raised floor and 0.6 m away from the rack inlets), which is assumed to be a good location for measuring the tile flow. The other two locations for the second and third monitor points (triangle and square point) were specifically chosen near (0.05 m away from the server inlets) the top region of the rack (1.75 m and 2 m high from the raised floor), which is where over-cabinet hot air recirculation typically occurs. Fig. 11 indicates the temperature profiles measured by the local sensors (sensor 1 through 3). The time derivative of temperature measured at sensor 1 yields a value of zero because the cold air emerging from the perforated tile is fixed at a temperature of 14 °C. As the CRAC supply velocity amplitude periodically increased and decreased (one half of a sine CRAC wave applied over 1500 s), the mixing of tile flow and under-cabinet leakage flow was

Fig. 11. Dynamic temperatures (°C) predicted by CFD at sensors.

correspondingly affected to a greater and lesser extent, which explains the variation of temperature at sensor 2. The variation of the temperature near the top of the rack (sensor 3) in Fig. 11 implies a more complex airflow distribution over time. While a sufficient amount of cooling flow was delivered through the perforations to effectively overcome the air recirculation from the top, the hot air finds its way underneath the rack and enters the server

Fig. 9. Velocity vectors and speed contours (m/s) for first three POD modes.

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

75

Fig. 12. Dynamic velocities (m/s) predicted by CFD and POD at sensors: (a) and (d). correspond to sensor 1; (b) and (e) correspond to sensor 2; (c) and (f) correspond to sensor 3.

Fig. 13. Dynamic velocities (m/s) predicted by CFD and NLPCA at sensors: (a) and (d) correspond to sensor 1; (b) and (e) correspond to sensor 2; (c) and (f) correspond to sensor 3.

inlets (illustrated by the temperature at sensor 3 from 250 s to 1250 s). One may note that these described thermal trends are reasonably explained and controlled through the observed transient flow patterns at the monitoring points. The compact models can be utilized for much faster characterization of the transient flows, which enables more robust control to the challenging thermal behavior in data centers.

For the compact model representations the data set consisted of M temporal modes (M = 31) in a 3D system (D = 3). Although no unique functional relationships can be achieved between the three observed variables, ud ðx; tm Þ (d = 1–3), their interactions can be comprehensively estimated using multivariate reduced-order methodologies (e.g., POD and NLPCA) and utilized to improve the techniques used for thermal control and optimization. For

76

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

example, a single variable controller used in typical control systems is not suitable for capturing the interrelationship between multiple control quantities (e.g., multiple zonal temperatures). A sufficient approximation of the multivariate optimum control condition cannot be effectively realized to overcome unbalanced cooling performance (either over cooling or under cooling). In addition, some of the previous relevant studies [27–30] indicate the independent POD or NLPCA modes defined in reduced-order spaces can be suitably applied to multivariate controllers to better handle and process the dynamic operating conditions in data centers. Fig. 12 presents the constructed POD model and the calculated CFD transient flow patterns for both velocity components in the y- and z-directions at the representative monitor points under the long-term dynamic CRAC conditions. A good agreement was achieved between the solutions of the POD and CFD methods. Using the described multivariate mapping between the input variables (CRAC flow variation and time) and the three sets of output variables (individual sensor measured flow velocities in the turbulent field), comparisons of the POD and FloTHERM simulation results were performed. In comparing all the recorded predictions of the velocities (v and w) at the sensor locations, the trends in the calculated flow pattern versus time were determined. At the position of sensor 1, the flow velocities were steadily profiled and the response to the periodically varied CRAC flow using the harmonic transient function characterized. For example, the peaked velocity, w, profile in the observed period (0 s through 1500 s) is nearly identical in shape to that in the prescribed CRAC

airflow motions with one order of magnitude smaller. The variation in the tile flow (the airflow rates adjacent to the perforated tiles at the foot of the individual rack) is almost entirely attributed to the motivated CRAC oscillatory motion. In comparing with the flow patterns at sensor 1, a small amount of perturbation is observed from the velocity profiles at the location of sensor 2. This behavior may be mainly due to the interaction between the organized CRAC flow and the underlying turbulent responses in the room-level environment, such as variations of the tile flow, the airflow rates demanded by the internal fans in the server racks and the hot/cold air mixing due to the top recirculation from the server outlets. As observed from the measurements at the top of the server rack (sensor 3), the perturbation is more detectable in the velocity components of the flow, which might be attributed to a more significant interactive correlation between the physical quantities that respond harmonically to the forced long-term unsteady flow (e.g., the top recirculation). As observed in Fig. 12, the solutions of POD that use three principle modes agree well with that of the CFD. For ease of comparison, the same data set generated by the three sensors was used to illustrate the performance of the NLPCA. Because there is a single bottleneck node that was determined by the approach to map the multivariate flow correlations, it was progressively rescaled during the network training procedure to model the nonlinear principle reduced-order basis vector (analogous conceptual meaning with POD modes). Fig. 13 shows the results under comparison between the CFD and NLPCA methods; furthermore, a good

Table 1 Temporally averaged relative error at the sensors using POD and NLPCA. Method

Temporally averaged relative error, E Y velocity sensor 1

Y velocity sensor 2

Y velocity sensor 3

Z velocity sensor 1

Z velocity sensor 2

Z velocity sensor 3

0.0055

0.0049

0.0067

0.0051

0.0197

0.0991

POD, modes no. 1 0.0077 2 0.0036 3 0.0024

0.0074 0.0055 0.0045

0.0347 0.0301 0.0134

0.0063 0.0050 0.0047

0.1103 0.0408 0.0277

0.3594 0.0630 0.0206

NLPCA

Fig. 14. Reconstruction relative error percentage (%): (a) POD with one mode; (b) POD with two modes; (c) POD with three modes; (d) NLPCA.

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

Fig. 15. The use of compact models in energy efficiency of data centers.

agreement was achieved. It is common to see a curved flow structure in data centers. The inputs may represent data center room-level airflow concentrations driven by a periodic inlet boundary condition (e.g., CRAC flow), whereas the bottleneck node might represent a local index in the design/control space to interpret the impact of observed original variables (multiple scaled transient response) in the thermal/fluid system. Both the representative node values and the nonlinear mappings can be suitably characterized by the NLPCA methods. Additional comprehensive comparisons were performed regarding the performances of the NLPCA to POD methods. 6.3. Error estimation in POD and NLPCA As presented in this section, the reconstruction errors of the NLPCA were further compared and discussed to that of the POD method. Table 1 lists the temporal averaged reconstruction error, E, at the representative locations (sensor 1 through 3). In addition to the reconstruction errors for the NLPCA (3-4-1-4-3) and POD methods using three modes, the implementation of the POD method using one and two modes was performed to investigate the effectiveness in the same CFD data. Fig. 14 shows the errors as a function of time and location (sensor 1 through 3) for the models under investigation: POD with one and two modes. As shown in Table 1, the relative error, E (temporal mean absolute error), between the POD and CFD techniques varies significantly as a function of the number of modes in the POD. The magnitude of error for the one mode POD is high with an apparent pattern, which implies that the inherent temporal flow structure was unsuccessfully explained. In this case, a higher number of POD modes have to be retained to improve the loss of information at the cost of computational efficiency. As shown in Table 1 and Fig. 14, the temporal relative errors were effectively reduced for the POD with two modes. A large error can still be obtained, particularly for most nonlinear trajectories (e.g., y velocity profile at sensor 3). The magnitude of errors for the three POD modes was much smaller with improved stability. This set of observations suggested that the interrelationship between the measured velocities produced a nonlinear trend. On the other hand, the results for the NLPCA present good reconstruction accuracy using one nonlinear mode (Table 1 and Fig. 14), which proves that the dynamic response structure of the flow field to the forced harmonic CRAC flow excitation has been successfully captured. 7. Conclusions Compact models based on the POD and NLPCA methods have been developed and applied to analyze the design of a data center thermal management system. The models were initialized using simulation results obtained by finite volume CFD computations. A simple model of a data center was used to test the performance of the compact models. The constructed POD and NLPCA models were shown to be capable of relating the complicated high-order flow distribution to the reduced-order space and provide an

77

accurate predictive capability with good computational efficiency. Based on these results, it has been shown that the POD-based reduced-order model provides better capability to reconstruct and visualize the physical fields (e.g., velocity and temperature). However, the NLPCA method may prove to be a better technique in practice for efficiently identifying nonlinear behavior and integrating with real-time thermal control strategies. Moreover, although it has been shown that a well-configured POD model is capable of reconstructing the flow structure, additional principle reduced-order modes might be required to achieve better accuracy, which will result in a higher computational cost. The nonlinear correlations in the magnitude of the local velocity components can be effectively and efficiently captured as well as eliminated using the NLPCA method. The model implementation described here provides the capability to adaptively reduce the dimensionality of the multivariate flow distributions given a desired CRAC cooling air flow within the data center, which will enable faster optimization and control of the operating conditions. The implementation of both the POD and NLPCA methods requires the use of comprehensive experimental observations or well-validated CFD simulations to determine the flow structure in a given room configuration. Once the structure of the transport between the different zones (e.g., sensor 1 through 3) has been established, the compact models can be utilized to compute the principle reduced-order modes that account for the dominant percentage of variance of the mass and energy transport and predict the flow and temperature variation. Furthermore, the pursuit of various model selection criteria for the NLPCA technique is significant (e.g., training method, topographical architecture). Since additional complete characterization of the cooling performance in a data center (e.g., temperature field solutions) given the reduced order flow approximations is still challenging, further efforts are needed to develop and implement an extended procedure as shown in Fig. 15, which suggests a combination of reduced order flow models (e.g. NLPCA) and temperature–velocity coupling techniques (e.g. zonal method) for further enhanced optimal control studies. On one hand, NLPCA could proactively feed complementary information into the zonal models to define systematic flows. On the other hand, the zonal methods, suggested by the studies in [31,32], may serve as an efficient modeling approach capable of relating the airflow patterns to the temperature distributions to predict important thermal trends in data centers (e.g. the region of rising temperature due to the leakage effect). The utility of the approach could be further tested with available and dedicated experimental measurements, validated CFD simulations and a wide variety of cooling and IT workload management strategies. The achievements will lead to more generalized and intelligent tools for real-time data center thermal analysis. Acknowledgments This research was funded by the Small Scale Systems Integration and Packaging Center (S3IP) at Binghamton University. S3IP is a New York State Center of Excellence and receives funding from the New York State Office of Science, Technology and Innovation (NYSTAR), the Empire State Development Corporation, and a consortium of industrial members. References [1] S. Ahuja, C. Rowley, I. Kevrekidis, M. Wei, Low-dimensional models for control of leading-edge vortices: equilibria and linearized models, in: AIAA Proceedings of 45th Aerospace Sciences Meeting and Exhibit, Reno, NV, USA, 2007, pp. 1–12. [2] B.R. Noack, K. Afanasiev, M. Morzynski, G. Tadmor, F. Thiele, A hierarchy of low-dimensional models for the transient and post-transient cylinder wake, J. Fluid Mech. 497 (2003) 335–363.

78

Z. Song et al. / International Journal of Heat and Mass Transfer 71 (2014) 69–78

[3] T.K. Sengupta, N. Singh, V. Suman, Dynamical system approach to instability of flow past a circular cylinder, J. Fluid Mech. 656 (2010) 82–115. [4] P. Ding, X.H. Wu, Y.L. He, W.Q. Tao, A fast and efficient method for predicting fluid flow and heat transfer problems, J. Heat Transfer – Trans. ASME 130 (3) (2008). [5] A.P. Raghupathy, U. Ghia, K. Ghia, W. Maltz, Boundary-conditionindependent reduced-order modeling of heat transfer in complex objects by POD-Galerkin methodology: 1D case study, J. Heat Transfer – Trans. ASME 132 (6) (2010). [6] Y. Wang, B. Yu, Z. Cao, W. Zou, G. Yu, A comparative study of POD interpolation and POD projection methods for fast and accurate prediction of heat transfer problems, Int. J. Heat Mass Transfer 55 (2012) 4827–4836. [7] Q. Nie, Y. Joshi, Reduced order modeling and experimental validation of steady turbulent convection in connected domains, Int. J. Heat Mass Transfer 51 (2008) 6063–6076. [8] Y. Joshi, Reduced order thermal models of multiscale microsystems, J. Heat Transfer – Trans. ASME 134 (3) (2012). [9] E. Samadiani, Y. Joshi, Multi-parameter model reduction in multi-scale convective systems, Int. J. Heat Mass Transfer 53 (2010) 2193–2205. [10] E. Samadiani, Y. Joshi, Proper orthogonal decomposition for reduced order thermal modeling of air cooled data centers, J. Heat Transfer – Trans. ASME 132 (7) (2010). [11] E. Samadiani, Y. Joshi, H. Hamann, M.K. Iyengar, S. Kamalsy, J. Lacey, Reduced order thermal modeling of data centers via distributed sensor data, J. Heat Transfer – Trans. ASME 134 (4) (2012). [12] R. Ghosh, Y. Joshi, Dynamic reduced order thermal modeling of data center air temperatures, in: Proc. ASME InterPACK, Portland, OR, USA, vol. 2, 2011, pp. 423–432. [13] R. Ghosh, Y. Joshi, Error estimation in POD-based dynamic reduced-order thermal modeling of data centers, Int. J. Heat Mass Transfer 57 (2013) 698– 707. [14] S.V. Patankar, Airflow and cooling in a data center, ASME J. Heat Transfer 132 (2010). [15] FloTHERM Users Manual, Mentor Graphics Inc., Wilsonville, OR, 2010. [16] H.K. Versteeg, W. Malalasekera, An Introduction to Computational Fluid Dynamics, Prentice Hall, 1995.

[17] V.S. Patankar, Numerical Heat Transfer and Fluid Flow, Hemisphere Pub. Corp, Washington, 1980. [18] A. Chatterjee, An introduction to the proper orthogonal decomposition, Curr. Sci. 78 (2000) 808–817. [19] S. Sharma, Applied Multivariate Techniques, 1996. [20] M.A. Kramer, Nonlinear principal component analysis using autoassociative neural networks, AIChE J. 37 (1991) 233–243. [21] D. Dong, T.J. Mcavoy, Nonlinear principal component analysis – based on principal curves and neural networks, Comput. Chem. Eng. 20 (1996) 65–78. [22] B. Scholkopf, A. Smola, K.R. Muller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10 (1998) 1299–1319. [23] N. Kambhatla, T.K. Leen, Dimension reduction by local principal component analysis, Neural Comput. 9 (1997) 1493–1516. [24] W.W. Hsieh, A.M. Wu, A. Shabbar, Nonlinear atmospheric teleconnections, Geophys. Res. Lett. 33 (2006). [25] A. Herman, Nonlinear principal component analysis of the tidal dynamics in a shallow sea, Geophys. Res. Lett. 34 (2007). [26] H. Zabiri, M. Ramasamy, NLPCA as a diagnostic tool for control valve stiction, J. Process Control 19 (2009) 1368–1376. [27] K. Li, H. Su, J. Chu, C. Xu, A fast-POD model for simulation and control of indoor thermal environment of buildings, Build. Environ. 60 (2013) 150–157. [28] M. Bergmann, L. Cordier, Optimal control of the cylinder wake in the laminar regime by trust-region methods and POD reduced-order models, J. Comput. Phys. 227 (2008) 7813–7840. [29] P. Guha, M. un Nabi, Optimal control of a nonlinear induction heating system using a proper orthogonal decomposition based reduced order model, J. Process Control 22 (2012) 1681–1687. [30] X. Wang, U. Kruger, G.W. Irwin, G. McCullough, N. McDowell, Nonlinear PCA with the local approach for diesel engine fault detection and diagnosis, IEEE Trans. Control Syst. Technol. 16 (2008) 122–129. [31] Z. Song, B.T. Murray, B. Sammakia, A dynamic compact thermal model for data center analysis and control using the zonal method and artificial neural networks, Appl. Therm. Eng. 62 (2014) 48–57. [32] Z. Song, B.T. Murray, B. Sammakia, A compact thermal model for data center analysis using the zonal method, Numeri. Heat Transfer Part A – Appl. 64 (2013) 361–377.