Automatic differentiation approach for property computations in nanoscale thermal transport

Journal Pre-proof Automatic differentiation approach for property computations in nanoscale thermal transport Prabhakar Marepalli, Sanjay R. Mathur, J...

Download PDF

3MB Sizes 0 Downloads 25 Views

Report

PDF Reader
Full Text

Journal Pre-proof Automatic differentiation approach for property computations in nanoscale thermal transport Prabhakar Marepalli, Sanjay R. Mathur, Jayathi Y. Murthy

PII: DOI: Reference:

S0010-4655(20)30001-1 https://doi.org/10.1016/j.cpc.2020.107138 COMPHY 107138

To appear in:

Computer Physics Communications

Received date : 15 March 2019 Revised date : 16 December 2019 Accepted date : 29 December 2019 Please cite this article as: P. Marepalli, S.R. Mathur and J.Y. Murthy, Automatic differentiation approach for property computations in nanoscale thermal transport, Computer Physics Communications (2020), doi: https://doi.org/10.1016/j.cpc.2020.107138. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Elsevier B.V. All rights reserved.

Journal Pre-proof

Manuscript

pro of

AUTOMATIC DIFFERENTIATION APPROACH FOR PROPERTY COMPUTATIONS IN NANOSCALE THERMAL TRANSPORT Prabhakar Marepalli,1 Sanjay R. Mathur2 and Jayathi Y. Murthy3 1 The University of Texas at Austin 2 ESI Group 3 University of California, Los Angeles

re-

ABSTRACT We present the automatic code differentiation technique to perform derivative computations in nanoscale phonon transport simulations. This method exploits the concepts of templating and operator overloading in C++ and other similar programming languages to unintrusively convert existing codes into those yielding derivatives of arbitrary order. The idea is demonstrated through the computation of phonon properties such as second and third order force constants, the Gruneisen parameter, group velocities, and the temperature variation of specific heat for materials like graphene and graphene nanoribbons. Derivative values so computed are compared with those obtained using finite difference approaches or with analytical values. The method is found to yield derivative values to machine accuracy, with none of the round-off issues associated with finite difference approaches.

lP

1. Introduction First principles simulations have been the staple tool for predicting the thermal properties of materials for more than a decade [1]–[3]. Typical properties of interest include thermodynamic properties such as specific heat, the Gruneisen parameter, thermal expansion coefficient, macroscopic properties such as thermal conductivity, interface resistance, and phonon properties such as dominant mean free paths. Knowledge of these properties may be used to engineer novel materials/nanocomposites for applications such as thermal management and thermal energy storage.

urn a

While these predicted thermal properties provide important information about material behavior, in order to optimize material design, one needs information about the sensitivity of these properties to design and operating variables. Sensitivity is the variation of an output quantity of interest (QoI) with respect to a specific independent input variable and can be obtained by computing the derivative of the QoI with respect to this input variable. A common example where sensitivity is required is in the design of thermoelectric materials to harvest energy from waste heat [4]–[6] . The design of thermoelectrics is based on the Seebeck effect, and the performance metric, ZT [4], requires a good thermoelectric material to have low thermal conductivity and high electrical conductivity. Thermal transport simulations have been used to engineer materials that possess low thermal conductivity [7], [8]. A robust design of a thermoelectric energy harvester would lie in the region of the design space where the sensitivity of thermal conductivity to parameters like operating temperature and strain (caused by thermal expansion) is minimal. Another example is the design of thermal energy storage materials. A good thermal energy storage material must have large specific heat [9], [10]. For the material to be reliable across a wide range of temperatures, it is also important that the specific heat be insensitive to operating temperature fluctuations.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Derivatives are also used in thermal transport simulations to describe important crystal properties. Examples include force constants and the Gruneisen parameter. A force constant is a derivative of the total potential energy of the system with respect to the displacement of a given atom. Force constants are extensively used in phonon computations. Force constants of second order are used to compute phonon dispersion which provides valuable information on spectral properties of a crystal like group velocity, thermal conductance, specific heat, etc. The force constants of third and fourth order are needed to compute 1

Journal Pre-proof

anharmonic phonon-phonon interactions [11], [12]. Anharmonic phonon interactions provide information to predict properties like thermal conductivity. The Gruneisen parameter, which describes the effect of crystal volume change on its vibrational properties, is the derivative of phonon frequency with respect to the volume of the crystal. This parameter is also used to describe the anharmonicity of a crystal lattice, which in turn can be used to estimate material properties like thermal conductivity and the coefficient of thermal expansion [13], [14].

pro of

Derivatives are typically computed using a finite difference method. Here, the derivative of the quantity of interest (QoI) with respect to independent variable 𝑥 is approximated as:

[QoI(𝑥+∆𝑥)−QoI(𝑥)] ∆𝑥

(1)

re-

Equation (1) is a first-order forward difference approximation. As ∆𝑥 approaches zero, the difference equation approaches the exact value of derivative. However, the choice of ∆𝑥 in a finite difference scheme depends on the type of function being differentiated and the order of derivative that is sought. In most cases, ∆𝑥 is taken to be around 1/100th to 1/1000th of the range of 𝑥 under consideration. The accuracy of the derivative may be improved by decreasing Δx; this is typically done by sweeping ∆𝑥 across a given range of values and looking for a converged value of the derivative as Δx0. Too large a value of Δx leads to inaccurate derivatives, while too small a value of ∆𝑥 may result in the value of the finite difference derivative blowing up.

urn a

lP

In order to compute the sensitivity of design variables with respect to input parameters, exact derivatives may be calculated analytically by differentiating the underlying partial differential equations. For example, Blackwell and coworkers [15] derived sensitivity equations for the dependence of temperature with respect to material properties, boundary conditions, etc. by analytically differentiating the heat conduction equation with respect to these independent variables, and then solving the resulting partial differential equation (PDE) numerically. Similarly, Henz et al. [16] derived the sensitivity equations for fluid flow models through a porous medium. Sensitivity equations for generalized PDE systems have been presented in [17]. Although the sensitivities obtained from these equations are exact to numerical accuracy, the process for deriving them can be tedious and cumbersome with complicated physical models. Also, computation of sensitivities of higher orders is even more complicated, making it hard to generalize the procedure for all applications. Furthermore, different PDEs and boundary conditions must be derived for each sensitivity of interest, coded into software and solved numerically. Recently, automatic code differentiation techniques have been developed to compute exact derivatives automatically in a problem-agnostic and non-intrusive manner. This class of techniques is based on the idea that any computational code, at a low-level, is a set of elementary arithmetic operations (add, subtract, divide, multiply, etc.). These operations can be overloaded to compute the derivative using the chain rule for differentiation to yield the derivative of the quantity of interest with respect to the input variable. The algorithmic and implementation details of this technique are well developed and have been used in several applications [18], [19]. Very recently, with the advent of machine learning, automatic code differentiation has gained significant attention for use in backpropagation procedures where derivatives are used to update the weights of a neural network [20]. Many standard machine learning frameworks like Tensorflow and Pytorch now offer this utility as an inbuilt feature [21], [22]. In the field of fluid dynamics, Mathur et al., [23] recently applied this technique to unintrusively perform sensitivity analysis in a computational fluid dynamics (CFD) solver.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Code differentiation exploits the idea of templating and operator overloading in object-oriented languages and can yield exact derivatives, without the need for finite difference approximations. The method is 2

Journal Pre-proof

extensible to any code (such as Quantum Espresso [24] or LAMMPS [25]) written in modern computer languages such as C++ or F90/95. Derivatives of arbitrarily high order with respect to chosen input variables may be found using this idea. The most attractive feature of this idea is that it is unintrusive. Existing codes are easily converted, with virtually no intrusion, into those providing derivative information.

lP

(a)

re-

pro of

In this work, we propose an approach to unintrusively compute derivatives in phonon transport simulations. In the sections that follow, we discuss the main components of the idea. In Section 2, we discuss automatic code differentiation to compute derivatives of any order with respect to a given parameter. In Section 3, we present several examples illustrating the use of automatic code differentiation in nanoscale phonon transport considering materials like graphene and graphene nanoribbons. We first apply the idea to the computation of force constants, which are a staple of phonon spectral analysis. For this demonstration, we consider graphene as the material. We demonstrate that automatic code differentiation provides accurate and reliable second and third-order force constants without the issues of round-off encountered by widely used finite difference methods [26] with minimal change to the code computing the interatomic potential. We then apply the technique to the computation of the Gruneisen parameter, i.e., the derivative of the lattice vibrational frequency with respect to volume, for a graphene nanoribbon, and show that it does not suffer from the inaccuracies of finite difference methods near the  point. We also compute phonon group velocity, i.e., the derivative of the phonon frequency with wave vector, for the nanoribbon, and demonstrate that the automatic code differentiation technique obviates the necessity of tracing the large number of phonon branches, as finite difference methods require. Finally, using our approach, we demonstrate the computation of variation of the crystal specific heat of a graphene nanoribbon due to variation in temperature. In Section 4, we discuss possible extensions of this procedure and how it can be integrated into existing state-of-the-art first-principles simulations.

Original Functions p  3* x 2  sin( y ) q p/ y

urn a

Elemental Decomposition

Inputs : x, y Outputs : p, q Elemental Derivatives

𝑥 𝑥

𝑥 𝑥

( )

𝑥

𝑥

o

(

(b)

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

3

)

Journal Pre-proof

Outputs :

Inputs : x  10, y   / 3, x '  1, y '  0

p '  60.0 

p x

x 10, y  /3

q x

x 10, y  /3

pro of

q '  57.2958 

Outputs :

Inputs : x  10, y   / 3, x '  0, y '  1

p '  0.5 

p y

x 10, y  /3

q '  273.8795 

q y

x 10, y  /3

re-

Figure 1. Code differentiation procedure. (a) The functions to be differentiated are decomposed into elemental operations, and elemental derivatives computed. (b) Inputs x and y are prescribed, and also the input derivatives x’ and y’. If one of the input derivatives is set to unity and the other to zero (x’=1, y’=0, 𝜕𝑝 𝜕𝑞 for example), the corresponding output derivatives 𝜕𝑥 , 𝜕𝑥 are computed. These derivates are exact to machine accuracy. 2. Code Differentiation Method

𝑥

( )

.

urn a

lP

In this section, we describe the details of the automatic code-differentiation method. The mode of codedifferentiation described here is referred to as forward mode. This idea is illustrated through a simple example in Figure 1. We are given two outputs p and q, which are functions of two independent inputs x and y. (2) (3)

The objective is to compute the output derivatives p’ and q’ with respect to the input variables x and y at the point x=x0 and y=y0. Here the prime symbol denotes differentiation with respect to the chosen input variable. The algebraic expressions for p and q are first decomposed into elemental operations involving the input variables x and y. Then, their derivatives are written in terms of the elemental derivatives, i.e., the derivatives of inputs x and y (i.e, x’ and y’). Using the above decomposition, we can differentiate each of the elemental functions with respect to an unknown input variable. Let us assume the input variable with respect to which we want to differentiate is 𝑧, where 𝑧 could be either 𝑥 or . Using the expressions from Figure 1,

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

𝑥 𝑥 𝑑𝛼 𝑑𝑧

𝑥

𝑑𝑥 𝑑𝑧

𝑑𝑥 𝑑𝑧

(4)

𝑥.

(5)

Once we have all these derivatives computed, it is easy to obtain derivatives with respect to 𝑥 or by 𝑑𝑦 𝑑𝑥 𝑑𝛼 simply setting 𝑧 𝑥 or 𝑧 . For example, if 𝑧 𝑥, 𝑑𝑧 0 and 𝑑𝑧 𝑥 1 giving 𝑑𝑧 2𝑥. 4

Journal Pre-proof

𝑑𝑦

𝑑𝑥

𝑑𝛼

Similarly, if 𝑧 , 𝑑𝑧 1 and 𝑑𝑧 𝑥 0 giving 𝑑𝑧 0. In other words, by setting only one input derivative to unity and all other input derivatives to zero, the procedure computes the exact derivatives of all outputs with respect to the input corresponding to the nonzero derivative. For example, in Figure 1, setting the value of x’ to 1 and y’=0 computes the exact derivative of p and q with respect to x (top example). Similarly, setting the value of y’ to 1 and x’=0 computes the exact derivative of p and q with respect to y (bottom example).

to compute

𝜕𝑝 𝜕𝑞 , 𝜕𝑦 𝜕𝑦

pro of

It is useful, for purposes of coding (see next section), to think of all variables as consisting of a pair: the variable itself, and its derivative with respect to the chosen input. Thus, p and q would be represented by P=[p,p’] and Q=[q,q’]. The input variables x and y are also represented by such pairs, so that X=[x,x’] and Y=[y,y’]. The prime symbol means differentiation with respect to an arbitrary (user-defined) input. Thus 𝑓(𝑥, 𝑥 , , ) 𝑓(𝑋, 𝑌). When evaluating , we need to specify 𝑥, , 𝑥’, ’ as one would for any other function. If the variable we are differentiating with respect to is 𝑥, then 𝑑𝑥 𝑑𝑥 1 and 𝑑 𝑑𝑥 0, so 𝑥’ 1 and ’ 0. Thus, providing an input X=[x0, 1] and Y=[y0, 0] indicates that we wish to 𝜕𝑝 𝜕𝑞 compute 𝜕𝑥 , 𝜕𝑥 at (x0, y0). Alternatively, providing an input X=[x0, 0] and Y=[y0, 1] signals that we wish at (x0, y0). The specific algebraic operations are achieved in our solver through

re-

templating and operator overloading in C++, as described in the sections that follow. (a)

Original Functions

Inputs : x, y Outputs : p, q

Elemental Derivatives

urn a

Elemental Decomposition

lP

p  3* x 2  sin( y ) q p/ y

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

5

Journal Pre-proof

(b)

Outputs : Inputs : x  x0 , x1 '  0, x2 '  0, x12 '  0 p p1'  0.5  y  y0 , y1 '  1, y2 '  1, y12 '  0 y p2'  0.5 

Outputs : q1'  273.879  x 10, y  /3

p y

Outputs :

x0  10, y0   / 3

p x

 p yy 2

q12'  522.244 

x 10, y  /3

q1'  57.2958 

x 10, y  /3

p y

x 10, y  /3

 q yy 2

q x

x 10, y  /3

 p xy

x 10, y  /3

q2'  273.879 

q y

q12'  54.7134 

 q xy

x 10, y  /3

re-

p2'  0.5  p12'  0 

q y

Outputs :

p1'  60 

y  y0 , y1 '  0, y2 '  1, y12 '  0

x 10, y  /3

x 10, y  /3

p12'  0.866025 

Inputs : x  x0 , x1 '  1, x2 '  0, x12 '  0

q y

q2'  273.879 

pro of

x0  10, y0   / 3

2

x 10, y  /3

x 10, y  /3 2

x 10, y  /3

1, 𝑥

unity and others zero (𝑥1

lP

Figure 2. Code differentiation procedure for derivatives of second order. (a) Here the elemental derivatives are computed with respect to 𝑥1 , 𝑥 , 1 , . (b) Input values of these parameters determine the type of corresponding output derivative. Since a second order derivative is sought, two inputs are always set to 0,

1

0,

1, to find

𝜕2 𝜕𝑥𝜕𝑦

for example).

The above idea can easily be extended to compute the derivatives of any chosen order. In Figure 2, we show the extension for evaluating second derivatives. We consider the same functions used in the above 𝜕2 𝑞 . 𝜕𝑥𝜕𝑦

and q”=

𝑥

( ) and

𝜕2 𝑝

. Our objective is to compute output second derivatives p”=𝜕𝑥𝜕𝑦

urn a

example, p

In order to accommodate second derivatives, we now consider each variable as consisting of four components rather than two. Thus, 𝑃 [ , 1 , , 1 ],and 𝑄 [ , 1 , , 1 ]. Here, the notation 1 denotes a first derivative with respect to input variable #1, denotes a first derivative with respect to input variable #2, and 1 denotes the second derivative with respect to input variables #1 and #2. The procedure is similar to the one discussed for first order derivatives earlier. But now, we are differentiating with respect to two unknown variables. Let us assume the input variables are 𝑧1 and 𝑧 . The corresponding variable

inputs 𝑥 and

when we need 𝑑2 𝑥 𝑑𝑧1 𝑑𝑧2

𝑑2 𝑝 ]. The variable is now a function of 1 2 1 𝑑𝑧2 𝑑𝑥 𝑑𝑥 𝑑2 𝑥 𝑑𝑦 𝑑𝑦 𝑑2 𝑦 and their corresponding derivatives, i.e. p= (𝑥, 𝑑𝑧 , 𝑑𝑧 , 𝑑𝑧 𝑑𝑧 , , 𝑑𝑧 , 𝑑𝑧 , 𝑑𝑧 𝑑𝑧 ). Thus, 1 2 1 2 1 2 1 2 𝑑𝑥 𝑑𝑥 the second derivative with respect to 𝑥, we choose 𝑧1 𝑥 and 𝑧 𝑥, so 𝑑𝑧 1, 𝑑𝑧 1, 1 2 𝑑𝑝

𝑑𝑝

has four values associated with them [ , 𝑑𝑧 , 𝑑𝑧 , 𝑑𝑧

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

0 and all the y-derivatives are 0. In other words, the inputs x and y are represented as 𝑋

[𝑥, 𝑥1 , 𝑥 , 𝑥1 ], and 𝑌

[ ,

1,

,

1

]. If we wished to compute 6

𝜕2 𝑝 𝜕𝑥 2

and

𝜕2 𝑞 𝜕𝑥 2

at (𝑥0 ,

0 ),

we would

Journal Pre-proof

choose as inputs 𝑋 [𝑥0 , 1,1, 0], and 𝑌 [ 0 , 0,0,0]. This indicates the computation that we wish to produce second derivatives with respect to a single input variable, x. The computational process, outlined 𝜕𝑝

in Figure 2, would produce 𝑃 𝜕2 𝑝 𝜕𝑦 2

to compute

and

𝜕2 𝑞 𝜕𝑦 2

𝜕𝑝 𝜕2 𝑝

[ , 𝜕𝑥 , 𝜕𝑥 , 𝜕𝑥 2 ] at the point (𝑥0 ,

at (𝑥0 ,

0 ),

0 ).

we would choose as inputs 𝑋

If, on the other hand, we wished [𝑥0 , 0,0, 0], and 𝑌

[ 0 , 1,1,0].

This signals the computation that we seek to produces second derivatives with respect to a single input 𝜕𝑝

𝑄

𝜕𝑞

𝜕𝑞

𝜕2 𝑞

[ , 𝜕𝑦 , 𝜕𝑦 , 𝜕𝑦2 ] at the point (𝑥0 ,

0 ).

An input of 𝑋

we want a mixed second derivative, would produce 𝑃 the point (𝑥0 ,

0 ).

[𝑥, 1,0, 0], and 𝑌

[

𝜕𝑝 𝜕𝑝 𝜕2 𝑝 , 𝜕𝑥 , 𝜕𝑦 , 𝜕𝑥𝜕𝑦

𝜕𝑝 𝜕2 𝑝

[ , 𝜕𝑦 , 𝜕𝑦 , 𝜕𝑦2 ] and

pro of

variable, y. The computational process, outlined in Figure 2, would produce 𝑃

[ , 0,1,0], signaling that

] and 𝑄

𝜕𝑞

𝜕𝑞

𝜕2 𝑞

[ , 𝜕𝑥 , 𝜕𝑦 , 𝜕𝑥𝜕𝑦] at

We now turn to the example shown in Figure 2. As before, elemental decomposition is performed and elemental derivatives are computed using a set of intermediate variables, [ , 1, , 1 ] , [ , 1 , , 1 ] and [ , 1 , , 1 ] and exploiting the chain rule. Passing input variables 𝑋 [10, 1,0, 0], and 𝑌 [𝜋 ,0,1,0] through the concatenation of algebraic operations produces 𝑃 ] and 𝑄

[ ,

𝜕𝑞 𝜕𝑞 , 𝜕𝑥 𝜕𝑦

.

𝜕2 𝑞 ] 𝜕𝑥𝜕𝑦

at the point (𝑥0

lP

(a)

(b)

10,

re-

𝜕𝑝 𝜕𝑝 𝜕2 𝑝 , , 𝜕𝑥 𝜕𝑦 𝜕𝑥𝜕𝑦

urn a

[ ,

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

7

0

𝜋

), as shown.

Journal Pre-proof

Examples for Template Usage (

Output  First Derivative

f x

Output  ( x0 , y0 , z0 )

x  [ x0 ,1] Inputs : y  [ y0 , 0]

z  [ z0 ,1]

2 f x 2

Output 

( x0 , y0 , z0 )

0

0)

z  [ z0 , 0, 0, 0]

z  [ z0 , 0, 0, 0]

Output 

( x0 , y0 , z0 )

re-

Third Derivative

0

Inputs : y  [ y0 , 0,1, 0]

Inputs : y  [ y0 , 0, 0, 0]

3 f x3

2 f xy ( x , y , z

x  [ x0 , 1, 0, 0]

x  [ x0 , 1, 1, 0]

Output 

pro of

Second Derivative

( x0 , y0 , z0 )

Inputs : y  [ y0 , 0]

z  [ z0 , 0] Output 

f z

x  [ x0 , 0]

x  [ x0 , 1, 1, 1, 0, 0, 0, 0]

3 f xyz

( x0 , y0 , z0 )

x  [ x0 , 1, 0, 0, 0, 0, 0, 0]

Inputs : y  [ y0 , 0, 0, 0, 0, 0, 0, 0]

Inputs : y  [ y0 , 0,1, 0, 0, 0, 0, 0]

z  [ z0 , 0, 0, 0, 0, 0, 0, 0]

z  [ z0 , 0, 0,1, 0, 0, 0, 0]

lP

Figure 3. Examples of template usage. a) Data structure for templates used for first, second and third derivatives; b) examples showing how to choose input values to obtain derivatives of different orders.

urn a

Templating To realize the above idea in our computational code, we employ templating and operator overloading. A template allows functions and classes in a programming language to be operated on with a generic datatype. In general applications, the datatype may be an integer, a floating point decimal, or a character. User-defined datatypes are what make templating powerful. For computing derivatives of nth order, we define a datatype as a variable with 𝑁 elements where 𝑁 2𝑛 . For example, considering a datatype for a 2nd derivative, we define the variable as 𝑉 [𝑣, 𝑑𝑣1, 𝑑𝑣2, 𝑑𝑣1𝑑𝑣2], where 𝑣 is the actual value of the variable, 𝑑𝑣1 is the first derivative with respect to variable 1, 𝑑𝑣2 is the first derivative with respect to variable 2 and 𝑑𝑣1𝑑𝑣2 is the mixed second derivative with respect to variables 1 a d 2. We call these types of variables Tangent variables. Input variables 1 and 2, and their corresponding derivatives are specified by the user at the start. As shown in the example in Figure 1, x and y are Input Tangent variables. The derivatives of these Input Tangent variables with respect to variables 1 and 2 are propagated to the outputs. x’ and y’ can take a value of 1 or 0. A value of 1 denotes that the derivative being computed is with respect to that variable. The data structures for templates of different orders of derivatives are shown in Figure 3 (a). In Figure 3 (b), we show some examples for template usage. These example demonstrate how a combination of input derivative variables can be used to obtain any desired mixed partial derivative. Additional examples are shown in Supplementary Information.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Using the templated datatype discussed above, we create a class in C++ with the derivatives defined for all elemental mathematical expressions. We name this class the Tangent class.

8

Journal Pre-proof

Regular Operation Addition

Overloaded Operation 𝑑𝑥

double x=x0, y=y0; double z; z=x+y;

𝑑𝑦

Tangent x=[x0, 1], y=[y0, 𝑑𝑥 𝑑𝑥 Tangent z; z=x+y; z’ x’ y’

Output: z=x0 + y0

pro of z=[x0+y0,1+0]

Output: z=[x0+y0, 1]

Multiplication

0];

Overload: z=[x0,1]+[y0,0]

𝑑𝑥

double x=x0, y=y0; double z; z=x*x+y;

Tangent x=[x0, 𝑑𝑥 Tangent z; z=x*x+y;

Output: z=x0*x0+y0

re-

Overload: z1=x*x z=z1 y

𝑧 𝑥

𝑥 𝑥

1], y=[y0,

𝑑𝑦 𝑑𝑥

0];

z1’ x*x’ x’*x z’ z 1’ y

lP

z1=[x0,1]*[x0,1]

z1=[x0*x0, x0*1+1*x0] z=z1+y z=[x0*x0, x0*1+1*x0]+[y0,0] z=[x0*x0+y0, x0*1+1*x0] Output: z=[x0*x0+y0, 2*x0]

𝑧 𝑥

𝑥 𝑥

urn a

Figure 4. Pseudo code for operator overload for a first order derivative. The top example describes the overload procedure for the addition operation. Here the input variables x and y are added together to compute the output variable z. The left hand column shows regular operations addressing double variable types. The corresponding operations on Tangent variable types are shown in the right hand column. For Tangent variables, operations are performed on the variable itself, and also its derivative. The value of x’ is set to 1 (x=[x0,1]), meaning that the derivative of z is computed with respect to x; the value of y’ is set 𝜕𝑧 𝜕𝑧 to zero (y=[y0,0]). The derivative 𝜕𝑥 is stored in the second field of the Tangent variable z (z=[𝑧0, 𝜕𝑥]). The bottom example describes the overload operation for the multiplication operation. In the right column, the derivative of x*x is compute using the uv chain rule for derivatives. Operator Overloading With the dictionary of derivatives available from the Tangent class, the next step is to design a procedure to parse the expressions using the derivative rules. In other words, a procedure is needed to perform the elemental decomposition shown in Figure 1. For this we exploit the operator overlaoding feature of C++. This feature allows user defined definitions for an operator. For example, a multiply operator ( ) may be overloaded to perform the multiplication rule for derivatives. Considering the expression 𝑢 𝑣, the compiler may be programmed to perform the standard multiplication procedure or the multiplication rule for derivatives, (𝑢𝑣 𝑣𝑢 ). The choice of operation depends on the data type of the operands (𝑢, 𝑣). For example, for the multiplication of two variables of user-defined data type 𝑉 [𝑣, 𝑑𝑣], the multiplication

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

9

Journal Pre-proof

operation may be overloaded to perform straightforward multiplication for the first element, 𝑣, and the chain rule for derivatives applied to the second element, 𝑑𝑣. Pseudo code is provided in Figure 4 to demonstrate examples of operator overloading. Similar code may be written for second and third derivatives.

pro of

It is important to understand that the code differentiation procedure does not yield symbolic expressions for derivatives. Instead, it yields the exact numerical values of the derivatives for a given set of numerical input values, as seen in Figure 1 and Figure 2. If derivatives are desired for multiple input values, the above procedure must be repeated for every set of input values. This may be time-consuming if the derivatives are needed at many such input values.

re-

We emphasize here that the conversion of an existing legacy code to one providing derivatives is unintrusive if the code is written in F90, C++ or any other language supporting templating and operator overloading. All that is required is including files to perform the templating operation, and to define the operator overloading functions. The code itself does not need to be changed at all, and two versions of the code, one regular and one computing derivatives, are not required. The same code may be “converted” at compile time simply by choosing suitable compiler options. Also, the computation of sensitivities and derivatives carries over to any new models, such as for specific heat or thermal conductivity, that are added to the code – no new code needs to be written to support derivative computations. Thus, the automatic code differentiation idea presented here is a powerful, unintrusive and convenient way to add derivative computation capabilities to existing codes. To demonstrate the ease of use and the process of “converting” a regular code to a code that can compute derivatives, we show the code snippets for computing second order force constants for graphene nanoribbons in supplementary information.

lP

3. Examples

urn a

3.1 Force Constants Force constants are an integral part of phonon spectral analysis of crystalline materials. A force constant is a derivative of the total potential energy of the system with respect to the displacement of a given atom. Force constants of second order are used to compute phonon dispersion and force constants of third order are needed to compute anharmonic phonon-phonon interactions. Several techniques exist for computation of force constants. The most commonly used techniques are density functional perturbation theory (DFPT), the method of finite differences or the direct method, and the exploitation crystal symmetries and invariances [29]–[31] . DFPT uses a quantum mechanical approach to analyze the crystal response to an external perturbation [27], [29], [30]. This response is then used to compute the second, third, and higherorder derivatives. Although DFPT is one of the most accurate methods for predicting force constants, it is computationally very expensive. The direct method is the most widely-used and easiest of methods. Here, atoms in a supercell of the crystal are manually displaced, and the change in forces on other atoms due to this displacement is recorded. The forces themselves are computed either from first-principle DFT (Density Functional Theory) calculations or from empirical interatomic potentials. These force changes are used to compute the force constant, which is the derivative of force with displacement, using a finite difference formula. A similar procedure is followed for derivatives of all orders. A limitation of this method is that the number of displacements to be performed increases with the order of the derivative, making the process cumbersome. The derivative calculation is also approximate in nature: the value of the derivative depends on the value of the displacement. Too large a displacement leads to an approximate derivative, whereas too-small a displacement can lead to round-off errors and the derivative blowing up because of a division by a nearlyzero number. The task of finding the appropriate displacement is itself a significant part of the problem in the finite difference method. This value depends on the potential and the order of the derivatives being computed. Nevertheless, several variations of this method exist and are widely used [33]–[36].

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

10

Journal Pre-proof

A recently developed technique exploits the symmetries of force constants and the crystal lattice [28]. The force on an atom i in a crystal is computed as the derivative of the total potential energy of the system with 𝜕𝑈 respect to that atom (𝐹𝑖 ), where the potential energy is defined using Taylor’s expansion as 𝜕𝑢𝑖 1 1 1 𝑈 𝑈0 ∑ Π𝑖 𝑢𝑖 ∑ Φ𝑖𝑗 𝑢𝑖 𝑢𝑗 ∑ Ψ𝑖𝑗𝑘 𝑢𝑖 𝑢𝑗 𝑢𝑘 ∑ χ𝑖𝑗𝑘𝑙 𝑢𝑖 𝑢𝑗 𝑢𝑘 𝑢𝑙 ⋯ 2! ! 4! 𝑖𝑗

𝑖𝑗𝑘

𝑖𝑗𝑘𝑙

(6)

pro of

𝑖

Here Π𝑖 , Φ𝑖𝑗 , Ψ𝑖𝑗𝑘 , χ𝑖𝑗𝑘𝑙 are the force constants of first, second, third and fourth order respectively and 𝑈0 is the equilibrium crystal energy. A set of equations describing the relations between these force constants can be obtained by applying principles of force constant symmetry and invariance. With these relations between the force constants and a set of force (𝐹𝑖 ) versus displacement (𝑢𝑖 ) data for the crystal lattice, one can form a linear system of equations with the force constants being the unknowns. The overall accuracy of the force constants is determined by the accuracy of the force versus displacement data. A common practice is to obtain this data from molecular dynamics (MD) or density functional theory (DFT) simulations.

re-

Since a force constant is nothing but a derivative of total crystal potential energy, we can exploit the idea of code differentiation to automatically compute the exact force constants. The total potential energy can be obtained using interatomic potentials. Most commonly-used interatomic potentials have well-defined closed form expressions and are shown to capture crystal behavior reasonably well. For example, the Tersoff, Brenner potentials and EDIP (Environment dependent interatomic potential) are commonly used for silicon [37]–[39]. The Tersoff and Brenner potentials (with modified coefficients) can be used for graphene [38] as well.

lP

In this section, we demonstrate the application of the code differentiation method to compute force constants of graphene by using Tersoff interatomic potential [38]. The crystal structure of graphene used for this computation is shown in Figure 5. The atoms in the primitive unit cell are denoted as A and B, and the reference coordinate axes are also shown. Following are the force constants computed and the variable notations:

urn a

a) The first order force constant computed is for the displacement of atom A in the x-direction (Figure 𝜕𝑈 5). This displacement is denoted by 𝑢𝐴,𝑥 . The corresponding force constant is 𝜕𝑢 , which provides 𝐴,𝑥

the first order derivative of total crystal potential with respect to the displacement of atom A in the x-direction. b) The second order force constant computed is

𝜕2 𝑈 , 𝜕𝑢𝐴,𝑥 𝜕𝑢𝐴,𝑥

which provides the second order

derivative of total crystal potential with respect to the displacement of atom A in the x-direction. c) Finally, the third order force constant is computed as

𝜕3 𝑈 , 𝜕𝑢𝐴,𝑥 𝜕𝑢𝐴,𝑥 𝜕𝑢𝐴,𝑥

which provides the third

order derivative of total crystal potential with respect to the displacement of atom A in the xdirection.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

All three derivatives are shown in Figure 6. The derivatives computed using code differentiation are shown by the blue line. The corresponding derivatives computed using the method of finite difference are also shown. For finite differences, we use the second-order accurate central difference formulae shown below:

11

Journal Pre-proof

𝑈𝑢𝐴,𝑥 +∆𝑢𝐴,𝑥 −𝑈𝑢𝐴,𝑥−∆𝑢

𝑑𝑈

𝐴,𝑥

𝑑𝑢𝐴,𝑥 𝑑2𝑈

𝑈𝑢𝐴,𝑥 +∆𝑢𝐴,𝑥 −

𝑑𝑢𝐴,𝑥 𝑑𝑢𝐴,𝑥 ∆𝑢𝐴,𝑥

𝑈𝑢𝐴,𝑥 −

∆𝑢𝐴,𝑥

2 𝑈𝑢𝐴,𝑥 −∆𝑢𝐴,𝑥

2 𝑈𝑢𝐴,𝑥 +∆𝑢𝐴,𝑥

(1)

2 ∆𝑢𝐴,𝑥 ∆𝑢𝐴,𝑥 ∆𝑢𝐴,𝑥

urn a

lP

A

Unit Cell

(8)

∆𝑢𝐴,𝑥 ∆𝑢𝐴,𝑥

pro of

𝑈𝑢𝐴,𝑥 +

𝑈𝑢𝐴,𝑥 +𝑈𝑢𝐴,𝑥 −∆𝑢𝐴,𝑥

re-

𝑑3 𝑈 𝑑𝑢𝐴,𝑥 𝑑𝑢𝐴,𝑥 𝑑𝑢𝐴,𝑥

(7)

∆𝑢𝐴,𝑥

(9)

B

y

x

Figure 5. Crystal lattice of graphene used in the computation of first, second and third order force constants. The unit cell is highlighted showing atoms A and B. The coordinate axis are also shown. (a)

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

12

reurn a

lP

(b)

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

pro of

Journal Pre-proof

(c) 13

relP

Figure 6. Code differentiation versus finite differencing for first, second and third derivatives of the potential energy of a bulk graphene crystal. Sensitivities of finite difference derivatives with respect to the choice of ∆𝑢 are shown in inset plots for second and third derivatives.

urn a

Each finite difference derivative is computed for a range of Δ𝑢/a values, where a is the interatomic distance. It can be seen from the figure that the code differentiation procedure is independent of the choice of Δ𝑢, since the idea does not employ finite difference concepts at all. The value it computes is exact to machine precision. Additionally, as discussed in the introduction, force constants follow the symmetry and invariance relations of the underlying crystal [28]. We have tested for these relations with the force constants computed from code differentiation. We find that all the important symmetry relations (invariance to permutation of indices, translational invariance, rotational invariance, etc.) are satisfied at least to the degree obtained by the least-squares fit in [28]. Derivatives computed using finite differences converge quadratically to the exact value, consistent with a second-order differencing scheme. However, we see that for the second and third derivatives, as Δ𝑢 → 0, the finite difference calculation begins to diverge due to round-off. More importantly, in the case of finite difference, it is not guaranteed that the value of optimized Δ𝑢 is constant for all derivatives for a problem of interest. For instance, considering the set of all third order force constants needed to understand thermal transport in graphene, the computation of each unique force constant in that set may need an optimized value of displacement that gives a force constant that is close to exact value. Finding such displacements when there is no reference of what an exact derivative could be (which is the case with DFT calculations) is difficult. While the validity of force constants computed using this method can be checked by ensuring that they satisfy symmetries as discussed above [28], these checks cannot guarantee exact force constants, therefore leaving a degree of uncertainty regarding the force constants obtained. In cases where there is a

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

pro of

Journal Pre-proof

14

Journal Pre-proof

high degree of sensitivity of material properties to force constants, the uncertainty in force constants is propagated to the computed material properties. Force constants computed using code differentiation do not suffer from any of these limitations as the derivatives are exact. They also satisfy all important symmetry relations to the degree obtained by the least-squares fit in [28] or better.

re-

pro of

Another major limitation of the finite difference method is the need for several DFT computations or potential energy solves. As discussed above, the method of finite differences involves displacing atoms and measuring the forces on lattice in the displaced configuration. In DFT techniques, this means performing a new DFT calculation for every atomic displacement. This makes force constant computation one of the most expensive steps in first-principles thermal transport modeling. Specifically, when force constants of higher order are required, the number of atomic displacements grows quickly. Computation of each first-order force constant needs two atomic displacements. A second-order force constant needs 4 atomic displacements. A third order force constant needs 8 and fourth order needs 16 atomic displacements. Given that the DFT computation itself is very expensive, having to perform tens of atomic displacements for each force computation can become very expensive. Furthermore, in cases where there are a large number of atoms per unit cell, a large number of force constants are needed, making the process prohibitively expensive. With recent findings on the importance of four-phonon scattering in the prediction of thermal conductivity [12], which in-turn needs fourth order force constants, force constant computations through finite difference is turning out to be the bottleneck in thermal transport models. Researchers employ techniques such as the exploitation of symmetries, elimination of force constant computations for those atomic combinations that are expected to provide infinitesimal force constants, etc. [28], [39]. While these techniques reduce the overall computation time, the fundamental limitation of having to perform several DFT or potential energy solves remains.

urn a

lP

Force constants computed using code differentiation do not have the problem of multiple DFT or potential energy computations for each force constant. Computing a force constant of any order with respect to a given atomic displacement needs only a single DFT computation in the framework of code differentiation. However, the code differentiation procedure does have a runtime overhead due to additional operator overloading and templating operations on each variable. Depending on the problem, the cost associated with these overhead operations may be comparable to the multiple function evaluations required by the finite difference procedure. Careful design of the code differentiation template can reduce runtime costs. For instance, as shown in Figure 3, the template for a third-order force constant for a specific atomic displacement automatically provides first- and second-order force constants. Additionally, the technique described in this work uses forward mode code differentiation to compute derivatives. It has been shown in the literature that using reverse mode computation for code differentiation can significantly speedup derivative computations by at least an order of magnitude [18].

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

15

re-

y

urn a

lP

x

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

pro of

Journal Pre-proof

16

Journal Pre-proof

urn a

lP

re-

pro of

Figure 7. Graphene nanoribbon of width 0.44 nm is used for demonstration. On the right is the phonon dispersion computed by solving lattice dynamics equation on the ribbon structure. Acoustic branches are highlighted in blue, green and red.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

17

relP urn a

Figure 8. Gruneisen parameter - code differentiation versus finite difference. The parameter is computed for the transverse (branch 2) and longitudinal branches (branch 3) of the phonon dispersion of the graphene nanoribbon. The Gruneisen parameter computed using the finite difference method is seen to vary significantly with δ𝑎 near the Γ point (shown in inset plots). This occurs due to the varying slopes of the linear region of these dispersion curves with respect to a. 3.2 Gruneisen Parameter The Gruneisen parameter is a non-dimensional quantity that describes the response of the vibrational properties of a crystal due to strain applied on it. This source of strain may be mechanical or thermal and can be compressive or tensile. Gruneisen parameters provide useful information about thermal expansion mechanism of a crystal, anharmonicity, etc. [14], [40], [41]. Several definitions exist for the computation of the Gruneisen parameter. The most commonly used definition computes the variation of phonon vibrational frequency due to change in lattice constant. The mode-wise Gruneisen parameter at wave vector k for a polarization p may be defined as

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

pro of

Journal Pre-proof

18

Journal Pre-proof

𝑘(𝑝)

𝑉

𝜕𝜔𝑘(𝑝)

𝜔 ,𝑘(𝑝)

𝜕𝑉

(10)

where V is the volume of the crystal, 𝜔0,𝑘(𝑝) is the frequency of a phonon of polarization p at wave vector 𝜕𝜔𝑘(𝑝)

k at equilibrium. The derivative 𝜕𝑉 in the above equation may be computed exactly using code differentiation. Here the input derivative is the volume of the crystal or lattice constant 𝜕𝜔𝑘(𝑝)

pro of

( 𝜕𝑎 , e 𝑉 𝑓(𝑎)) in case of materials like graphene where only one lattice parameter is used to describe the crystal. In this example, we choose a graphene nanoribbon structure for demonstrating Gruneisen parameter computation using the code differentiation procedure. A narrow zig-zag nanoribbon of width 0.44 nm is picked for this purpose. The x-direction is confined and thermal transport occurs only in the y-direction (see Figure 7). The phonon dispersion for this structure is shown in Figure 7. The dispersion highlights the three acoustic phonon branches in blue, green and red. In Figure 8, we show the Gruneisen parameters computed for the transverse and longitudinal (green and red, respectively) branches. The corresponding Gruneisen parameters computed using the finite difference scheme are also shown for different values of 𝜕𝜔𝑘(𝑝) 𝜕𝑎

is computed using 𝜕𝜔𝑘(𝑝)

(

𝜕𝑎

𝜔𝑘(𝑝)𝑎+𝛿𝑎 −𝜔𝑘(𝑝)𝑎−𝛿𝑎

re-

𝛿𝑎, where the derivative

𝛿𝑎

)

(11)

lP

As we can see from the figure, the Gruneisen parameters computed using code differentiation match the numerical values well for most of the Γ 𝑀 region. The primary difference is close to the Γ point, where the finite difference method shows a strong sensitivity to the value of δ𝑎 (see inset plots in Figure 8). The 𝜕𝜔𝑘(𝑝)

slope in the linear region is very sensitive to the applied strain and the appropriate value of is 𝜕𝑎 therefore difficult to compute accurately. To demonstrate this, in the lower inset plot of Figure 8, we show 𝜕𝜔𝑘(𝑝) 𝜕𝑎

computed using the finite difference method with different values of δ𝑎 using the equation above.

In this plot, we can clearly see that

𝜕𝜔𝑘(𝑝) 𝜕𝑎

does not change monotonically with 𝑎 and only using an 𝜕𝑎

urn a

appropriate value 𝑎 can we find the exact derivative (In Figure 8, a value of 0.0068 is found to 𝑎 provide the closest match with exact derivative.) In situations like this, the method of code differentiation is very helpful as the derivative is exact and does not depend on perturbing the value of a to compute finite differences. 3.3 Group Velocities Phonon group velocities for the nanoribbon may also be computed using the code differentiation procedure discussed above. The group velocity vector of a phonon of wave vector k is given by k  where 𝜔 is the frequency and k is the wave vector; it may be computed from derivatives of the dispersion relation. For a 𝜕𝜔 1D Brillouin zone, such as that for a graphene nanoribbon, for example, 𝑣𝑔 . For this case, this 𝜕𝑘 quantity is typically computed by performing finite differencing on each branch of the dispersion curve (𝜔 𝑘). It is necessary to identify the branch of interest and to stay on it during the differencing procedure. The identification of the phonon branches, however, is not straightforward in the case of crystals with several atoms in a unit cell, where multiple intersections of polarization branches make this type of computation error-prone (see Figure 7). One of the other approaches used to compute group velocity without having to perform branch sorting involves directly differentiating the eigenvalue equation that is used to compute phonon dispersion spectrum [42]. The eigenvalue equation for phonon dispersion is given by

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

19

Journal Pre-proof

𝜔 (𝑘)𝑒(𝑘)

𝐷𝑒(𝑘)

(12)

pro of

Here 𝑒(𝑘) is the eigenvector and 𝐷 is the dynamical matrix. Differentiating Equation (12), one can define the group velocity at a given wave vector and polarization as a function of the eigenvectors of the system. The method of code differentiation can also be used to bypass the branch identification step altogether. The variable ω2 in the eigen problem is identified as a Tangent variable. The corresponding wave vectors for the desired derivatives are then Input Tangent variables. For example, in order to compute 𝑣𝑔(𝑖,𝑝) , the '

'

urn a

lP

re-

group velocity for the ith point in k-space and the pth branch index, the Input Tangent variables k xi , p  , k y i , p  are given a value of 1. This results in the output variable 𝜔 with the following form: 𝜔 [𝜔, 𝑣𝑔,𝑥 , 𝑣𝑔,𝑦 , 𝑣𝑔,𝑥𝑦 ]. In Figure 9, we show the group velocity 𝑣𝑔,𝑦 for branch 2 of the graphene nanoribbon computed using the above procedure. Also shown in the figure is the group velocity computed using the finite difference method with a 𝛿𝑘 of 𝑘𝑚𝑎𝑥 160, where 𝑘𝑚𝑎𝑥 𝜋⁄ . √ 𝑎

Figure 9. Phonon group velocity component 𝒗𝒈,𝒚 for branch 2 (dark green in Figure 7) of the dispersion using the code differentiation and finite difference methods. Both cases use a k-space grid size of 160 points.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

3.4 Temperature Variation of Specific Heat of Graphene Nanoribbons As a final example, we compute the variation of the specific heat of the graphene nanoribbon with respect to small changes in temperature. Specific heat is an important parameter for applications such as thermal energy storage. A good thermal storage material does not exhibit large variations of specific heat with 20

Journal Pre-proof

temperature within the prescribed range. Information on the sensitivity of specific heat can provide useful insight into designing such materials. In this example, we consider graphene nanoribbons with the lattice 𝜕𝐶 structure shown in Figure 7, but with different widths, and compute the derivative of specific heat ( 𝜕𝑇𝑣 ) with temperature. This derivative can be used to compute the variation in specific heat using the expression

∆𝐶𝑣 ≈

𝜕𝐶𝑣 𝜕𝑇

∆𝑇

(13)

pro of

where ∆𝑇 is the variation in temperature. Three different widths of the graphene nanoribbon are considered – 0.44 nm, 0.88 nm and 2.66 nm -- with 4, 8 and 24 atoms per unit cell, respectively. The variation of specific heat with respect to temperature is computed at three different temperatures (T=40K, T=300K, T=900K). Similar to the above examples, we can use the framework of code differentiation to compute 𝜕𝐶 this derivative ( 𝜕𝑇𝑣 ). All that is needed to enable the derivative computation is add the Tangent library to an existing codebase that computes specific heat at a given temperature. Specifically, we set temperature as the Input Tangent variable, i.e. the T’=1. This derivative is propagated through all computations needed for specific heat evaluation to provide the derivative of specific heat with respect to temperature. In Figure 10, we show the specific heat of GNRs of different widths as a function of temperature. For this demonstration, using Equation (13), we compute the variation of specific heat for a 2% variation in temperature.

urn a

lP

re-

In Table 1, we show the specific heat and the variation in specific heat for a 2% variation in temperature for ribbons of different widths and at different temperatures. As we can see from the table, at low temperatures (T=40K), the variation in specific heat (∆𝐶𝑣 ) for a 2% variation in temperature ranges from ~1.6% for the narrowest ribbon to ~2.8% for the widest ribbon - an increase of 1.2%. As the temperature is increased to 300K and then to 900K, the range of variation in specific heat reduces. At 300K, the variation in specific heat (∆𝐶𝑣 ) only changes by 0.5% (from narrowest to widest ribbon). At 900K, this change is under 0.2%. Finally, at temperatures above Debye temperature, specific heat of ribbons of all widths converge therefore converging ∆𝐶𝑣 of all ribbons to zero. Information like this is quite valuable for selecting materials that provide the best performance at different operating conditions. Finally, we also 𝜕𝐶 verify the accuracy of the computed values of ( 𝜕𝑇𝑣 ) by comparing against analytical derivatives which, for this example, can be obtained easily by differentiating the analytical expression for the specific heat with respect to temperature. We find that the code differentiation derivatives match the analytical values to machine precision.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

21

re-

lP

Figure 10. Specific heat of graphene nanoribbons of different widths as a function of temperature. Here we show the specific heat for ribbons of 0.44nm, 0.88nm and 2.66nm. Also shown in the figure are the asymptotic powers at 40K computed using code differentiation showing the variation of specific heat of each ribbon with respect to temperature.

urn a

Table 1. Variation of the specific heat of graphene nanoribbons with temperature. Ribbons of three different widths are chosen for the demonstration. Analysis is performed at three different temperatures (T=40K, T=300K, T=900K). The derivatives computed using code differentiation match to machine precision the theoretical derivative.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

pro of

Journal Pre-proof

22

re-

lP

4. Discussion The method of code differentiation can be easily added to any existing codes that support templating and operator overloading. First principles DFT simulation codes such as Quantum Espresso [24] and Siesta use Fortran (F90) [43] which support such features. Adding a Tangent class to these codes can unintrusively enable computation of derivatives such as exact force constants of any order. This would be quicker and easier than the DFT or DFPT based approaches to computing derivatives of higher order.

urn a

Our method for the computation of derivatives may be generalized to other applications also. The Tangent class is generic in nature and is independent of the function for which the derivatives are being computed. Jacobian matrices (matrices of first-order derivatives of a function with respect to inputs) and Hessian matrices (matrix of second-order derivatives of a function with respect to inputs) are commonly used in many numerical methods. Code differentiation method may be used to compute these matrices to machine accuracy. 5. Closure In this paper, we presented a generalized method, automatic code differentiation, to perform unintrusive derivative computation in nanoscale thermal transport simulations. This method exploits the concepts of templating and operator overloading in the C++ programming language to easily convert existing codes into those providing derivatives. The idea was shown to be useful in computing phonon properties such as force constants, the Gruneisen parameter and group velocity. Examples of derivative computation using code differentiation were discussed and the results are compared with those obtained using finite difference approaches or with analytical values.

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

pro of

Journal Pre-proof

Data Availability

The data supporting the results in this paper is available upon request from the corresponding author, PM. Acknowledgments Support of J. Murthy and P. Marepalli under NSF grant #1758004 is gratefully acknowledged.

23

Journal Pre-proof

References

[5] [6] [7]

[8]

[9] [10]

[11] [12]

[13] [14]

[15]

[16]

[17] [18] [19]

pro of

[4]

re-

[3]

lP

[2]

D. Singh, “Frequency and Polarization Resolved Phonon Transport in Carbon and Silicon Nanostructures,” Theses Diss. Available ProQuest, pp. 1–193, Jan. 2011. L. Lindsay, D. A. Broido, and N. Mingo, “Flexural Phonons and Thermal Transport in Graphene,” Phys. Rev. B, vol. 82, no. 11, p. 115427, Sep. 2010. A. Cepellotti, G. Fugallo, L. Paulatto, M. Lazzeri, F. Mauri, and N. Marzari, “Phonon Hydrodynamics in Two-Dimensional Materials,” Nat. Commun., vol. 6, Mar. 2015. G. Chen, M. S. Dresselhaus, G. Dresselhaus, J.-P. Fleurial, and T. Caillat, “Recent Developments in Thermoelectric Materials,” Int. Mater. Rev., vol. 48, no. 1, pp. 45–66, Feb. 2003. M. S. Dresselhaus et al., “New Directions for Low-Dimensional Thermoelectric Materials,” Adv. Mater., vol. 19, no. 8, pp. 1043–1053, Apr. 2007. S. Z. Butler et al., “Progress, Challenges, and Opportunities in Two-Dimensional Materials Beyond Graphene,” ACS Nano, vol. 7, no. 4, pp. 2898–2926, Apr. 2013. B. Qiu and X. Ruan, “Molecular Dynamics Simulations of Lattice Thermal Conductivity of Bismuth Telluride Using Two-Body Interatomic Potentials,” Phys. Rev. B, vol. 80, no. 16, p. 165203, Oct. 2009. B. Qiu et al., “First-Principles Simulation of Electron Mean-Free-Path Spectra and Thermoelectric Properties in Silicon,” EPL Europhys. Lett., vol. 109, no. 5, p. 57006, Mar. 2015. C. Liu, F. Li, L.-P. Ma, and H.-M. Cheng, “Advanced Materials for Energy Storage,” Adv. Mater., vol. 22, no. 8, pp. E28–E62, Feb. 2010. H. Ji et al., “Enhanced Thermal Conductivity of Phase Change Materials with UltrathinGraphite Foams for Thermal Energy Storage,” Energy Environ. Sci., vol. 7, no. 3, pp. 1185–1192, Feb. 2014. J. M. Ziman, Electrons and Phonons: The Theory of Transport Phenomena in Solids. Oxford, New York: Oxford University Press, 2001. T. Feng and X. Ruan, “Four-Phonon Scattering Reduces Intrinsic Thermal Conductivity of Graphene and the Contributions from Flexural Phonons,” Phys. Rev. B, vol. 97, no. 4, p. 045202, Jan. 2018. Y.-J. Han and P. G. Klemens, “Anharmonic Thermal Resistivity of Dielectric Crystals at Low Temperatures,” Phys. Rev. B, vol. 48, no. 9, pp. 6033–6042, Sep. 1993. J. Fabian and P. B. Allen, “Thermal Expansion and Gruneisen Parameters of Amorphous Silicon: A Realistic Model Calculation,” Phys. Rev. Lett., vol. 79, no. 10, pp. 1885–1888, Sep. 1997. R. J. C. Bennie F. Blackwell Kevin J. Dowding, “Development and Implementation of Sensitivity Coefficient Equations for Heat Conduction Problems,” Numer. Heat Transf. Part B Fundam., vol. 36, no. 1, pp. 15–32, Aug. 1999. B.J. Henz, K.K. Tamma, R. Kanapady, N.D. Ngo, and P.W. Chung, “Process Modeling of Composites by Resin Transfer Molding: Practical Applications of Sensitivity Analysis for Isothermal Considerations,” Int. J. Numer. Methods Heat Fluid Flow, vol. 13, no. 4, pp. 415–447, Jun. 2003. D. A. Tortorelli and P. Michaleris, “Design Sensitivity Analysis: Overview and Review,” Inverse Probl. Eng., vol. 1, no. 1, pp. 71–105, Oct. 1994. A. Griewank and A. Walther, Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008. C. H. Bischof, H. M. Bücker, P. Hovland, U. Naumann, and J. Utke, Advances in Automatic Differentiation, 1st ed. Springer Publishing Company, Incorporated, 2008.

urn a

[1]

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

24

Journal Pre-proof

[25] [26]

[27]

[28] [29]

[30] [31]

[32]

[33] [34] [35] [36]

[37] [38]

[39]

pro of

[24]

re-

[22] [23]

lP

[21]

A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, “Automatic Differentiation in Machine Learning: a Survey,” J. Mach. Learn. Res., vol. 18, no. 153, pp. 1–43, 2018. M. Abadi et al., “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems,” ArXiv160304467 Cs, Mar. 2016. A. Paszke et al., “Automatic Differentiation in Pytorch,” Oct. 2017. J. Y. Murthy and S. R. Mathur, “Computational Heat Transfer in Complex Systems: A Review of Needs and Opportunities,” J. Heat Transf., vol. 134, no. 3, pp. 031016–031016, Jan. 2012. P. Giannozzi et al., “QUANTUM ESPRESSO: A Modular and Open-Source Software Project for Quantum Simulations of Materials,” J. Phys. Condens. Matter, vol. 21, no. 39, p. 395502, Sep. 2009. S. Plimpton, “Fast Parallel Algorithms for Short-Range Molecular Dynamics,” J. Comput. Phys., vol. 117, no. 1, pp. 1–19, Mar. 1995. W. Li, J. Carrete, N. A. Katcho, and N. Mingo, “Shengbte: A Solver of the Boltzmann Transport Equation for Phonons,” Comput. Phys. Commun., vol. 185, no. 6, pp. 1747–1758, Jun. 2014. A. Debernardi and S. Baroni, “Third-Order Density-Functional Perturbation Theory: A Practical Implementation with Applications to Anharmonic Couplings in Si,” Solid State Commun., vol. 91, no. 10, pp. 813–816, Sep. 1994. K. Esfarjani and H. T. Stokes, “Method to Extract Anharmonic Force Constants from First Principles Calculations,” Phys. Rev. B, vol. 77, no. 14, p. 144112, Apr. 2008. X. Gonze, J.-C. Charlier, D. C. Allan, and M. P. Teter, “Interatomic Force Constants from First Principles: The Case of $\alpha${}-Quartz,” Phys. Rev. B, vol. 50, no. 17, pp. 13035– 13038, Nov. 1994. G. Lang et al., “Anharmonic Line Shift and Linewidth of the Raman Mode in Covalent Semiconductors,” Phys. Rev. B, vol. 59, no. 9, pp. 6182–6188, Mar. 1999. K. Kunc and R. M. Martin, “\textit{Ab Initio} Force Constants of GaAs: A New Approach to Calculation of Phonons and Dielectric Properties,” Phys. Rev. Lett., vol. 48, no. 6, pp. 406–409, Feb. 1982. K. Parlinski, Z. Q. Li, and Y. Kawazoe, “First-Principles Determination of the Soft Mode in Cubic ${\mathrm{ZrO}}_{2}$,” Phys. Rev. Lett., vol. 78, no. 21, pp. 4063–4066, May 1997. M. H. F. Sluiter, M. Weinert, and Y. Kawazoe, “Force Constants for Substitutional Alloys,” Phys. Rev. B, vol. 59, no. 6, pp. 4100–4111, Feb. 1999. M. H. F. Sluiter, M. Weinert, and Y. Kawazoe, “Determination of the Elastic Tensor in Low-Symmetry Structures,” EPL Europhys. Lett., vol. 43, no. 2, p. 183, Jul. 1998. J. Tersoff, “New Empirical Approach for the Structure and Energy of Covalent Systems,” Phys. Rev. B, vol. 37, no. 12, pp. 6991–7000, Apr. 1988. D. W. Brenner, “Empirical Potential for Hydrocarbons for Use in Simulating the Chemical Vapor Deposition of Diamond Films,” Phys. Rev. B, vol. 42, no. 15, pp. 9458–9471, Nov. 1990. M. Z. Bazant, E. Kaxiras, and J. F. Justo, “Environment Dependent Interatomic Potential for Bulk Silicon,” Phys. Rev. B, vol. 56, no. 14, pp. 8542–8552, Oct. 1997. L. Lindsay and D. A. Broido, “Optimized Tersoff and Brenner Empirical Potential Parameters for Lattice Dynamics and Phonon Thermal Transport in Carbon Nanotubes and Graphene,” Phys. Rev. B, vol. 81, no. 20, p. 205441, May 2010. T. Feng and X. Ruan, “Quantum mechanical prediction of four-phonon scattering rates and reduced thermal conductivity of solids,” Phys. Rev. B, vol. 93, no. 4, p. 045202, Jan. 2016.

urn a

[20]

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

25

Journal Pre-proof

pro of

[43]

re-

[42]

lP

[41]

N. Mounet and N. Marzari, “First-Principles Determination of the Structural, Vibrational and Thermodynamic Properties of Diamond, Graphite, and Derivatives,” Phys. Rev. B, vol. 71, no. 20, p. 205214, May 2005. D. Singh, J. Y. Murthy, and T. S. Fisher, “On the Accuracy of Classical and Long Wavelength Approximations for Phonon Transport in Graphene,” J. Appl. Phys., vol. 110, no. 11, p. 113510, Dec. 2011. J.-S. Wang, J. Wang, and J. T. Lü, “Quantum thermal transport in nanostructures,” Eur. Phys. J. B, vol. 62, no. 4, pp. 381–404, Apr. 2008. J. M. Soler et al., “The SIESTA Method for Ab Initio Order- N Materials Simulation,” J. Phys. Condens. Matter, vol. 14, no. 11, p. 2745, 2002.

urn a

[40]

Jo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

26

Journal Pre-proof

Declaration of Interest Statement

Jo

urn a

lP

re-

pro of

Declaration of interest:None

urn a

lP

re-

pro of

Journal Pre-proof

𝐶𝑣 (J/m3/K)

T=300K

T=900K

(∆𝑇 = 2% 𝑜𝑓 𝑇)

4 atoms (0.44 nm)

2.20E+05

3.47E+03

1.58

8 atoms (0.88 nm)

1.28E+05

2.77E+03

2.16

24 atoms (2.66 nm)

8.31E+04

2.29E+03

2.76

4 atoms (0.44 nm)

1.83E+06

3.64E+04

1.99

8 atoms (0.88 nm)

1.58E+06

3.62E+04

2.29

24 atoms (2.66 nm)

1.41E+06

3.59E+04

2.54

4 atoms (0.44 nm)

3.67E+06

2.11E+04

0.58

8 atoms (0.88 nm)

3.57E+06

2.42E+04

0.68

24 atoms (2.66 nm)

3.50E+06

2.62E+04

0.75

Jo

T=40K

𝜕𝐶𝑣 ∗ ∆𝑇 𝜕𝑇

Percentage Variation in Specific Heat

∆𝐶𝑣 ≈

Automatic differentiation approach for property computations in nanoscale thermal transport

Automatic differentiation approach for property computations in nanoscale thermal transport

Recommend Documents