Journal Pre-proof Automatic differentiation approach for property computations in nanoscale thermal transport Prabhakar Marepalli, Sanjay R. Mathur, Jayathi Y. Murthy
PII: DOI: Reference:
S0010-4655(20)30001-1 https://doi.org/10.1016/j.cpc.2020.107138 COMPHY 107138
To appear in:
Computer Physics Communications
Received date : 15 March 2019 Revised date : 16 December 2019 Accepted date : 29 December 2019 Please cite this article as: P. Marepalli, S.R. Mathur and J.Y. Murthy, Automatic differentiation approach for property computations in nanoscale thermal transport, Computer Physics Communications (2020), doi: https://doi.org/10.1016/j.cpc.2020.107138. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ยฉ 2020 Elsevier B.V. All rights reserved.
Journal Pre-proof
Manuscript
pro of
AUTOMATIC DIFFERENTIATION APPROACH FOR PROPERTY COMPUTATIONS IN NANOSCALE THERMAL TRANSPORT Prabhakar Marepalli,1 Sanjay R. Mathur2 and Jayathi Y. Murthy3 1 The University of Texas at Austin 2 ESI Group 3 University of California, Los Angeles
re-
ABSTRACT We present the automatic code differentiation technique to perform derivative computations in nanoscale phonon transport simulations. This method exploits the concepts of templating and operator overloading in C++ and other similar programming languages to unintrusively convert existing codes into those yielding derivatives of arbitrary order. The idea is demonstrated through the computation of phonon properties such as second and third order force constants, the Gruneisen parameter, group velocities, and the temperature variation of specific heat for materials like graphene and graphene nanoribbons. Derivative values so computed are compared with those obtained using finite difference approaches or with analytical values. The method is found to yield derivative values to machine accuracy, with none of the round-off issues associated with finite difference approaches.
lP
1. Introduction First principles simulations have been the staple tool for predicting the thermal properties of materials for more than a decade [1]โ[3]. Typical properties of interest include thermodynamic properties such as specific heat, the Gruneisen parameter, thermal expansion coefficient, macroscopic properties such as thermal conductivity, interface resistance, and phonon properties such as dominant mean free paths. Knowledge of these properties may be used to engineer novel materials/nanocomposites for applications such as thermal management and thermal energy storage.
urn a
While these predicted thermal properties provide important information about material behavior, in order to optimize material design, one needs information about the sensitivity of these properties to design and operating variables. Sensitivity is the variation of an output quantity of interest (QoI) with respect to a specific independent input variable and can be obtained by computing the derivative of the QoI with respect to this input variable. A common example where sensitivity is required is in the design of thermoelectric materials to harvest energy from waste heat [4]โ[6] . The design of thermoelectrics is based on the Seebeck effect, and the performance metric, ZT [4], requires a good thermoelectric material to have low thermal conductivity and high electrical conductivity. Thermal transport simulations have been used to engineer materials that possess low thermal conductivity [7], [8]. A robust design of a thermoelectric energy harvester would lie in the region of the design space where the sensitivity of thermal conductivity to parameters like operating temperature and strain (caused by thermal expansion) is minimal. Another example is the design of thermal energy storage materials. A good thermal energy storage material must have large specific heat [9], [10]. For the material to be reliable across a wide range of temperatures, it is also important that the specific heat be insensitive to operating temperature fluctuations.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Derivatives are also used in thermal transport simulations to describe important crystal properties. Examples include force constants and the Gruneisen parameter. A force constant is a derivative of the total potential energy of the system with respect to the displacement of a given atom. Force constants are extensively used in phonon computations. Force constants of second order are used to compute phonon dispersion which provides valuable information on spectral properties of a crystal like group velocity, thermal conductance, specific heat, etc. The force constants of third and fourth order are needed to compute 1
Journal Pre-proof
anharmonic phonon-phonon interactions [11], [12]. Anharmonic phonon interactions provide information to predict properties like thermal conductivity. The Gruneisen parameter, which describes the effect of crystal volume change on its vibrational properties, is the derivative of phonon frequency with respect to the volume of the crystal. This parameter is also used to describe the anharmonicity of a crystal lattice, which in turn can be used to estimate material properties like thermal conductivity and the coefficient of thermal expansion [13], [14].
pro of
Derivatives are typically computed using a finite difference method. Here, the derivative of the quantity of interest (QoI) with respect to independent variable ๐ฅ is approximated as:
[QoI(๐ฅ+โ๐ฅ)โQoI(๐ฅ)] โ๐ฅ
(1)
re-
Equation (1) is a first-order forward difference approximation. As โ๐ฅ approaches zero, the difference equation approaches the exact value of derivative. However, the choice of โ๐ฅ in a finite difference scheme depends on the type of function being differentiated and the order of derivative that is sought. In most cases, โ๐ฅ is taken to be around 1/100th to 1/1000th of the range of ๐ฅ under consideration. The accuracy of the derivative may be improved by decreasing ฮx; this is typically done by sweeping โ๐ฅ across a given range of values and looking for a converged value of the derivative as ฮx๏ฎ0. Too large a value of ฮx leads to inaccurate derivatives, while too small a value of โ๐ฅ may result in the value of the finite difference derivative blowing up.
urn a
lP
In order to compute the sensitivity of design variables with respect to input parameters, exact derivatives may be calculated analytically by differentiating the underlying partial differential equations. For example, Blackwell and coworkers [15] derived sensitivity equations for the dependence of temperature with respect to material properties, boundary conditions, etc. by analytically differentiating the heat conduction equation with respect to these independent variables, and then solving the resulting partial differential equation (PDE) numerically. Similarly, Henz et al. [16] derived the sensitivity equations for fluid flow models through a porous medium. Sensitivity equations for generalized PDE systems have been presented in [17]. Although the sensitivities obtained from these equations are exact to numerical accuracy, the process for deriving them can be tedious and cumbersome with complicated physical models. Also, computation of sensitivities of higher orders is even more complicated, making it hard to generalize the procedure for all applications. Furthermore, different PDEs and boundary conditions must be derived for each sensitivity of interest, coded into software and solved numerically. Recently, automatic code differentiation techniques have been developed to compute exact derivatives automatically in a problem-agnostic and non-intrusive manner. This class of techniques is based on the idea that any computational code, at a low-level, is a set of elementary arithmetic operations (add, subtract, divide, multiply, etc.). These operations can be overloaded to compute the derivative using the chain rule for differentiation to yield the derivative of the quantity of interest with respect to the input variable. The algorithmic and implementation details of this technique are well developed and have been used in several applications [18], [19]. Very recently, with the advent of machine learning, automatic code differentiation has gained significant attention for use in backpropagation procedures where derivatives are used to update the weights of a neural network [20]. Many standard machine learning frameworks like Tensorflow and Pytorch now offer this utility as an inbuilt feature [21], [22]. In the field of fluid dynamics, Mathur et al., [23] recently applied this technique to unintrusively perform sensitivity analysis in a computational fluid dynamics (CFD) solver.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Code differentiation exploits the idea of templating and operator overloading in object-oriented languages and can yield exact derivatives, without the need for finite difference approximations. The method is 2
Journal Pre-proof
extensible to any code (such as Quantum Espresso [24] or LAMMPS [25]) written in modern computer languages such as C++ or F90/95. Derivatives of arbitrarily high order with respect to chosen input variables may be found using this idea. The most attractive feature of this idea is that it is unintrusive. Existing codes are easily converted, with virtually no intrusion, into those providing derivative information.
lP
(a)
re-
pro of
In this work, we propose an approach to unintrusively compute derivatives in phonon transport simulations. In the sections that follow, we discuss the main components of the idea. In Section 2, we discuss automatic code differentiation to compute derivatives of any order with respect to a given parameter. In Section 3, we present several examples illustrating the use of automatic code differentiation in nanoscale phonon transport considering materials like graphene and graphene nanoribbons. We first apply the idea to the computation of force constants, which are a staple of phonon spectral analysis. For this demonstration, we consider graphene as the material. We demonstrate that automatic code differentiation provides accurate and reliable second and third-order force constants without the issues of round-off encountered by widely used finite difference methods [26] with minimal change to the code computing the interatomic potential. We then apply the technique to the computation of the Gruneisen parameter, i.e., the derivative of the lattice vibrational frequency with respect to volume, for a graphene nanoribbon, and show that it does not suffer from the inaccuracies of finite difference methods near the ๏ point. We also compute phonon group velocity, i.e., the derivative of the phonon frequency with wave vector, for the nanoribbon, and demonstrate that the automatic code differentiation technique obviates the necessity of tracing the large number of phonon branches, as finite difference methods require. Finally, using our approach, we demonstrate the computation of variation of the crystal specific heat of a graphene nanoribbon due to variation in temperature. In Section 4, we discuss possible extensions of this procedure and how it can be integrated into existing state-of-the-art first-principles simulations.
Original Functions p ๏ฝ 3* x 2 ๏ซ sin( y ) q๏ฝ p/ y
urn a
Elemental Decomposition
Inputs : x, y Outputs : p, q Elemental Derivatives
๐ฅ ๐ฅ
๐ฅ ๐ฅ
( )
๐ฅ
๐ฅ
o
(
(b)
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
3
)
Journal Pre-proof
Outputs :
Inputs : x ๏ฝ 10, y ๏ฝ ๏ฐ / 3, x ' ๏ฝ 1, y ' ๏ฝ 0
p ' ๏ฝ 60.0 ๏ฝ
๏ถp ๏ถx
x ๏ฝ10, y ๏ฝ๏ฐ /3
๏ถq ๏ถx
x ๏ฝ10, y ๏ฝ๏ฐ /3
pro of
q ' ๏ฝ 57.2958 ๏ฝ
Outputs :
Inputs : x ๏ฝ 10, y ๏ฝ ๏ฐ / 3, x ' ๏ฝ 0, y ' ๏ฝ 1
p ' ๏ฝ 0.5 ๏ฝ
๏ถp ๏ถy
x ๏ฝ10, y ๏ฝ๏ฐ /3
q ' ๏ฝ ๏ญ273.8795 ๏ฝ
๏ถq ๏ถy
x ๏ฝ10, y ๏ฝ๏ฐ /3
re-
Figure 1. Code differentiation procedure. (a) The functions to be differentiated are decomposed into elemental operations, and elemental derivatives computed. (b) Inputs x and y are prescribed, and also the input derivatives xโ and yโ. If one of the input derivatives is set to unity and the other to zero (xโ=1, yโ=0, ๐๐ ๐๐ for example), the corresponding output derivatives ๐๐ฅ , ๐๐ฅ are computed. These derivates are exact to machine accuracy. 2. Code Differentiation Method
๐ฅ
( )
.
urn a
lP
In this section, we describe the details of the automatic code-differentiation method. The mode of codedifferentiation described here is referred to as forward mode. This idea is illustrated through a simple example in Figure 1. We are given two outputs p and q, which are functions of two independent inputs x and y. (2) (3)
The objective is to compute the output derivatives pโ and qโ with respect to the input variables x and y at the point x=x0 and y=y0. Here the prime symbol denotes differentiation with respect to the chosen input variable. The algebraic expressions for p and q are first decomposed into elemental operations involving the input variables x and y. Then, their derivatives are written in terms of the elemental derivatives, i.e., the derivatives of inputs x and y (i.e, xโ and yโ). Using the above decomposition, we can differentiate each of the elemental functions with respect to an unknown input variable. Let us assume the input variable with respect to which we want to differentiate is ๐ง, where ๐ง could be either ๐ฅ or . Using the expressions from Figure 1,
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
๐ฅ ๐ฅ ๐๐ผ ๐๐ง
๐ฅ
๐๐ฅ ๐๐ง
๐๐ฅ ๐๐ง
(4)
๐ฅ.
(5)
Once we have all these derivatives computed, it is easy to obtain derivatives with respect to ๐ฅ or by ๐๐ฆ ๐๐ฅ ๐๐ผ simply setting ๐ง ๐ฅ or ๐ง . For example, if ๐ง ๐ฅ, ๐๐ง 0 and ๐๐ง ๐ฅ 1 giving ๐๐ง 2๐ฅ. 4
Journal Pre-proof
๐๐ฆ
๐๐ฅ
๐๐ผ
Similarly, if ๐ง , ๐๐ง 1 and ๐๐ง ๐ฅ 0 giving ๐๐ง 0. In other words, by setting only one input derivative to unity and all other input derivatives to zero, the procedure computes the exact derivatives of all outputs with respect to the input corresponding to the nonzero derivative. For example, in Figure 1, setting the value of xโ to 1 and yโ=0 computes the exact derivative of p and q with respect to x (top example). Similarly, setting the value of yโ to 1 and xโ=0 computes the exact derivative of p and q with respect to y (bottom example).
to compute
๐๐ ๐๐ , ๐๐ฆ ๐๐ฆ
pro of
It is useful, for purposes of coding (see next section), to think of all variables as consisting of a pair: the variable itself, and its derivative with respect to the chosen input. Thus, p and q would be represented by P=[p,pโ] and Q=[q,qโ]. The input variables x and y are also represented by such pairs, so that X=[x,xโ] and Y=[y,yโ]. The prime symbol means differentiation with respect to an arbitrary (user-defined) input. Thus ๐(๐ฅ, ๐ฅ , , ) ๐(๐, ๐). When evaluating , we need to specify ๐ฅ, , ๐ฅโ, โ as one would for any other function. If the variable we are differentiating with respect to is ๐ฅ, then ๐๐ฅ ๐๐ฅ 1 and ๐ ๐๐ฅ 0, so ๐ฅโ 1 and โ 0. Thus, providing an input X=[x0, 1] and Y=[y0, 0] indicates that we wish to ๐๐ ๐๐ compute ๐๐ฅ , ๐๐ฅ at (x0, y0). Alternatively, providing an input X=[x0, 0] and Y=[y0, 1] signals that we wish at (x0, y0). The specific algebraic operations are achieved in our solver through
re-
templating and operator overloading in C++, as described in the sections that follow. (a)
Original Functions
Inputs : x, y Outputs : p, q
Elemental Derivatives
urn a
Elemental Decomposition
lP
p ๏ฝ 3* x 2 ๏ซ sin( y ) q๏ฝ p/ y
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
5
Journal Pre-proof
(b)
Outputs : Inputs : x ๏ฝ x0 , x1 ' ๏ฝ 0, x2 ' ๏ฝ 0, x12 ' ๏ฝ 0 ๏ถp p1' ๏ฝ 0.5 ๏ฝ y ๏ฝ y0 , y1 ' ๏ฝ 1, y2 ' ๏ฝ 1, y12 ' ๏ฝ 0 ๏ถy p2' ๏ฝ 0.5 ๏ฝ
Outputs : q1' ๏ฝ ๏ญ273.879 ๏ฝ x ๏ฝ10, y ๏ฝ๏ฐ /3
๏ถp ๏ถy
Outputs :
x0 ๏ฝ 10, y0 ๏ฝ ๏ฐ / 3
๏ถp ๏ถx
๏ถ p ๏ถy๏ถy 2
q12' ๏ฝ 522.244 ๏ฝ
x ๏ฝ10, y ๏ฝ๏ฐ /3
q1' ๏ฝ 57.2958 ๏ฝ
x ๏ฝ10, y ๏ฝ๏ฐ /3
๏ถp ๏ถy
x ๏ฝ10, y ๏ฝ๏ฐ /3
๏ถ q ๏ถy๏ถy 2
๏ถq ๏ถx
x ๏ฝ10, y ๏ฝ๏ฐ /3
๏ถ p ๏ถx๏ถy
x ๏ฝ10, y ๏ฝ๏ฐ /3
q2' ๏ฝ ๏ญ273.879 ๏ฝ
๏ถq ๏ถy
q12' ๏ฝ ๏ญ54.7134 ๏ฝ
๏ถ q ๏ถx๏ถy
x ๏ฝ10, y ๏ฝ๏ฐ /3
re-
p2' ๏ฝ 0.5 ๏ฝ p12' ๏ฝ 0 ๏ฝ
๏ถq ๏ถy
Outputs :
p1' ๏ฝ 60 ๏ฝ
y ๏ฝ y0 , y1 ' ๏ฝ 0, y2 ' ๏ฝ 1, y12 ' ๏ฝ 0
x ๏ฝ10, y ๏ฝ๏ฐ /3
x ๏ฝ10, y ๏ฝ๏ฐ /3
p12' ๏ฝ ๏ญ0.866025 ๏ฝ
Inputs : x ๏ฝ x0 , x1 ' ๏ฝ 1, x2 ' ๏ฝ 0, x12 ' ๏ฝ 0
๏ถq ๏ถy
q2' ๏ฝ ๏ญ273.879 ๏ฝ
pro of
x0 ๏ฝ 10, y0 ๏ฝ ๏ฐ / 3
2
x ๏ฝ10, y ๏ฝ๏ฐ /3
x ๏ฝ10, y ๏ฝ๏ฐ /3 2
x ๏ฝ10, y ๏ฝ๏ฐ /3
1, ๐ฅ
unity and others zero (๐ฅ1
lP
Figure 2. Code differentiation procedure for derivatives of second order. (a) Here the elemental derivatives are computed with respect to ๐ฅ1 , ๐ฅ , 1 , . (b) Input values of these parameters determine the type of corresponding output derivative. Since a second order derivative is sought, two inputs are always set to 0,
1
0,
1, to find
๐2 ๐๐ฅ๐๐ฆ
for example).
The above idea can easily be extended to compute the derivatives of any chosen order. In Figure 2, we show the extension for evaluating second derivatives. We consider the same functions used in the above ๐2 ๐ . ๐๐ฅ๐๐ฆ
and qโ=
๐ฅ
( ) and
๐2 ๐
. Our objective is to compute output second derivatives pโ=๐๐ฅ๐๐ฆ
urn a
example, p
In order to accommodate second derivatives, we now consider each variable as consisting of four components rather than two. Thus, ๐ [ , 1 , , 1 ],and ๐ [ , 1 , , 1 ]. Here, the notation 1 denotes a first derivative with respect to input variable #1, denotes a first derivative with respect to input variable #2, and 1 denotes the second derivative with respect to input variables #1 and #2. The procedure is similar to the one discussed for first order derivatives earlier. But now, we are differentiating with respect to two unknown variables. Let us assume the input variables are ๐ง1 and ๐ง . The corresponding variable
inputs ๐ฅ and
when we need ๐2 ๐ฅ ๐๐ง1 ๐๐ง2
๐2 ๐ ]. The variable is now a function of 1 2 1 ๐๐ง2 ๐๐ฅ ๐๐ฅ ๐2 ๐ฅ ๐๐ฆ ๐๐ฆ ๐2 ๐ฆ and their corresponding derivatives, i.e. p= (๐ฅ, ๐๐ง , ๐๐ง , ๐๐ง ๐๐ง , , ๐๐ง , ๐๐ง , ๐๐ง ๐๐ง ). Thus, 1 2 1 2 1 2 1 2 ๐๐ฅ ๐๐ฅ the second derivative with respect to ๐ฅ, we choose ๐ง1 ๐ฅ and ๐ง ๐ฅ, so ๐๐ง 1, ๐๐ง 1, 1 2 ๐๐
๐๐
has four values associated with them [ , ๐๐ง , ๐๐ง , ๐๐ง
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
0 and all the y-derivatives are 0. In other words, the inputs x and y are represented as ๐
[๐ฅ, ๐ฅ1 , ๐ฅ , ๐ฅ1 ], and ๐
[ ,
1,
,
1
]. If we wished to compute 6
๐2 ๐ ๐๐ฅ 2
and
๐2 ๐ ๐๐ฅ 2
at (๐ฅ0 ,
0 ),
we would
Journal Pre-proof
choose as inputs ๐ [๐ฅ0 , 1,1, 0], and ๐ [ 0 , 0,0,0]. This indicates the computation that we wish to produce second derivatives with respect to a single input variable, x. The computational process, outlined ๐๐
in Figure 2, would produce ๐ ๐2 ๐ ๐๐ฆ 2
to compute
and
๐2 ๐ ๐๐ฆ 2
๐๐ ๐2 ๐
[ , ๐๐ฅ , ๐๐ฅ , ๐๐ฅ 2 ] at the point (๐ฅ0 ,
at (๐ฅ0 ,
0 ),
0 ).
we would choose as inputs ๐
If, on the other hand, we wished [๐ฅ0 , 0,0, 0], and ๐
[ 0 , 1,1,0].
This signals the computation that we seek to produces second derivatives with respect to a single input ๐๐
๐
๐๐
๐๐
๐2 ๐
[ , ๐๐ฆ , ๐๐ฆ , ๐๐ฆ2 ] at the point (๐ฅ0 ,
0 ).
An input of ๐
we want a mixed second derivative, would produce ๐ the point (๐ฅ0 ,
0 ).
[๐ฅ, 1,0, 0], and ๐
[
๐๐ ๐๐ ๐2 ๐ , ๐๐ฅ , ๐๐ฆ , ๐๐ฅ๐๐ฆ
๐๐ ๐2 ๐
[ , ๐๐ฆ , ๐๐ฆ , ๐๐ฆ2 ] and
pro of
variable, y. The computational process, outlined in Figure 2, would produce ๐
[ , 0,1,0], signaling that
] and ๐
๐๐
๐๐
๐2 ๐
[ , ๐๐ฅ , ๐๐ฆ , ๐๐ฅ๐๐ฆ] at
We now turn to the example shown in Figure 2. As before, elemental decomposition is performed and elemental derivatives are computed using a set of intermediate variables, [ , 1, , 1 ] , [ , 1 , , 1 ] and [ , 1 , , 1 ] and exploiting the chain rule. Passing input variables ๐ [10, 1,0, 0], and ๐ [๐ ,0,1,0] through the concatenation of algebraic operations produces ๐ ] and ๐
[ ,
๐๐ ๐๐ , ๐๐ฅ ๐๐ฆ
.
๐2 ๐ ] ๐๐ฅ๐๐ฆ
at the point (๐ฅ0
lP
(a)
(b)
10,
re-
๐๐ ๐๐ ๐2 ๐ , , ๐๐ฅ ๐๐ฆ ๐๐ฅ๐๐ฆ
urn a
[ ,
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
7
0
๐
), as shown.
Journal Pre-proof
Examples for Template Usage (
Output ๏ฝ First Derivative
๏ถf ๏ถx
Output ๏ฝ ( x0 , y0 , z0 )
x ๏ฝ [ x0 ,1] Inputs : y ๏ฝ [ y0 , 0]
z ๏ฝ [ z0 ,1]
๏ถ2 f ๏ถx 2
Output ๏ฝ
( x0 , y0 , z0 )
0
0)
z ๏ฝ [ z0 , 0, 0, 0]
z ๏ฝ [ z0 , 0, 0, 0]
Output ๏ฝ
( x0 , y0 , z0 )
re-
Third Derivative
0
Inputs : y ๏ฝ [ y0 , 0,1, 0]
Inputs : y ๏ฝ [ y0 , 0, 0, 0]
๏ถ3 f ๏ถx3
๏ถ2 f ๏ถx๏ถy ( x , y , z
x ๏ฝ [ x0 , 1, 0, 0]
x ๏ฝ [ x0 , 1, 1, 0]
Output ๏ฝ
pro of
Second Derivative
( x0 , y0 , z0 )
Inputs : y ๏ฝ [ y0 , 0]
z ๏ฝ [ z0 , 0] Output ๏ฝ
๏ถf ๏ถz
x ๏ฝ [ x0 , 0]
x ๏ฝ [ x0 , 1, 1, 1, 0, 0, 0, 0]
๏ถ3 f ๏ถx๏ถy๏ถz
( x0 , y0 , z0 )
x ๏ฝ [ x0 , 1, 0, 0, 0, 0, 0, 0]
Inputs : y ๏ฝ [ y0 , 0, 0, 0, 0, 0, 0, 0]
Inputs : y ๏ฝ [ y0 , 0,1, 0, 0, 0, 0, 0]
z ๏ฝ [ z0 , 0, 0, 0, 0, 0, 0, 0]
z ๏ฝ [ z0 , 0, 0,1, 0, 0, 0, 0]
lP
Figure 3. Examples of template usage. a) Data structure for templates used for first, second and third derivatives; b) examples showing how to choose input values to obtain derivatives of different orders.
urn a
Templating To realize the above idea in our computational code, we employ templating and operator overloading. A template allows functions and classes in a programming language to be operated on with a generic datatype. In general applications, the datatype may be an integer, a floating point decimal, or a character. User-defined datatypes are what make templating powerful. For computing derivatives of nth order, we define a datatype as a variable with ๐ elements where ๐ 2๐ . For example, considering a datatype for a 2nd derivative, we define the variable as ๐ [๐ฃ, ๐๐ฃ1, ๐๐ฃ2, ๐๐ฃ1๐๐ฃ2], where ๐ฃ is the actual value of the variable, ๐๐ฃ1 is the first derivative with respect to variable 1, ๐๐ฃ2 is the first derivative with respect to variable 2 and ๐๐ฃ1๐๐ฃ2 is the mixed second derivative with respect to variables 1 a d 2. We call these types of variables Tangent variables. Input variables 1 and 2, and their corresponding derivatives are specified by the user at the start. As shown in the example in Figure 1, x and y are Input Tangent variables. The derivatives of these Input Tangent variables with respect to variables 1 and 2 are propagated to the outputs. xโ and yโ can take a value of 1 or 0. A value of 1 denotes that the derivative being computed is with respect to that variable. The data structures for templates of different orders of derivatives are shown in Figure 3 (a). In Figure 3 (b), we show some examples for template usage. These example demonstrate how a combination of input derivative variables can be used to obtain any desired mixed partial derivative. Additional examples are shown in Supplementary Information.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Using the templated datatype discussed above, we create a class in C++ with the derivatives defined for all elemental mathematical expressions. We name this class the Tangent class.
8
Journal Pre-proof
Regular Operation Addition
Overloaded Operation ๐๐ฅ
double x=x0, y=y0; double z; z=x+y;
๐๐ฆ
Tangent x=[x0, 1], y=[y0, ๐๐ฅ ๐๐ฅ Tangent z; z=x+y; zโ xโ yโ
Output: z=x0 + y0
pro of z=[x0+y0,1+0]
Output: z=[x0+y0, 1]
Multiplication
0];
Overload: z=[x0,1]+[y0,0]
๐๐ฅ
double x=x0, y=y0; double z; z=x*x+y;
Tangent x=[x0, ๐๐ฅ Tangent z; z=x*x+y;
Output: z=x0*x0+y0
re-
Overload: z1=x*x z=z1 y
๐ง ๐ฅ
๐ฅ ๐ฅ
1], y=[y0,
๐๐ฆ ๐๐ฅ
0];
z1โ x*xโ xโ*x zโ z 1โ y
lP
z1=[x0,1]*[x0,1]
z1=[x0*x0, x0*1+1*x0] z=z1+y z=[x0*x0, x0*1+1*x0]+[y0,0] z=[x0*x0+y0, x0*1+1*x0] Output: z=[x0*x0+y0, 2*x0]
๐ง ๐ฅ
๐ฅ ๐ฅ
urn a
Figure 4. Pseudo code for operator overload for a first order derivative. The top example describes the overload procedure for the addition operation. Here the input variables x and y are added together to compute the output variable z. The left hand column shows regular operations addressing double variable types. The corresponding operations on Tangent variable types are shown in the right hand column. For Tangent variables, operations are performed on the variable itself, and also its derivative. The value of xโ is set to 1 (x=[x0,1]), meaning that the derivative of z is computed with respect to x; the value of yโ is set ๐๐ง ๐๐ง to zero (y=[y0,0]). The derivative ๐๐ฅ is stored in the second field of the Tangent variable z (z=[๐ง0, ๐๐ฅ]). The bottom example describes the overload operation for the multiplication operation. In the right column, the derivative of x*x is compute using the uv chain rule for derivatives. Operator Overloading With the dictionary of derivatives available from the Tangent class, the next step is to design a procedure to parse the expressions using the derivative rules. In other words, a procedure is needed to perform the elemental decomposition shown in Figure 1. For this we exploit the operator overlaoding feature of C++. This feature allows user defined definitions for an operator. For example, a multiply operator ( ) may be overloaded to perform the multiplication rule for derivatives. Considering the expression ๐ข ๐ฃ, the compiler may be programmed to perform the standard multiplication procedure or the multiplication rule for derivatives, (๐ข๐ฃ ๐ฃ๐ข ). The choice of operation depends on the data type of the operands (๐ข, ๐ฃ). For example, for the multiplication of two variables of user-defined data type ๐ [๐ฃ, ๐๐ฃ], the multiplication
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
9
Journal Pre-proof
operation may be overloaded to perform straightforward multiplication for the first element, ๐ฃ, and the chain rule for derivatives applied to the second element, ๐๐ฃ. Pseudo code is provided in Figure 4 to demonstrate examples of operator overloading. Similar code may be written for second and third derivatives.
pro of
It is important to understand that the code differentiation procedure does not yield symbolic expressions for derivatives. Instead, it yields the exact numerical values of the derivatives for a given set of numerical input values, as seen in Figure 1 and Figure 2. If derivatives are desired for multiple input values, the above procedure must be repeated for every set of input values. This may be time-consuming if the derivatives are needed at many such input values.
re-
We emphasize here that the conversion of an existing legacy code to one providing derivatives is unintrusive if the code is written in F90, C++ or any other language supporting templating and operator overloading. All that is required is including files to perform the templating operation, and to define the operator overloading functions. The code itself does not need to be changed at all, and two versions of the code, one regular and one computing derivatives, are not required. The same code may be โconvertedโ at compile time simply by choosing suitable compiler options. Also, the computation of sensitivities and derivatives carries over to any new models, such as for specific heat or thermal conductivity, that are added to the code โ no new code needs to be written to support derivative computations. Thus, the automatic code differentiation idea presented here is a powerful, unintrusive and convenient way to add derivative computation capabilities to existing codes. To demonstrate the ease of use and the process of โconvertingโ a regular code to a code that can compute derivatives, we show the code snippets for computing second order force constants for graphene nanoribbons in supplementary information.
lP
3. Examples
urn a
3.1 Force Constants Force constants are an integral part of phonon spectral analysis of crystalline materials. A force constant is a derivative of the total potential energy of the system with respect to the displacement of a given atom. Force constants of second order are used to compute phonon dispersion and force constants of third order are needed to compute anharmonic phonon-phonon interactions. Several techniques exist for computation of force constants. The most commonly used techniques are density functional perturbation theory (DFPT), the method of finite differences or the direct method, and the exploitation crystal symmetries and invariances [29]โ[31] . DFPT uses a quantum mechanical approach to analyze the crystal response to an external perturbation [27], [29], [30]. This response is then used to compute the second, third, and higherorder derivatives. Although DFPT is one of the most accurate methods for predicting force constants, it is computationally very expensive. The direct method is the most widely-used and easiest of methods. Here, atoms in a supercell of the crystal are manually displaced, and the change in forces on other atoms due to this displacement is recorded. The forces themselves are computed either from first-principle DFT (Density Functional Theory) calculations or from empirical interatomic potentials. These force changes are used to compute the force constant, which is the derivative of force with displacement, using a finite difference formula. A similar procedure is followed for derivatives of all orders. A limitation of this method is that the number of displacements to be performed increases with the order of the derivative, making the process cumbersome. The derivative calculation is also approximate in nature: the value of the derivative depends on the value of the displacement. Too large a displacement leads to an approximate derivative, whereas too-small a displacement can lead to round-off errors and the derivative blowing up because of a division by a nearlyzero number. The task of finding the appropriate displacement is itself a significant part of the problem in the finite difference method. This value depends on the potential and the order of the derivatives being computed. Nevertheless, several variations of this method exist and are widely used [33]โ[36].
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
10
Journal Pre-proof
A recently developed technique exploits the symmetries of force constants and the crystal lattice [28]. The force on an atom i in a crystal is computed as the derivative of the total potential energy of the system with ๐๐ respect to that atom (๐น๐ ), where the potential energy is defined using Taylorโs expansion as ๐๐ข๐ 1 1 1 ๐ ๐0 โ ฮ ๐ ๐ข๐ โ ฮฆ๐๐ ๐ข๐ ๐ข๐ โ ฮจ๐๐๐ ๐ข๐ ๐ข๐ ๐ข๐ โ ฯ๐๐๐๐ ๐ข๐ ๐ข๐ ๐ข๐ ๐ข๐ โฏ 2! ! 4! ๐๐
๐๐๐
๐๐๐๐
(6)
pro of
๐
Here ฮ ๐ , ฮฆ๐๐ , ฮจ๐๐๐ , ฯ๐๐๐๐ are the force constants of first, second, third and fourth order respectively and ๐0 is the equilibrium crystal energy. A set of equations describing the relations between these force constants can be obtained by applying principles of force constant symmetry and invariance. With these relations between the force constants and a set of force (๐น๐ ) versus displacement (๐ข๐ ) data for the crystal lattice, one can form a linear system of equations with the force constants being the unknowns. The overall accuracy of the force constants is determined by the accuracy of the force versus displacement data. A common practice is to obtain this data from molecular dynamics (MD) or density functional theory (DFT) simulations.
re-
Since a force constant is nothing but a derivative of total crystal potential energy, we can exploit the idea of code differentiation to automatically compute the exact force constants. The total potential energy can be obtained using interatomic potentials. Most commonly-used interatomic potentials have well-defined closed form expressions and are shown to capture crystal behavior reasonably well. For example, the Tersoff, Brenner potentials and EDIP (Environment dependent interatomic potential) are commonly used for silicon [37]โ[39]. The Tersoff and Brenner potentials (with modified coefficients) can be used for graphene [38] as well.
lP
In this section, we demonstrate the application of the code differentiation method to compute force constants of graphene by using Tersoff interatomic potential [38]. The crystal structure of graphene used for this computation is shown in Figure 5. The atoms in the primitive unit cell are denoted as A and B, and the reference coordinate axes are also shown. Following are the force constants computed and the variable notations:
urn a
a) The first order force constant computed is for the displacement of atom A in the x-direction (Figure ๐๐ 5). This displacement is denoted by ๐ข๐ด,๐ฅ . The corresponding force constant is ๐๐ข , which provides ๐ด,๐ฅ
the first order derivative of total crystal potential with respect to the displacement of atom A in the x-direction. b) The second order force constant computed is
๐2 ๐ , ๐๐ข๐ด,๐ฅ ๐๐ข๐ด,๐ฅ
which provides the second order
derivative of total crystal potential with respect to the displacement of atom A in the x-direction. c) Finally, the third order force constant is computed as
๐3 ๐ , ๐๐ข๐ด,๐ฅ ๐๐ข๐ด,๐ฅ ๐๐ข๐ด,๐ฅ
which provides the third
order derivative of total crystal potential with respect to the displacement of atom A in the xdirection.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
All three derivatives are shown in Figure 6. The derivatives computed using code differentiation are shown by the blue line. The corresponding derivatives computed using the method of finite difference are also shown. For finite differences, we use the second-order accurate central difference formulae shown below:
11
Journal Pre-proof
๐๐ข๐ด,๐ฅ +โ๐ข๐ด,๐ฅ โ๐๐ข๐ด,๐ฅโโ๐ข
๐๐
๐ด,๐ฅ
๐๐ข๐ด,๐ฅ ๐2๐
๐๐ข๐ด,๐ฅ +โ๐ข๐ด,๐ฅ โ
๐๐ข๐ด,๐ฅ ๐๐ข๐ด,๐ฅ โ๐ข๐ด,๐ฅ
๐๐ข๐ด,๐ฅ โ
โ๐ข๐ด,๐ฅ
2 ๐๐ข๐ด,๐ฅ โโ๐ข๐ด,๐ฅ
2 ๐๐ข๐ด,๐ฅ +โ๐ข๐ด,๐ฅ
(1)
2 โ๐ข๐ด,๐ฅ โ๐ข๐ด,๐ฅ โ๐ข๐ด,๐ฅ
urn a
lP
A
Unit Cell
(8)
โ๐ข๐ด,๐ฅ โ๐ข๐ด,๐ฅ
pro of
๐๐ข๐ด,๐ฅ +
๐๐ข๐ด,๐ฅ +๐๐ข๐ด,๐ฅ โโ๐ข๐ด,๐ฅ
re-
๐3 ๐ ๐๐ข๐ด,๐ฅ ๐๐ข๐ด,๐ฅ ๐๐ข๐ด,๐ฅ
(7)
โ๐ข๐ด,๐ฅ
(9)
B
y
x
Figure 5. Crystal lattice of graphene used in the computation of first, second and third order force constants. The unit cell is highlighted showing atoms A and B. The coordinate axis are also shown. (a)
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
12
reurn a
lP
(b)
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
pro of
Journal Pre-proof
(c) 13
relP
Figure 6. Code differentiation versus finite differencing for first, second and third derivatives of the potential energy of a bulk graphene crystal. Sensitivities of finite difference derivatives with respect to the choice of โ๐ข are shown in inset plots for second and third derivatives.
urn a
Each finite difference derivative is computed for a range of ฮ๐ข/a values, where a is the interatomic distance. It can be seen from the figure that the code differentiation procedure is independent of the choice of ฮ๐ข, since the idea does not employ finite difference concepts at all. The value it computes is exact to machine precision. Additionally, as discussed in the introduction, force constants follow the symmetry and invariance relations of the underlying crystal [28]. We have tested for these relations with the force constants computed from code differentiation. We find that all the important symmetry relations (invariance to permutation of indices, translational invariance, rotational invariance, etc.) are satisfied at least to the degree obtained by the least-squares fit in [28]. Derivatives computed using finite differences converge quadratically to the exact value, consistent with a second-order differencing scheme. However, we see that for the second and third derivatives, as ฮ๐ข โ 0, the finite difference calculation begins to diverge due to round-off. More importantly, in the case of finite difference, it is not guaranteed that the value of optimized ฮ๐ข is constant for all derivatives for a problem of interest. For instance, considering the set of all third order force constants needed to understand thermal transport in graphene, the computation of each unique force constant in that set may need an optimized value of displacement that gives a force constant that is close to exact value. Finding such displacements when there is no reference of what an exact derivative could be (which is the case with DFT calculations) is difficult. While the validity of force constants computed using this method can be checked by ensuring that they satisfy symmetries as discussed above [28], these checks cannot guarantee exact force constants, therefore leaving a degree of uncertainty regarding the force constants obtained. In cases where there is a
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
pro of
Journal Pre-proof
14
Journal Pre-proof
high degree of sensitivity of material properties to force constants, the uncertainty in force constants is propagated to the computed material properties. Force constants computed using code differentiation do not suffer from any of these limitations as the derivatives are exact. They also satisfy all important symmetry relations to the degree obtained by the least-squares fit in [28] or better.
re-
pro of
Another major limitation of the finite difference method is the need for several DFT computations or potential energy solves. As discussed above, the method of finite differences involves displacing atoms and measuring the forces on lattice in the displaced configuration. In DFT techniques, this means performing a new DFT calculation for every atomic displacement. This makes force constant computation one of the most expensive steps in first-principles thermal transport modeling. Specifically, when force constants of higher order are required, the number of atomic displacements grows quickly. Computation of each first-order force constant needs two atomic displacements. A second-order force constant needs 4 atomic displacements. A third order force constant needs 8 and fourth order needs 16 atomic displacements. Given that the DFT computation itself is very expensive, having to perform tens of atomic displacements for each force computation can become very expensive. Furthermore, in cases where there are a large number of atoms per unit cell, a large number of force constants are needed, making the process prohibitively expensive. With recent findings on the importance of four-phonon scattering in the prediction of thermal conductivity [12], which in-turn needs fourth order force constants, force constant computations through finite difference is turning out to be the bottleneck in thermal transport models. Researchers employ techniques such as the exploitation of symmetries, elimination of force constant computations for those atomic combinations that are expected to provide infinitesimal force constants, etc. [28], [39]. While these techniques reduce the overall computation time, the fundamental limitation of having to perform several DFT or potential energy solves remains.
urn a
lP
Force constants computed using code differentiation do not have the problem of multiple DFT or potential energy computations for each force constant. Computing a force constant of any order with respect to a given atomic displacement needs only a single DFT computation in the framework of code differentiation. However, the code differentiation procedure does have a runtime overhead due to additional operator overloading and templating operations on each variable. Depending on the problem, the cost associated with these overhead operations may be comparable to the multiple function evaluations required by the finite difference procedure. Careful design of the code differentiation template can reduce runtime costs. For instance, as shown in Figure 3, the template for a third-order force constant for a specific atomic displacement automatically provides first- and second-order force constants. Additionally, the technique described in this work uses forward mode code differentiation to compute derivatives. It has been shown in the literature that using reverse mode computation for code differentiation can significantly speedup derivative computations by at least an order of magnitude [18].
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
15
re-
y
urn a
lP
x
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
pro of
Journal Pre-proof
16
Journal Pre-proof
urn a
lP
re-
pro of
Figure 7. Graphene nanoribbon of width 0.44 nm is used for demonstration. On the right is the phonon dispersion computed by solving lattice dynamics equation on the ribbon structure. Acoustic branches are highlighted in blue, green and red.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
17
relP urn a
Figure 8. Gruneisen parameter - code differentiation versus finite difference. The parameter is computed for the transverse (branch 2) and longitudinal branches (branch 3) of the phonon dispersion of the graphene nanoribbon. The Gruneisen parameter computed using the finite difference method is seen to vary significantly with ฮด๐ near the ฮ point (shown in inset plots). This occurs due to the varying slopes of the linear region of these dispersion curves with respect to a. 3.2 Gruneisen Parameter The Gruneisen parameter is a non-dimensional quantity that describes the response of the vibrational properties of a crystal due to strain applied on it. This source of strain may be mechanical or thermal and can be compressive or tensile. Gruneisen parameters provide useful information about thermal expansion mechanism of a crystal, anharmonicity, etc. [14], [40], [41]. Several definitions exist for the computation of the Gruneisen parameter. The most commonly used definition computes the variation of phonon vibrational frequency due to change in lattice constant. The mode-wise Gruneisen parameter at wave vector k for a polarization p may be defined as
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
pro of
Journal Pre-proof
18
Journal Pre-proof
๐(๐)
๐
๐๐๐(๐)
๐ ,๐(๐)
๐๐
(10)
where V is the volume of the crystal, ๐0,๐(๐) is the frequency of a phonon of polarization p at wave vector ๐๐๐(๐)
k at equilibrium. The derivative ๐๐ in the above equation may be computed exactly using code differentiation. Here the input derivative is the volume of the crystal or lattice constant ๐๐๐(๐)
pro of
( ๐๐ , e ๐ ๐(๐)) in case of materials like graphene where only one lattice parameter is used to describe the crystal. In this example, we choose a graphene nanoribbon structure for demonstrating Gruneisen parameter computation using the code differentiation procedure. A narrow zig-zag nanoribbon of width 0.44 nm is picked for this purpose. The x-direction is confined and thermal transport occurs only in the y-direction (see Figure 7). The phonon dispersion for this structure is shown in Figure 7. The dispersion highlights the three acoustic phonon branches in blue, green and red. In Figure 8, we show the Gruneisen parameters computed for the transverse and longitudinal (green and red, respectively) branches. The corresponding Gruneisen parameters computed using the finite difference scheme are also shown for different values of ๐๐๐(๐) ๐๐
is computed using ๐๐๐(๐)
(
๐๐
๐๐(๐)๐+๐ฟ๐ โ๐๐(๐)๐โ๐ฟ๐
re-
๐ฟ๐, where the derivative
๐ฟ๐
)
(11)
lP
As we can see from the figure, the Gruneisen parameters computed using code differentiation match the numerical values well for most of the ฮ ๐ region. The primary difference is close to the ฮ point, where the finite difference method shows a strong sensitivity to the value of ฮด๐ (see inset plots in Figure 8). The ๐๐๐(๐)
slope in the linear region is very sensitive to the applied strain and the appropriate value of is ๐๐ therefore difficult to compute accurately. To demonstrate this, in the lower inset plot of Figure 8, we show ๐๐๐(๐) ๐๐
computed using the finite difference method with different values of ฮด๐ using the equation above.
In this plot, we can clearly see that
๐๐๐(๐) ๐๐
does not change monotonically with ๐ and only using an ๐๐
urn a
appropriate value ๐ can we find the exact derivative (In Figure 8, a value of 0.0068 is found to ๐ provide the closest match with exact derivative.) In situations like this, the method of code differentiation is very helpful as the derivative is exact and does not depend on perturbing the value of a to compute finite differences. 3.3 Group Velocities Phonon group velocities for the nanoribbon may also be computed using the code differentiation procedure discussed above. The group velocity vector of a phonon of wave vector k is given by ๏k ๏ท where ๐ is the frequency and k is the wave vector; it may be computed from derivatives of the dispersion relation. For a ๐๐ 1D Brillouin zone, such as that for a graphene nanoribbon, for example, ๐ฃ๐ . For this case, this ๐๐ quantity is typically computed by performing finite differencing on each branch of the dispersion curve (๐ ๐). It is necessary to identify the branch of interest and to stay on it during the differencing procedure. The identification of the phonon branches, however, is not straightforward in the case of crystals with several atoms in a unit cell, where multiple intersections of polarization branches make this type of computation error-prone (see Figure 7). One of the other approaches used to compute group velocity without having to perform branch sorting involves directly differentiating the eigenvalue equation that is used to compute phonon dispersion spectrum [42]. The eigenvalue equation for phonon dispersion is given by
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
19
Journal Pre-proof
๐ (๐)๐(๐)
๐ท๐(๐)
(12)
pro of
Here ๐(๐) is the eigenvector and ๐ท is the dynamical matrix. Differentiating Equation (12), one can define the group velocity at a given wave vector and polarization as a function of the eigenvectors of the system. The method of code differentiation can also be used to bypass the branch identification step altogether. The variable ฯ2 in the eigen problem is identified as a Tangent variable. The corresponding wave vectors for the desired derivatives are then Input Tangent variables. For example, in order to compute ๐ฃ๐(๐,๐) , the '
'
urn a
lP
re-
group velocity for the ith point in k-space and the pth branch index, the Input Tangent variables k x๏จi , p ๏ฉ , k y ๏จi , p ๏ฉ are given a value of 1. This results in the output variable ๐ with the following form: ๐ [๐, ๐ฃ๐,๐ฅ , ๐ฃ๐,๐ฆ , ๐ฃ๐,๐ฅ๐ฆ ]. In Figure 9, we show the group velocity ๐ฃ๐,๐ฆ for branch 2 of the graphene nanoribbon computed using the above procedure. Also shown in the figure is the group velocity computed using the finite difference method with a ๐ฟ๐ of ๐๐๐๐ฅ 160, where ๐๐๐๐ฅ ๐โ . โ ๐
Figure 9. Phonon group velocity component ๐๐,๐ for branch 2 (dark green in Figure 7) of the dispersion using the code differentiation and finite difference methods. Both cases use a k-space grid size of 160 points.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
3.4 Temperature Variation of Specific Heat of Graphene Nanoribbons As a final example, we compute the variation of the specific heat of the graphene nanoribbon with respect to small changes in temperature. Specific heat is an important parameter for applications such as thermal energy storage. A good thermal storage material does not exhibit large variations of specific heat with 20
Journal Pre-proof
temperature within the prescribed range. Information on the sensitivity of specific heat can provide useful insight into designing such materials. In this example, we consider graphene nanoribbons with the lattice ๐๐ถ structure shown in Figure 7, but with different widths, and compute the derivative of specific heat ( ๐๐๐ฃ ) with temperature. This derivative can be used to compute the variation in specific heat using the expression
โ๐ถ๐ฃ โ
๐๐ถ๐ฃ ๐๐
โ๐
(13)
pro of
where โ๐ is the variation in temperature. Three different widths of the graphene nanoribbon are considered โ 0.44 nm, 0.88 nm and 2.66 nm -- with 4, 8 and 24 atoms per unit cell, respectively. The variation of specific heat with respect to temperature is computed at three different temperatures (T=40K, T=300K, T=900K). Similar to the above examples, we can use the framework of code differentiation to compute ๐๐ถ this derivative ( ๐๐๐ฃ ). All that is needed to enable the derivative computation is add the Tangent library to an existing codebase that computes specific heat at a given temperature. Specifically, we set temperature as the Input Tangent variable, i.e. the Tโ=1. This derivative is propagated through all computations needed for specific heat evaluation to provide the derivative of specific heat with respect to temperature. In Figure 10, we show the specific heat of GNRs of different widths as a function of temperature. For this demonstration, using Equation (13), we compute the variation of specific heat for a 2% variation in temperature.
urn a
lP
re-
In Table 1, we show the specific heat and the variation in specific heat for a 2% variation in temperature for ribbons of different widths and at different temperatures. As we can see from the table, at low temperatures (T=40K), the variation in specific heat (โ๐ถ๐ฃ ) for a 2% variation in temperature ranges from ~1.6% for the narrowest ribbon to ~2.8% for the widest ribbon - an increase of 1.2%. As the temperature is increased to 300K and then to 900K, the range of variation in specific heat reduces. At 300K, the variation in specific heat (โ๐ถ๐ฃ ) only changes by 0.5% (from narrowest to widest ribbon). At 900K, this change is under 0.2%. Finally, at temperatures above Debye temperature, specific heat of ribbons of all widths converge therefore converging โ๐ถ๐ฃ of all ribbons to zero. Information like this is quite valuable for selecting materials that provide the best performance at different operating conditions. Finally, we also ๐๐ถ verify the accuracy of the computed values of ( ๐๐๐ฃ ) by comparing against analytical derivatives which, for this example, can be obtained easily by differentiating the analytical expression for the specific heat with respect to temperature. We find that the code differentiation derivatives match the analytical values to machine precision.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
21
re-
lP
Figure 10. Specific heat of graphene nanoribbons of different widths as a function of temperature. Here we show the specific heat for ribbons of 0.44nm, 0.88nm and 2.66nm. Also shown in the figure are the asymptotic powers at 40K computed using code differentiation showing the variation of specific heat of each ribbon with respect to temperature.
urn a
Table 1. Variation of the specific heat of graphene nanoribbons with temperature. Ribbons of three different widths are chosen for the demonstration. Analysis is performed at three different temperatures (T=40K, T=300K, T=900K). The derivatives computed using code differentiation match to machine precision the theoretical derivative.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
pro of
Journal Pre-proof
22
re-
lP
4. Discussion The method of code differentiation can be easily added to any existing codes that support templating and operator overloading. First principles DFT simulation codes such as Quantum Espresso [24] and Siesta use Fortran (F90) [43] which support such features. Adding a Tangent class to these codes can unintrusively enable computation of derivatives such as exact force constants of any order. This would be quicker and easier than the DFT or DFPT based approaches to computing derivatives of higher order.
urn a
Our method for the computation of derivatives may be generalized to other applications also. The Tangent class is generic in nature and is independent of the function for which the derivatives are being computed. Jacobian matrices (matrices of first-order derivatives of a function with respect to inputs) and Hessian matrices (matrix of second-order derivatives of a function with respect to inputs) are commonly used in many numerical methods. Code differentiation method may be used to compute these matrices to machine accuracy. 5. Closure In this paper, we presented a generalized method, automatic code differentiation, to perform unintrusive derivative computation in nanoscale thermal transport simulations. This method exploits the concepts of templating and operator overloading in the C++ programming language to easily convert existing codes into those providing derivatives. The idea was shown to be useful in computing phonon properties such as force constants, the Gruneisen parameter and group velocity. Examples of derivative computation using code differentiation were discussed and the results are compared with those obtained using finite difference approaches or with analytical values.
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
pro of
Journal Pre-proof
Data Availability
The data supporting the results in this paper is available upon request from the corresponding author, PM. Acknowledgments Support of J. Murthy and P. Marepalli under NSF grant #1758004 is gratefully acknowledged.
23
Journal Pre-proof
References
[5] [6] [7]
[8]
[9] [10]
[11] [12]
[13] [14]
[15]
[16]
[17] [18] [19]
pro of
[4]
re-
[3]
lP
[2]
D. Singh, โFrequency and Polarization Resolved Phonon Transport in Carbon and Silicon Nanostructures,โ Theses Diss. Available ProQuest, pp. 1โ193, Jan. 2011. L. Lindsay, D. A. Broido, and N. Mingo, โFlexural Phonons and Thermal Transport in Graphene,โ Phys. Rev. B, vol. 82, no. 11, p. 115427, Sep. 2010. A. Cepellotti, G. Fugallo, L. Paulatto, M. Lazzeri, F. Mauri, and N. Marzari, โPhonon Hydrodynamics in Two-Dimensional Materials,โ Nat. Commun., vol. 6, Mar. 2015. G. Chen, M. S. Dresselhaus, G. Dresselhaus, J.-P. Fleurial, and T. Caillat, โRecent Developments in Thermoelectric Materials,โ Int. Mater. Rev., vol. 48, no. 1, pp. 45โ66, Feb. 2003. M. S. Dresselhaus et al., โNew Directions for Low-Dimensional Thermoelectric Materials,โ Adv. Mater., vol. 19, no. 8, pp. 1043โ1053, Apr. 2007. S. Z. Butler et al., โProgress, Challenges, and Opportunities in Two-Dimensional Materials Beyond Graphene,โ ACS Nano, vol. 7, no. 4, pp. 2898โ2926, Apr. 2013. B. Qiu and X. Ruan, โMolecular Dynamics Simulations of Lattice Thermal Conductivity of Bismuth Telluride Using Two-Body Interatomic Potentials,โ Phys. Rev. B, vol. 80, no. 16, p. 165203, Oct. 2009. B. Qiu et al., โFirst-Principles Simulation of Electron Mean-Free-Path Spectra and Thermoelectric Properties in Silicon,โ EPL Europhys. Lett., vol. 109, no. 5, p. 57006, Mar. 2015. C. Liu, F. Li, L.-P. Ma, and H.-M. Cheng, โAdvanced Materials for Energy Storage,โ Adv. Mater., vol. 22, no. 8, pp. E28โE62, Feb. 2010. H. Ji et al., โEnhanced Thermal Conductivity of Phase Change Materials with UltrathinGraphite Foams for Thermal Energy Storage,โ Energy Environ. Sci., vol. 7, no. 3, pp. 1185โ1192, Feb. 2014. J. M. Ziman, Electrons and Phonons: The Theory of Transport Phenomena in Solids. Oxford, New York: Oxford University Press, 2001. T. Feng and X. Ruan, โFour-Phonon Scattering Reduces Intrinsic Thermal Conductivity of Graphene and the Contributions from Flexural Phonons,โ Phys. Rev. B, vol. 97, no. 4, p. 045202, Jan. 2018. Y.-J. Han and P. G. Klemens, โAnharmonic Thermal Resistivity of Dielectric Crystals at Low Temperatures,โ Phys. Rev. B, vol. 48, no. 9, pp. 6033โ6042, Sep. 1993. J. Fabian and P. B. Allen, โThermal Expansion and Gruneisen Parameters of Amorphous Silicon: A Realistic Model Calculation,โ Phys. Rev. Lett., vol. 79, no. 10, pp. 1885โ1888, Sep. 1997. R. J. C. Bennie F. Blackwell Kevin J. Dowding, โDevelopment and Implementation of Sensitivity Coefficient Equations for Heat Conduction Problems,โ Numer. Heat Transf. Part B Fundam., vol. 36, no. 1, pp. 15โ32, Aug. 1999. B.J. Henz, K.K. Tamma, R. Kanapady, N.D. Ngo, and P.W. Chung, โProcess Modeling of Composites by Resin Transfer Molding: Practical Applications of Sensitivity Analysis for Isothermal Considerations,โ Int. J. Numer. Methods Heat Fluid Flow, vol. 13, no. 4, pp. 415โ447, Jun. 2003. D. A. Tortorelli and P. Michaleris, โDesign Sensitivity Analysis: Overview and Review,โ Inverse Probl. Eng., vol. 1, no. 1, pp. 71โ105, Oct. 1994. A. Griewank and A. Walther, Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008. C. H. Bischof, H. M. Bรผcker, P. Hovland, U. Naumann, and J. Utke, Advances in Automatic Differentiation, 1st ed. Springer Publishing Company, Incorporated, 2008.
urn a
[1]
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
24
Journal Pre-proof
[25] [26]
[27]
[28] [29]
[30] [31]
[32]
[33] [34] [35] [36]
[37] [38]
[39]
pro of
[24]
re-
[22] [23]
lP
[21]
A. G. Baydin, B. A. Pearlmutter, A. A. Radul, and J. M. Siskind, โAutomatic Differentiation in Machine Learning: a Survey,โ J. Mach. Learn. Res., vol. 18, no. 153, pp. 1โ43, 2018. M. Abadi et al., โTensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems,โ ArXiv160304467 Cs, Mar. 2016. A. Paszke et al., โAutomatic Differentiation in Pytorch,โ Oct. 2017. J. Y. Murthy and S. R. Mathur, โComputational Heat Transfer in Complex Systems: A Review of Needs and Opportunities,โ J. Heat Transf., vol. 134, no. 3, pp. 031016โ031016, Jan. 2012. P. Giannozzi et al., โQUANTUM ESPRESSO: A Modular and Open-Source Software Project for Quantum Simulations of Materials,โ J. Phys. Condens. Matter, vol. 21, no. 39, p. 395502, Sep. 2009. S. Plimpton, โFast Parallel Algorithms for Short-Range Molecular Dynamics,โ J. Comput. Phys., vol. 117, no. 1, pp. 1โ19, Mar. 1995. W. Li, J. Carrete, N. A. Katcho, and N. Mingo, โShengbte: A Solver of the Boltzmann Transport Equation for Phonons,โ Comput. Phys. Commun., vol. 185, no. 6, pp. 1747โ1758, Jun. 2014. A. Debernardi and S. Baroni, โThird-Order Density-Functional Perturbation Theory: A Practical Implementation with Applications to Anharmonic Couplings in Si,โ Solid State Commun., vol. 91, no. 10, pp. 813โ816, Sep. 1994. K. Esfarjani and H. T. Stokes, โMethod to Extract Anharmonic Force Constants from First Principles Calculations,โ Phys. Rev. B, vol. 77, no. 14, p. 144112, Apr. 2008. X. Gonze, J.-C. Charlier, D. C. Allan, and M. P. Teter, โInteratomic Force Constants from First Principles: The Case of $\alpha${}-Quartz,โ Phys. Rev. B, vol. 50, no. 17, pp. 13035โ 13038, Nov. 1994. G. Lang et al., โAnharmonic Line Shift and Linewidth of the Raman Mode in Covalent Semiconductors,โ Phys. Rev. B, vol. 59, no. 9, pp. 6182โ6188, Mar. 1999. K. Kunc and R. M. Martin, โ\textit{Ab Initio} Force Constants of GaAs: A New Approach to Calculation of Phonons and Dielectric Properties,โ Phys. Rev. Lett., vol. 48, no. 6, pp. 406โ409, Feb. 1982. K. Parlinski, Z. Q. Li, and Y. Kawazoe, โFirst-Principles Determination of the Soft Mode in Cubic ${\mathrm{ZrO}}_{2}$,โ Phys. Rev. Lett., vol. 78, no. 21, pp. 4063โ4066, May 1997. M. H. F. Sluiter, M. Weinert, and Y. Kawazoe, โForce Constants for Substitutional Alloys,โ Phys. Rev. B, vol. 59, no. 6, pp. 4100โ4111, Feb. 1999. M. H. F. Sluiter, M. Weinert, and Y. Kawazoe, โDetermination of the Elastic Tensor in Low-Symmetry Structures,โ EPL Europhys. Lett., vol. 43, no. 2, p. 183, Jul. 1998. J. Tersoff, โNew Empirical Approach for the Structure and Energy of Covalent Systems,โ Phys. Rev. B, vol. 37, no. 12, pp. 6991โ7000, Apr. 1988. D. W. Brenner, โEmpirical Potential for Hydrocarbons for Use in Simulating the Chemical Vapor Deposition of Diamond Films,โ Phys. Rev. B, vol. 42, no. 15, pp. 9458โ9471, Nov. 1990. M. Z. Bazant, E. Kaxiras, and J. F. Justo, โEnvironment Dependent Interatomic Potential for Bulk Silicon,โ Phys. Rev. B, vol. 56, no. 14, pp. 8542โ8552, Oct. 1997. L. Lindsay and D. A. Broido, โOptimized Tersoff and Brenner Empirical Potential Parameters for Lattice Dynamics and Phonon Thermal Transport in Carbon Nanotubes and Graphene,โ Phys. Rev. B, vol. 81, no. 20, p. 205441, May 2010. T. Feng and X. Ruan, โQuantum mechanical prediction of four-phonon scattering rates and reduced thermal conductivity of solids,โ Phys. Rev. B, vol. 93, no. 4, p. 045202, Jan. 2016.
urn a
[20]
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
25
Journal Pre-proof
pro of
[43]
re-
[42]
lP
[41]
N. Mounet and N. Marzari, โFirst-Principles Determination of the Structural, Vibrational and Thermodynamic Properties of Diamond, Graphite, and Derivatives,โ Phys. Rev. B, vol. 71, no. 20, p. 205214, May 2005. D. Singh, J. Y. Murthy, and T. S. Fisher, โOn the Accuracy of Classical and Long Wavelength Approximations for Phonon Transport in Graphene,โ J. Appl. Phys., vol. 110, no. 11, p. 113510, Dec. 2011. J.-S. Wang, J. Wang, and J. T. Lรผ, โQuantum thermal transport in nanostructures,โ Eur. Phys. J. B, vol. 62, no. 4, pp. 381โ404, Apr. 2008. J. M. Soler et al., โThe SIESTA Method for Ab Initio Order- N Materials Simulation,โ J. Phys. Condens. Matter, vol. 14, no. 11, p. 2745, 2002.
urn a
[40]
Jo
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
26
Journal Pre-proof
Declaration of Interest Statement
Jo
urn a
lP
re-
pro of
Declaration of interest:None
urn a
lP
re-
pro of
Journal Pre-proof
๐ถ๐ฃ (J/m3/K)
T=300K
T=900K
(โ๐ = 2% ๐๐ ๐)
4 atoms (0.44 nm)
2.20E+05
3.47E+03
1.58
8 atoms (0.88 nm)
1.28E+05
2.77E+03
2.16
24 atoms (2.66 nm)
8.31E+04
2.29E+03
2.76
4 atoms (0.44 nm)
1.83E+06
3.64E+04
1.99
8 atoms (0.88 nm)
1.58E+06
3.62E+04
2.29
24 atoms (2.66 nm)
1.41E+06
3.59E+04
2.54
4 atoms (0.44 nm)
3.67E+06
2.11E+04
0.58
8 atoms (0.88 nm)
3.57E+06
2.42E+04
0.68
24 atoms (2.66 nm)
3.50E+06
2.62E+04
0.75
Jo
T=40K
๐๐ถ๐ฃ โ โ๐ ๐๐
Percentage Variation in Specific Heat
โ๐ถ๐ฃ โ