Constrained DFT-based magnetic machine-learning potentials for magnetic alloys: a case study of FeAl | Scientific … – Nature.com

Magnetic multi-component moment tensor potential (mMTP)

The concept of magnetic multi-component Moment Tensor Potential (mMTP) presented in the current research is based on the previously developed non-magnetic MTP for multi-component systems41,42 and magnetic MTP for single-component systems35.

The mMTP potential is local, i.e., the energy of the atomistic system is a sum of energies of individual atoms:

$$begin{aligned} E = sum _{i=1}^{N_a}E_i, end{aligned}$$

(1)

where i stands for the individual atoms in an (N_a)-atom system. We note that any configuration includes lattice vectors ({{varvec{L}}} = {{{varvec{l}}}_1,{{varvec{l}}}_2,{{varvec{l}}}_3}), atomic positions ({{varvec{R}}} = {{{varvec{r}}}_1, ldots , {{varvec{r}}}_{N_a}}), types (Z = {z_1,ldots ,z_{N_{a}}}) (we also denote (N_{rm types}) by the total number of atomic types in the system), and magnetic moments (M = {m_1,ldots ,m_{N_a}}). The energy of the atom (E_i), in turn, has the form:

$$begin{aligned} E_i = sum _{alpha =1}^{alpha _{rm max}} xi _{alpha }B_{alpha }({mathfrak n}_i), end{aligned}$$

(2)

where ({{varvec{xi }}} = {xi _{alpha } }) are the linear parameters to be optimized and (B_alpha) are the so-called basis functions, which are contractions of the descriptors25 of atomistic environment ({mathfrak n}_i), yielding a scalar. The (alpha _text {max}) parameter can be changed to provide potentials with different amount of parameters35.

The descriptors are composed of the radial part, i.e., the scalar function depending on the interatomic distances and atomic magnetic moments, and the angular part, which is a tensor of rank (nu):

$$begin{aligned} M_{mu ,nu }({mathfrak n}_i)=sum _{j} f_{mu }(| {{varvec{r}}}_{ij}|,z_i,z_j,m_i,m_j)underbrace{{{varvec{r}}}_{ij}otimes ...otimes {{varvec{r}}}_{ij}}_nu text { times }, end{aligned}$$

(3)

where ({mathfrak n}_i) stands for the atomic environment, including all the atoms within the (R_text {cut}) distance (or less) from the central atom i, (mu) is the number of the radial function, (nu) is the rank of the angular part tensor, (|{{varvec{r}}}_{ij}|) is the distance between the atoms i and j, (z_i) and (z_j) are the atomic types, (m_i) and (m_j) are the magnetic moments of the atoms.

The radial functions are expanded in a basis of Chebyshev polynomials:

$$begin{aligned} f_{mu }(|r_{ij}|,z_i,z_j,m_i,m_j) = sum _{zeta =1}^{N_{phi }} sum _{beta =1}^{N_{psi }}sum _{gamma =1}^{N_{psi }}c_{mu ,z_i,z_j}^{zeta ,beta ,gamma } phi _{zeta }(|{varvec{r}}_{ij}|) psi _{beta }(m_i)psi _{gamma }(m_j) (R_{rm cut} - |{varvec{r}}_{ij}|)^2. end{aligned}$$

(4)

Here ({{varvec{c}}} = {c_{mu ,z_i,z_j}^{zeta ,beta ,gamma }}) are the radial parameters to be optimized, each of the functions (phi _{zeta }(|{varvec{r}}_{ij}|)), (psi _{beta }(m_i)), (psi _{gamma }(m_i)) is a Chebyshev polynomial of order (zeta), (beta) and (gamma) correspondingly, taking values from (-1) to 1. The function (phi _{zeta }(|{varvec{r}}_{ij}|)) yields the dependency on the distance between the atoms i and j, while the functions (psi _{beta }(m_i)) and (psi _{gamma }(m_j)) yield the dependency on the magnetic moments of the atoms i and j, correspondingly. The arguments of the functions (phi _{zeta }(|{varvec{r}}_{ij}|)) are on the interval ((R_{rm min},R_{rm cut})), where (R_{rm min}) and (R_{rm cut}) are the minimum and maximum distance, correspondingly, between the interacting atoms. The functions (psi _{beta }(m_i)) and (psi _{gamma }(m_j)) are of the same structure, which we explain for the case of the former one. The argument of the function (psi _{beta }(m_i)) is the magnetic moment of the atom i, taking the values on the ((-M_{rm max}^{z_i},M_{rm max}^{z_i})) interval. The value (M_{rm max}^{z_i}) itself depends on the type of atom (z_i), and is determined as the maximal absolute value of the magnetic moment for atom type (z_i) in the training set. Similar to the conventional MTP, the term ((R_{rm cut} - |{varvec{r}}_{ij}|)^2) provides smooth fading to 0 when approaching the (R_{rm cut}) distance, in accordance with the locality principle (1).

We note that magnetic degrees of freedom (m_i) from (4) are collinear, i.e., they can take negative or positive values as projection onto the Z axis (though the choice of the axis is arbitrary). This way, in comparison to non-magnetic atomistic systems with N atoms, in which the amount of degrees of freedom equals 4N (namely 3N for coordinates and N for types), for the description of magnetic systems additional N degrees of freedom are introduced, standing for the magnetic moment (m_i) of each atom. The amount of parameters entering the radial functions (Eq. 4) also increases in mMTP compared to the conventional MTP41,42. Namely, in MTP this number equals (N_{mu } cdot N_{phi } cdot N_{rm types}^2), while in mMTP it is (N_{mu } cdot N_{phi } cdot N_{rm types}^2 cdot N_{psi }^2). Thus, if we take (N_{psi } = 2) (which is used in the current research), the amount of the parameters entering the radial functions would be four times more in mMTP then in MTP.

We denote all the mMTP parameters by ({varvec{theta }}= {{varvec{xi }}, {varvec{c}} }) and the total energy (1) of the atomic system by (E=E({{varvec{theta }}})=E({{varvec{theta }}};M)=E({{varvec{theta }}};{{varvec{L}}},{{varvec{R}}},Z,M)).

The tensor (Eq. (4)) includes collinear magnetic moments in its functional form. However, it is not invariant with respect to the inversion of magnetic moments, i.e., (E({{varvec{theta }}};M) ne E({{varvec{theta }}};-M)), while both original and spin-inverted configurations should yield the same energy due to the arbitrary orientation of the projection axis, which we further call the magnetic symmetry.

We use data augmentation followed by explicit symmetrization with respect to magnetic moments to train a symmetric mMTP as we discuss below. Assume we have K configurations in the training set with DFT energies (E_k^{rm DFT}), forces ({varvec{f}}^{rm DFT}_{i,k}), and stresses (sigma ^{rm DFT}_{ab,k}) ((a,b=1,2,3)) calculated. We find the optimal parameters (bar{{{varvec{theta }}}}) (fit mMTP) by minimizing the objective function:

$$begin{aligned} &sum _{k=1}^{K} Biggl [ w_{rm e} Biggl | frac{E_k ({varvec{theta }}; M) + E_{k}({varvec{theta }}; -M)}{2} - E_{k}^{rm DFT}Biggr |^2 \&quad + w_{rm f} sum _{i=1}^{N_a} Biggl | frac{{varvec{f}}_{i,k}({varvec{theta }};M) + {varvec{f}}_{i,k}({varvec{theta }};-M)}{2} - {varvec{f}}^{rm DFT}_{i,k}Biggr |^2 \&quad +w_{rm s} sum _{a,b=1}^{3} Biggl | frac{sigma _{ab,k}({varvec{theta }};M)+sigma _{ab,k}({varvec{theta }};-M)}{2} -sigma ^{rm DFT}_{ab,k}Biggr |^2 Biggr ], end{aligned}$$

(5)

where (w_{rm e}), (w_{rm f}), and (w_{rm s}) are non-negative weights. By minimizing (5) we find such optimal parameters (bar{{{varvec{theta }}}}) that yield (E_k (bar{{varvec{theta }}}; M) approx E_k (bar{{varvec{theta }}}; -M)), (k = 1, ldots , K) (the same fact takes place for the mMTP forces and stresses), i.e., we symmetrize the training set to make mMTP learn the required symmetry from the data itselfthis is called data augmentation.

Next, we modify mMTP to make the energy used for the simulations (e.g., relaxation of configurations) to satisfy the exact symmetry:

$$begin{aligned} E^{rm symm}(bar{{{varvec{theta }}}};M) = dfrac{E(bar{{varvec{theta }}};M)+E(bar{{varvec{theta }}};-M)}{2}. end{aligned}$$

(6)

That is, we substitute the mMTP energy (1) into (6) and get a functional form which satisfies the exact identity (E^{rm symm}(bar{{{varvec{theta }}}};M) = E^{rm symm}(bar{{{varvec{theta }}}};-M)) for any configuration. We also note that (E (bar{{varvec{theta }}}) approx E^{rm symm}(bar{{{varvec{theta }}}})).

We use the cDFT approach with hard constraints(i.e., Lagrange multiplier) as proposed by Gonze et al. in Ref.19. One way to formulate it is to first note that in a single-point DFT calculation we minimize the Kohn-Sham total energy functional (E[rho ; {{varvec{R}}}]) with respect to the electronic density (rho =rho (r)) (here (rho) combines the spin-up and spin-down electron densities), keeping the nuclei position ({{varvec{R}}}) fixed. In other words, we solve the following minimization problem:

$$begin{aligned} E_{rm DFT}({{varvec{R}}}) = min _rho E[rho ; {{varvec{R}}}], end{aligned}$$

and from the optimal (rho ^* = mathrm{arg,min} E[rho ; {{varvec{R}}}]) we can, e.g., find magnetization (m(r) = rho ^*_+ - rho ^*_-), where the subscripts denote the spin-up ((+)) and spin-down () densities. The magnetic moment of the ith atom can be found by integrating m(r) over some (depending on the partitioning scheme) region around the atom:

$$begin{aligned} m_i = int _{Omega _i} m(r) textrm{d}r. end{aligned}$$

(7)

Since the minimizer (rho ^*) depends on ({{varvec{R}}}), (m_i) are also the functions of ({{varvec{R}}}).

According to the cDFT approach19, we now formulate the problem of minimizing (E[rho ; {{varvec{R}}}]) in which not only ({{varvec{R}}}),but also (rho) is allowed to change only subject to constraints (7):

$$begin{aligned} begin{array}{rcl} E_{rm cDFT}(rho, {{varvec{R}}}, M) =&{} min _rho &{} E[rho ; {{varvec{R}}}] \ &{} text {subject to} &{} m_i = int _{Omega _i} big (rho _{+}(r)-rho _-(r)big ) textrm{d}r. end{array} end{aligned}$$

The algorithmic details of how this minimization problem is solved, and how the energy derivatives (forces, stresses, torques) are computed, are described in detail in Ref.19.

We used the ABINIT code43,44 for DFT (and cDFT recently developed and described in Ref.19) calculations with (6times 6times 6) k-point mesh and cutoff energy of 25 Hartree (about 680 eV). We utilized the PAW PBE method with the generalized gradient approximation. We applied constraints on magnetic moments of all atoms during cDFT calculations.

We fitted an ensemble of five mMTPs with 415 parameters in order to quantify the uncertainty of mMTPs predictions. For each mMTP we took (R_{rm min} = 2.1 ~) , (R_{rm cut} = 4.5 ~), (M_{rm max}^{rm Al} = 0.1 ~mu _B), and (M_{rm max}^{rm Fe} = 3.0 ~mu _B). The weights in the objective function (5) were (w_{rm e} = 1), (w_{rm f} = 0.01) (^2), and (w_{rm s} = 0.001).

See the article here:
Constrained DFT-based magnetic machine-learning potentials for magnetic alloys: a case study of FeAl | Scientific ... - Nature.com

Related Posts

Comments are closed.