Difference between revisions of "PEXSI"

From ESL
Jump to navigation Jump to search
(Re-do box for new standard)
Line 2: Line 2:
 
authors =  
 
authors =  
 
[http://math.berkeley.edu/~linlin/ Lin Lin], University of California Berkeley and Lawrence Berkeley National Laboratory, CA, US<br/>
 
[http://math.berkeley.edu/~linlin/ Lin Lin], University of California Berkeley and Lawrence Berkeley National Laboratory, CA, US<br/>
[http://crd.lbl.gov/about/staff/amsc/scientific-computing-group-scg/chao-yang/ Chao Yang], Lawrence Berkeley National Laboratory, CA, US
+
[http://crd.lbl.gov/about/staff/amsc/scientific-computing-group-scg/chao-yang/ Chao Yang], Lawrence Berkeley National Laboratory, CA, US |
|
 
 
esl-entry = [[User:ghuhs|Georg Huhs]] |
 
esl-entry = [[User:ghuhs|Georg Huhs]] |
 
license = BSD modified by LBNL |
 
license = BSD modified by LBNL |
 
typedel = External |
 
typedel = External |
download = [http://math.berkeley.edu/~linlin/pexsi/page_download.html Project page] |}}
+
download = [http://math.berkeley.edu/~linlin/pexsi/page_download.html Project page] |
 +
functionalities = <li>[[Resolvent-based solvers]]</li> |
 +
algorithms = <li>[[Pole expansion]]</li> <li>[[Selected inversion]]</li>}}
  
 
The '''P'''ole '''EX'''pansion and '''S'''elected '''I'''nversion method '''PEXSI'''  
 
The '''P'''ole '''EX'''pansion and '''S'''elected '''I'''nversion method '''PEXSI'''  

Revision as of 16:57, 6 August 2014



The Pole EXpansion and Selected Inversion method PEXSI is a fast method for evaluating certain selected elements of a matrix function and can be used for solving the KS-equation without explicit diagonalization. PEXSI is particularly advantageous for large systems since it reduces the computational effort to

  • for 3D systems
  • for (quasi-) 2D systems
  • for (quasi-) 1D systems

Furthermore it is highly scalable on distributed memory parallel machines.

The Selected Inversion by itself can be used for calculating specific elements of the inverse of a sparse matrix.

Most of the following documentation originates from the PEXSI-homepage.

Theory

Basics

Given a sparse square matrix and a certain function , the basic idea of PEXSI is to expand using a small number of rational functions (pole expansion)

and to efficiently evaluate by evaluating selected elements (selected inversion).

The currently supported form of includes:

  • : Matrix inversion. Since the matrix inversion is already represented as a single term of rational function (pole), no pole expansion is needed. The selected inversion method can be significantly faster than directly inverting the matrix and then extract the selected elements of the inverse.
  • : Fermi-Dirac function. This can be used as a "smeared" matrix sign function at , without assuming a spectral gap near . This can be used for evaluating the electron density for electronic structure calculation.
Red: Fermi-Dirac function. Black: Matrix sign function

For sparse matrices, the PEXSI method can be more efficient than the widely used diagonalization method for evaluating matrix functions, especially when a relatively large number of eigenpairs are needed to be computed in the diagonalization method.

Application to electronic structure calculations

PEXSI can be used to solve the generalized eigenvalue problem that has to be solved in each SCF-iteration of the KS-DFT formalism without explicit diagonalization. Instead the density matrix is computed directly. Other magnitudes (free energy, atomic forces, local density of states) are given by the same formalism with negligible effort. The PEXSI framework can also be used for calculating the total density of states very efficiently.

With the pole expansion of the Fermi function

the density matrix (in case of a non-orthogonal basis) becomes


The chemical potential has to fulfill
with the total number of electrons and the electron density in real space.

For finding PEXSI implements a hybrid scheme, combining a bisection and a Newton method. This allows determining in a few iterations, even when starting with a wide guess and for difficult cases. Systems with a gap are naturally even easier to solve. If there is a good guess for available, for example from a previous SCF iteration, the bisection can be turned off.

The PEXSI library provides two interfaces. A simple one, finding many of the parameters automatically, and another one offering extensive control of the behavior of the solver.

An extensive description of the implementation of PEXSI in the numerical atomic orbital code SIESTA can be found in [1]

Parallelization

Besides the favorable weak scaling, PEXSI can also scale to large numbers of processors due to its two-level parallelism:

  1. For accurate evaluation, the pole expansion usually takes 40-60 poles. The calculations on each of these poles are completely independent of each other, enabling an efficient parallelization on this level.
  2. The calculations on each pole can also be parallelized, with the number of processors that can be used efficiently strongly depending on the problem size, the matrix sparsity, and the sparsity pattern. Typically tens up to thousands of processors per pole are used. An extensive description of the selected inversion and its parallelization can be found in [2]

Large examples need a minimum amount of processors to have enough memory available. Typically this number is much smaller than with dense direct methods like ScaLAPACK.

Publications

  • Implementation of PEXSI into SIESTA [1]
  • PEXSI [3]
  • (Parallel) Selected Inversion [2] [4] [5] [6]
  • Pole Expansion [7]

License

PEXSI is distributed under BSD license (modified by Lawrence Berkeley National Laboratory).

License text

Download

A source tarball can be downloaded directly from here.

Installation

Prerequisites

PEXSI requires an external parallel factorization or factorization routine, and an external parallel matrix reordering routine to reduce the fill-in of the factorization routine.
Currently we use SuperLU_DIST for the parallel factorization, and ParMETIS for the parallel fill-in reducing reordering. It is also possible to use PT-Scotch for the reordering. But we recommend to first download ParMETIS.

Instructions

Edit make.inc

Configuration of PEXSI is controlled by a single make.inc file. Examples of the make.inc file are given under the config/ directory.

Find make.inc with the most similar architecture, and copy to the main PEXSI directory (using Edison for example, the latest Intel computer at NERSC). ${PEXSI_DIR} stands for the main directory of PEXSI.

 cd ${PEXSI_DIR}
 cp config/make.inc.edison.intel make.inc

Edit the variables in make.inc

 PEXSI_DIR     = Main directory for PEXSI
 DSUPERLU_DIR  = Main directory for SuperLU_DIST
 METIS_DIR     = Main directory for METIS
 PARMETIS_DIR  = Main directory for ParMETIS 
 PTSCOTCH_DIR  = Main directory for PT-Scotch

Note:
PEXSI can be compiled using debug or release mode in by the variable COMPILE_MODE in make.inc. This variable mainly controls the compiling flag -DRELEASE. The debug mode introduces tracing of call stacks at all levels of functions, and may significantly slow down the code. For production runs, use release mode. The *.profile configuration files are for debugging purpose and can be ignored.

Build the PEXSI library

If make.inc is configured correctly,

 cd ${PEXSI_DIR}
 cd src
 make

should produce libpexsi_(suffix).a under src/.

Tests

Build examples

After libpexsi_(suffix).a is built, all driver routines are readily to be compiled. For example, the selected inversion for a complex matrix has the test routine

 cd ${PEXSI_DIR}
 cd examples
 make driver_pselinv_complex

should produce driver_pselinv_complex, which can be executed with MPI.

For more information on the examples, see Tutorial.

Run tests

After driver_pselinv_complex is compiled,

 examples$ mpirun -n 1 ./driver_pselinv_complex

should return the diagonal of the matrix

saved on the 0-th processor, where A is the five-point discretization of a Laplacian operator on a 2D domain. The result can be compared with examples/driver_pselinv_complex.out to check the correctness of the result.


Programming interface

The full documentation of the C/C++ and Fortran interfaces can be found on the project's page:

Further information

The Homepage contains:

Future developments

  • Generalization for k-points
  • MPI-OpenMP hybrid version
  • Port to Intel MIC

References

  1. 1.0 1.1 L. Lin, A. Garcia, G. Huhs and C. Yang, SIESTA-PEXSI: Massively parallel method for efficient and accurate ab initio materials simulation without matrix diagonalization, arXiv
  2. 2.0 2.1 M. Jacquelin, L. Lin and C. Yang, PSelInv -- A Distributed Memory Parallel Algorithm for Selected Inversion : the Symmetric Case arXiv
  3. L. Lin, M. Chen, C. Yang and L. He, Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion, J. Phys. Condens. Matter 25, 295501, 2013 journal
  4. L. Lin, C. Yang, J. Meza, J. Lu, L. Ying and W. E, SelInv -- An algorithm for selected inversion of a sparse symmetric matrix, ACM Trans. Math. Software 37, 40, 2011 journal
  5. L. Lin, C. Yang, J. Lu, L. Ying and W. E, A Fast Parallel algorithm for selected inversion of structured sparse matrices with application to 2D electronic structure calculations, SIAM J. Sci. Comput. 33, 1329, 2011 journal
  6. L. Lin, J. Lu, L. Ying, R. Car and W. E, Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems, Commun. Math. Sci. 7, 755, 2009 journal
  7. L. Lin, J. Lu, L. Ying and W. E, Pole-based approximation of the Fermi-Dirac function, Chin. Ann. Math. 30B, 729, 2009 journal