PEXSI

From ESL
Jump to navigation Jump to search

{{#if: Lin Lin, University of California Berkeley and Lawrence Berkeley National Laboratory, CA, US

Chao Yang, Lawrence Berkeley National Laboratory, CA, US |

Source authors:
Lin Lin, University of California Berkeley and Lawrence Berkeley National Laboratory, CA, US
Chao Yang, Lawrence Berkeley National Laboratory, CA, US

}}{{#if: BSD modified by LBNL |

License: BSD modified by LBNL

}}{{#if: Project page |

Download: Project page

}}{{#if: |

Documentation: {{{documentation}}}

}}{{#if:
  • Resolvent-based solver
  • |

    Links to other ESL entries

    | {{#if: |

    Links to other ESL entries

    | {{#if: |

    Links to other ESL entries

    | {{#if: |

    Links to other ESL entries

    |{{#if: |

    Links to other ESL entries

    | {{#if:
  • ELSI
  • |

    Links to other ESL entries

    |}}}}}}}}}}}}{{#if:
  • Resolvent-based solver
  • |

    Functionalities:

    }}{{#if: |

    Algorithms:

      {{{algorithms}}}

    }}{{#if: |

    Generic interfaces:

      {{{generic interfaces}}}

    }}{{#if: |

    APIs:

      {{{apis}}}

    }}{{#if: |

    Data standards:

      {{{data standards}}}

    }}{{#if:
  • ELSI
  • |

    Software:

    }}

    The Pole EXpansion and Selected Inversion method PEXSI is a fast method for evaluating certain selected elements of a matrix function and can be used for solving the KS-equation without explicit diagonalization. PEXSI is particularly advantageous for large systems since it reduces the computational effort to

    • for 3D systems
    • for (quasi-) 2D systems
    • for (quasi-) 1D systems

    Furthermore it is highly scalable on distributed memory parallel machines.

    The Selected Inversion by itself can be used for calculating specific elements of the inverse of a sparse matrix.

    Most of the following documentation originates from the PEXSI-homepage.

    Theory

    Basics

    Given a sparse square matrix and a certain function , the basic idea of PEXSI is to expand using a small number of rational functions (pole expansion)

    and to efficiently evaluate by evaluating selected elements (selected inversion).

    The currently supported form of includes:

    • : Matrix inversion. Since the matrix inversion is already represented as a single term of rational function (pole), no pole expansion is needed. The selected inversion method can be significantly faster than directly inverting the matrix and then extract the selected elements of the inverse.
    • : Fermi-Dirac function. This can be used as a "smeared" matrix sign function at , without assuming a spectral gap near . This can be used for evaluating the electron density for electronic structure calculation.
    Red: Fermi-Dirac function. Black: Matrix sign function

    For sparse matrices, the PEXSI method can be more efficient than the widely used diagonalization method for evaluating matrix functions, especially when a relatively large number of eigenpairs are needed to be computed in the diagonalization method.

    Application to electronic structure calculations

    PEXSI can be used to solve the generalized eigenvalue problem that has to be solved in each SCF-iteration of the KS-DFT formalism without explicit diagonalization. Instead the density matrix is computed directly. Other magnitudes (free energy, atomic forces, local density of states) are given by the same formalism with negligible effort. The PEXSI framework can also be used for calculating the total density of states very efficiently.

    With the pole expansion of the Fermi function

    the density matrix (in case of a non-orthogonal basis) becomes


    The chemical potential has to fulfill
    with the total number of electrons and the electron density in real space.

    For finding PEXSI implements a hybrid scheme, combining a bisection and a Newton method. This allows determining in a few iterations, even when starting with a wide guess and for difficult cases. Systems with a gap are naturally even easier to solve. If there is a good guess for available, for example from a previous SCF iteration, the bisection can be turned off.

    The PEXSI library provides two interfaces. A simple one, finding many of the parameters automatically, and another one offering extensive control of the behavior of the solver.

    An extensive description of the implementation of PEXSI in the numerical atomic orbital code SIESTA can be found in [1], and more usage information under Further_information.

    Parallelization

    Besides the favorable weak scaling, PEXSI can also scale to large numbers of processors due to its two-level parallelism:

    1. For accurate evaluation, the pole expansion usually takes 40-60 poles. The calculations on each of these poles are completely independent of each other, enabling an efficient parallelization on this level.
    2. The calculations on each pole can also be parallelized, with the number of processors that can be used efficiently strongly depending on the problem size, the matrix sparsity, and the sparsity pattern. Typically tens up to thousands of processors per pole are used. An extensive description of the selected inversion and its parallelization can be found in [2]

    Large examples need a minimum amount of processors to have enough memory available. Typically this number is much smaller than with dense direct methods like ScaLAPACK.

    Publications

    Algorithms:

    Implementations in electronic structure codes:

    • Implementation in SIESTA [1]

    Applications to physical systems:

    • 1D, 2D, and 3D model systems in [1]
    • Graphene Nanoflakes [8]

    License

    PEXSI is distributed under BSD license (modified by Lawrence Berkeley National Laboratory).

    Download

    A source tarball can be downloaded directly from here.

    Installation

    Prerequisites

    PEXSI requires an external parallel factorization or factorization routine, and an external parallel matrix reordering routine to reduce the fill-in of the factorization routine.
    Currently we use SuperLU_DIST for the parallel factorization, and ParMETIS for the parallel fill-in reducing reordering. It is also possible to use PT-Scotch for the reordering. But we recommend to first download ParMETIS.

    Instructions

    Edit make.inc

    Configuration of PEXSI is controlled by a single make.inc file. Examples of the make.inc file are given under the config/ directory.

    Find make.inc with the most similar architecture, and copy to the main PEXSI directory (using Edison for example, the latest Intel computer at NERSC). ${PEXSI_DIR} stands for the main directory of PEXSI.

     cd ${PEXSI_DIR}
     cp config/make.inc.edison.intel make.inc
    

    Edit the variables in make.inc

     PEXSI_DIR     = Main directory for PEXSI
     DSUPERLU_DIR  = Main directory for SuperLU_DIST
     METIS_DIR     = Main directory for METIS
     PARMETIS_DIR  = Main directory for ParMETIS 
     PTSCOTCH_DIR  = Main directory for PT-Scotch
    

    Note:
    PEXSI can be compiled using debug or release mode in by the variable COMPILE_MODE in make.inc. This variable mainly controls the compiling flag -DRELEASE. The debug mode introduces tracing of call stacks at all levels of functions, and may significantly slow down the code. For production runs, use release mode. The *.profile configuration files are for debugging purpose and can be ignored.

    Build the PEXSI library

    If make.inc is configured correctly,

     cd ${PEXSI_DIR}
     cd src
     make
    

    should produce libpexsi_(suffix).a under src/.

    Tests

    Build examples

    After libpexsi_(suffix).a is built, all driver routines are readily to be compiled. For example, the selected inversion for a complex matrix has the test routine

     cd ${PEXSI_DIR}
     cd examples
     make driver_pselinv_complex
    

    should produce driver_pselinv_complex, which can be executed with MPI.

    For more information on the examples, see Tutorial.

    Run tests

    After driver_pselinv_complex_(suffix) is compiled,

     examples$ mpirun -n 1 ./driver_pselinv_complex_(suffix)
    

    should return the diagonal of the matrix

    saved on the 0-th processor, where A is the five-point discretization of a Laplacian operator on a 2D domain. The result can be compared with examples/driver_pselinv_complex.out to check the correctness of the result.

    Programming interface

    The full documentation of the C/C++ and Fortran interfaces can be found on the project's page:

    • C/C++ interface
    • Fortran interface
    • On top of the fine grained interface there is also a simplified driver routine for DFT calculations available. This function contains both the inertia counting step for estimating the chemical potential, and the Newton's iteration for updating the chemical potential. Heuristics are built into this routine. A detailed description can be found on the project page.

    Further information

    The Homepage contains:

    A version of SIESTA with PEXSI 0.7.3 implemented is available as tarball from the SIESTA page under Other versions of Siesta (after selecting the applicable user agreement). The documentation delivered with this version also includes a description of the configuration of the PEXSI solver, which gives a general overview about how to use PEXSI for electronic structure calculations.
    For faster access there is an excerpt of the SIESTA-manual.

    Future developments

    • Generalization for k-points
    • Hybrid version MPI + multithreading, preferably with a task-oriented approach like OMPSS
    • Application of Selected Inversion in NEGF based transport code TranSIESTA
    • Port to Intel MIC

    References

    1. 1.0 1.1 1.2 L. Lin, A. Garcia, G. Huhs and C. Yang, SIESTA-PEXSI: Massively parallel method for efficient and accurate ab initio materials simulation without matrix diagonalization, Journal of Physics: Condensed Matter 26, 305503 (2014) journal
    2. 2.0 2.1 M. Jacquelin, L. Lin and C. Yang, PSelInv -- A Distributed Memory Parallel Algorithm for Selected Inversion : the Symmetric Case arXiv
    3. L. Lin, M. Chen, C. Yang and L. He, Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion, J. Phys. Condens. Matter 25, 295501, 2013 journal
    4. L. Lin, C. Yang, J. Meza, J. Lu, L. Ying and W. E, SelInv -- An algorithm for selected inversion of a sparse symmetric matrix, ACM Trans. Math. Software 37, 40, 2011 journal
    5. L. Lin, C. Yang, J. Lu, L. Ying and W. E, A Fast Parallel algorithm for selected inversion of structured sparse matrices with application to 2D electronic structure calculations, SIAM J. Sci. Comput. 33, 1329, 2011 journal
    6. L. Lin, J. Lu, L. Ying, R. Car and W. E, Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems, Commun. Math. Sci. 7, 755, 2009 journal
    7. L. Lin, J. Lu, L. Ying and W. E, Pole-based approximation of the Fermi-Dirac function, Chin. Ann. Math. 30B, 729, 2009 journal
    8. W. Hu, L. Lin, C. Yang, J Yang, Electronic Structure of Large-Scale Graphene Nanoflakes, arXiv