# PEXSI

{{#if: Lin Lin, University of California Berkeley and Lawrence Berkeley National Laboratory, CA, US

**Source authors:**

Lin Lin, University of California Berkeley and Lawrence Berkeley National Laboratory, CA, US

Chao Yang, Lawrence Berkeley National Laboratory, CA, US

**License:** BSD modified by LBNL

**Download:** Project page

**Documentation:** {{{documentation}}}

__Links to other ESL entries__

__Links to other ESL entries__

__Links to other ESL entries__

__Links to other ESL entries__

__Links to other ESL entries__

__Links to other ESL entries__

**Functionalities:**

**Algorithms:**

- {{{algorithms}}}

**Generic interfaces:**

- {{{generic interfaces}}}

**APIs:**

- {{{apis}}}

**Data standards:**

- {{{data standards}}}

**Software:**

The **P**ole **EX**pansion and **S**elected **I**nversion method **PEXSI**
is a fast method for evaluating certain selected elements
of a matrix function and can be used for solving the KS-equation without explicit diagonalization.
PEXSI is particularly advantageous for large systems since it reduces the computational effort to

- for 3D systems
- for (quasi-) 2D systems
- for (quasi-) 1D systems

Furthermore it is highly scalable on distributed memory parallel machines.

The **Selected Inversion** by itself can be used for calculating specific elements of the inverse of a sparse matrix.

Most of the following documentation originates from the PEXSI-homepage.

## Contents

## Theory

### Basics

Given a sparse square matrix and a certain function , the basic idea of PEXSI is to expand using a small number of rational functions (pole expansion)

and to efficiently evaluate by evaluating selected elements (selected inversion).

The currently supported form of includes:

- : Matrix inversion. Since the matrix inversion is already represented as a single term of rational function (pole), no pole expansion is needed. The selected inversion method can be significantly faster than directly inverting the matrix and then extract the selected elements of the inverse.

- : Fermi-Dirac function. This can be used as a "smeared" matrix sign function at , without assuming a spectral gap near . This can be used for evaluating the electron density for electronic structure calculation.

For sparse matrices, the PEXSI method can be more efficient than the widely used diagonalization method for evaluating matrix functions, especially when a relatively large number of eigenpairs are needed to be computed in the diagonalization method.

### Application to electronic structure calculations

PEXSI can be used to solve the generalized eigenvalue problem that has to be solved in each SCF-iteration of the KS-DFT formalism without explicit diagonalization. Instead the density matrix is computed directly. Other magnitudes (free energy, atomic forces, local density of states) are given by the same formalism with negligible effort. The PEXSI framework can also be used for calculating the total density of states very efficiently.

With the pole expansion of the Fermi function

the density matrix (in case of a non-orthogonal basis) becomes

The chemical potential has to fulfill

with the total number of electrons and the electron density in real space.

For finding PEXSI implements a hybrid scheme, combining a bisection and a Newton method. This allows determining in a few iterations, even when starting with a wide guess and for difficult cases. Systems with a gap are naturally even easier to solve. If there is a good guess for available, for example from a previous SCF iteration, the bisection can be turned off.

The PEXSI library provides two interfaces. A simple one, finding many of the parameters automatically, and another one offering extensive control of the behavior of the solver.

An extensive description of the implementation of PEXSI in the numerical atomic orbital code SIESTA can be found in ^{[1]}, and more usage information under Further_information.

### Parallelization

Besides the favorable weak scaling, PEXSI can also scale to large numbers of processors due to its two-level parallelism:

- For accurate evaluation, the pole expansion usually takes 40-60 poles. The calculations on each of these poles are completely independent of each other, enabling an efficient parallelization on this level.
- The calculations on each pole can also be parallelized, with the number of processors that can be used efficiently strongly depending on the problem size, the matrix sparsity, and the sparsity pattern. Typically tens up to thousands of processors per pole are used. An extensive description of the selected inversion and its parallelization can be found in
^{[2]}

Large examples need a minimum amount of processors to have enough memory available. Typically this number is much smaller than with dense direct methods like ScaLAPACK.

## Publications

**Algorithms:**

- PEXSI
^{[3]} - (Parallel) Selected Inversion
^{[2]}^{[4]}^{[5]}^{[6]} - Pole Expansion
^{[7]}

**Implementations in electronic structure codes:**

- Implementation in SIESTA
^{[1]}

**Applications to physical systems:**

- 1D, 2D, and 3D model systems in
^{[1]} - Graphene Nanoflakes
^{[8]}

## License

PEXSI is distributed under BSD license (modified by Lawrence Berkeley National Laboratory).

## Download

A source tarball can be downloaded directly from here.

## Installation

### Prerequisites

- LAPACK for basic linear algebra operations
- MPI
- SuperLU-Dist (download)
- ParMetis (download) or PT-Scotch (download)

PEXSI requires an external parallel factorization or factorization routine, and an external parallel matrix reordering routine to reduce the fill-in of the factorization routine.

Currently we use SuperLU_DIST for the parallel factorization, and ParMETIS for the parallel fill-in reducing reordering. It is also possible to use PT-Scotch for the reordering. But we recommend to first download ParMETIS.

### Instructions

**Edit make.inc**

Configuration of PEXSI is controlled by a single `make.inc` file. Examples of the `make.inc` file are given under the `config/` directory.

Find `make.inc` with the most similar architecture, and copy to the main PEXSI directory (using Edison for example, the latest Intel computer at NERSC). `${PEXSI_DIR}` stands for the main directory of PEXSI.

cd ${PEXSI_DIR} cp config/make.inc.edison.intel make.inc

Edit the variables in `make.inc`

PEXSI_DIR = Main directory for PEXSI DSUPERLU_DIR = Main directory for SuperLU_DIST METIS_DIR = Main directory for METIS PARMETIS_DIR = Main directory for ParMETIS PTSCOTCH_DIR = Main directory for PT-Scotch

Note:

PEXSI can be compiled using debug or release mode in by the variable `COMPILE_MODE` in `make.inc`. This variable mainly controls the compiling flag `-DRELEASE`. The debug mode introduces tracing of call stacks at all levels of functions, and may significantly slow down the code. For production runs, use release mode.
The `*.profile` configuration files are for debugging purpose and can be ignored.

**Build the PEXSI library**

If `make.inc` is configured correctly,

cd ${PEXSI_DIR} cd src make

should produce `libpexsi_(suffix).a` under `src/`.

### Tests

**Build examples**

After `libpexsi_(suffix).a` is built, all driver routines are readily to be compiled. For example, the selected inversion for a complex matrix has the test routine

cd ${PEXSI_DIR} cd examples make driver_pselinv_complex

should produce `driver_pselinv_complex`, which can be executed with MPI.

For more information on the examples, see Tutorial.

**Run tests**

After `driver_pselinv_complex_(suffix)` is compiled,

examples$ mpirun -n 1 ./driver_pselinv_complex_(suffix)

should return the diagonal of the matrix

saved on the 0-th processor, where A is the five-point discretization of a Laplacian operator on a 2D domain. The result can be compared with `examples/driver_pselinv_complex.out` to check the correctness of the result.

## Programming interface

The full documentation of the C/C++ and Fortran interfaces can be found on the project's page:

- C/C++ interface
- Fortran interface
- On top of the fine grained interface there is also a simplified driver routine for DFT calculations available. This function contains both the inertia counting step for estimating the chemical potential, and the Newton's iteration for updating the chemical potential. Heuristics are built into this routine. A detailed description can be found on the project page.

## Further information

The **Homepage** contains:

- Introduction
- Download
- Installation
- Tutorial
- Core Functionality
- Frequently asked questions
- Troubleshooting

A version of SIESTA with PEXSI 0.7.3 implemented is available as tarball from the SIESTA page under *Other versions of Siesta* (after selecting the applicable user agreement). The documentation delivered with this version also includes a description of the configuration of the PEXSI solver, which gives a general overview about how to use PEXSI for electronic structure calculations.

For faster access there is an excerpt of the SIESTA-manual.

## Future developments

- Generalization for k-points
- Hybrid version MPI + multithreading, preferably with a task-oriented approach like OMPSS
- Application of Selected Inversion in NEGF based transport code TranSIESTA
- Port to Intel MIC

## References

- ↑
^{1.0}^{1.1}^{1.2}L. Lin, A. Garcia, G. Huhs and C. Yang, SIESTA-PEXSI: Massively parallel method for efficient and accurate ab initio materials simulation without matrix diagonalization, Journal of Physics: Condensed Matter 26, 305503 (2014) journal - ↑
^{2.0}^{2.1}M. Jacquelin, L. Lin and C. Yang, PSelInv -- A Distributed Memory Parallel Algorithm for Selected Inversion : the Symmetric Case arXiv - ↑ L. Lin, M. Chen, C. Yang and L. He, Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion, J. Phys. Condens. Matter 25, 295501, 2013 journal
- ↑ L. Lin, C. Yang, J. Meza, J. Lu, L. Ying and W. E, SelInv -- An algorithm for selected inversion of a sparse symmetric matrix, ACM Trans. Math. Software 37, 40, 2011 journal
- ↑ L. Lin, C. Yang, J. Lu, L. Ying and W. E, A Fast Parallel algorithm for selected inversion of structured sparse matrices with application to 2D electronic structure calculations, SIAM J. Sci. Comput. 33, 1329, 2011 journal
- ↑ L. Lin, J. Lu, L. Ying, R. Car and W. E, Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems, Commun. Math. Sci. 7, 755, 2009 journal
- ↑ L. Lin, J. Lu, L. Ying and W. E, Pole-based approximation of the Fermi-Dirac function, Chin. Ann. Math. 30B, 729, 2009 journal
- ↑ W. Hu, L. Lin, C. Yang, J Yang, Electronic Structure of Large-Scale Graphene Nanoflakes, arXiv