# MatrixSwitch

Simplified BSD F. Corsetti Gitlab

MatrixSwitch is a module which acts as an intermediary interface layer between high-level routines for physics-related algorithms and low-level routines dealing with matrix storage and manipulation. This allows the high-level routines to be written in a way which is physically transparent, and enables them to switch seamlessly between different software implementations of the matrix operations.

## Introduction

Many computational physics algorithms (e.g., iterative Kohn-Sham eigensolvers) are based on sequences of matrix operations. These are typically described using standard mathematical notation, which does not depend on the specifics of the computational implementation, i.e., how the matrices are stored and manipulated in the code. Many different storage formats exist, depending also on the architecture (serial/parallel) and the type of matrix (dense/sparse), as well as many libraries that can perform matrix operations for particular storage formats. Libraries can be more or less transparent in the way the matrices are handled: some hide the details of the storage scheme in a derived type, while others require auxiliary data to be carried around by the user. Generally, the matrix operations themselves are contained within subroutines that are simple to call. However, the interface is specific to each library.

The aim of MatrixSwitch is to provide a simple, unified interface to allow users to code physics-related algorithms with a minimal amount of knowledge of the underlying implementation of the matrix algebra, and, crucially, to be able to switch between different implementations without modifying their code. Therefore, if a new matrix algebra library is released which is particularly suited to a new architecture, this simply has to be interfaced within MatrixSwitch to start being used.

The emphasis for this project is on implementing physically relevant operations in as simple a way as possible. Therefore, the focus will be on the core set of functionalities typically needed for physics (particularly electronic structure), and on streamlining the interface to make programs easy to read, understand, and code in terms of the mathematical formulation of the algorithm.

## Installation

### Prerequisites

The basic routines can be installed with only a Fortran compiler. This will allow you to use the s?den format and ref operations.

Optional requirements are:

• BLAS + LAPACK for lap operations with the s?den format
• MPI + BLAS + LAPACK + ScaLAPACK for the p?dbc format
• MPI + BLAS + LAPACK + DBCSR for the pdcsr format

### Instructions

1. Enter the src directory.
2. Copy make.inc.example to make.inc and modify it to suit your needs. Available options for FPPFLAGS are:
• -DHAVE_MPI: enable MPI parallel routines
• -DHAVE_LAPACK: enable LAPACK routines
• -DHAVE_SCALAPACK: enable ScaLAPACK routines (requires -DHAVE_MPI)
• -DHAVE_PSPBLAS: enable PSPBLAS routines
• -DHAVE_DBCSR: enable DBCSR routines (requires -DHAVE_MPI)
• -DCONV: enable automatic conversion of scalar types (real/complex) to agree with matrix definitions (real/complex). Note that conversions from complex to real will simply discard the imaginary part.
3. Type make.

### Tests

The examples directory contains a number of small programs that make use of MatrixSwitch. These can be useful both for testing the installation and for learning how to use the library. To compile them:

1. Enter the examples directory.
2. Copy make.inc.example to make.inc and modify it to suit your needs. Be aware that make.inc in the src directory will also be used.
3. Type make.

Each example contains a header explaining what the program does and providing sample output to compare against.

## Usage

MatrixSwitch is a module that you can use in Fortran routines. Note that both the .a and .mod files need to be available. An example compilation command for a code using MatrixSwitch is: gfortran MyCode.f90 /path/to/MatrixSwitch-x.y.z/src/MatrixSwitch.a -I/path/to/MatrixSwitch-x.y.z/src/ -llapack -lblas

## Documentation

The best way of learning how to use MatrixSwitch is by example. See the examples in the examples directory for this. In a typical code, there are four steps that are followed:

1. Setup the matrices:
Matrices need to first be declared with the MatrixSwitch public type matrix. There are then two roots to initialising a matrix. The easiest is to do so from scratch, by calling m_allocate. However, if the matrix data already exists (e.g., if it comes from a different section of the code) and is in the correct format, it can simply be registered into the TYPE(MATRIX) variable, by calling the appropriate subroutine; for example, two-dimensional arrays can be registered as s?den matrices by calling m_register_sden. In this case, the data is not copied; rather, elements of the TYPE(MATRIX) variable are set to point to the existing array(s). Note that some storage formats may require additional setup operations (detailed below).
2. Fill the matrices:
Matrix element values can be set by calling m_set and m_set_element.
3. Perform some matrix operations:
See the list of available matrix operations.
4. Destroy the matrices:
Matrices can be deallocated by calling m_deallocate.

### Storage formats

The storage formats that can currently be used with MatrixSwitch are listed below. A ? in a format name stands for either d (real matrix) or z (complex matrix).

#### s?den: simple dense (serial distribution)

This is the most basic type of storage: a two-dimensional array storing the matrix elements on a single core. It can be used to perform operations with ref or lap.

Requirements:

• External libraries: none
• Usage: no special routines need to be called to use this format

Storage details within type matrix:

• dval/zval, dimension (dim1,dim2): stores the matrix elements (real/complex matrix)

#### p?dbc: dense block cyclic (parallel distribution)

This format follows the standard used by ScaLAPACK for parallel distribution of a dense matrix (see this page for some introduction). This makes it is extremely easy to use MatrixSwitch in a small portion of a larger code which already uses ScaLAPACK, as it allows for matrices to be passed in and out of the MatrixSwitch section (see ms_lap_icontxt, ms_scalapack_setup, m_register_pdbc).

This format can be used to perform operations with lap.

Requirements:

• External libraries: MPI + BLAS + LAPACK + ScaLAPACK
• Usage: ms_scalapack_setup needs to be called at the start of the code

Storage details within type matrix:

• iaux1, dimension (9): stores the BLACS array descriptor
• iaux2, dimension (2): stores the size of the local portion of the matrix
• dval/zval, dimension (iaux2(1),iaux2(2)): stores the local matrix elements (real/complex matrix)

#### s?coo: sparse coordinate list (serial distribution)

Documentation coming soon.

#### p?coo: sparse coordinate list (parallel distribution)

Documentation coming soon.

#### s?csc: compressed sparse column (serial distribution)

Documentation coming soon.

#### p?csc: compressed sparse column (parallel distribution)

Documentation coming soon.

#### s?csr: compressed sparse row (serial distribution)

Documentation coming soon.

#### pdcsr: compressed sparse row (parallel distribution)

This format follows the distributed block-compressed sparse row format as implemented in the DBCSR library. The distribution of the blocks over the processors follows a block-cycling distribution a la ScaLAPACK (see this page for some introduction). A 2D grid (MPI cartesian grid) is automatically created by DBCSR (by means of mpi_dims_create and mpi_cart_create functions). Note that blocks are monolithic, i.e. it is impossible to read/write single elements inside a block.

Requirements:

• External libraries: MPI + BLAS + LAPACK + DBCSR. Download and install DBCSR somewhere (use make install PREFIX=)
• ms_dbcsr_setup(global MPI communicator) needs to be called at the start of the code
• Define the number of blocks per rows and columns
• Define two Integer arrays for the definition of the block sizes per row and columns
• ms_dbcsr_finalize needs to be called at the end of the code

### Implementations of the matrix operations

A general overview of the different computational implementations of the MatrixSwitch matrix operations is given below. These implementations need not be tied to specific storage formats, and vice versa. See the next section for a more detailed description of which storage formats can be used with which implementations for a particular operation.

#### ref: reference

The reference implementation is coded within MatrixSwitch. It can be used with s?den matrices. It is not fast, but is useful for checking results and does not require any external libraries.

Requirements:

• External libraries: none

#### lap: LAPACK/ScaLAPACK

This implementation makes use of BLAS + LAPACK to operate on s?den matrices, and additionally ScaLAPACK to operate on p?dbc matrices. It should be considerably faster than ref, but the performance will depend on the external libraries provided by the user.

Requirements:

• External libraries:
• Serial: BLAS + LAPACK
• Parallel: MPI + BLAS + LAPACK + ScaLAPACK

#### psp: pspBLAS

Documentation coming soon.

### Operation tables

This section contains a comprehensive list of the allowed combinations of storage formats and implementations of the matrix operations. There is a separate table for each matrix operation subroutine. The table lists the input and output matrices required by the subroutine. Each row gives a possible combination of storage formats that can be used when calling it. The last column then lists the possible implementations of the operation for the particular combination of storage formats; usually only one implementation is available, but sometimes more than one is. The three-character code for the implementation should be passed to the subroutine in the label variable; if label is absent, the default implementation for the storage formats provided will be called.

#### mm_multiply

A B C label
s?den s?den s?den ref (default) or lap
p?dbc p?dbc p?dbc lap (default)
pdcsr pdcsr pdcsr (ignored)

#### m_add

A C label
s?den s?den ref (default) or lap (redirects to ref)
p?dbc p?dbc lap (default)
pdcsr pdcsr (ignored)

#### m_trace

A label
s?den ref (default) or lap (redirects to ref)
p?dbc lap (default)
pdcsr (ignored)

#### mm_trace

A B label
sdden sdden ref (default) or lap

|szden |szden |ref (default) or lap (redirects to ref) | |pddbc |pddbc |lap (default) | |pzdbc |pzdbc |ref (default) [1] or lap (redirects to ref) | |pdcsr |pdcsr |(ignored) |

[1] Note that identical parallel distributions for A and B are required.

#### m_scale

C label
s?den ref (default) [1] lap (redirects to ref)
p?dbc lap (default - redirects to [1])
pdcsr (ignored)

#### m_set

C label
s?den ref (default) or lap (redirects to ref)
p?dbc lap (default)

#### m_set_element

C label
s?den ref (default) or lap (redirects to ref)
p?dbc lap (default)
pdcsr (ignored)

#### m_get_element

C label
s?den ref (default) or lap (redirects to ref)
p?dbc lap (default)
pdcsr (ignored)

### Future developments

• Sparse matrix formats: distributed compressed column, block sparse
• Hermitian matrices

## Programming interface

Note that some entries are specifically of use for a particular storage format or implementation. This is marked in [red] at the beginning of the description.

### Public variables

ms_lap_icontxt INTEGER
[p?dbc] BLACS context handle used by MatrixSwitch. This is made public to allow allocated and registered p?dbc matrices to be placed in the same context. This can be done in two ways:

• If BLACS has already been initialised, the existing context handle can be passed to MatrixSwitch via ms_scalapack_setup, which will then set ms_lap_icontxt to the same value. Note that in this case the other variables passed to ms_scalapack_setup need to be consistent with the process grid enclosed in the existing context.
• If BLACS is first initialised through MatrixSwitch with ms_scalapack_setup, ms_lap_icontxt can then be used as the context handle for BLACS operations outside of MatrixSwitch.

### Public types

type matrix

This is the derived type that encapsulates all matrix storage possibilities and hides the details from the user. Typically, the elements below will never need to be accessed directly.

• str_type CHARACTER*3
Label identifying the storage format.
• is_initialized LOGICAL
T: Matrix has been initialized (with m_allocate or one of the m_register routines).
F: Matrix has not been initialized.
• is_serial LOGICAL
T: Matrix is serial distributed.
F: Matrix is parallel distributed.
• is_real LOGICAL
T: Matrix is real (DOUBLE PRECISION default).
F: Matrix is complex (COMPLEX*16 default).
• is_square LOGICAL
T: Matrix is square.
F: Matrix is non-square.
• is_sparse LOGICAL
T: Matrix is sparse.
F: Matrix is dense.
• iaux1_is_allocated LOGICAL
T: iaux1 is directly allocated.
F: iaux1 is a pointer.
• iaux2_is_allocated LOGICAL
T: iaux2 is directly allocated.
F: iaux2 is a pointer.
• iaux3_is_allocated LOGICAL
T: iaux3 is directly allocated.
F: iaux3 is a pointer.
• iaux4_is_allocated LOGICAL
T: iaux4 is directly allocated.
F: iaux4 is a pointer.
• dval_is_allocated LOGICAL
T: dval is directly allocated.
F: dval is a pointer.
• zval_is_allocated LOGICAL
T: zval is directly allocated.
F: zval is a pointer.
• dim1 INTEGER
Row dimension size of the matrix.
• dim2 INTEGER
Column dimension size of the matrix.
• iaux1 INTEGER pointer, dimension (:)
Auxiliary information for certain storage formats.
• iaux2 INTEGER pointer, dimension (:)
Auxiliary information for certain storage formats.
• iaux3 INTEGER pointer, dimension (:)
Auxiliary information for certain storage formats.
• iaux4 INTEGER pointer, dimension (:)
Auxiliary information for certain storage formats.
• dval DOUBLE PRECISION pointer, dimension (:,:)
Matrix elements for a real matrix.
• zval COMPLEX*16 pointer, dimension (:,:)
Matrix elements for a complex matrix.
• spm TYPE(PSP_MATRIX_SPM)
pspBLAS matrix type.
• dbcsr_dist TYPE(DBCSR_DISTRIBUTION_TYPE)
DBCSR distribution.
• dbcsr_mat TYPE(DBCSR_TYPE)
DBCSR matrix.

### Public subroutines

#### Matrix setup/creation/destruction

subroutine m_allocate( m_name, i, j, label )

Initializes a TYPE(MATRIX) variable by saving some basic information about the matrix, and allocating the necessary arrays for the requested storage format. Matrix elements are set to zero.

• m_name (input/output) TYPE(MATRIX)
The matrix to be allocated.
• i (input) INTEGER
Row dimension size of the matrix.
• j (input) INTEGER
Column dimension size of the matrix.
• label (input, optional) CHARACTER*5
Storage format to use. See the list of available formats. Default is sdden.

subroutine m_allocate( m_name, row_sizes, col_sizes, label )

Initializes a Blocked TYPE(MATRIX) variable by saving some basic information about the matrix, and allocating the necessary distribution and storage. The matrix is initialized to be empty. The total size of the matrix is extracted by the sum of the row_sizes and col_sizes values for the row and column dimensions, respectively.

• m_name (input/output) TYPE(MATRIX)
The matrix to be allocated.
• row_sizes (input) INTEGER, dimension (:)
Row block sizes.
• col_sizes (input) INTEGER, dimension (:)
Column block sizes.
• label (input, optional) CHARACTER*5
Storage format to use. See the list of available formats. Default is pdcsr.

subroutine m_deallocate( m_name )

Deallocates any allocated arrays in a TYPE(MATRIX) variable. For a registered matrix, the pointers are nullified.

• m_name (input/output) TYPE(MATRIX)
The matrix to be deallocated.

subroutine m_register_sden( m_name, A )

[s?den] Registers pre-existing matrix data into a TYPE(MATRIX) variable with s?den format.

• m_name (input/output) TYPE(MATRIX)
The matrix to be allocated.
• A (input) DOUBLE PRECISION/COMPLEX*16 array, dimension (:,:)
The values of the matrix elements, stored as a two-dimensional array.

subroutine ms_scalapack_setup( mpi_comm, nprow, order, bs_def, bs_list, icontxt )

[p?dbc] Sets up everything needed to use p?dbc matrices with ScaLAPACK. Has to be called once at the start of the code.

• mpi_comm (input) INTEGER
MPI communicator to use.
• nprow (input) INTEGER
Row dimension of the process grid (has to be a divisor of the size of the group defined by mpi_comm).
• order (input) CHARACTER*1
Ordering of the process grid:
c/C: column-major ordering
r/R/other: row-major ordering
• bs_def (input) INTEGER
Default block size to use when allocating p?dbc matrices.
• bs_list (input, optional) INTEGER array, dimension (:)
List of exceptions to bs_def to use for specific matrix dimension sizes. Has to be formatted as (dim_1,bs_1,dim_2,bs_2,etc.), where dim_x is the matrix dimension size, and bs_x is the corresponding block size to use for it.
• icontxt (input, optional) INTEGER
BLACS context handle, if already initialized (see ms_lap_icontxt).

subroutine m_register_pdbc( m_name, A, desc )

[p?dbc] Registers pre-existing matrix data into a TYPE(MATRIX) variable with p?dbc format.

• m_name (input/output) TYPE(MATRIX)
The matrix to be allocated.
• A (input) DOUBLE PRECISION/COMPLEX*16 array, dimension (:,:)
The values of the local matrix elements, stored as a two-dimensional array.
• desc (input) INTEGER array, dimension (9)
BLACS array descriptor.

subroutine ms_dbcsr_setup( mpi_comm )

[pdcsr] Sets up everything needed to use pdcsr matrices with DBCSR. Has to be called once at the start of the code.

• mpi_comm (input) INTEGER
MPI communicator to use.

subroutine ms_dbcsr_finalize( )

[pdcsr] Finalize DBCSR for the pdcsr matrices. Has to be called once at the end of the code.

#### Matrix operations

subroutine mm_multiply( A, opA, B, opB, C, alpha, beta, label )

Performs the operation:

$\mathbf{C} \leftarrow \alpha \tilde{\mathbf{A}} \tilde{\mathbf{B}} + \beta \mathbf{C}$, where $\tilde{\mathbf{M}} = \begin{cases} \mathbf{M} \\ \mathbf{M}^\mathrm{T} \\ \mathbf{M}^\mathrm{H} \end{cases}$

• A (input) TYPE(MATRIX)
Matrix $\mathbf{A}$. Note that the definition of the matrix (real/complex) needs to be the same as for the other matrices.
• opA (input) CHARACTER*1
Form of $\tilde{\mathbf{A}}$:
n/N: $\mathbf{A}$
t/T: $\mathbf{A}^\mathrm{T}$
c/C: $\mathbf{A}^\mathrm{H}$ (equivalent to $\mathbf{A}^\mathrm{T}$ for a real matrix)
• B (input) TYPE(MATRIX)
Matrix $\mathbf{B}$. Note that the definition of the matrix (real/complex) needs to be the same as for the other matrices.
• opB (input) CHARACTER*1
Form of $\tilde{\mathbf{B}}$:
n/N: $\mathbf{B}$
t/T: $\mathbf{B}^\mathrm{T}$
c/C: $\mathbf{B}^\mathrm{H}$ (equivalent to $\mathbf{B}^\mathrm{T}$ for a real matrix)
• C (input/output) TYPE(MATRIX)
Matrix $\mathbf{C}$. Note that the definition of the matrix (real/complex) needs to be the same as for the other matrices.
• alpha (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\alpha$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrices (real/complex); otherwise, it only has to match the type of beta, and will be automatically converted to match the matrices.
• beta (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\beta$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrices (real/complex); otherwise, it only has to match the type of alpha, and will be automatically converted to match the matrices.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine m_add ( A, opA, C, alpha, beta, label )

Performs the operation:

$\mathbf{C} \leftarrow \alpha \tilde{\mathbf{A}} + \beta \mathbf{C}$, where $\tilde{\mathbf{M}} = \begin{cases} \mathbf{M} \\ \mathbf{M}^\mathrm{T} \\ \mathbf{M}^\mathrm{H} \end{cases}$

• A (input) TYPE(MATRIX)
Matrix $\mathbf{A}$. Note that the definition of the matrix (real/complex) needs to be the same as for the other matrix.
• opA (input) CHARACTER*1
Form of $\tilde{\mathbf{A}}$:
n/N: $\mathbf{A}$
t/T: $\mathbf{A}^\mathrm{T}$
c/C: $\mathbf{A}^\mathrm{H}$ (equivalent to $\mathbf{A}^\mathrm{T}$ for a real matrix)
• C (input/output) TYPE(MATRIX)
Matrix $\mathbf{C}$. Note that the definition of the matrix (real/complex) needs to be the same as for the other matrix.
• alpha (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\alpha$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrices (real/complex); otherwise, it only has to match the type of beta, and will be automatically converted to match the matrices.
• beta (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\beta$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrices (real/complex); otherwise, it only has to match the type of alpha, and will be automatically converted to match the matrices.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine m_trace( A, alpha, label )

Performs the operation:

$\alpha \leftarrow \operatorname{tr} \left ( \mathbf{A} \right )$

• A (input) TYPE(MATRIX)
Matrix $\mathbf{A}$.
• alpha (output) DOUBLE PRECISION/COMPLEX*16
Scalar $\alpha$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrix (real/complex); otherwise, it will be automatically converted to match it.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine mm_trace( A, B, alpha, label )

Performs the operation:

$\alpha \leftarrow \operatorname{tr} \left ( \mathbf{A}^\mathrm{H} \mathbf{B} \right ) \equiv \operatorname{tr} \left ( \mathbf{B} \mathbf{A}^\mathrm{H} \right )$

• A (input) TYPE(MATRIX)
Matrix $\mathbf{A}$. Note that the definition of the matrix (real/complex) needs to be the same as for the other matrix.
• B (input) TYPE(MATRIX)
Matrix $\mathbf{B}$. Note that the definition of the matrix (real/complex) needs to be the same as for the other matrix.
• alpha (output) DOUBLE PRECISION/COMPLEX*16
Scalar $\alpha$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrices (real/complex); otherwise, it will be automatically converted to match them.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine m_scale ( C, beta, label )

Performs the operation:

$\mathbf{C} \leftarrow \beta \mathbf{C}$

• C (input/output) TYPE(MATRIX)
Matrix $\mathbf{C}$.
• beta (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\beta$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrix (real/complex); otherwise, it will be automatically converted to match it.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine m_set( C, seC, alpha, beta, label )

Performs the operation: $\left [ \mathbf{C} \right ]_{i,j} \leftarrow \begin{cases} \alpha, & i \ne j \\ \beta, & i = j \end{cases}$ for either all matrix elements, or only the lower/upper triangle (generalised to elements below/above the diagonal for rectangular matrices)

• C (input/output) TYPE(MATRIX)
Matrix $\mathbf{C}$.
• seC (input) CHARACTER*1
Form of the operation:
l/L: lower triangle
u/U: upper triangle
other: complete matrix
• alpha (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\alpha$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrix (real/complex); otherwise, it only has to match the type of beta, and will be automatically converted to match the matrix.
• beta (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\beta$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrix (real/complex); otherwise, it only has to match the type of alpha, and will be automatically converted to match the matrix.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine m_set_element( C, i, j, alpha, beta, label )

Performs the operation:

$\left [ \mathbf{C} \right ]_{i,j} \leftarrow \alpha + \beta \left [ \mathbf{C} \right ]_{i,j}$

• C (input/output) TYPE(MATRIX)
Matrix $\mathbf{C}$.
• i (input) INTEGER
Row index of the element.
• j (input) INTEGER
Column index of the element.
• alpha (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\alpha$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrix (real/complex); otherwise, it only has to match the type of beta, and will be automatically converted to match the matrix.
• beta (input) DOUBLE PRECISION/COMPLEX*16
Scalar $\beta$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrix (real/complex); otherwise, it only has to match the type of alpha, and will be automatically converted to match the matrix.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine m_set_element( C, iblock, jblock, block_data, beta )

Performs the operation:

$\left [ \mathbf{C} \right ]_{iblock,jblock} \leftarrow \mathbf{block\_data} + \beta \left [ \mathbf{C} \right ]_{iblock,jblock}$

It is only available for the PDCSR format.

• C (input/output) TYPE(MATRIX)
Matrix $\mathbf{C}$.
• i (input) INTEGER
Row index of the block.
• j (input) INTEGER
Column index of the block.
• block_data (input) DOUBLE PRECISION, dimension(:, :)
block values.
• beta (input) DOUBLE PRECISION
Scalar $\beta$.

subroutine m_get_element( C, i, j, alpha, label )

Performs the operation:

$\alpha \leftarrow \left [ \mathbf{C} \right ]_{i,j}$

• C (input) TYPE(MATRIX)
Matrix $\mathbf{C}$.
• i (input) INTEGER
Row index of the element.
• j (input) INTEGER
Column index of the element.
• alpha (output) DOUBLE PRECISION/COMPLEX*16
Scalar $\alpha$. If the library is compiler without the -DCONV flag, the type has to match the definition of the matrix (real/complex); otherwise, it will be automatically converted to match it.
• label (input, optional) CHARACTER*3
Implementation of the operation to use. See the list of available implementations.

subroutine m_get_element( C, iblock, jblock, block_data, found )

First, it checks if the block exists. If so, performs the operation:

$\mathbf{block\_data} \leftarrow \left [ \mathbf{C} \right ]_{iblock,jblock}$

and return found = .True., otherwise it only returns found = .False..

It is only available for the PDCSR format.

• C (input) TYPE(MATRIX)
Matrix $\mathbf{C}$.
• iblock (input) INTEGER
Row block index.
• jblock (input) INTEGER
Column block index.
• block_data (output) DOUBLE PRECISION, dimension(:, :)
If the block doesn't exist in the matrix, the values of block_data are left unchanged.
• found LOGICAL
Returns .True. if the block was found and the values are retrieved, otherwise it is .False..