License  Authors  Download 

Simplified BSD  F. Corsetti  Gitlab 
MatrixSwitch is a module which acts as an intermediary interface layer between highlevel routines for physicsrelated algorithms and lowlevel routines dealing with matrix storage and manipulation. This allows the highlevel routines to be written in a way which is physically transparent, and enables them to switch seamlessly between different software implementations of the matrix operations.
Many computational physics algorithms (e.g., iterative KohnSham eigensolvers) are based on sequences of matrix operations. These are typically described using standard mathematical notation, which does not depend on the specifics of the computational implementation, i.e., how the matrices are stored and manipulated in the code. Many different storage formats exist, depending also on the architecture (serial/parallel) and the type of matrix (dense/sparse), as well as many libraries that can perform matrix operations for particular storage formats. Libraries can be more or less transparent in the way the matrices are handled: some hide the details of the storage scheme in a derived type, while others require auxiliary data to be carried around by the user. Generally, the matrix operations themselves are contained within subroutines that are simple to call. However, the interface is specific to each library.
The aim of MatrixSwitch is to provide a simple, unified interface to allow users to code physicsrelated algorithms with a minimal amount of knowledge of the underlying implementation of the matrix algebra, and, crucially, to be able to switch between different implementations without modifying their code. Therefore, if a new matrix algebra library is released which is particularly suited to a new architecture, this simply has to be interfaced within MatrixSwitch to start being used.
The emphasis for this project is on implementing physically relevant operations in as simple a way as possible. Therefore, the focus will be on the core set of functionalities typically needed for physics (particularly electronic structure), and on streamlining the interface to make programs easy to read, understand, and code in terms of the mathematical formulation of the algorithm.
The basic routines can be installed with only a Fortran compiler. This
will allow you to use the s?den
format and ref
operations.
Optional requirements are:
lap
operations with the s?den
formatp?dbc
formatpdcsr
formatsrc
directory.make.inc.example
to make.inc
and modify it to suit your
needs. Available options for FPPFLAGS
are:
DHAVE_MPI
: enable MPI parallel routinesDHAVE_LAPACK
: enable LAPACK routinesDHAVE_SCALAPACK
: enable ScaLAPACK routines (requires
DHAVE_MPI
)DHAVE_PSPBLAS
: enable PSPBLAS routinesDHAVE_DBCSR
: enable DBCSR
routines (requires DHAVE_MPI
)DCONV
: enable automatic conversion of scalar types
(real/complex) to agree with matrix definitions (real/complex).
Note that conversions from complex to real will simply discard
the imaginary part.make
.The examples
directory contains a number of small programs that make
use of MatrixSwitch. These can be useful both for testing the
installation and for learning how to use the library. To compile them:
examples
directory.make.inc.example
to make.inc
and modify it to suit your
needs. Be aware that make.inc
in the src
directory will also be
used.make
.Each example contains a header explaining what the program does and providing sample output to compare against.
MatrixSwitch
is a module that you can use
in Fortran routines. Note
that both the .a
and .mod
files need to be available. An example
compilation command for a code using MatrixSwitch is:
gfortran MyCode.f90 /path/to/MatrixSwitchx.y.z/src/MatrixSwitch.a I/path/to/MatrixSwitchx.y.z/src/ llapack lblas
The best way of learning how to use MatrixSwitch is by example. See the
examples in the examples
directory for this. In a typical code, there
are four steps that are followed:
matrix
. There are then two roots to initialising a matrix. The
easiest is to do so from scratch, by calling m_allocate
. However,
if the matrix data already exists (e.g., if it comes from a
different section of the code) and is in the correct format, it can
simply be registered into the TYPE(MATRIX) variable, by calling the
appropriate subroutine; for example, twodimensional arrays can be
registered as s?den
matrices by calling m_register_sden
. In this
case, the data is not copied; rather, elements of the TYPE(MATRIX)
variable are set to point to the existing array(s). Note that some
storage formats may require additional setup operations (detailed
below).m_set
and
m_set_element
.m_deallocate
.The storage formats that can currently be used with MatrixSwitch are
listed below. A ?
in a format name stands for either d
(real matrix)
or z
(complex matrix).
s?den
: simple dense (serial distribution)This is the most basic type of storage: a twodimensional array storing
the matrix elements on a single core. It can be used to perform
operations with ref
or lap
.
Requirements:
Storage details within type matrix
:
dval
/zval
, dimension (dim1
,dim2
): stores the matrix elements
(real/complex matrix)p?dbc
: dense block cyclic (parallel distribution)This format follows the standard used by ScaLAPACK for parallel
distribution of a dense matrix (see this
page
for some introduction). This makes it is extremely easy to use
MatrixSwitch in a small portion of a larger code which already uses
ScaLAPACK, as it allows for matrices to be passed in and out of the
MatrixSwitch section (see ms_lap_icontxt
, ms_scalapack_setup
,
m_register_pdbc
).
This format can be used to perform operations with lap
.
Requirements:
ms_scalapack_setup
needs to be called at the start of the
codeStorage details within type matrix
:
iaux1
, dimension (9
): stores the BLACS array descriptoriaux2
, dimension (2
): stores the size of the local portion of
the matrixdval
/zval
, dimension (iaux2(1)
,iaux2(2)
): stores the local
matrix elements (real/complex matrix)s?coo
: sparse coordinate list (serial distribution)Documentation coming soon.
p?coo
: sparse coordinate list (parallel distribution)Documentation coming soon.
s?csc
: compressed sparse column (serial distribution)Documentation coming soon.
p?csc
: compressed sparse column (parallel distribution)Documentation coming soon.
s?csr
: compressed sparse row (serial distribution)Documentation coming soon.
pdcsr
: compressed sparse row (parallel distribution)This format follows the distributed blockcompressed sparse row format
as implemented in the DBCSR library.
The distribution of the blocks over the processors follows a
blockcycling distribution a la ScaLAPACK (see this
page
for some introduction). A 2D grid (MPI cartesian grid) is automatically
created by DBCSR (by means of mpi_dims_create
and mpi_cart_create
functions). Note that blocks are monolithic, i.e. it is impossible to
read/write single elements inside a block.
Requirements:
make install PREFIX=
)ms_dbcsr_setup(global MPI communicator)
needs to be called at the
start of the codems_dbcsr_finalize
needs to be called at the end of the codeA general overview of the different computational implementations of the MatrixSwitch matrix operations is given below. These implementations need not be tied to specific storage formats, and vice versa. See the next section for a more detailed description of which storage formats can be used with which implementations for a particular operation.
ref
: referenceThe reference implementation is coded within MatrixSwitch. It can be
used with s?den
matrices. It is not fast, but is useful for checking
results and does not require any external libraries.
Requirements:
lap
: LAPACK/ScaLAPACKThis implementation makes use of BLAS + LAPACK to operate on s?den
matrices, and additionally ScaLAPACK to operate on p?dbc
matrices. It
should be considerably faster than ref
, but the performance will
depend on the external libraries provided by the user.
Requirements:
psp
: pspBLASDocumentation coming soon.
This section contains a comprehensive list of the allowed combinations
of storage formats and implementations of the matrix operations. There
is a separate table for each matrix operation
subroutine. The table lists
the input and output matrices required by the subroutine. Each row gives
a possible combination of storage formats that can be used when calling
it. The last column then lists the possible implementations of the
operation for the particular combination of storage formats; usually
only one implementation is available, but sometimes more than one is.
The threecharacter code for the implementation should be passed to the
subroutine in the label
variable; if label
is absent, the default
implementation for the storage formats provided will be called.
mm_multiply
A 
B 
C 
label 

s?den 
s?den 
s?den 
ref (default) or lap 
p?dbc 
p?dbc 
p?dbc 
lap (default) 
pdcsr 
pdcsr 
pdcsr 
(ignored) 
m_add
A 
C 
label 

s?den 
s?den 
ref (default) or lap (redirects to ref ) 
p?dbc 
p?dbc 
lap (default) 
pdcsr 
pdcsr 
(ignored) 
m_trace
A 
label 

s?den 
ref (default) or lap (redirects to ref ) 
p?dbc 
lap (default) 
pdcsr 
(ignored) 
mm_trace
A 
B 
label 

sdden 
sdden 
ref (default) or lap 
szden
szden
ref
(default) or lap
(redirects to ref
) 
pddbc
pddbc
lap
(default) 
pzdbc
pzdbc
ref
(default) [1] or lap
(redirects to ref
) 
pdcsr
pdcsr
(ignored) 
[1] Note that identical parallel distributions for A
and B
are
required.
m_scale
C 
label 

s?den 
ref (default) [1] lap (redirects to ref ) 
p?dbc 
lap (default  redirects to [1]) 
pdcsr 
(ignored) 
m_set
C 
label 

s?den ref (default) or lap (redirects to ref ) 

p?dbc lap (default) 
m_set_element
C 
label 

s?den 
ref (default) or lap (redirects to ref ) 
p?dbc 
lap (default) 
pdcsr 
(ignored) 
m_get_element
C 
label 

s?den 
ref (default) or lap (redirects to ref ) 
p?dbc 
lap (default) 
pdcsr 
(ignored) 
Note that some entries are specifically of use for a particular storage format or implementation. This is marked in [red] at the beginning of the description.
ms_lap_icontxt
INTEGER
[p?dbc
] BLACS context handle used by
MatrixSwitch. This is made public to allow allocated and registered
p?dbc
matrices to be placed in the same context. This can be done in
two ways:
ms_scalapack_setup
, which will
then set ms_lap_icontxt
to the same value. Note that in this case
the other variables passed to ms_scalapack_setup
need to be
consistent with the process grid enclosed in the existing context.ms_scalapack_setup
, ms_lap_icontxt
can then be used as the
context handle for BLACS operations outside of MatrixSwitch.type matrix
This is the derived type that encapsulates all matrix storage possibilities and hides the details from the user. Typically, the elements below will never need to be accessed directly.
str_type
CHARACTER*3is_initialized
LOGICALT
: Matrix has been initialized (with m_allocate
or one of the
m_register
routines).F
: Matrix has not been initialized.is_serial
LOGICALT
: Matrix is serial distributed.F
: Matrix is parallel distributed.is_real
LOGICALT
: Matrix is real (DOUBLE PRECISION default).F
: Matrix is complex (COMPLEX*16 default).is_square
LOGICALT
: Matrix is square.F
: Matrix is nonsquare.is_sparse
LOGICALT
: Matrix is sparse.F
: Matrix is dense.iaux1_is_allocated
LOGICALT
: iaux1
is directly allocated.F
: iaux1
is a pointer.iaux2_is_allocated
LOGICALT
: iaux2
is directly allocated.F
: iaux2
is a pointer.iaux3_is_allocated
LOGICALT
: iaux3
is directly allocated.F
: iaux3
is a pointer.iaux4_is_allocated
LOGICALT
: iaux4
is directly allocated.F
: iaux4
is a pointer.dval_is_allocated
LOGICALT
: dval
is directly allocated.F
: dval
is a pointer.zval_is_allocated
LOGICALT
: zval
is directly allocated.F
: zval
is a pointer.dim1
INTEGERdim2
INTEGERiaux1
INTEGER pointer, dimension (:
)iaux2
INTEGER pointer, dimension (:
)iaux3
INTEGER pointer, dimension (:
)iaux4
INTEGER pointer, dimension (:
)dval
DOUBLE PRECISION pointer, dimension (:
,:
)zval
COMPLEX*16 pointer, dimension (:
,:
)spm
TYPE(PSP_MATRIX_SPM)dbcsr_dist
TYPE(DBCSR_DISTRIBUTION_TYPE)dbcsr_mat
TYPE(DBCSR_TYPE)subroutine m_allocate( m_name, i, j, label )
Initializes a TYPE(MATRIX) variable by saving some basic information about the matrix, and allocating the necessary arrays for the requested storage format. Matrix elements are set to zero.
m_name
(input/output) TYPE(MATRIX)i
(input) INTEGERj
(input) INTEGERlabel
(input, optional) CHARACTER*5sdden
.subroutine m_allocate( m_name, row_sizes, col_sizes, label )
Initializes a Blocked TYPE(MATRIX) variable by saving some basic information about the matrix, and allocating the necessary distribution and storage. The matrix is initialized to be empty. The total size of the matrix is extracted by the sum of the row_sizes and col_sizes values for the row and column dimensions, respectively.
m_name
(input/output) TYPE(MATRIX)row_sizes
(input) INTEGER, dimension (:)col_sizes
(input) INTEGER, dimension (:)label
(input, optional) CHARACTER*5pdcsr
.subroutine m_deallocate( m_name )
Deallocates any allocated arrays in a TYPE(MATRIX) variable. For a registered matrix, the pointers are nullified.
m_name
(input/output) TYPE(MATRIX)subroutine m_register_sden( m_name, A )
[s?den
] Registers preexisting
matrix data into a TYPE(MATRIX) variable with s?den
format.
m_name
(input/output) TYPE(MATRIX)A
(input) DOUBLE PRECISION/COMPLEX*16 array, dimension (:
,:
)subroutine ms_scalapack_setup( mpi_comm, nprow, order, bs_def, bs_list, icontxt )
[p?dbc
] Sets up everything needed to
use p?dbc
matrices with ScaLAPACK. Has to be called once at the start
of the code.
mpi_comm
(input) INTEGERnprow
(input) INTEGERmpi_comm
).order
(input) CHARACTER*1c
/C
: columnmajor orderingr
/R
/other: rowmajor orderingbs_def
(input) INTEGERp?dbc
matrices.bs_list
(input, optional) INTEGER array, dimension (:
)bs_def
to use for specific matrix dimension
sizes. Has to be formatted as (dim_1
,bs_1
,dim_2
,bs_2
,etc.),
where dim_x
is the matrix dimension size, and bs_x
is the
corresponding block size to use for it.icontxt
(input, optional) INTEGERms_lap_icontxt
).subroutine m_register_pdbc( m_name, A, desc )
[p?dbc
] Registers preexisting
matrix data into a TYPE(MATRIX) variable with p?dbc
format.
m_name
(input/output) TYPE(MATRIX)A
(input) DOUBLE PRECISION/COMPLEX*16 array, dimension (:
,:
)desc
(input) INTEGER array, dimension (9
)subroutine ms_dbcsr_setup( mpi_comm )
[pdcsr
] Sets up everything needed to
use pdcsr
matrices with DBCSR. Has to be called once at the start of
the code.
mpi_comm
(input) INTEGERsubroutine ms_dbcsr_finalize( )
[pdcsr
] Finalize DBCSR for the
pdcsr
matrices. Has to be called once at the end of the code.
subroutine mm_multiply( A, opA, B, opB, C, alpha, beta, label )
Performs the operation:
$\mathbf{C} \leftarrow \alpha \tilde{\mathbf{A}} \tilde{\mathbf{B}} + \beta \mathbf{C}$, where $\tilde{\mathbf{M}} = \begin{cases} \mathbf{M} \\ \mathbf{M}^\mathrm{T} \\ \mathbf{M}^\mathrm{H} \end{cases}$
A
(input) TYPE(MATRIX)opA
(input) CHARACTER*1n
/N
: $\mathbf{A}$t
/T
: $\mathbf{A}^\mathrm{T}$c
/C
: $\mathbf{A}^\mathrm{H}$ (equivalent to
$\mathbf{A}^\mathrm{T}$ for a real matrix)B
(input) TYPE(MATRIX)opB
(input) CHARACTER*1n
/N
: $\mathbf{B}$t
/T
: $\mathbf{B}^\mathrm{T}$c
/C
: $\mathbf{B}^\mathrm{H}$ (equivalent to
$\mathbf{B}^\mathrm{T}$ for a real matrix)C
(input/output) TYPE(MATRIX)alpha
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrices
(real/complex); otherwise, it only has to match the type of beta
,
and will be automatically converted to match the matrices.beta
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrices
(real/complex); otherwise, it only has to match the type of alpha
,
and will be automatically converted to match the matrices.label
(input, optional) CHARACTER*3subroutine m_add ( A, opA, C, alpha, beta, label )
Performs the operation:
$\mathbf{C} \leftarrow \alpha \tilde{\mathbf{A}} + \beta \mathbf{C}$, where $\tilde{\mathbf{M}} = \begin{cases} \mathbf{M} \\ \mathbf{M}^\mathrm{T} \\ \mathbf{M}^\mathrm{H} \end{cases}$
A
(input) TYPE(MATRIX)opA
(input) CHARACTER*1n
/N
: $\mathbf{A}$t
/T
: $\mathbf{A}^\mathrm{T}$c
/C
: $\mathbf{A}^\mathrm{H}$ (equivalent to
$\mathbf{A}^\mathrm{T}$ for a real matrix)C
(input/output) TYPE(MATRIX)alpha
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrices
(real/complex); otherwise, it only has to match the type of beta
,
and will be automatically converted to match the matrices.beta
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrices
(real/complex); otherwise, it only has to match the type of alpha
,
and will be automatically converted to match the matrices.label
(input, optional) CHARACTER*3subroutine m_trace( A, alpha, label )
Performs the operation:
$\alpha \leftarrow \operatorname{tr} \left ( \mathbf{A} \right )$
A
(input) TYPE(MATRIX)alpha
(output) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrix
(real/complex); otherwise, it will be automatically converted to
match it.label
(input, optional) CHARACTER*3subroutine mm_trace( A, B, alpha, label )
Performs the operation:
$\alpha \leftarrow \operatorname{tr} \left ( \mathbf{A}^\mathrm{H} \mathbf{B} \right ) \equiv \operatorname{tr} \left ( \mathbf{B} \mathbf{A}^\mathrm{H} \right )$
A
(input) TYPE(MATRIX)B
(input) TYPE(MATRIX)alpha
(output) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrices
(real/complex); otherwise, it will be automatically converted to
match them.label
(input, optional) CHARACTER*3subroutine m_scale ( C, beta, label )
Performs the operation:
$\mathbf{C} \leftarrow \beta \mathbf{C}$
C
(input/output) TYPE(MATRIX)beta
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrix
(real/complex); otherwise, it will be automatically converted to
match it.label
(input, optional) CHARACTER*3subroutine m_set( C, seC, alpha, beta, label )
Performs the operation: $\left [ \mathbf{C} \right ]_{i,j} \leftarrow \begin{cases} \alpha, & i \ne j \\ \beta, & i = j \end{cases}$ for either all matrix elements, or only the lower/upper triangle (generalised to elements below/above the diagonal for rectangular matrices)
C
(input/output) TYPE(MATRIX)seC
(input) CHARACTER*1l
/L
: lower triangleu
/U
: upper trianglealpha
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrix
(real/complex); otherwise, it only has to match the type of beta
,
and will be automatically converted to match the matrix.beta
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrix
(real/complex); otherwise, it only has to match the type of alpha
,
and will be automatically converted to match the matrix.label
(input, optional) CHARACTER*3subroutine m_set_element( C, i, j, alpha, beta, label )
Performs the operation:
$\left [ \mathbf{C} \right ]_{i,j} \leftarrow \alpha + \beta \left [ \mathbf{C} \right ]_{i,j}$
C
(input/output) TYPE(MATRIX)i
(input) INTEGERj
(input) INTEGERalpha
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrix
(real/complex); otherwise, it only has to match the type of beta
,
and will be automatically converted to match the matrix.beta
(input) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrix
(real/complex); otherwise, it only has to match the type of alpha
,
and will be automatically converted to match the matrix.label
(input, optional) CHARACTER*3subroutine m_set_element( C, iblock, jblock, block_data, beta )
Performs the operation:
$\left [ \mathbf{C} \right ]_{iblock,jblock} \leftarrow \mathbf{block\_data} + \beta \left [ \mathbf{C} \right ]_{iblock,jblock}$
It is only available for the PDCSR format.
C
(input/output) TYPE(MATRIX)i
(input) INTEGERj
(input) INTEGERblock_data
(input) DOUBLE PRECISION, dimension(:, :)beta
(input) DOUBLE PRECISIONsubroutine m_get_element( C, i, j, alpha, label )
Performs the operation:
$\alpha \leftarrow \left [ \mathbf{C} \right ]_{i,j}$
C
(input) TYPE(MATRIX)i
(input) INTEGERj
(input) INTEGERalpha
(output) DOUBLE PRECISION/COMPLEX*16DCONV
flag, the type has to match the definition of the matrix
(real/complex); otherwise, it will be automatically converted to
match it.label
(input, optional) CHARACTER*3subroutine m_get_element( C, iblock, jblock, block_data, found )
First, it checks if the block exists. If so, performs the operation:
$\mathbf{block\_data} \leftarrow \left [ \mathbf{C} \right ]_{iblock,jblock}$
and return found = .True.
, otherwise it only returns
found = .False.
.
It is only available for the PDCSR format.
C
(input) TYPE(MATRIX)iblock
(input) INTEGERjblock
(input) INTEGERblock_data
(output) DOUBLE PRECISION, dimension(:, :)found
LOGICAL.True.
if the block was found and the values are
retrieved, otherwise it is .False.
.