One of the objectives of the European Theoretical Spectroscopy Facility is to specify file formats for the contents that are relevant to the scientific activity of its constituting nodes. The present document describes detailed NetCDF specifications, for selected contents (crystallographic/density/wavefunctions). It is hoped that these specifications will be implemented in many different softwares, or (at least) will be the basis of even better file format specifications.
This document has the goal of informing the electronic structure community of the agreed ETSF specifications, hereafter referred to as SpecFF_ETSF3, in view of further discussions and implementations. It is expected that the file format specifications present in this document be subject to revision and improvements. The first version of this document, named SpecFFNQ1 (NQ for Nanoquanta, the precursor of the ETSF), was frozen around June 2006. The current version of the file format, and associated information, can be found at http://www.etsf.eu/fileformats.
The document is organized in sections:
One has to consider separately the set of data to be included in each of different types of files, from their representation. Concerning the latter, one encounters simple text files, binary files, XML-structured files, NetCDF files, etc ... The ETSF decided to evolve towards formats that deal appropriately with the self- description issue, i.e. XML and NetCDF. The inherent flexibility of these representations also allow to evolve specific versions of each type of files progressively, and refine earlier working proposals. The same direction has been adopted by several groups of code developers that we know of.
Information on NetCDF and XML can be obtained from the official Web
sites:
http://www.unidata.ucar.edu/software/netcdf/ and http://www.w3.org/XML/
There are numerous other presentations of these formats on the Web, or in books.
Concerning XML
(A) The XML format is most adapted for the structured representation of
relatively small quantity of data, as it is not compressed.
(B) It is a very flexible format.
Concerning NetCDF
(A) Several groups of developers inside the ETSF have already a good
experience of using it, for the representation of binary data (large
files).
(B) Although there is no clear advantage of NetCDF compared to HDF
(another possibility for large binary files), this experience inside the
ETSF network is the main reason for preferring it.
(C) File size limitations of NetCDF exist, see Appendix
A,
but should be limited to old architectures.
Thanks to the flexibility of NetCDF, the content of a NetCDF file format
suitable for use for ETSF softwares might be of four different types:
(1) The actual numerical data (that defines a file for wavefunctions, or
a density file, etc ...), whose NetCDF description would have been
agreed.
(2) The auxiliary data that are mandatory to make proper usage of the
actual numerical data. The NetCDF description of these auxiliary data
should also be agreed.
(3) The auxiliary data that are not mandatory, but whose NetCDF
description has been agreed, in a larger context.
(4) Other data, typically code-dependent, whose existence might help the
use of the file for a specific code. The name of these variables should
be different from the names chosen for agreed variables (1)-(3). Such
other data might even be redundant with (1)-(3).
Such content is compatible with a file format being complete for use by many codes, though adapted for the specific usage by one code. The ETSF file descriptions to be provided later (sections General specifications to Dielectric function) are based on this generic classification of data that can be integrated in such a NetCDF file.
In order to address the 2 GB limit (see Appendix A), as well as the use of NetCDF files for parallel calculations, one file can actually be split into several partial files. Selected variables should describe the differing content of each of them. As an example, in Specification for files containing a density or a potential, a file containing a set of wavefunctions can be split in different files containing selected bands and/or k-points, however being exactly similar in every other respect.
Some technical details concerning the use of NetCDF files apply to all formats specified in the ETSF framework:
Global attributes are used for a general description of the file, mainly the file format convention. Important data is not contained in attributes, but rather in variables.
The length of character attributes is the maximum length this attribute may take. This is relevant for reading, where sufficient space must be provided. In writing, the defined length may be reduced to the real length of the attribute.
This table gathers specifications for required attributes in any ETSF NetCDF files.
Attributes | Type (length) | Notes |
---|---|---|
file_format | char (80) | "ETSF" |
file_format_version | real | 1.1 or 1.2 or 2.0 ... |
Conventions | char (80) | "http://www.etsf.eu/fileformats" |
This table presents optional attributes for ETSF NetCDF files.
Attributes | Type (length) | Notes |
---|---|---|
history | char (1024) | |
title | char (80) |
A few attributes might apply to a large number of variables. The following table presents the generic attributes that might be mandatory for selected variables in ETSF NetCDF files.
Attributes | Type (length) | Notes |
---|---|---|
units | char (80) | required for variables that carry units |
scale_to_atomic_units | double | required for units other than "atomic units" |
"Flag-like" attributes can take the values "yes" and "no". When such attributes are written, they should be written in full length and small letters. When they are read, only the first character needs to be checked (i.e. "y" or "n" – this simplifies life a lot).
Dimensions are used for one- or multidimensional variables. It is very important to remember that the NetCDF interface adapts the dimension ordering to the programming language used. The notation here is C-like, i.e. the last index varies fastest. In Fortran, the order is reversed. When implementing new reading interfaces, the dimension names can be used to check the dimension ordering. The dimension names help also to identify the meaning of certain dimensions in cases where the number alone is not sufficient.
The variables that specify dimensions in ETSF files are divided into two lists: one for the dimensions that are not supposed to lead to a splitting, and another for the dimensions that might be used to define a splitting (e.g. in case of parallelism).
This table list the dimensions that are not supposed to lead to a splitting.
Dimensions | Type (index order as in C) | Notes |
---|---|---|
character_string_length | integer | Always ==80 |
real_or_complex_coefficients | integer | Either ==1 or 2 |
real_or_complex_density | integer | Either ==1 or 2 |
real_or_complex_gw_corrections | integer | Either ==1 or 2 |
real_or_complex_potential | integer | Either ==1 or 2 |
real_or_complex_wavefunctions | integer | Either ==1 or 2 |
number_of_cartesian_directions | integer | Always ==3 |
number_of_reduced_dimensions | integer | Always ==3 |
number_of_vectors | integer | Always ==3 |
number_of_symmetry_operations | integer | |
number_of_atoms | integer | |
number_of_atom_species | integer | |
symbol_length | integer | Always ==2 |
This table list the dimensions that might be used to define a splitting (e.g. in case of parallelism). For the auxiliary variables needed in case of splitting, see Splitting.
Dimensions | Type (index order as in C) | Notes |
---|---|---|
max number of states | integer | |
number of kpoints | integer | |
number of spins | integer | Either ==1 or 2 |
number of spinor components | integer | Either ==1 or 2 |
number of components | integer | Either ==1, 2 or 4 |
max number of coefficients | integer | |
number of grid points vector1 | integer | |
number of grid points vector2 | integer | |
number of grid points vector3 | integer | |
max number of basis grid points | integer | For wavelets. Range in 1 to number_of_grid_points1_vector1 * number_of_grid_points1_vector2 * number_of_grid_points1_vector3 |
number of localisation regions | integer | Always 1. |
To clarify the interplay between number_of_spins,
number_of_components, and number_of_spinor_components, note the
different following magnetic or non-magnetic cases:
Non-spin-polarized:
number_of_spins=1 , number_of_spinor_components=1,
number_of_components=1
Collinear spin-polarized:
number_of_spins=2, number_of_spinor_components=1,
number_of_components=2
Non-collinear spin-polarized:
number_of_spins=1, number_of_spinor_components=2,
number_of_components=4
We now turn to the specification of the (optional) splitting of files in partial files. Such splitting might be done in many different ways. In order to allow for very general, flexible, splittings, but still rely on a simple system, we set up different pairs of variables, one for each possible splitting. These pairs of variables are described in Auxiliary dimensions for splitting and Auxiliary variables for splitting. If a software cannot cope with the file splitting, it should simply check that no file splitting is done, and if the contrary happens, it should stop.
Let us work out an example.
Suppose we split the file according to the kpoints. The full set might
have 10 kpoints, of which 3 kpoints (number 1, 2 and 5) might be
contained in a first file, 3 other kpoints (number 3, 6 and 9) might be
contained in a second file, and the 4 remaining kpoints (number 4, 7, 8
and 10) might be contained in the third file.
Then, the first file will contain:
number_of_kpoints = 10 , my_number_of_kpoints = 3 ,
my_kpoints=(1,2,5)
The second file will contain:
number_of_kpoints = 10 , my_number_of_kpoints = 3 ,
my_kpoints=(3,6,9)
The third file will contain:
number_of_kpoints = 10 , my_number_of_kpoints = 4 ,
my_kpoints=(4,7,8,10)
If more than one splitting is done, the file will contain the
intersection of the split data. As an example, suppose we split the file
according to the kpoints and the spins. The full set of kpoints might
have 4 kpoints, and there would be two spins. We perform two splittings,
one separating kpoints 1 and 2 from kpoints 3 and 4, and one separating
the spins.
The first file might contain:
number_of_kpoints = 4 , my_number_of_kpoints = 2 ,
my_kpoints=(1,2)
number_of_spins = 2 , my_number_of_spins = 1 , my_spins=(1)
The second file might contain:
number_of_kpoints = 4 , my_number_of_kpoints = 2 ,
my_kpoints=(3,4)
number_of_spins = 2 , my_number_of_spins = 1 , my_spins=(1)
The third file might contain:
number_of_kpoints = 4 , my_number_of_kpoins = 2 ,
my_kpoints=(1,2)
number_of_spins = 2 , my_number_of_spins = 1 , my_spins=(2)
The fourth file might contain:
number_of_kpoints = 4 , my_number_of_kpoins = 2 ,
my_kpoints=(3,4)
number_of_spins = 2 , my_number_of_spins = 1 , my_spins=(2)
Different variables might change their sizes when splitting is used. The list of variables whose size might change compared to non-split files will have to be specified.
Dimensions of variables to specify the (optional) splitting of one file in different partial files. These dimensions and associated variables (see Auxiliary variables for splitting) are defined by pair (one integer, and one integer array). Any one of these pairs can be used to split the files, and several of these pairs can be used as well. In case several pairs are used, the content of the file is defined by the intersection of the different integer arrays. The detailed description of these variables is induced from the one of the corresponding variables in Dimensions that can be split.
Dimensions | Type (index order as in C) | Notes |
---|---|---|
my_max number_of_states | integer | At least 1, at most number of states |
my_number_of_kpoints | integer | At least 1, at most number of kpoints |
my_number_of_spins | integer | At least 1, at most number of spins |
my_number_of_spinor_components | integer | At least 1, at most number of spinor components |
my_number_of_components | integer | At least 1, at most number of components |
my_number_of_grid_points_vector1 | integer | At least 1, at most number of grid points vector1 |
my_number_of_grid_points_vector2 | integer | At least 1, at most number of grid points vector2 |
my_number_of_grid_points_vector3 | integer | At least 1, at most number of grid points vector3 |
my_max_number_of_coefficients | integer | At least 1, at most max number of coefficients |
Variables to specify the (optional) splitting of one file in different partial files. See the explanation in Auxiliary dimensions for splitting. The detailed description of these variables is induced from the one of the corresponding variables from Dimensions that can be split.
Variables | Type (index order as in C) | Notes |
---|---|---|
my_states | integer [my_max_number_of_states] | |
my_kpoints | integer [my_number_of_kpoints] | |
my_spins | integer [my_number_of_spins] | |
my_spinor_components | integer [my_number_of_spinor_components] | |
my_components | integer [my_number_of_components] | |
my_grid_points_vector1 | integer [my_number_of_grid_points_vector1] | |
my_grid_points_vector2 | integer [my_number_of_grid_points_vector2] | |
my_grid_points_vector3 | integer [my_number_of_grid_points_vector3] | |
my_coefficients | integer [my_max_number_of_coefficients] |
In order to avoid the “divergence of the formats in the additional data”, we propose names and formats for some information that is likely to be written to the files. This section will grow in future format versions. Please report any variable you miss here, so we can add it to the list. None of these data is mandatory for the file formats to be described later. Some of the proposed variables contain redundant information.
All optional variables must be defined BEFORE the largest size array of the file, otherwise this array will be restricted to 4GB. Examples of such arrays are coefficients_of_wavefunctions or real_space_wavefunctions (see later).
These optional variables are grouped with respect to their physical relevance: atomic information, electronic structure, and reciprocal space.
Variables | Type (index order as in C) | Notes |
---|---|---|
valence_charges | double [number_of_atom_species] | |
pseudopotential_types | char [number_of_atom_species][character_string_length] |
Variables | Type (index order as in C) | Notes |
---|---|---|
number of electrons | integer | |
exchange_functional | char [character_string_length] | |
correlation_functional | char [character_string_length] | |
fermi_energy double | Units attribute required. The attribute “scale to atomic units” might also be mandatory, see Generic attributes of variables. | |
smearing_scheme | char [character_string_length] | |
smearing_width double | Units attribute required. The attribute “scale to atomic units” might also be mandatory, see Generic attributes of variables. |
Variables | Type (index order as in C) | Notes |
---|---|---|
kinetic_energy_cutoff | double | Units attribute required. The attribute “scale to atomic units” might also be mandatory, see Generic attributes of variables. |
kpoint_grid_shift | double [number_of_reduced_dimensions] | |
kpoint_grid_vectors | double [number_of_vectors] [number_of_reduced_dimensions] | |
monkhorst_pack_folding | integer [number_of_vectors] |
NetCDF files, that respect the ETSF specifications described in the
present document, should be easily recognized. We suggest to append, in
their names, the string "-etsf.nc" . The appendix ".nc" is a standard
convention for naming NetCDF files, see:
http://www.unidata.ucar.edu/software/netcdf/docs/faq.html#filename .
Some filesystems are case- insensitive, and this motivates the
lower-case choice. Finally, a dash is to be preferred to an underscore
to allow the files references by a Web search engine.
A ETSF NetCDF file for crystallographic data should contain the following set of mandatory information:
The use of atomic_numbers is preferred. If atomic_numbers is not available, atom_species_names will be preferred over chemical_symbols. In case more than one such variables are present in a file, the same order of preference should be followed by the reading program.
As mentioned in General considerations and General specifications, such file might contain additional information agreed within ETSF, such as any of the variables specified in General specifications. It might even contain enough information to be declared a ETSF NetCDF file "containing the density" or "containing the wavefunctions", or both. Such file might also contain additional information specific to the software that generated the file. It is not expected that this other software-specific information be used by another software.
It is not expected that the above-mentioned information be distributed among different files (unlike for density/potential/wavefunction files, see later).
Variables and attributes to specify the atomic structure and symmetry operations.
Variables | Type (index order as in C) | Notes |
---|---|---|
primitive_vectors | double [number_of_vectors][number_of_cartesian_directions] | By default, given in Bohr. |
reduced_symmetry_matrices | integer [number_of_symmetry_operations][number_of_reduced_dimensions][number_of_reduced_dimensions] | The "symmorphic" attribute is needed. |
reduced_symmetry_translations | double [number_of_symmetry_operations][number_of_reduced_dimensions] | The "symmorphic" attribute is needed. |
space_group | integer | Between 1 and 232. |
atom_species | integer [number_of_atoms] | Between 1 and "number_of_atom_species". |
reduced_atom_positions | double [number_of_atoms][number_of_reduced_dimensions] | |
atomic_numbers | double [number_of_atom_species] | |
atom_species_names | char [number_of_atom_species][character_string_length] | |
chemical_symbols | char [number_of_atom_species][symbol_length] | |
Attributes | Type | Notes |
symmorphic | char(80) | flag-type attribute, see Flag-like attributes. |
${r’}{\alpha}^{red} = \sum{\beta} S^{red}{\alpha \beta} r^{red}{\beta} + t^{red}_{\beta}$
The array reduced_symmetry_matrices contains the matrices S, in reduced coordinates, while the vector t, in reduced coordinates, is contained in the array reduced_symmetry_translations of the same Table. There might be a confusion between the two dimensions number_of_reduced_dimensions of this variable. In the C ordering, the last one corresponds to the beta index in the above-mentioned formula.
The first symmetry operation must always be unity with translation vector (0,0,0). If all translations are zero, the attribute symmorphic for reduced_symmetry_matrices should be set to "yes".
A ETSF NetCDF file for a density should contain the following set of mandatory information:
As mentioned in General considerations and General specifications, such file might contain additional information agreed within ETSF, such as any of the variables specified in General specifications. It might even contain enough information to be declared a ETSF NetCDF file "containing crystallographic data" or "containing the wavefunctions", or both. Such file might also contain additional information specific to the software that generated the file. It is not expected that this other software-specific information be used by another software.
A ETSF NetCDF exchange, correlation, or exchange-correlation potential file should contain at least one variable among the three presented in Exchange and correlation in replacement of the specification of the density. The type and size of such variables are similar to the one of the density. The other variables required for a density are also required for a potential file. Additional ETSF or software-specific information might be added, as described previously.
The information might distributed among different files, thanks to the use of splitting of data for variables:
In case the splitting related to one of these variables is activated, then the corresponding variables in Auxiliary variables for splitting must be defined. Accordingly, the dimensions of the variables in Density and/or Exchange and correlation will be changed, to accommodate only the segment of data effectively contained in the file.
Variables | Type (index order as in C) | Notes |
---|---|---|
density | double[number_of_components][number_of_grid_points_vector3][number_of_grid_points_vector2][number_of_grid_points_vector1][real_or_complex_density] | This is a pseudo-density. Note in case of PAW, the augmentation contribution is missing. By default, the density is given in atomic units, that is, number of electrons per Bohr^3^. The “units” attribute is required. The attribute “scale_to_atomic_units” might also be mandatory, see Generic attributes of variables. |
A density in such a format (represented on a 3D homogeneous grid) is suited for the representation of smooth densities, as obtained naturally from pseudopotential calculations using plane waves.
This specification for a density can also accommodate the response densities of Density-Functional Perturbation Theory.
Variables | Type (index order as in C) | Notes |
---|---|---|
correlation_potential | double[number_of_components][number_of_grid_points_vector3][number_of_grid_points_vector2][number_of_grid_points_vector1][real_or_complex_potential] | Note in case of PAW, the augmentation contribution is missing. Units attribute required. The attribute "scale to atomic units" might also be mandatory, see Generic attributes of variables. |
exchange_potential | double[number_of_components][number_of_grid_points_vector3][number_of_grid_points_vector2][number_of_grid_points_vector1][real_or_complex_potential] | Note in case of PAW, the augmentation contribution is missing. Units attribute required. The attribute "scale to atomic units" might also be mandatory, see Generic attributes of variables. |
exchange_correlation_potential | double[number_of_components][number_of_grid_points_vector3][number_of_grid_points_vector2][number_of_grid_points_vector1][real_or_complex_potential] | Note in case of PAW, the augmentation contribution is missing. Units attribute required. The attribute "scale to atomic units" might also be mandatory, see Generic attributes of variables. |
A ETSF NetCDF file "containing the wavefunctions" should contain at least the information needed to build the density from this file. Also, since the eigenvalues are intimately linked to eigenfunctions, it is expected that such a file contain eigenvalues. Of course, files might contain less information than the one required, but still follow the naming convention of ETSF. It might also contain more information, of the kind specified in other tables of the present document.
A ETSF NetCDF file "containing the wavefunctions" should contain the following set of mandatory information:
As mentioned in General considerations and General specifications, such a file might contain additional information agreed on within ETSF, such as any of the variables specified in General specifications. It might even contain enough information to be declared a ETSF NetCDF file "containing crystallographic data" or "containing the density", or both. Such a file might also contain additional information specific to the software that generated the file. It is not expected that this other software-specific information be used by another software.
The information might be distributed among different files, thanks to the use of splitting of data for variables:
And, either
or
In case the splitting related to one of these variables is activated, then the corresponding variables in Split wavefunctions must be defined. Accordingly, the dimensions of the variables in K-points, States, Wavefunctions, and BSE/GW might have to be changed, to accommodate only the segment of data effectively contained in the file.
Variables | Type (index order as in C) | Notes |
---|---|---|
reduced_coordinates_of_kpoints | double[number_of_kpoints] [number_of_reduced_dimensions] | See possible changes for split files in Split wavefunctions. |
kpoint_weights | double[number_of_kpoints] | See Construction of the density. See also possible changes for split files in Split wavefunctions. |
Variables | Type (index order as in C) | Notes |
---|---|---|
number_of_states | integer[number_of_spins][number_of_kpoints] | The attribute "k_dependent" must be defined. |
eigenvalues | double[number of spins][number of kpoints][max number of states] | The "units" attribute is required. The attribute "scale_to_atomic_units" might also be mandatory, see Generic attributes of variables. See also possibles changes for split files in Split wavefunctions. |
occupations | double[number of spins][number of kpoints][max number of states] | See also possibles changes for split files in Split wavefunctions. |
Attributes | Type | Notes |
k_dependent | char(80) | Attribute of number_of_states, flag-type, see Flag-like attributes. |
Variables | Type (index order as in C) | Notes |
---|---|---|
basis_set | char(character string length) | "plane_waves" if a plane-wave basis set is used. "Daubechies_wavelets" if a Daubechies wavelet is used. |
number_of_coefficients | integer[number_of_kpoints] | The attribute "k_dependent" must be defined (see States). Possible splitting, see Split wavefunctions. |
coefficients_of_wavefunctions | double [number_of_spins][number_of_kpoints][max_number_of_states][number_of_spinor_components][max_number_of_coefficients][real_or_complex_coefficients] | For both plane-wave basis set and Daubechies wavelet basis set. Normalization for plane waves: 1 per unit cell. See also possible modifications for split files in Split wavefunctions. The attribute used_time_reversal_at_gamma might be defined. |
reduced_coordinates_of_plane_waves | integer[number_of_kpoints][max_number_of_coefficients][number_of_reduced_dimensions] | The attribute "k_dependent" must be defined (see States). See possible modifications for split files in Split wavefunctions. The attribute used_time_reversal_at_gamma might be defined. |
coordinates_of_basis_grid_points | integer[number_of_localization_regions][max_number_of_basis_grid_points][number_of_reduced_dimensions] | For wavelets. |
number_of_coefficients_per_grid_point | integer[number_of_localization_regions][max_number_of_basis_grid_points] | For wavelets. |
order_of_Daubechies_wavelets | integer | For wavelets. |
real_space_wavefunctions | double[number_of_spins][number_of_kpoints][max_number_of_states][number_of_spinor_components][number_of_grid_points_vector3][number_of_grid_points_vector2][number_of_grid_points_vector1][real_or_complex_wavefunctions] Normalization: 1 per unit cell. See possible modifications for split files in Split wavefunctions. | |
Attributes | Type | Notes |
used_time_reversal_at_gamma | char(80) | Attribute of reduced coordinates of plane waves and coefficients of wavefunctions flag-type, see Flag-like attributes. |
When the variable basis_set is set to "daubechies_wavelets", the basis set is constituted by a reduced set of grid points that can host one or several coefficients. The following explanation assumes a two-level resolution but it can be used for other values. In the two-resolution case, all other quantities than the wavefunctions (as the density) are usually expressed on the finest grid, i.e. the grid for the density is twice the grid for the wavefunctions. Since dimensions number_of_grid_points_vector<i> are used to define the scalar variables, the coordinates_of_basis_grid_points must be even numbers in the two- resolution case. The wavefunctions are expanded in real space on a non-complete uniform grid. The grid points used for the basis set are listed in the variable coordinates_of_basis_grid_points. Each basis grid point can host one or eight coefficients as stored in the variable number_of_coefficients_per_grid_points. Then, in that case, the dimension max_number_of_coefficients is the sum over the basis gridpoint of the values of number_of_coefficients_per_grid_point. To build the wavefunctions from the values stored in coefficients_of_wavefunctions, one must read for each basis grid point the required number of coefficients. When one coefficient is given, this means a coefficient for a product of 1-dimensional Daubechies scaling-functions centered on the basis grid point. When eight values are given, this means eight coefficients for product of both scaling functions and wavelet functions ($\phi$ denotes Daubechies scaling functions and $\psi$ Daubechies wavelet functions):
For a review on wavelets, including the description of Daubechies wavelets, see, e.g., Wavelets and Their Application for the Solution of Partial Differential Equations in Physics, Presses Polytechniques et Universitaires Romandes, Lausanne, (1998) by S. Goedecker.
Note that these specification for the wavefunctions can accommodate the response wavefunctions of Density-Functional Perturbation Theory. On the contrary, the response eigenenergies (actually a hermitian matrix of Lagrange multipliers) cannot be accommodated by the "eigenvalues" array of States.
Different variables see their dimensions modified, in case the file is split, as described in Splitting (see Dimensions that can be split and Auxiliary variables for splitting). In the following table we have gathered the variables whose dimensions will change. We have also dimensioned them as if the splitting was done on all the possible dimensions. This will rarely be the case, but intermediate situations can easily be deduced from the data gathered in the table.
Variables | Type (index order as in C) | Notes |
---|---|---|
reduced_coordinates_of_kpoints | double[my_number_of_kpoints][number_of_reduced_dimensions] | |
number_of_coefficients | integer[my_number_of_kpoints] | |
kpoint_weights | double[my_number_of_kpoints] | |
occupations | double[my_number_of_spins][my_number_of_kpoints][my_max_number_of_states] | |
eigenvalues | double[my_number_of_spins][my_number_of_kpoints][my_max_number_of_states] | The "units" attribute is required. The attribute "scale_to_atomic_units" might also be mandatory, see Generic attributes of variables. |
real_space_wavefunctions | double [my_number_of_spins][my_number_of_kpoints][my_max number_of_states][my_number_of_spinor_components][my_number_of_grid_points_vector1][my_number_of_grid_points_vector2][my_number_of_grid_points_vector3][real_or_complex_wavefunctions] | |
coefficients_of_wavefunctions | double [my_number_of_spins][my_number_of_kpoints][my_max_number_of_states][my_number_of_spinor_components][my_max_number_of_coefficients][real_or_complex_coefficients] | |
reduced_coordinates_of_plane_waves | integer[my_number_of_kpoints][number_of_reduced_dimensions] |
The variables mentioned in this table are optional. They have been introduced in the present specification in prevision of use by some GW/BSE softwares, and might be subject to (heavy?) revisions in future versions of the specification.
Dimensions | Type | Notes |
---|---|---|
max_number_of_angular_momenta | integer | |
max_number_of_projectors | integer | |
Variables | Type (index order as in C) | |
gw_corrections | double [number_of_spins][number_of_kpoints][max_number_of_states][real_or_complex_gw_corrections] | The "units" attribute is required. The attribute "scale_to_atomic_units" might also be mandatory, see Generic attributes of variables. See also possibles changes for split files, as in Split wavefunctions. |
kb_formfactor_sign | integer[number_of_atom_species][max_number_of_angular_momenta][max_number_of_projectors] | |
kb_formfactors | double[number_of_atom_species][max_number_of_angular_momenta][max_number_of_projectors][number_of_kpoints][max_number_of_coefficients] | Possibles changes for split files, as in Split wavefunctions. |
kb_formfactor_derivative | double[number_of_atom_species][max_number_of_angular_momenta][max_number_of_projectors][number_of_kpoints][max_number_of_coefficients] | Possibles changes for split files, as in Split wavefunctions. |
On the Kleinman-Bylander form factors, we note that one can always write the non-local part of Kleinman-Bylander pseudopotential (reciprocal space) in the following way:
$v^{KB}{nonloc} (\vec{K},\vec{K’}) = \sum_s \left[ \sum{a(s)} e^{-i(\vec{K}-\vec{K’})\vec{\tau_a}}\right] \left[ \sum_{lp} P_l(\hat{K} \cdot \hat{K’}) F^{\star}{slp}(K) S{slp} F_{slp}(K’) \right]$
with $\vec{K} = \vec{k} + \vec{G}$ , $\vec{k}$ is one of the kpoints (see Exchange and correlation), $\vec{G}$ is a vector of the reciprocal lattice, the list of reduced coordinates of which can be found in the variable reduced_coordinates_of_plane_waves of Wavefunctions. $K$ is the module of $\vec{K}$ and $\hat{K}$ its direction. $\vec{\tau_a}$ is the atomic position of atom $a$ belonging to species $s$. $P_l (x)$ is the Legendre polynomial of order $l$. $F_{slp} (K)$ is the Kleinman-Bylander form factor for species $s$, angular polynomial of order $l$ , and number of projector $p$ . $S_{slp}$ is the sign of the dyadic product $F_{slp}^{\star}(K) F_{slp}(K’)$. The sum on $a(s)$ runs over all atoms of atomic species $s$, $l$ runs over all the pseudopotential angular momentum components of the atomic species $s$, and $p$ runs over the number of projectors allowed for a specific angular momentum channel of atomic species $s$. The additional variable kb_formfactor_derivative is equal to $d F_{slp}(K) / dK$.
Supposing $\rho_{n, k} (r)$ to be the partial density at point r (in real space, using reduced coordinates) due to band n at k-point k (in reciprocal space, using reduced coordinates), then the full density at point is obtained thanks to:
$\rho(r^{red}\alpha) = \sum_{s \in sym} \sum_k w_k \sum_n f_{n,k} \rho_{n, k} \left( S^{red}{s,\alpha \beta} (r^{red}\beta-t^{red}_{s,\beta}) \right)$,
where $w_k$ is contained in the array "kpoint_weights" of K-points, and $f_{n, k}$ is contained in the array "occupations" of States. This relation generalizes to the collinear spin-polarized case, as well as the non-collinear case by taking into account the "number_of_components" defined in Dimensions that can be split, and the direction of the magnetization vector.
To summarize:
Although the 64-bit offset format allows the creation of much larger NetCDF files than was possible with the classic format, there are still some restrictions on the size of variables. It is important to note that without Large File Support (LFS) in the operating system, it is impossible to create any file larger than 2 GBytes. Assuming an operating system with LFS, the following restrictions apply to the NetCDF 64-bit offset format:
Note also that all NetCDF variables and records are padded to 4-byte boundaries.
Here is a list of all the names of agreed variables, attributes, and dimensions names, in alphabetical order.
Note: all the variables/dimensions beginning with "my_" refer to split files, and are explained in Auxiliary dimensions for splitting and Auxiliary variables for splitting.
Name | Type | Table |
---|---|---|
atom_species | Variable | Atomic structure and symmetry operations |
atom_species_names | Variable | Atomic structure and symmetry operations |
atomic_numbers | Variable | Atomic structure and symmetry operations |
basis_set | Variable | Wavefunctions |
character_string_length | Dimension | Dimensions that cannot be split |
chemical_symbols | Variable | Atomic structure and symmetry operations |
coefficients_of_wavefunctions | Variable | Wavefunctions |
Conventions | Global attribute | Mandatory attributes |
coordinates_of_basis_grid_points | Variable | Wavefunctions |
correlation_functional | Variable | Electronic structure |
correlation_potential | Variable | Exchange and correlation |
density | Variable | Density |
eigenvalues | Variable | States |
exchange_correlation_potential | Variable | Exchange and correlation |
exchange_functional | Variable | Electronic structure |
exchange_potential | Variable | Exchange and correlation |
fermi_energy | Variable | Electronic structure |
file_format | Global attribute | Mandatory attributes |
file_format_version | Global attribute | Mandatory attributes |
gw_corrections | Variable | BSE/GW |
history | Global attribute | Optional attributes |
k_dependent | Attribute | States |
kb_formfactor_sign | Variable | BSE/GW |
kb_formfactors | Variable | BSE/GW |
kb_formfactor_derivative | Variable | BSE/GW |
kinetic_energy_cutoff | Variable | Reciprocal space |
kpoint_grid_shift | Variable | Reciprocal space |
kpoint_grid_vectors | Variable | Reciprocal space |
kpoints_weights | Variable | K-points |
max_number_of_angular_momenta | Dimension | BSE/GW |
max_number_of_basis_grid_points | Dimension | Dimensions that can be split |
max_number_of_coefficients | Dimension | Dimensions that can be split |
max_number_of_projectors | Dimension | BSE/GW |
max_number_of_states | Dimension | Dimensions that can be split |
monkhorst_pack_folding | Variable | Reciprocal space |
number_of_atoms | Dimension | Dimensions that cannot be split |
number_of_atom_species | Dimension | Dimensions that cannot be split |
number_of_cartesian_directions | Dimension | Dimensions that cannot be split |
number_of_coefficients | Variable | Wavefunctions |
number_of_coefficients_per_grid_point | Variable | Wavefunctions |
number_of_components | Dimension | Dimensions that can be split |
number_of_electrons | Variable | Electronic structure |
number_of_grid_points_vector1 | Dimension | Dimensions that can be split |
number_of_grid_points_vector2 | Dimension | Dimensions that can be split |
number_of_grid_points_vector3 | Dimension | Dimensions that can be split |
number_of_kpoints | Dimension | Dimensions that can be split |
number_of_localization_regions | Dimension | Dimensions that can be split |
number_of_reduced_dimensions | Dimension | Dimensions that cannot be split |
number_of_spinor_components | Dimension | Dimensions that can be split |
number_of_spins | Dimension | Dimensions that can be split |
number_of_states | Variable | States |
number_of_symmetry_operations | Dimension | Dimensions that cannot be split |
number_of_vectors | Dimension | Dimensions that cannot be split |
occupations | Variable | States |
order_of_Daubechies_wavelets | Variable | Wavefunctions |
primitive_vectors | Variable | Atomic structure and symmetry operations |
pseudopotential_types | Variable | Atomic information |
real_or_complex_coefficients | Variable | Dimensions that cannot be split |
real_or_complex_density | Variable | Dimensions that cannot be split |
real_or_complex_gw_corrections | Variable | Dimensions that cannot be split |
real_or_complex_potential | Variable | Dimensions that cannot be split |
real_or_complex_wavefunctions | Variable | Dimensions that cannot be split |
real_space_wavefunctions | Variable | Wavefunctions |
reduced_atom_positions | Variable | Atomic structure and symmetry operations |
reduced_coordinates_of_kpoints | Variable | K-points |
reduced_coordinates_of_plane_waves | Variable | Wavefunctions |
reduced_symmetry_matrices | Variable | Atomic structure and symmetry operations |
reduced_symmetry_translations | Variable | Atomic structure and symmetry operations |
scale_to_atomic_units | Attribute | Generic attributes of variables |
smearing_scheme | Variable | Electronic structure |
smearing_width | Variable | Electronic structure |
space_group | Variable | Atomic structure and symmetry operations |
symbol_length | Dimension | Dimensions that cannot be split |
symmorphic | Attribute | Atomic structure and symmetry operations |
title | Global attribute | Optional attributes |
units | Attribute | Generic attributes of variables |
used_time_reversal_at_gamma | Attribute | Wavefunctions |
valence_charges | Variable | Atomic information |