Difference between revisions of "The XIOS Library"
Line 192: | Line 192: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
+ | ==== Force flushing and writing files every ### time steps ==== | ||
+ | XIOS handles its buffers and only writes to output files when needed. This is quite efficient and worthwhile, except for instance when the model crashes as some data might then not be included in the output files. One may use the '''sync_freq''' (optional) attribute of a file to force XIOS to write to the file at some predefined frequency, e.g. | ||
+ | <syntaxhighlight lang="xml"> | ||
+ | <file id="my_output_file" | ||
+ | output_freq="1ts" | ||
+ | sync_freq="1ts"> | ||
+ | ... | ||
+ | |||
+ | </file> | ||
+ | </syntaxhighlight> | ||
+ | Very useful when debugging. | ||
==== Specifying an offset (in time) for the outputs ==== | ==== Specifying an offset (in time) for the outputs ==== |
Revision as of 11:56, 6 February 2024
The XIOS (Xml I/O Server) library is based on client-server principles where the server manages the outputs asynchronously from the client (the climate model) so that the bottleneck of writing data is alleviated.
Contents
Installing the XIOS library
Prerequisites
There are a couple of prerequisites to installing and using the XIOS library:
- An MPI library must be available
- A NetCDF4-HDF5 library, preferably compiled with MPI enabled, must be available (see, e.g. dedicated section on The_netCDF_library)
The rest of this page assume all prerequisites are met. People interested in building an appropriate NetCDF library on their Linux machine might be interested in the following installation script: https://web.lmd.jussieu.fr/~lmdz/pub/script_install/install_netcdf4_hdf5.bash (which might need some adaptations to work in your specific case).
Downloading and compiling the XIOS library
The XIOS source code is available for download using svn (subversion). To download it, go to your trunk repository and run the line e.g.:
svn co http://forge.ipsl.jussieu.fr/ioserver/svn/XIOS/trunk XIOS
- To compile the library, one must first have adequate architecture "arch" files at hand, just like for the GCM (see The_Target Architecture_("arch")_Files). In principle both arch.env and arch.path files could be the same as for the GCM; arch.fcm will of course differ, as XIOS source code is in C++ (along with a Fortran interface). If using a "known" machine (e.g. Occigen, Irene-Rome, Ciclad) then ready-to-use up-to-date arch files for that machine should be present in the arch directory. If not you will have to create your own (it is advised to use the existing ones as templates!)
- Assuming some_machine arch files (i.e. files arch-some_machine.env, arch-some_machine.path, arch-some_machine.fcm) are present in the arch subdirectory, compiling the XIOS is done using the dedicated make_xios script, e.g.:
./make_xios --prod --arch some_machine --job 8
If the compilation steps went well then the lib directory should contain file libxios.a and the bin directory should contain
fcm_env.ksh generic_testcase.exe xios_server.exe
XIOS documentation
Note that the downloaded XIOS distribution includes some documentation in the doc subdirectory:
reference_xml.pdf XIOS_reference_guide.pdf XIOS_user_guide.pdf
Definitely worth checking out!
Compiling the GCM with the XIOS library
To compile with XIOS enabled, one must specify the option
-io xios
to the makelmdz_fcm script.
XIOS output controls
All aspects of the outputs (name, units, file, post-processing operations, etc.) are controlled by dedicated XML files which are read at run-time. Samples of xml files are provided in the "deftank" directory.
In a nutshell
- the master file read by XIOS is iodef.xml; and contains specific XIOS parameters such as using_server to dictate whether XIOS is run in client-server mode (true) or attached (false) mode, info_level to set the verbosity of XIOS messages (0: none, 100: very verbose), print_file to set whether XIOS messages will be sent to standard output (false) or dedicated xios_*.out and xios_*.err files (true).
<variable id="using_server" type="bool">false</variable>
<variable id="info_level" type="int">0</variable>
<variable id="print_file" type="bool"> false </variable>
- It is common practice to have LMDZ-related definitions and outputs in separate XML files, e.g. context_lmdz.xml which are included in iodef.xml via the src attribute, e.g.
<context id="LMDZ" src="./context_lmdz_physics.xml"/>
The context_lmdz_physics.xml file must then contain all fields/grid/file output definitions, which may be split into multiple XML files, for instance the definition of model variables (i.e. all fields that may be outputed) is often put in a separate file field_def_physics.xml which is referenced within context_lmdz_physics.xml as:
<field_definition src="./field_def_physics.xml" />
Concerning output files, the current recommended practice is to use separate file_def_histsomething_lmdz.xml files, one for each histsomething.nc file to generate, and include these in context_lmdz.xml using the file_definition key. e.g.:
<!-- Define output files
Each file contains the list of variables and their output levels -->
<file_definition src="./file_def_histins.xml"/>
<file_definition src="./file_def_specIR.xml"/>
Some XIOS key concepts
calendars
The calendar is set via the Fortran source code (see xios_output_mod.F90 in the physics). Without going into details here, note that it is flexible enough so that day length, year length, etc. may be defined by the user. However a strong limitation is that the calendar time step should be an integer number of seconds.
TODO: refer to specific stuff/settings for Mars, Generic, Venus cases...
axes, domains and grids
First a bit of XIOS nomenclature:
- an axis is 1D; e.g. pseudo-altitude or pseudo-pressure or sub-surface depth or wavelength or ...
- a domain is a horizontal 2D surface; e.g. the globe or some portion of it
- a grid is the combination of a domain and one axis (or more); e.g. the atmosphere or the sub-surface of a planet
Most of the axis and domain are defined in the code (since all the information is known there) and only referred to in the XML via dedicated id values, e.g.:
<axis_definition>
<axis id="presnivs"
standard_name="Pseudo-pressure of model vertical levels"
unit="Pa">
</axis>
<axis id="altitude"
standard_name="Pseudo-altitude of model vertical levels"
unit="km">
</axis>
</axis_definition>
Likewise the global computational domain is defined in the code and known in the XML via its id(="dom_glo"):
<domain_definition>
<domain id="dom_glo" data_dim="2" />
</domain_definition>
From there one may generate a grid, e.g.:
<grid_definition>
<!-- toggle axis id below to change output vertical axis -->
<grid id="grid_3d">
<domain id="dom_glo" />
<!-- <axis id="presnivs" /> -->
<axis id="altitude" />
</grid>
</grid_definition>
Note that grid_3d is defined in the XML file and thus may be changed by the user without having to modify the PCM source code.
field definitions
For XIOS a field is defined with and id and most be assigned to a reference grid (this is how XIOS knows a field is a simple scalar, or a surface or a volume, and thus to which computational grid it is related to).e.g.:
<field_definition prec="4">
<field_group id="fields_2D" domain_ref="dom_glo">
<field id="aire"
long_name="Mesh area"
unit="m2" />
<field id="phis"
long_name="Surface geopotential (gz)"
unit="m2/s2" />
<field id="tsol"
long_name="Surface Temperature"
unit="K" />
...
</field_group>
<field_group id="fields_3D" grid_ref="grid_3d">
<field id="temp"
long_name="Atmospheric temperature"
unit="K" />
<field id="pres"
long_name="Atmospheric pressure"
unit="Pa" />
...
</field_group>
</field_definition>
It is vital that all the fields which are sent to XIOS via the code are declared in the XML file otherwise there will be a run-time error message of the likes of:
In file "object_factory_impl.hpp", function "static std::shared_ptr<U> xios::CObjectFactory::GetObject(const std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> &) [with U = xios::CAxis]", line 78 -> [ id = weirdvar, U = field ] object was not found.
In the message above XIOS received from the code a variable called "weirdvar" which is not defined in the XML... One must update the XML file with the proper definition (<field id="weirdvar" ... />).
output file definitions
It is by defining a file that the user specifies what the output file will be, which variables it will contain, etc. as illustrated with this simple Venusian example:
<file_definition>
<!-- Instantaneous outputs; every physics time steps -->
<file id="Xins"
output_freq="1ts"
type="one_file"
enabled=".true.">
<!-- VARS 2D -->
<field_group operation="instant"
freq_op="1ts">
<field field_ref="phis" operation="once" />
<field field_ref="aire" operation="once" />
<field field_ref="tsol" />
</field_group>
<!-- VARS 3D -->
<field_group operation="instant"
freq_op="1ts">
<field field_ref="temp" />
<field field_ref="pres" />
</field_group>
</file>
Thorougher description with illustrative examples
TODO: PUT SOME SIMPLE ILLUSTRATIVE EXAMPLES HERE
See for example the following page: controling outputs in the dynamics with DYNAMICO
Specifying that the time axis should be labeled in days rather than seconds
The default for XIOS is to label temporal axes ("time_instant" and "time_counter") in seconds. But one may ask that they be labelled in days by setting the optional time_unit attribute of a file to days, e.g.
<file id="my_output_file"
output_freq="1ts"
time_units="days">
...
</file>
Force flushing and writing files every ### time steps
XIOS handles its buffers and only writes to output files when needed. This is quite efficient and worthwhile, except for instance when the model crashes as some data might then not be included in the output files. One may use the sync_freq (optional) attribute of a file to force XIOS to write to the file at some predefined frequency, e.g.
<file id="my_output_file"
output_freq="1ts"
sync_freq="1ts">
...
</file>
Very useful when debugging.
Specifying an offset (in time) for the outputs
One may use the attribute record_offset of a file to impose that the outputs in the file begin after a certain number of time steps of the simulation (useful for instance when debugging). For instance if there are 192 time steps per day and the run is 10 days long but one only wants outputs for the last day and at every time step of that day then one should have a record_offset of -9*192=-1728 (note the -; the value to specify is negative), e.g.:
<file id="my_output_file"
output_freq="1ts"
record_offset="-1728ts"
time_units="days">
<field_group operation="instant"
freq_op="1ts">
<field field_ref="my_variable" />
</field_group>
</file>
The time_counter values in the file will be from 9.0052 (=9.+1./192.) to 10. (since here the time axis unit is requested to be in days)
An alternative way to have the first n timesteps of a time series excluded from the output is to specify a freq_offset attribute to the field. For instance, to follow up on the example above, to extract every time step of the final 10th day of simulation with 192 time steps par day one should specify a freq_offset of 9*192=1728, e.g.:
<file id="my_output_file"
output_freq="1ts"
time_units="days">
<field_group operation="instant"
freq_offset="1728ts"
freq_op="1ts">
<field field_ref="my_variable" />
</field_group>
</file>
The main difference, compared to the previous example using the record_offset file attribute, is that the time_counter values in the file will this time be from 0.0052 (=1./192) to 1.0.
Saving or loading interpolation weights
With the XIOS library one can define output domains (grid) which are different from input domains (grids), and XIOS does the necessary interpolation.
This requires, once source and destination grids are known, to compute some interpolation weights (during the initialization step). For large grids, this can take some time. One can however tell XIOS to save the interpolation weights in a file and use that file (if it is present) rather than recompute them when a new simulation is ran.
In practice one must add extra keys to the "interpolate_domain" tag, e.g.:
<domain id="dom_256_192" type="rectilinear" ni_glo="256" nj_glo="192" >
<generate_rectilinear_domain/>
<interpolate_domain order="1" write_weight="true" mode="read_or_compute" />
</domain>
This will automatically generate a NetCDF file containing the weights. Default file name will be something like xios_interpolation_weights_CONTEXT_INPUTDOMAIN_OUTPUTDOMAIN.nc , where CONTEXT, INPUTDOMAIN and OUTPUTDOMAIN are inherited from the context (i.e. definitions of these in the xml files).
One can specify the name of the file with the key "weight_filename", e.g.
<domain id="dom_256_192" type="rectilinear" ni_glo="256" nj_glo="192" >
<generate_rectilinear_domain/>
<interpolate_domain order="1" write_weight="true" mode="read_or_compute" weight_filename="xios_weights" />
</domain>
It can also happen that for a given variable we want the interpolation not to be conservative. For example, a variable like the area of a mesh grid should not be interpolated between different domains. Since the interpolation is specific to a domain (and defined in the "domain id"), we have to create a new domain for all the variable that should be interpolated in another way. For the variable "Area" for example, the syntax is as follow :
- Create the new domain:
<domain id="dom_64_48_quantity_T" type="rectilinear" ni_glo="64" nj_glo="48" >
<generate_rectilinear_domain/>
<interpolate_domain quantity="true"/>
</domain>
- Assign the variable to this domain:
Later in the context file, the variable should be outputted using this new domain (note that it still can be outputed in the same file as the other variables) :
<field_group domain_ref="dom_64_48_quantity_T">
<field_group operation="instant"
freq_op="1ts">
<field field_ref="area" operation="once" />
</field_group>
</field_group>
Using the XIOS library in client-server mode
To run with XIOS in client-server mode requires the following:
- The client-server mode should be activated (in file iodef.xml):
<variable id="using_server" type="bool">true</variable>
- The xios_server.exe executable should be present alongside the GCM executable gcm_***.e and they should be run together in MPMD (Multiple Programs, Multiple Data) mode : some of the MPI processes being allocated to the GCM and the others to XIOS ; in practice much less are needed by XIOS than the GCM, this however also depends on the amount of outputs and postprocessing computations, e.g. temporal averaging and grid interpolations, that XIOS will have to do. For example if the MPI execution wrapper is mpirun and that 26 processes are to be used by the GCM gcm_64x52x20_phystd_para.e and 2 by XIOS (i.e. using overall 28 processes):
mpirun -np 26 gcm_64x52x20_phystd_para.e > gcm.out 2>&1 : -np 2 xios_server.exe