WhatIs: The install lmdz.sh script : Différence entre versions
(Add new install_lmdz know bugs) |
(remove obsolete part) |
||
(2 révisions intermédiaires par le même utilisateur non affichées) | |||
Ligne 100 : | Ligne 100 : | ||
This "error" doesn't seem to actually affect the outputs, and should be ignored; yet, its cause hasn't been investigated. | This "error" doesn't seem to actually affect the outputs, and should be ignored; yet, its cause hasn't been investigated. | ||
− | ==== | + | ==== COSP + orch + xios segfault ==== |
− | <code> | + | With <code>-veget orch2.0 -xios -cosp v1</code> (or <code>v2</code>), we encounter a segfault in the bench. |
− | ' | + | Unfortunately because of the netcdf debug bug (see above), we can't get to the actual trace.... |
+ | |||
+ | ==== XIOS compilation failure ==== | ||
+ | |||
+ | XIOS 2.6 r2568 compilation fails randomly when compiled in parallel (<code>make_xios --job N</code>). The stack trace: | ||
+ | |||
+ | mpif90 -o generic_testcase.exe /home/abarral/PycharmProjects/installLMDZ/script_install/LMDZ/modipsl/modeles/XIOS/obj/generic_testcase.o -L/home/abarral/PycharmProjects/installLMDZ/script_install/LMDZ/modipsl/modeles/XIOS/lib -l__fcm__generic_testcase -lxios -lstdc++ -L/home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/netcdf-c-4.9.2-fortran-4.5.4-cxx-4.3.1/lib -lnetcdff -L//home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/netcdf-c-4.9.2-fortran-4.5.4-cxx-4.3.1/lib -lnetcdf -lnetcdf -lm -Wl,-rpath=/home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/openmpi-4.1.6/bin/../lib:/home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/netcdf-c-4.9.2-fortran-4.5.4-cxx-4.3.1/lib -lstdc++ | ||
+ | /usr/bin/ld: cannot find -lxios: No such file or directory | ||
+ | /usr/bin/ld: cannot find -lxios: No such file or directory | ||
+ | collect2: error: ld returned 1 exit status | ||
+ | fcm_internal load failed (256) | ||
+ | |||
+ | seems to indicate a race condition in the fcm process. | ||
+ | |||
+ | ''I'm not sure if this problem has been reported to the XIOS team. We also haven't tested if it's present in newer XIOS versions.'' | ||
+ | |||
+ | '''Solution''': Restart the compilation. |
Version actuelle en date du 11 juillet 2024 à 10:31
The install_lmdz.sh script is a Bash script that aims at being an "installer" for LMDZ of Linux machines.
In practice it runs a succession of mandatory steps to install and run the model from scratch, namely:
- Download and install required libraries (NetCDF, IOIPSL and possibly XIOS)
- Download the source code (LMDZ and ORCHIDEE), compile it and run a test (bench) simulation
It has many options and features; a good starting point (short of reading and digesting the Bash script itself) is to run install_lmdz.sh -h to learn about these, which should yield something like:
install_lmdz.sh ./install_lmdz.sh [ -v version ] [ -r svn_release ] [ -parallel PARA ] [ -d GRID_RESOLUTION ] [ -bench 0/1 ] [-name LOCAL_MODEL_NAME] [-gprof] [-opt_makelmdz] [-rad RADIATIF] -v "version" like 20150828.trunk see http://www.lmd.jussieu.fr/~lmdz/Distrib/LISMOI.trunk -r "svn_release" : either the svn release number or "last" -compiler gfortran|ifort|pgf90 (default: gfortran) -parallel PARA : can be mpi_omp (mpi with openMP) or none (for sequential) -d GRID_RESOLUTION should be among the available benchs if -bench 1 among which : 48x36x19, 48x36x39 if wanting to run a bench simulation in addition to compilation default : 48x36x19 -bench activating the bench or not (0/1). Default 1 -name LOCAL_MODEL_NAME : default = LMDZversion.release -netcdf PATH : full path to an existing installed NetCDF library (without -netcdf: also download and install the NetCDF library) -xios also download and compile the XIOS library (requires the NetCDF4-HDF5 library, also installed by default) (requires to also have -parallel mpi_omp) -gprof to compile with -pg to enable profiling with gprof -cosp to run without our with cospv1 or cospv2 [none/v1/v2] -rad RADIATIF can be old, rrtm or ecrad radiatif code -nofcm to compile without fcm -SCM install 1D version automatically -debug compile everything in debug mode -opt_makelmdz to call makelmdz or makelmdz_fcm with additional options -physiq to choose which physics package to use -env_file specify an arch.env file to overwrite the existing one -veget surface model to run [NONE/CMIP6/xxxx]
Note that the install_lmdz.sh script is not the only way to install LMDZ; one can manually do the various key steps (quite possibly adapted to the specificities of the machine that is used).
02/12/2021
Sommaire
New install_lmdz version (Amaury - 06/2024)
Known bugs / issues
ORCHIDEE bench
Right now (06/24), only the CMIP6/orch2.0 ORCHIDEE version has a proper bench. As a result, other versions (orch2.2, orch4/trunk) run in a bench that doesn't activate ORCHIDEE.
Todo: make a proper orch2.2 and orch4 bench, with/without xios)
Debug mode crashes
When activating -debug
, with newer versions of NETCDF, a segfault is raised on lines such as CALL err(NF90_OPEN(var,NF90_NOWRITE,fID),"open",var)
. It seems that the segfault originates from the NF90_OPEN function itself.
Todo: check with even newer versions of netcdf - but then we face the other documented netcdf bug...
ORCHIDEE 4/TRUNK compilation fails
ORCHIDEE 4/trunk can't be compiled with gfortran for now. This issue has been raised to the ORCHIDEE team (06/24).
Todo: update when orch fixes their code
Error -105 in closing file
with -parallel mpi_omp -veget orch2.0
, at the end of a simulation, we can read in listing
:
WARNING FROM ROUTINE restclo --> Error -105 in closing file : *** --> sechiba_rest_out.nc -->
This "error" doesn't seem to actually affect the outputs, and should be ignored; yet, its cause hasn't been investigated.
COSP + orch + xios segfault
With -veget orch2.0 -xios -cosp v1
(or v2
), we encounter a segfault in the bench.
Unfortunately because of the netcdf debug bug (see above), we can't get to the actual trace....
XIOS compilation failure
XIOS 2.6 r2568 compilation fails randomly when compiled in parallel (make_xios --job N
). The stack trace:
mpif90 -o generic_testcase.exe /home/abarral/PycharmProjects/installLMDZ/script_install/LMDZ/modipsl/modeles/XIOS/obj/generic_testcase.o -L/home/abarral/PycharmProjects/installLMDZ/script_install/LMDZ/modipsl/modeles/XIOS/lib -l__fcm__generic_testcase -lxios -lstdc++ -L/home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/netcdf-c-4.9.2-fortran-4.5.4-cxx-4.3.1/lib -lnetcdff -L//home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/netcdf-c-4.9.2-fortran-4.5.4-cxx-4.3.1/lib -lnetcdf -lnetcdf -lm -Wl,-rpath=/home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/openmpi-4.1.6/bin/../lib:/home/abarral/Programs/zlib_mpi_hdf5_ncdf_cdo/netcdf-c-4.9.2-fortran-4.5.4-cxx-4.3.1/lib -lstdc++ /usr/bin/ld: cannot find -lxios: No such file or directory /usr/bin/ld: cannot find -lxios: No such file or directory collect2: error: ld returned 1 exit status fcm_internal load failed (256)
seems to indicate a race condition in the fcm process.
I'm not sure if this problem has been reported to the XIOS team. We also haven't tested if it's present in newer XIOS versions.
Solution: Restart the compilation.