Tool Box
Contents
Pre-processing Tools
newstart: a fortran program to modify start files
Newstart is an interactive tool to modify the start files (start.nc and startfi.nc).
To be usable, newstart should be compile in the LMDZ.COMMON directory by using the following command line:
./makelmdz_fcm -arch my_arch_file -p std -d 64x48x30 newstart
In the example, my_arch_file is the name the arch files (see arch ) and 64x48x30 is the resolution of the physical grid. Then copy the executable from the LMDZ.COMMON/bin directory to your bench directory.
When you execute newstart, you can use both a start2archive file or the start files (start.nc and startfi.nc). Then the interactive interface will propose to modify several physical quantities such as the gravity, the surface pressure or the rotation of the planet. At the end of the procedure, two files are created: restart.nc and restartfi.nc. They can be renamed and used as start files to initialize a new simulation.
start2archive
The start2archive tool is similar to newstart in the sense that it can be used to modify the start files. But start2archive can modify the resolution of the physical grid, the topography and the surface thermal inertia while newstart cannot. It is also useful to create an archive of different starting states, then extractable as start files. The command line to compile start2archive is similar to the one used for newstart:
./makelmdz_fcm -arch my_arch_file -p std -d 64x48x30 start2archive
To modify the resolution, you should first create a start_archive (by using start2archive) file at the used resolution, then compile a newstart file at the new resolution. Newstart will interpolate all the physical quantities on the new grid.
other third party scripts and tools
TO BE COMPLETED
Post-processing tools
zrecast
With this program you can recast atmospheric (i.e.: 4D-dimentional longitude-latitude-altitude-time) data from GCM outputs (e.g. as given in diagfi.nc files) onto either pressure or altitude above areoid vertical coordinates. Since integrating the hydrostatic equation is required to recast the data, the input file must contain surface pressure and atmospheric temperature, as well as the ground geopotential. If recasting data onto pressure coordinates, then the output file name is given by the input file name to which _P.nc will be appened. If recasting data onto altitude above areoid coordinates, then a _A.nc will be appened.
mass stream function
The mass stream function (and the total angular momentum) can be computed from a diagfi.nc or a stats.nc, using the streamfunction.F90 script. The script is located at
trunk/LMDZ.GENERIC/utilities
To compile the script, open the compile file in the same directory and do the following:
- Replace "pgf90" with your favorite fortran compiler
- replace "/distrib/local/netcdf/pgi_7.1-6_32/lib" with the lib address and directory that contains your NetCDF library (file libnetcdf.a).
- Replace "/distrib/local/netcdf/pgi_7.1-6_32/include" with the address of the directory that contains the NetCDF include file (netcdf.inc).
- You can mess with the compiling options but it is not mandatory.
Once the script is compiled, copy it in the same directory as your .nc file and run
./streamfunction.e
The script will ask you for the name of your .nc file, and will run and produce a new nameofyourfile_stream.nc file.
Be careful : In this new file, all fields are temporally and zonally averaged.
If you want to use python instead of fortran, you can take a look at this repo. It hosts a tool to perform dynamical analysis of GCM simulations (and therefore, it computes the mass stream function and a lot of other stuff), but it is tailored for Dynamico only. This repo also takes care of recasting (it does the job of both zrecast.F90 and streamfunction.F90)
Continuing Simulations
manually
At the end of a simulation, the model generates restart files (files 'restart.nc' and 'restartfi.nc') which contain the final state of the model. The 'restart.nc' and 'restartfi.nc' files have the same format as the 'start.nc' and 'startfi.nc' files, respectively.
These files can in fact be used as initial states to continue the simulation, using the following renaming command lines:
mv restart.nc start.nc
mv restartfi.nc startfi.nc
Running a simulation with these start files will in fact resume the simulation from where the previous run ended.
with bash scripts
We have set up very simple bash scripts to automatize the launching of chain simulations. Here is an example of bash script that does the job:
#!/bin/bash
###########################################################################
# Script to perform several chained LMD Mars GCM simulations
# SET HERE the maximum total number of simulations
nummax=100
###########################################################################
echo "---------------------------------------------------------"
echo "STARTING LOOP RUN"
echo "---------------------------------------------------------"
dir=`pwd`
machine=`hostname`
address=`whoami`
# Look for file "num_run" which should contain
# the value of the previously computed season
# (defaults to 0 if file "num_run" does not exist)
if [[ -r num_run ]] ; then
echo "found file num_run"
numold=`cat num_run`
else
numold=0
fi
echo "numold is set to" ${numold}
# Set value of current season
(( numnew = ${numold} + 1 ))
echo "numnew is set to" ${numnew}
# Look for initialization data files (exit if none found)
if [[ ( -r start${numold}.nc && -r startfi${numold}.nc ) ]] ; then
\cp -f start${numold}.nc start.nc
\cp -f startfi${numold}.nc startfi.nc
else
if (( ${numold} == 99999 )) ; then
echo "No run because previous run crashed ! (99999 in num_run)"
exit
else
echo "Where is file start"${numold}".nc??"
exit
fi
fi
# Run GCM -- THIS LINE NEEDS TO BE MODIFIED WITH THE CORRECT GCM EXECUTION COMMAND
mpirun -np 8 gcm_64x48x26_phystd_para.e < diagfi.def > lrun${numnew}
# Check if run ended normaly and copy datafiles
if [[ ( -r restartfi.nc && -r restart.nc ) ]] ; then
echo "Run seems to have ended normaly"
\mv -f restart.nc start${numnew}.nc
\mv -f restartfi.nc startfi${numnew}.nc
else
if [[ -r num_run ]] ; then
\mv -f num_run num_run.crash
else
echo "No file num_run to build num_run.crash from !!"
# Impose a default value of 0 for num_run
echo 0 > num_run.crash
fi
echo 99999 > num_run
exit
fi
# Copy other datafiles that may have been generated
if [[ -r diagfi.nc ]] ; then
\mv -f diagfi.nc diagfi${numnew}.nc
fi
if [[ -r diagsoil.nc ]] ; then
\mv -f diagsoil.nc diagsoil${numnew}.nc
fi
if [[ -r stats.nc ]] ; then
\mv -f stats.nc stats${numnew}.nc
fi
if [[ -f profiles.dat ]] ; then
\mv -f profiles.dat profiles${numnew}.dat
\mv -f profiles.hdr profiles${numnew}.hdr
fi
# Prepare things for upcoming runs by writing
# value of computed season in file num_run
echo ${numnew} > num_run
# If we are over nummax : stop
if (( $numnew + 1 > $nummax )) ; then
exit
else
\cp -f run_gnome exe_mars
./exe_mars
fi
Summary of what this bash script does:
- It reads the file 'num_run' which contains the step of the simulation.
If num_run is
5
then the script expects to read start5.nc and startfi5.nc.
- It modifies start5.nc and startfi5.nc into start.nc and startfi.nc, respectively.
- It runs the GCM.
- It modifies restart.nc and restartfi.nc into start6.nc and startfi6.nc
- It rewrite num_run as follows:
6
- It restarts the loop until num_run reaches the value (defined in nummax):
100
Processing Output Files with NCOs
NCOs (NetCdf OperatorS) are a set of powerful command-line utilities – available on Linux, Mac and PC – that allow to perform useful (and very fast!) post-processing operations on netCDF GCM output files. Full documentation can be found on http://research.jisao.washington.edu/data_sets/nco/, but we provide below a few examples of command lines.
- How to calculate a time mean of a netCDF 'diagfi.nc' file
ncra -F -d Time,1,,1 diagfi.nc diagfi_MEAN.nc # format is "-d dimension,minimum,maximum,stride"
- Subsetting time in a netCDF 'diagfi.nc' file
ncea -F -d Time,first,last diagfi.nc diagfi_subset.nc # format is "-d dimension,minimum,maximum" ; we recall you can type "ncdump -v time diagfi.nc" to see the Time values in the netCDF file.
- Decimating a netCDF 'diagfi.nc' file in time
ncks -F -d Time,1,,8 diagfi.nc diagfi_decimated.nc # format is "-d dimension,minimum,maximum,stride" ; In this example, this means that data is extracted 1 time every 8 time steps, starting from the first time step (number 1), ending at the last time step).
- Extract a variable from a netCDF 'diagfi.nc' file
ncks -v tsurf,temp,p diagfi.nc diagfi_out.nc # Here we created a new file named 'diagfi_out.nc' in which we only kept variables named 'tsurf' (surface temperatures), 'temp' (atmospheric temperatures) and p (atmospheric pressures).
Again, more examples can be found on http://research.jisao.washington.edu/data_sets/nco/ .
Data Handling and Visualization Software
There are several data handling and visualization tools that can be used to analyse and plot the results from GCM simulations (using the diagfi.nc NetCDF files). We provide below a panorama of most widely used solutions.
panoply
Panoply is a user-friendly tool for viewing raw NetCDF data, available here: https://www.giss.nasa.gov/tools/panoply/ . It is very convenient to make pretty visuals (see an example for the exoplanet TRAPPIST-1e). There are many options that can be used (map projections, masks, colorbars, shadows, etc.) to make your plots fancy. However, the tool is not very well suited for manipulating data (compute averages, statistics, etc.).
ncview
ncview is another useful user-friendly tool for viewing raw NetCDF data. This is kind of a very archaic version of panoply, but it is convenient because it allows to have a very quick first look at netCDF data files.
Command line tool to visualize NetCDF data:
- Installation on Linux:
sudo apt install ncview
- Run on Linux:
ncview diagfi.nc
python scripts
Python scripts provide a very useful mean to analyse and visualize netCDF files.
You can use the netCDF4 python library to open a netCDF file and put data in tables that can then be manipulated and plotted.
Here is an exemple of how to open and read a netCDF file with Python:
1 import numpy
2 from netCDF4 import Dataset
3
4 # HERE WE OPEN THE NETCDF FILE
5 nc = Dataset('diagfi.nc')
6
7 # HERE WE READ THE VARIABLES (1D OUTPUT)
8 Time=nc.variables['Time'][:]
9 lat=nc.variables['latitude'][:]
10 lon=nc.variables['longitude'][:]
11 al=nc.variables['altitude'][:]
12
13 # HERE WE READ THE AREA (2D OUTPUT)
14 aire_GCM=nc.variables['aire'][:][:]
15
16 # HERE WE READ 3D OUTPUTS
17 tsurf=nc.variables['tsurf'][:][:][:] # this is the surface temperature 3D field (time, latitude, longitude, altitude)
18
19 # HERE WE READ 4D OUTPUTS
20 temp=nc.variables['temp'][:][:][:][:] # this is the atmospheric temperature 4D field (time, latitude, longitude, altitude)
And here is an exemple of how to manipulate the netCDF data (here to compute the time averaged surface temperatures):
1 from numpy import *
2 import numpy as np
3
4 mean_tsurf=np.zeros((len(lat_GCM),len(lon_GCM)),dtype='f')
5
6 for i in range(0,len(Time)):
7 for j in range(0,len(lat)):
8 for k in range(0,len(lon)):
9 mean_tsurf[j,k]=mean_tsurf[j,k]+tsurf[i,j,k]*(1./len(Time))
And here is a last exemple of how to plot the data (using matplotlib):
1 import matplotlib.pyplot as plt
2
3 plt.figure(1)
4 plt.contourf(lon_GCM,lat_GCM,mean_tsurf)
5 plt.colorbar(label='Surface Temperature (K)')
6 plt.xlabel('Longitude ($^{\circ}$)')
7 plt.ylabel('Latitude ($^{\circ}$)')
8 plt.show()
Another useful library to deal with netcdf files is xarray. We provide a code snippet below, doing the same thing as the snippets above.
1 import numpy as np
2 import xarray as xr
3 import matplotlib.pyplot as plt
4
5 # HERE WE OPEN THE NETCDF FILE
6 data = xr.open_dataset('diagfi.nc',
7 decode_times=False)
8
9 # HERE WE READ THE VARIABLES (1D OUTPUT)
10 Time=data['Time']
11 lat=data['latitude']
12 lon=data['longitude']
13 al=data['altitude']
14
15 # HERE WE READ THE AREA (2D OUTPUT)
16 aire_GCM=data['aire']
17
18 # HERE WE READ 3D OUTPUTS
19 tsurf=data['tsurf'] # this is the surface temperature 3D field (time, latitude, longitude, altitude)
20
21 # HERE WE READ 4D OUTPUTS
22 temp=data['temp'] # this is the atmospheric temperature 4D field (time, latitude, longitude, altitude)
23
24 ## let's take the time-averaged surface temperature
25 mean_tsurf = np.mean(tsurf,axis=0)
26
27 ##Let's plot a lon-lat map
28 fig = plt.figure()
29 plt.contourf(lon,lat,mean_tsurf)
30 plt.colorbar(label='Surface Temperature (K)')
31 plt.xlabel('Longitude ($^{\circ}$)')
32 plt.ylabel('Latitude ($^{\circ}$)')
33 plt.show()
For more examples on how to use xarray, take a look at the documentation. Here is another example of how one can use xarray with multiples netcdfiles.
1 import xarray as xr
2 import os
3
4 # your folder where output files are stored
5 FOLDER = './your_folder_with_output_files/'
6
7 # take back the files from your FOLDER
8 list_files_folder=os.listdir(FOLDER)
9
10 # If there are several files.
11 # Sort your simulation files by date,
12 # so beginning of simulation will be top of the list
13 # and end of simulation will be end of the list.
14 list_files_folder.sort()
15
16 files = [FOLDER+str(f) for f in list_files_folder]
17 # if you want to keep only files of special_year you can add this option :
18 # files = [FOLDER+str(f) for f in list_files_folder if f.startswith("special_year")]
19
20 # xarray will magically concatenate your outfiles by 'Time' (or any other 'concat_dime' you want)
21 nc=xr.open_mfdataset(files,decode_times=False, concat_dim='Time', combine='nested')
22
23 # to check your keys
24 for key in nc.keys():
25 print(key)
26
27 # to load keys (example here with keys for a mesoscale simulation)
28 Times = nc['Times'][:]
29 PTOT = nc['PTOT'][:]
30 T = nc['T'][:]
31 W = nc['W'][:]
32
33 # you can use some functions to make averages etc
34
35 T_moy = T.mean(dim=['Time','south_north','west_east'])
36
37 # other functions
38 # .cumsum (cumulative sum)
39 # .rename (change the name of the object)
NB: ADD SOMETHING HERE ABOUT XARRAY LIBRARY? AYMERIC? (with a simple tutorial?) TBD
We provide a tutorial on how to make pretty visuals using Generic PCM 3-D simulations here.
Planetoplot
Planetoplot is a in-house, python based library developped to vizualize Generic PCM data.
The code and documentation can be found at: https://nbviewer.org/github/aymeric-spiga/planetoplot/blob/master/tutorial/planetoplot_tutorial.ipynb