Tool Box

From Planets
Revision as of 13:16, 20 July 2022 by Martin.turbet (talk | contribs) (python scripts)

Jump to: navigation, search

Pre-processing Tools

newstart: a fortran program to modify start files

Newstart is an interactive tool to modify the start files (start.nc and startfi.nc).

To be usable, newstart should be compile in the LMDZ.COMMON directory by using the following command line:

./makelmdz_fcm -arch my_arch_file -p std -d 64x48x30 newstart

In the example, my_arch_file is the name the arch files (see arch ) and 64x48x30 is the resolution of the physical grid. Then copy the executable from the LMDZ.COMMON/bin directory to your bench directory.

When you execute newstart, you can use both a start2archive file or the start files (start.nc and startfi.nc). Then the interactive interface will propose to modify several physical quantities such as the gravity, the surface pressure or the rotation of the planet. At the end of the procedure, two files are created: restart.nc and restartfi.nc. They can be renamed and used as start files to initialize a new simulation.

start2archive

The start2archive tool is similar to newstart in the sense that it can be used to modify the start files. But start2archive can modify the resolution of the physical grid, the topography and the surface thermal inertia while newstart cannot. It is also useful to create an archive of different starting states, then extractable as start files. The command line to compile start2archive is similar to the one used for newstart:

./makelmdz_fcm -arch my_arch_file -p std -d 64x48x30 start2archive

To modify the resolution, you should first create a start_archive (by using start2archive) file at the used resolution, then compile a newstart file at the new resolution. Newstart will interpolate all the physical quantities on the new grid.

other third party scripts and tools

TO BE COMPLETED

Post-processing tools

zrecast

With this program you can recast atmospheric (i.e.: 4D-dimentional longitude-latitude-altitude-time) data from GCM outputs (e.g. as given in diagfi.nc files) onto either pressure or altitude above areoid vertical coordinates. Since integrating the hydrostatic equation is required to recast the data, the input file must contain surface pressure and atmospheric temperature, as well as the ground geopotential. If recasting data onto pressure coordinates, then the output file name is given by the input file name to which _P.nc will be appened. If recasting data onto altitude above areoid coordinates, then a _A.nc will be appened.

mass stream function

The mass stream function (and the total angular momentum) can be computed from a diagfi.nc or a stats.nc, using the streamfunction.F90 script. The script is located at

trunk/LMDZ.GENERIC/utilities

To compile the script, open the compile file in the same directory and do the following:

  • Replace "pgf90" with your favorite fortran compiler
  • replace "/distrib/local/netcdf/pgi_7.1-6_32/lib" with the lib address and directory that contains your NetCDF library (file libnetcdf.a).
  • Replace "/distrib/local/netcdf/pgi_7.1-6_32/include" with the address of the directory that contains the NetCDF include file (netcdf.inc).
  • You can mess with the compiling options but it is not mandatory.

Once the script is compiled, copy it in the same directory as your .nc file and run

./streamfunction.e

The script will ask you for the name of your .nc file, and will run and produce a new nameofyourfile_stream.nc file.

Be careful : In this new file, all fields are temporally and zonally averaged.

If you want to use python instead of fortran, you can take a look at this repo. It hosts a tool to perform dynamical analysis of GCM simulations (and therefore, it computes the mass stream function and a lot of other stuff), but it is tailored for Dynamico only. This repo also takes care of recasting (it does the job of both zrecast.F90 and streamfunction.F90)

Continuing Simulations

manually

At the end of a simulation, the model generates restart files (files 'restart.nc' and 'restartfi.nc') which contain the final state of the model. The 'restart.nc' and 'restartfi.nc' files have the same format as the 'start.nc' and 'startfi.nc' files, respectively.

These files can in fact be used as initial states to continue the simulation, using the following renaming command lines:

mv restart.nc start.nc
mv restartfi.nc startfi.nc

Running a simulation with these start files will in fact resume the simulation from where the previous run ended.

with bash scripts

We have set up very simple bash scripts to automatize the launching of chain simulations. Here is an example of bash script that does the job:

#!/bin/bash
###########################################################################
# Script to perform several chained LMD Mars GCM simulations
# SET HERE the maximum total number of simulations

nummax=100

###########################################################################


echo "---------------------------------------------------------"
echo "STARTING LOOP RUN"
echo "---------------------------------------------------------"

dir=`pwd`
machine=`hostname`
address=`whoami`

# Look for file "num_run" which should contain 
# the value of the previously computed season
# (defaults to 0 if file "num_run" does not exist)
if [[ -r num_run ]] ; then
  echo "found file num_run"
  numold=`cat num_run`
else
  numold=0
fi
echo "numold is set to" ${numold}


# Set value of current season 
(( numnew = ${numold} + 1 ))
echo "numnew is set to" ${numnew}

# Look for initialization data files (exit if none found)
if [[ ( -r start${numold}.nc  &&  -r startfi${numold}.nc ) ]] ; then
   \cp -f start${numold}.nc start.nc
   \cp -f startfi${numold}.nc startfi.nc
else
   if (( ${numold} == 99999 )) ; then
    echo "No run because previous run crashed ! (99999 in num_run)"
    exit
   else
   echo "Where is file start"${numold}".nc??"
   exit
   fi
fi

# Run GCM -- THIS LINE NEEDS TO BE MODIFIED WITH THE CORRECT GCM EXECUTION COMMAND
mpirun -np 8 gcm_64x48x26_phystd_para.e < diagfi.def > lrun${numnew}


# Check if run ended normaly and copy datafiles
if [[ ( -r restartfi.nc  &&  -r restart.nc ) ]] ; then
  echo "Run seems to have ended normaly"


  \mv -f restart.nc start${numnew}.nc
  \mv -f restartfi.nc startfi${numnew}.nc  
    
else
  if [[ -r num_run ]] ; then
    \mv -f num_run num_run.crash
  else
    echo "No file num_run to build num_run.crash from !!"
    # Impose a default value of 0 for num_run
    echo 0 > num_run.crash
  fi
 echo 99999 > num_run
 exit
fi

# Copy other datafiles that may have been generated
if [[ -r diagfi.nc ]] ; then
  \mv -f diagfi.nc diagfi${numnew}.nc
fi
if [[ -r diagsoil.nc ]] ; then
  \mv -f diagsoil.nc diagsoil${numnew}.nc
fi
if [[ -r stats.nc ]] ; then
  \mv -f stats.nc stats${numnew}.nc
fi
if [[ -f profiles.dat ]] ; then
  \mv -f profiles.dat profiles${numnew}.dat
  \mv -f profiles.hdr profiles${numnew}.hdr
fi

# Prepare things for upcoming runs by writing
# value of computed season in file num_run
echo ${numnew} > num_run

# If we are over nummax : stop
if (( $numnew + 1 > $nummax )) ; then
   exit
else
   \cp -f run_gnome exe_mars
   ./exe_mars
fi

Summary of what this bash script does:

  • It reads the file 'num_run' which contains the step of the simulation.

If num_run is

5

then the script expects to read start5.nc and startfi5.nc.

  • It modifies start5.nc and startfi5.nc into start.nc and startfi.nc, respectively.
  • It runs the GCM.
  • It modifies restart.nc and restartfi.nc into start6.nc and startfi6.nc
  • It rewrite num_run as follows:
6
  • It restarts the loop until num_run reaches the value (defined in nummax):
100

Pre-processing Output Files

for MT: take the best examples from http://research.jisao.washington.edu/data_sets/nco/

Data Handling and Visualization Software

There are several data handling and visualization tools that can be used to analyse and plot the results from GCM simulations (using the diagfi.nc NetCDF files). We provide below a panorama of most widely used solutions.

Panoply

Panoply is a user-friendly tool for viewing raw NetCDF data, available here: https://www.giss.nasa.gov/tools/panoply/ . It is very convenient to make pretty visuals (see an example for the exoplanet TRAPPIST-1e). There are many options that can be used (map projections, masks, colorbars, shadows, etc.) to make your plots fancy. However, the tool is not very well suited for manipulating data (compute averages, statistics, etc.).

Screenshot of panoply showing here Generic PCM results for the exoplanet TRAPPIST-1e (surface temperatures)

ncview

ncview is another useful user-friendly tool for viewing raw NetCDF data. This is kind of a very archaic version of panoply, but it is convenient because it allows to have a very quick first look at netCDF data files.

Command line tool to visualize NetCDF data:

  • Installation on Linux:
sudo apt install ncview
  • Run on Linux:
ncview diagfi.nc
Screenshot of ncview showing here Generic PCM results for the exoplanet Proxima b (OLR - Thermal Emission)

paraview

Can Ehouarn or anyone else help here? I have never used paraview.

python scripts

Python scripts provide a very useful mean to analyse and visualize netCDF files.

You can use the netCDF4 python library to open a netCDF file and put data in tables that can then be manipulated and plotted.

Here is an exemple of how to open and read a netCDF file with Python:

 1 import numpy
 2 from netCDF4 import Dataset
 3 
 4 # HERE WE OPEN THE NETCDF FILE
 5 nc = Dataset('diagfi.nc')
 6 
 7 # HERE WE READ THE VARIABLES (1D OUTPUT)
 8 Time=nc.variables['Time'][:]
 9 lat=nc.variables['latitude'][:]
10 lon=nc.variables['longitude'][:]
11 al=nc.variables['altitude'][:]
12 
13 # HERE WE READ THE AREA (2D OUTPUT)
14 aire_GCM=nc.variables['aire'][:][:]
15 
16 # HERE WE READ 3D OUTPUTS
17 tsurf=nc.variables['tsurf'][:][:][:] # this is the surface temperature 3D field (time, latitude, longitude, altitude)
18 
19 # HERE WE READ 4D OUTPUTS
20 temp=nc.variables['temp'][:][:][:][:] # this is the atmospheric temperature 4D field (time, latitude, longitude, altitude)

And here is an exemple of how to manipulate the netCDF data (here to compute the time averaged surface temperatures):

1 from numpy import *
2 import numpy as np
3 
4 mean_tsurf=np.zeros((len(lat_GCM),len(lon_GCM)),dtype='f')
5 
6 for i in range(0,len(Time)):
7     for j in range(0,len(lat)):
8         for k in range(0,len(lon)):
9             mean_tsurf[j,k]=mean_tsurf[j,k]+tsurf[i,l,j,k]*(1./len(Time))

And here is a last exemple of how to plot the data (using matplotlib):

1 import matplotlib.pyplot as plt
2 
3 mpl.figure(1)
4 mpl.contourf(lon_GCM[:],lat_GCM[:],mean_tsurf)
5 mpl.colorbar(label='Surface Temperature (K)')
6 mpl.xlabel('Longitude ($^{\circ}$)')
7 mpl.ylabel('Latitude ($^{\circ}$)')
8 mpl.show()


The xarray python library is a very good tool to easily load and plot netcdf data.

TBD

Planetoplot

Planetoplot is a in-house, python based library developped to vizualize Generic PCM data.

The code and documentation can be found at: https://nbviewer.org/github/aymeric-spiga/planetoplot/blob/master/tutorial/planetoplot_tutorial.ipynb