HowTo: LMDZ code guidelines
/!\ This page is a WORK IN PROGRESS. Nothing in this page should be considered correct or consensual for now. /!\
Note: this documents draws from various sources, including the OPA guidelines, Modernizing Old Fortran, F syntax rules, the stdlib style guide, CODEE Fortran recommendations.
Sommaire
Example code
Note: in this example, we added "meta-comments" indicated by !!
Note: of course, there's always exceptions to "Always", but they should systematically documented by a comment, with a proper explanation.
1 !! File: lmdz_mymodule.f90
2 !! ^ all module files should start with their component name, (e.g. "lmdz")
3 !! ^ use ".f90" for all free-form files, unless they contain preprocessor directives (then use ".F90").
4
5 !! At the top of the file, we provide a comment to explain what this modules does
6 ! implementation of my_useful_stuff.
7 ! ----------------------------------
8
9 !! Unless specified otherwise, all files should contain a single module with the same name which encapsulates all their content
10 MODULE lmdz_mymodule
11 !! ^ all Fortran keywords are in CAPS, everything else is in lowercase.
12 USE my_other_module, ONLY: function_a, function_b
13 !! ^ Always specify explicitely which functions you use from a module.
14 !!^ use indentation (2 spaces) to provide visual clarity
15 IMPLICIT NONE; PRIVATE
16 !! ^ Always make your module private, and expose explicitely which functions are public
17 !! ^ Always use IMPLICIT NONE at the module-level
18 PUBLIC my_var, my_subroutine
19
20 REAL, PARAMETER :: my_var
21 REAL :: a, b, c, d, e, f
22 !! ^ use "::" even if it's not strictly necessary
23 !! ^ Alignment of "::" is not necessary, but should be attempted whenever it provides readability and is relevant (e.g. there's some link between the aligned variables)
24
25 CONTAINS
26
27 !! Put at least a blank line before/after each function/subroutine2
28 !! A routine should have a clear, unambiguous name, describing what the routine does.
29 SUBROUTINE my_subroutine(arg1, arg2, arg3) ! handle event xxxx
30 !! ^ comments should always be right before, or on the same line as what they're commenting
31 !! ^ as for modules, all functions should be documented in a comment
32 INTEGER, INTENT(IN) :: arg1
33 !! ^ Always provide intent
34 REAL, INTENT(IN) :: arg2
35
36 INTEGER, INTENT(INOUT) :: arg3
37
38 REAl, INTENT(OUT) :: out1
39 !!^ Group together IN, then INOUT, then OUT
40
41 IF (arg2 > arg1) THEN
42 !! ^ .ge., .le., etc are deprecated. Use >, <, ==, etc instead
43 ! ...
44 END IF
45 !! ^ Use END IF, END DO rather than ENDIF, ENDDO (coherence with fortran-lang.org)
46 END SUBROUTINE my_subroutine
47 !! ^ always use named blocks for END
48
49
50 END MODULE my_module
51 !! ^ always use named blocks for END
Comments and Documentation
Documentation consists of putting information both inside and outside the source code. On large, formal projects, most of the documentation is outside the source code. However because the internal documentation is most closely associated with the code, it is the part most likely to remain correct as the code is modified. The thing to be really careful with is that misleading comments are even worse than having no comments at all. So, please report any misleading comments you found.
Try to avoid having superfluous comments such as: A = A + 1 ! Increment A by one.
This is not adding anything that is not immediately obvious. A person looking through your code should only have to scan your comments to be able to get a good idea of what the code does and where to look for any special activity.
Good comments do not just repeat verbally what is happening in the code or just explain it. They clarify its intent. They should explain at a higher level of abstraction what the code or what you are trying to do. In addition, you should comment:
- Everything that gets around an error or an undocumented feature in a language or environment.
- Any violations of good programming style should be justified. This will ensure that any well-intentioned programmer does not break your code by changing the source to implement a “better” style.
- The lines before a control structure, e.g. IF, CASE, loop, etc. are a natural spot to comment and explain what these constructs are about to do.
- The start of any subroutine or function whose usage is not downright obvious.
Misc
Fortran standard
Fortran 2008 is now widely supported on all major compilers, and should be taken as a reference.
Allocatable arrays
Allocatable arrays can be relevant, but should be substituted by automatic arrays (= passing the array size as an argument) whenever possible, for performance and debugging reasons.
Compiler warnings
Compiler warnings are useful - unless there's too many of them. Effort should be made such that during the compiling phase, no warning appears, so that user can be easily alerted of potential bugs. when some appear in their configuration.
In particular, all component must be able to run when a compile-time and/or run-time array bounds checking option is enabled. Use of the (*) construct in array dimensioning to circumvent this problem is forbidden because it effectively disables array bounds checking.
Side effects
Whenever possible, write functions without side-effects, and label them accordingly as PURE
or ELEMENTAL
.
SAVE
Since all files contain a MODULE
, attribute SAVE
is banned. Variables whose value have to be preserved between two calls should be declared at the module level.
Deprecated features
In terms of keeping code up to date and easier to maintain the code should always follow the current standards of FORTRAN and ANSI C. We decide to restrict the languages use to the elements, which are not obsolete or deleted -- even if they are still available with almost all compilers.
Note: just because a feature is deprecated doesn't mean it must be converted in old, working code - this can be tricky and bug-prone in complex instances.
Some notable deprecated features:
-
COMMON
blocks. Instead, useMODULE
initialisation. -
EQUIVALENCE
should be replaced by pointers or derived data types. - Any kind of
GOTO
has absolutely no place in modern programming. UseCASE
whenever relevant rather thanIF
. - Arithmetic
IF
statements. Use blockIF
orCASE
instead. -
CONTINUE
statement: useIF, CASE, DO WHILE, EXIT, CYCLE
statements or a containedSUBROUTINE
instead. - I/O routines
END
andERR
- useIOSTAT
instead -
FORMAT
statements: use character parameters or explicit format- specifiers inside theREAD
orWRITE
statement instead. -
ENTRY
statements: a subprogram must only have one entry point - Alternate
RETURN
is obsolescent, andRETURN
at the end of a function is useless clutter. - Statement functions have been replaced by corresponding functions or subroutines in the
CONTAIN
block of the current scope. -
DATA
andBLOCK DATA
should be replaced by initializers.
Preprocessor
Limit to the bare minimum the use of preprocessor #ifdef
keys. Those decrease lisibility, disable automatic code analysis, and generally make code a pain to manage and read. Ideally, a given preprocessor key should be used only once, or if not possible, in a single module for the entire codebase.
INCLUDE
Limit the use of include
headers to the bare minimum. Whenever not possible, wrap those include in a module, and import that module instead. This provides much more flexibility, e.g. when using ONLY: ...
.
Recent examples of "bad"/perfectible LMDZ code
Note: The purpose of this section is obviously not to blame anyone, but to stress that "bad" code in LMDZ isn't limited to old legacy code, and that even current code is being written in a way that can be greatly improved.
07/24: Unhelpful comment
1 IF (iflag_ice_thermo == 0) THEN
2 !pas necessaire a priori
This comment fails to convey a clear indication. What is necessary or not ? In which condition ? "A priori" further conveys a sense of uncertainty that may make sense to the writer, but certainly not to an external reader.
07/24: Unclear conventions, no PURE, useless RETURN
1 function grid_index(lon_deg,lat_deg)
2 !--------------------------------------------------------------------------------
3 ! Get local index of grid point of longitude,latitude lon_deg,lat_deg
4 ! Author : XXX 2024/07/XX
5 ! Please do not put this function in a m*odule not to complexify the replay script
6 !--------------------------------------------------------------------------------
7 USE dimphy, only : klon
8 USE geometry_mod, ONLY: latitude_deg, longitude_deg
9 implicit none
10 real, intent(in) :: lon_deg,lat_deg
11 integer :: grid_index
12 integer i
13 grid_index=0
14 do i=1,klon
15 if ( abs(lon_deg-longitude_deg(i)) < 0.02 .and. abs(lat_deg-latitude_deg(i)) < 0.02 ) then
16 grid_index=i
17 exit
18 endif
19 enddo
20 return
21 end
This function is actually mostly great ! It is documented, the author put a clear statement to indicate a deviation from standards (no module), USE
is used in conjuction with ONLY
, INTENT
is declared.
However, it is also lacking in some areas, notably:
- This function should be declared as
PURE
- This function has inconsistent syntax: mix of uppercase and lowercase for Fortran keywords, sometimes uses
::
in declarations and sometimes not - The function ends with a useless
RETURN
07/24: Duplicated, uncommented code
1 zx_tmp_fi3d=0.
2 IF (vars_defined) THEN
3 DO nsrf=1,nbsrf
4 DO k=1,klev
5 zx_tmp_fi3d(:,k)=zx_tmp_fi3d(:,k) &
6 +pctsrf(:,nsrf)*tke_shear(:,k,nsrf)
7 ENDDO
8 ENDDO
9 ENDIF
10
11 CALL histwrite_phy(o_tke_shear, zx_tmp_fi3d)
12
13 zx_tmp_fi3d=0.
14 IF (vars_defined) THEN
15 DO nsrf=1,nbsrf
16 DO k=1,klev
17 zx_tmp_fi3d(:,k)=zx_tmp_fi3d(:,k) &
18 +pctsrf(:,nsrf)*tke_buoy(:,k,nsrf)
19 ENDDO
20 ENDDO
21 ENDIF
22
23 CALL histwrite_phy(o_tke_buoy, zx_tmp_fi3d)
24
25
26 zx_tmp_fi3d=0.
27 IF (vars_defined) THEN
28 DO nsrf=1,nbsrf
29 DO k=1,klev
30 zx_tmp_fi3d(:,k)=zx_tmp_fi3d(:,k) &
31 +pctsrf(:,nsrf)*tke_trans(:,k,nsrf)
32 ENDDO
33 ENDDO
34 ENDIF
35
36 CALL histwrite_phy(o_tke_trans, zx_tmp_fi3d)
This code:
- Duplicates three times essentially the same code. Refactoring into a routine would be much more readable and reduce the potential of typos.
- Contains no comment justifying such behavior, or what is even being done here, which to an external reader is fairly obscure.
06/24: Undocumented variable
1 INTEGER :: nsrf, k, iq, iff, i, j, ilev, itr, itrb
2 !! ^ Added this
3 !! (...)
4 itrb = 0
5 !! (...)
6 if(tracers(iq)%name(1:3)=='BIN') then
7 itrb = itrb + 1
Just because a variable is a "dumb" counter doesn't mean it shouldn't be documented. In this particular example, the variable name is devoid of any obvious meaning. Properly documenting this variable would also make it clear to others whether it would make sense to reuse it later on in the code.
06/24: Comments as code storage
1 !CALL writefield_phy('latitude',latitude_deg,1)
2 !CALL writefield_phy('pressure_hl',pressure_hl,klev+1)
3 !CALL writefield_phy('Ldecorel',PDECORR_LEN_EDGES_M,klev)
Comments are NOT for code storage. Unfortunately, the LMDZ code is littered with bits of commented code without any indication of when or why the code was commented. As a result, there are in a lot of places such comments, many dating from 20+ years ago, for which we have no clue whether we can remove them or not. This greatly hinders readability and maintainability.
Of course, in some cases, it is warranted to keep some code at hand, e.g. for diagnostic purpose. In such case, please make sure to always clearly document why the code was commented out, so that it can be cleaned up at a later date when it becomes irrelevant.
Questions à élucider
- Prescrit-on une langue (Français / Anglais) ?
- Longueur des lignes: dans la limite du raisonable, pourquoi se limiter à 100-140 chars ? Le wrapping fonctionne très bien.