Compilation

Language standard

Most programming languages are defined through norms or standards. For example the C language have several standards:

  • K&R C
  • ANSI C (C89) – ISO C (C90)
  • C99
  • C11
  • C embedded

These standards define the syntax of the language, the meaning of these syntactic constructs regarding the resulting machine code. Most of them are a revision of the previous standard, incorprorating changes and new features.

The Intel Suite compilers, as well as the GNU Compiler Collection supports most releases the C, C++ and Fortran languages. However, some compilation options and optimisation as well as support for specific features of the most recent iterations of these standards are compiler-specific. Please refer to their respective documentations for more information.

Available compilers

Several well-known compilers are available for the C, C++ and Fortran languages. The most common being:

  • Intel Compiler suite (icc, icpc, ifort)
  • GNU compiler suite (gcc, g++, gfortran)

To get a complete list of available compilers and versions use the search parameter of module

$ module search compiler

We recommend to use the Intel Compiler Suite for better performances.

Here is how you would basically compile a serial code

$ icc [options] -o serial_prog.exe serial_prog.c
$ icpc [options] -o serial_prog.exe serial_prog.cpp
$ ifort [options] -o serial_prog.exe serial_prog.f90

Intel

Compiler flags

The following sections are an overview of the most frequent options for each compiler.

C/C++

The Intel compilers icc and icpc use mostly the same options. Their behaviors differ slightly: icpc assumes that all source files are C++, whereas icc distinguishes between .c and .cpp filenames.

Basic flags:

  • -o exe_file: names the executable exe_file
  • -c: generates the corresponding object file, without creating an executable.
  • -g: compiles with the debug symbols.
  • -I dir_name: specifies the path of the include files.
  • -L dir_name: specifies the path of the libraries.
  • -l bib: asks to link the libbib.a library

Preprocessor:

  • -E: preprocess the files and sends the result to the standard output
  • -P: preprocess the files and sends the result in file.i
  • -Dname=: defines the name variable
  • -M: creates a list of dependencies

Practical:

  • -p: profiling with gprof (needed at compilation time)
  • -mp, -mp1: IEEE arithmetic, mp1 is a compromise between time and accuracy

To tell the compiler to conform to a specific language standard :

  • -std=val: where val can take the following values (cf. man icc or man icpc)
    • c++14 : Enables support for the 2014 ISO C++ standard features.
    • c++11 : Enables support for many C++11 (formerly known as C++0x) features.
    • C99 : Conforms to The ISO/IEC 9899:1999 International Standard for the C language.

Note

If the desired standard is not fully supported by the current version of the Intel compiler (some features of the standard are not yet implemented), it might be supported by the last version of the GNU compilers. Since the Intel suit compilers are compiled against the gcc system (the one from the OS), load a recent version of the GNU compilers might solve this issue.

Fortran

The Intel compiler for Fortran is ifort.

Basic flags:

  • -o exe_file: name the executable exe_file
  • -c: generate the object file without creating an executable.
  • -g: compile with the debug symbols.
  • -I dir_name: add dir_name to the list of directories where include files are looked for.
  • -L dir_name: add dir_name to the list of directories where libraries are looked for.
  • -l bib: link the libbib.a library

Run-time check

  • -C or -check: generates a code which ends up in ‘run time error’ (ex: segmentation fault)

Preprocessor:

  • -E: pre-process the files and send the result to the standard output
  • -P: pre-process the files and send the result to file.i
  • -Dname=: assign the value value to the variable name
  • -M: creates a list of dependencies
  • -fpp: pre-process the files and compile

Practical:

  • -p: compile for profiling with gprof. You will not be able to use gprof otherwise.
  • -mp, -mp1: IEEE arithmetic (mp1 is a compromise between time and accuracy)
  • -i8: promote integers to 64 bits by default
  • -r8: promote reals to 64 bits by default
  • -module dir: send/read the files *.mod in the dir directory
  • -fp-model strict: strictly adhere to value-safe optimizations when implementing floating-point calculations, and enable floating-point exception semantics. This may slow down your program.

To tell the compiler to conform to a specific language standard :

  • -stand=val: where val can take the following values (cf. man ifort)
    • f15: Issues messages for language elements that are not standard in draft Fortran 2015.
    • f08: Tells the compiler to issue messages for language elements that are not standard in Fortran 2008
    • f03: Tells the compiler to issue messages for language elements that are not standard in Fortran 2003

Note

Please refer to the man pages for more information about the compilers.

Optimization flags

Compilers provide many optimization options: this section describes them.

Basic optimization options :

  • -O0, -O1, -O2, -O3: optimization levels - default: -O2
  • -opt_report : writes an optimization report to stderr (-O3 required)
  • -ip, -ipo: inter-procedural optimizations (mono and multi files). The command xiar must be used instead of ar to generate a static library file with objects compiled with -ipo option.
  • -fast : default high optimization level (-O3 -ipo -static).
  • -ftz : considers all the denormalized numbers (like INF or NAN) as zeros at runtime.
  • -fp-relaxed : mathematical optimization functions. Leads to a small loss of accuracy.
  • -pad : makes the modification of the memory positions operational (ifort only)

Warning

The -fast option is not allowed with MPI because the MPI context needs some libraries which only exist in dynamic mode. This is incompatible with the -static option. You need to replace -fast by -O3 -ipo.

Vectorization flags

Some options allow to use specific vectorization instructions of Intel processors to optimize the code. They are compatible with most Intel processors. The compiler will try to generate these instructions if the processor allows them.

  • -xcode : Tells the compiler which processor features it may target, including which instruction sets and optimization it may generate. “code” is one of the following:
    • CORE-AVX2
    • AVX
    • SSE4.2
    • SSE2
  • -xHost : Applies the highest level of vectorization supported depending on the processor where the compilation is performed. The login nodes may not have the same level of support as the compute nodes. So this option is to be used only if the compilation is done on the targeted compute nodes.
  • -axcode : Tells the compiler to generate a single executable with multiple levels of vectorization. “code” is a comma-separated list of instructions sets.

The default level of vectorization is sse2. However, it is only be activated for optimization level -O2 and more.

  • -vec-report[=n] : depending on the value of n, the option -vec-report enables information reports by the vectorizer.

Warning

A code compiled for a given instruction set will not run on a processor that only supports a lower instruction set

Default compilation flags

By default each of the Intel compiler provide the -sox option which allows to save all the options provided at the compilation time in the comment section of the ELF binary file. To display the comment section :

$ icc -g -O3 hello.c -o helloworld
$ readelf -p .comment ./helloworld
 String dump of section '.comment':
 [     0]  GCC: (GNU) <x.y.z> (Red Hat <x.y.z>)
 [    2c]  -?comment:Intel(R) C Intel(R) 64 Compiler for applications running on Intel(R) 64, Version <x.y.z> Build <XXXXXX>  : hello.c : -sox -g -O3 -o helloworld

GNU

Compiler flags

Basic flags:

  • -o exe_file: names the executable exe_file
  • -c: generates the corresponding object file, without creating an executable.
  • -g: compiles with the debug symbols.
  • -I dir_name: specifies the path where the include files are located.
  • -L dir_name: specifies the path where the libraries are located.
  • -l bib: asks to link the libbib.a library

To tell the compiler to conform to a specific language standard (g++/gcc/gfortran) :

  • -std=val: where val can take the following values (cf. man gcc/g++/gfortran)
    • c++14 : Enables support for the 2014 ISO C++ standard features.
    • C99 : Conforms to The ISO/IEC 9899:1999 International Standard.
    • f03: Tells the compiler to issue messages for language elements that are not standard in Fortran 2003
    • f08: Tells the compiler to issue messages for language elements that are not standard in Fortran 2008

Below are some specific flags for the gfortran commands.

Debugging:

  • -Wall: short for “warn about all”, warns about usual causes of bugs, such as having a subroutine or function named like a built-in one, or passing the same variable as an intent(in) and an intent(out) argument of the same subroutine
  • -Wextra: used with -Wall, warns about even more potential problems, like unused subroutine arguments
  • -w: inhibits all warning messages (not recommended)
  • -Werror: considers any warning as an error

Optimization flags

Compilers provide many optimization options: this section describes them.

Basic optimization options :

  • -O0, -O1, -O2, -O3: optimization levels - default: -O0

Some options allow usage of specific set of instructions for Intel processors, to optimize code behavior. They are compatible with most Intel processors. The compiler will try to use them if the processor allows them.

  • -mavx2 / -mno-avx2 : Switch on or off the usage of said instruction set.
  • -mavx / -mno-avx : idem.
  • -msse4.2 / -mno-sse4.2 : idem.

Available numerical libraries

MKL library

The Intel MKL library is integrated in the Intel package and contains:

  • BLAS, SparseBLAS;
  • LAPACK, ScaLAPACK;
  • Sparse Solver, CBLAS ;
  • Discrete Fourier and Fast Fourier transform

If you don’t need ScaLAPACK:

$ module load mkl
$ ifort -o myexe myobject.o ${MKL_LDFLAGS}

If you need ScaLAPACK:

$ module load scalapack
$ mpif90 -o myexe myobject.o ${SCALAPACK_LDFLAGS}

We provide multi-threaded versions for compiling with MKL:

$ module load feature/mkl/multi-threaded
$ module load mkl
$ ifort -o myexe myobject.o ${$MKL_LDFLAGS}
$ module load feature/mkl/multi-threaded
$ module load scalapack
$ mpif90 -o myexe myobject.o ${SCALAPACK_LDFLAGS}

To use multi-threaded MKL, you have to set the OpenMP environment variable OMP_NUM_THREADS.

We strongly recommend you to use the MKL_XXX and SCALAPACK_XXX environment variables made available by the mkl and scalapack modules.

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, with an arbitrary input size, and with both real and complex data. It is provided by the fftw3/gnu module. The variables FFTW3_CFLAGS and FFTW3_LDFLAGS should be used to compile a code using fftw routines.

$ module load fftw3/gnu
$ icc ${FFTW3_CFLAGS} -o test example_fftw3.c ${FFTW3_LDFLAGS}
$ ifort ${FFTW3_FFLAGS} -o test example_fftw3.f90 ${FFTW3_LDFLAGS}

Intel MKL also provides Fourier transform functions. FFTW3 wrappers are able to link programes so that they can use Intel MKL Fourier transforms instead of the FFTW3 library, without changing the source code. The correct compiling options are provided by the fftw3/mkl module.

$ module load fftw3/mkl
$ icc ${FFTW3_CFLAGS} -o test example_fftw3.c ${FFTW3_LDFLAGS}
$ ifort ${FFTW3_FFLAGS} -o test example_fftw3.f90 ${FFTW3_LDFLAGS}

Compiling for Skylake

With the -ax option, icc and ifort can generate code for several architectures.

For example, from a Broadwell login nodes, you can generate an executable with both AVX2 (Broadwell) and AVX512 (Skylake) instructions set. To do so, you need to add the -axCORE-AVX2,CORE-AVX512 option to icc or ifort.

An executable compiled with -axCORE-AVX2,CORE-AVX512 can be run on both Broadwell and Skylake as the best instruction set available on the architecture will be chosen.

Compiling for Rome/Milan

With the -m option, icc and ifort can generate specific instruction sets for Intel and non-Intel processors.

AMD Rome and Milan processors are able to run AVX2 instructions. To generate an AVX2 instructions for AMD processors, you need to add the -mavx2 option to icc or ifort. An executable compiled with -mavx2 can run on both AMD and Intel processors.

Note

  • The -mavx2 option is compatible with gcc.
  • Both -mavx2 and -axCORE-AVX2,CORE-AVX512 options can be used simultaneously with icc and ifort to generate both specific instructions for Intel processors and more generic instructions for AMD processors.