Environment management

Traditional Unix way to manage environment usually involves editing your ~/.bashrc and/or sourcing software-specific files. This methodology can be error-prone due to inconsistent definitions and hardly let users dynamically enable or change the software they want to use. To dynamically manage your environment and pick up the needed software among the large software catalog provided, the module command is provided from the Environment Modules project.

What is module

module is a user interface providing dynamic modification of your environment via modulefiles. module allows to change easily the shell environment by initializing, modifying or unsetting environment variables.

Each modulefile contains the information needed to configure the shell for an application. Once the module is initialized, the environment can be modified on a per-module basis using the module command which interprets modulefiles. Typically modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, etc. The modulefiles are added to and removed from the current environment by the user.

On the computing center, it is typically used to:

  • define your user environment ($HOME, $CCCSCRATCHDIR, etc.);
  • easily get access to third-party softwares in different versions (ex: Intel compilers, GCC, MPI librairies, etc.);
  • handle the potential conflicts an requirements between software.

Modulefiles are basically provided by the computing center staff to get access to the installed software and to the system properties. In addition you may have your own modulefiles to supplement the already provided modulefiles.

Module actions

Major kind of actions of the module command are described below. To get a full reference of the available module actions, you can either

  • display the command usage message (module -h)
  • look at the man page (man module)
  • or even try the command auto-completion

Listing / Searching modulefiles

Knowing current environment state:

  • module list shows the current state of your environment, which means to display all the modules currently loaded

Querying modulefiles catalog:

  • module avail displays all available modules (modules suffixed by @ are aliases)
  • module avail --default reduces the regular avail output by only displaying the available default versions
  • module avail --latest in the same way displays only the available latest versions
  • module avail mpi shows the available mpi modulefiles

Printing modulefile information:

  • module help netcdf shows software description for default netcdf modulefile
  • module help hdf5/x.y.z shows software description for a specific hdf5 modulefile
  • module show mkl displays software description plus environment definition for the default mkl
  • module show python/x.y.z displays as above description and environment definition but for a specific version

Searching for modulefiles:

  • module whatis papi prints for each version of papi product a one-liner description and its associated keywords
  • module help|show products/keywords prints products/keywords modulefile description which lists all keywords in-use by available modulefiles
  • module search profiler searches for all products whose name, one-liner description or keywords match the profiler search string

Loading / Unloading modulefiles

Adding modulefile(s) to the list of currently loaded modules:

  • module load fftw3 loads the default version of fftw3 product or its latest version if no default version is explicitly set
  • module load visit/x.y.z loads specific version x.y.z of visit
  • module load intel hdf5 netcdf loads multiple modulefiles in one command

Note

On interactive shells, module auto-completion is enabled and can help you to find the name of modulefiles you want to load, unload or switch

Removing modulefile(s) from your current environment:

  • module unload visit unloads loaded version of visit modulefile
  • module unload netcdf hdf5 unloads multiple products in one command
  • module purge unloads all loaded modulefiles

Note

All these load, unload, switch commands returns 0 on success or 1 elsewhere

Switching from one version of a modulefile to another:

  • module switch intel intel/x.y.z unloads currently loaded intel modulefile then loads version x.y.z

Note

The module command will automatically satisfy modulefile prerequisites. When loading a modulefile, all the modulefiles it declares as prerequisite are loaded prior to its own load. When unloading a modulefile, all the modulefiles it declares as prerequisite that have been automatically loaded as dependency are automatically unloaded after the initial modulefile unload.

Saving and restoring modulefile collections

A modulefile collection corresponds to saved state of your module environment you can restore whenever you want. A collection is composed of an ordered set of modulefiles which are the currently loaded modulefiles at the time of saving this collection. When a collection is restored, currently loaded modulefiles are unloaded to then load the set of modulefiles defined in the collection in the same loading order.

You can own any number of collections you want, which gives you the ability to easily switch between a production environment and a development environment or between a visualization environment and a debugging one, for instance.

Saving modulefile collections

  • module save development saves the current list of loaded modules in the collection named development
  • module save saves the current list of loaded modules in the default collection

Listing saved modulefile collections

  • module savelist lists all previously saved modulefile collections

Restoring saved modulefile collections:

  • module restore development restores the collection named development, by unloading currently loaded modulefiles then loading the modulefiles defined in the collection to restore the same ordered list of loaded modulefiles
  • module restore restores the default modulefile collection

Initialization and scope

Depending on shell mode, the module environment is initialized and propagated in different ways. In all cases, the module command is defined and a minimal environment is set. This minimal environment is composed of the default paths to the modulefiles provided by the computing center staff and the mandatory modulefiles ccc and dfldatadir loaded. Then on interactive shell:

  1. all module output messages are set to be redirected to stdout
  2. module command auto-completion is enabled
  3. your module collection named default is restored if it exists

On non-interactive shell following initialization is done after minimal environment setup:

  1. all module output messages are let on stderr
  2. module message at load or unload is disabled
  3. your module collection named non-interactive is restored if it exists

Interactive or non-interactive ?

Interactive shell initialization is obtained when:

  • you connect to the supercomputer or to a given node within the supercomputer to get a login shell
  • you run an interactive job

Non-interactive shell initialization is obtained elsewhere, which means when:

  • a batch job starts
  • you remotely execute a command (with SSH) on the supercomputer or on a node within the supercomputer

Scope of your environment

Your environment is initialized or re-initialized in the conditions previously described, which means each time you connect to the supercomputer or from one node to another within the supercomputer your environment is reset to its default. It is also the case when the batch scheduler starts one of your batch job: by default the environment at the time of the job submission is not restored so the environment of this batch job is initialized as a non-interactive environment.

Once initialized, each load or unload of modulefile modifies the environment of the current shell, its subsequent sub-shells and jobs. So each sub-shell and script launched will inherit the environment from its parent shell. However to guaranty the module function to still be defined in sub-shells and script launched, please ensure that /etc/bashrc configuration is loaded in your ~/.bashrc local configuration file, for instance with:

# Source global definitions
if [ -f /etc/bashrc ]; then
    . /etc/bashrc
fi

Regarding interactive jobs, the environment obtained at the start of this kind of job is the one set when you call for these interactive jobs. Which means all the modulefiles loaded prior to the execution of an interactive job will be still loaded once this interactive job session will be established.

Major modulefiles

Modulefiles provided by the computing center staff are spread across categories to sort them by product function. These categories are applications, compilers, environment, libraries, tools, parallel and graphics. They represent the type of software, except for parallel and graphics categories which are transversal to the other categories and are made to promote these kind of software.

The environment category is a bit special as it does not contain modulefile definition for product like the other categories. A modulefile from the environment category defines aspects relative to the system or configuration properties for a group of user that can access and use these system functions.

Some modulefiles from the environment category shape environment usages and possibilities. These major modulefiles are described below.

ccc

ccc modulefile defines global system variables and aliases needed to get a functional user environment. This modulefile is the first to be loaded in user environment and its load is done within default environment load. It is mandatory and thus cannot be unloaded.

datadir

datadir modulefile helps user to localize his own or shared/application space-specific data end points. Personal or shared spaces provide multiple data end-points each for different needs. These endpoints are localized with environment variables to easily reference these various end-points. Data end-points variables are all prefixed with the name of the module version upper-cased, which corresponds to the name of the personal or shared space. For instance, data end-points for project prj will be all prefixed by PRJ_ and end-points for personal space will be all prefixed by OWN_.

A version for the datadir modulefile exists for each shared/application space known in the computing center, named in accordance to the shared space name, and for the user personal space, named own. You can only view and access versions of the datadir modulefile that correspond to spaces you can access. Multiple versions of the datadir module can be loaded at the same time.

Following example display the variables set for the own version of the datadir modulefile:

$ module show datadir/own
-------------------------------------------------------------------
/opt/Modules/default/modulefiles/environment/datadir/own:

setenv      OWN_STOREDIR    /ccc/store/contxxx/group/username
setenv      OWN_CCCSTOREDIR /ccc/store/contxxx/group/username
setenv      OWN_WORKDIR /ccc/work/contxxx/group/username
setenv      OWN_CCCWORKDIR  /ccc/work/contxxx/group/username
setenv      OWN_SCRATCHDIR  /ccc/scratch/contxxx/group/username
setenv      OWN_HOME    /ccc/contxxx/home/group/username
module-whatis   Data Directory
-------------------------------------------------------------------

dfldatadir

dfldatadir modulefile sets personal or shared/application space data end-points targeted by a datadir modulefile as default end-points. Data end-point variables set by datadir modulefile are all prefixed with the name of the module version upper-cased, dfldatadir module sets the default data end-point variable (without prefix) for each of these personal or shared-space specific variables set by corresponding datadir modulefile. Exception is made for HOME variable which always refer to personal home directory and do not change if a dfldatadir modulefile different than default is loaded.

A version for dfldatadir modulefile exists for each shared/application space known in the computing center, named in accordance to the shared space name, and for the user personal space, named own. You can only view and access versions of the datadir modulefile that correspond to spaces you can access. Since dfldatadir module represents default data end-points, only one version of the module can be loaded at the same time.

dfldatadir modulefile requires the datadir modulefile with same version name. This datadir modulefile is thus automatically loaded when loading the dfldatadir modulefile. Default version of the dfldatadir module is the own modulefile, which is loaded by default when module environment is initialized.

To guaranty a coherent user environment with default datadir locations set (CCCSCRATCHDIR, CCCSTOREDIR, etc) dfldatadir is mandatory and thus cannot be unloaded. As a consequence, to change loaded dfldatadir module module switch command has to be used as module unload will fail:

$ module unload dfldatadir
module dfldatadir/own (Default Data Directory) cannot be unloaded
$ module switch dfldatadir/own dfldatadir/prj
unload module dfldatadir/own (Default Data Directory)
unload module datadir/own (Data Directory)
load module datadir/prj (Data Directory)
load module dfldatadir/prj (Default Data Directory)

extenv

extenv modulefile enables users to extend the environment provided by the computing center staff with extra environment managed within its own home directory or within a shared/application space home directory. extenv introduces a standard layout to manage your software products through the module environment provided by the computing center. Details on this modulefile are exposed in the next section.

feature

feature modulefile enables users to adjust the settings of a product through environment variables. feature introduces a standard layout to manage your software products settings through the module environment provided by the computing center. Typical usages are:

  • module whatis feature to list products with multiple settings
  • module whatis feature/openmpi to list and describe the available settings for the software openmpi

feature adjust the behaviour of a product without software recompilation. When behaviour changes requires dedicated build it will be in the flavor modulefiles.

The following example illustrates how the feature modules for MKL load some environment variables that affect the product in some way of its own. You can list those features with module av feature/mkl. Here is what they do:

$ module load feature/mkl/sequential
$ module load mkl
$ module show mkl
...
module-whatis Intel MKL LP64 Sequential
setenv        MKL_LDFLAGS ... -lmkl_sequential ...   # Here, link options will change
...
$ module switch feature/mkl/{sequential,multi-threaded}
$ module show mkl
...
module-whatis Intel MKL LP64 Multi-threaded
setenv        MKL_LDFLAGS ... -lmkl_intel_thread ... #... here we see the change
...

flavor

flavor modulefile enables users to select a specific build for a product. flavor introduces a standard layout to manage the multiple compilations for a same software products through the module environment provided by the computing center. Typical usages are:

  • module whatis flavor to list the products with multiple compilations/builds
  • module whatis flavor/hdf5 to list and describe the available compilations/builds for the software hdf5

flavor adjust the behaviour of a product with dedicated software compilation. When behaviour changes does not requires it, it will be in the feature modulefiles.

Here is a more detailed illustration of what flavor modules do:

  • flavor/%product%/%wish% expresses a wish about the installed version of a product you want to choose, but it does not load anything by itself;
  • %product%/%version% points to the installation of the product, depending on the flavor modules you may have loaded previously;

Here is a real-life example: module av flavor/hdf5 mentions parallel and serial. Let’s try both:

$ module load flavor/hdf5/serial
$ module load hdf5
$ module show hdf5
...
prepend-path  PATH    /.../serial/bin # Installation paths change with the flavor
...
$ module load mpi  #HDF5 parallel will require a MPI implementation
$ module switch flavor/hdf5/serial flavor/hdf5/parallel
$ module show hdf5
...
prepend-path  PATH    /.../parallel/bin # a new flavor changes the installation path

licsrv

licsrv modulefile defines in user environment the variables required by software to query the license server they are related to. Each version of this modulefile represents an existing license server. licsrv modulefile is automatically loaded when loading a modulefile who requires the relative license server.

products

products modulefiles provides functions to query the product catalog. These modulefiles can only be displayed, they are not intended to be loaded. They provide different kind of information on the installed software.

  • module help|show products/keywords displays all the existing product keywords
  • module help|show products/newinstall lists all the software versions whose installation date is fresher than 8 weeks
  • module help|show products/restrict lists all the software whose usage is restricted and your current grant status for these software.

Extend your environment with modulefiles

Computing center staff provides you regular HPC software you can access through the module environment. You may need to extend this regular environment with your own product installations or various setups. This section describes how to enable your environment extensions within the module environment.

Using the extenv modulefile

extenv modulefile enables users to extend the environment provided by the computing center staff with extra environment managed within its own home directory or within a shared/application space home directory. extenv introduces a standard layout to manage your software products through the module environment provided by the computing center. This modulefile enables to define a common environment for all users of a given shared space.

A version for extenv modulefile exists for each shared space known in the computing center, named in accordance to the shared space name, and for the user personal space, named own. You can only view and access versions of the extenv modulefile that correspond to spaces you can access. Multiple versions of the extenv module can be loaded at the same time.

Loading the extenv modulefile will:

  • Set environment variables defining the path to shared products and modulefiles (SHSPACE_PRODUCTSHOME, SHSPACE_MODULEFILES, SHSPACE_MODULESHOME)
  • Execute a module initialization script
  • Add shared modulefiles to the list of available modulefiles

The environment extension mechanisms of extenv requires the use of specific paths. Products installed for the shared space named shspace should be installed in $SHSPACE_PRODUCTSHOME and the corresponding modulefiles should be in $SHSPACE_MODULEFILES.

Initialization file

If a file named init is found in the path defined by $SHSPACE_MODULESHOME, then each time the extenv/shspace module is loaded, this initialization file will be executed as TCL code. This may be useful if you want to define other common environment variables or add prerequisites on modules to be used by the community.

For instance, with the following example, you will define two environment variables, one defining the path to a directory containing input files (SHSPACE_INPUTDIR), and the other defining the result directory (SHSPACE_RESULTDIR). It will also add $SHSPACE_PRODUCTSHOME/tools/bin to the PATH so that the tools installed in this directory are easily available.

setenv SHSPACE_INPUTDIR "$env(SHSPACE_CCCWORKDIR)/in"
setenv SHSPACE_RESULTDIR "$env(SHSPACE_CCCSCRATCHDIR)/res"
append-path PATH "$env(SHSPACE_PRODUCTSHOME)/tools/bin"

Expose modulefiles

Once the extenv/shspace modulefile is loaded, all the modulefiles located in $SHSPACE_MODULEFILES will be visible to the module command. For each product, there should be one module file per version. You can also define modulefiles for configuration or environment change, it is not mandatory to relate each modulefiles to a product.

For example, if you create specific modules in the shared environment in the following paths:

$ find $SHSPACE_MODULEFILES
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/thecode
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/thecode/1.2.3
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/thetool
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/thetool/2
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/conf
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/conf/thecode
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/conf/thecode/production
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/conf/thecode/tuningtest
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/libprod
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/libprod/1.0
/ccc/contxxx/home/shspace/shspace/products/modules/modulefiles/libprod/2.0

Then, those modules will be visible and accessible once the extenv/<shspace> module is loaded.

$ module load extenv/shspace
load module extenv/shspace (Extra Environment)
$ module avail
---------------- /opt/Modules/default/modulefiles/applications -----------------
abinit/x.y.z        gaussian/x.y.z        openfoam/x.y.z
[...]
-------------------- /opt/Modules/default/modulefiles/tools --------------------
advisor/x.y.z       ipython/x.y.z         r/x.y.z
[...]
-------- /ccc/contxxx/home/shspace/shspace/products/modules/modulefiles --------
thecode       conf/thecode/production   conf/thecode/tuningtest
libprod/1.0   libprod/2.0               thetool/2

$ module load thetool
$ module list
Currently Loaded Modulefiles:
1) ccc          3) dfldatadir/own   5) extenv/shspace
2) datadir/own  4) datadir/shspace  6) thetool/2

Next section will give some hints to write your own modulefiles.

Write your own modulefile

This section describes how to write a modulefile through the example of a product installed in the shared space named shspace. This product is called TheCode and it is installed in version 1.2.3. This product depends on the library libprod and on configuration enabled via the conf/thecode modulefile. It is advised to install this product in the $SHSPACE_PRODUCTSHOME/thecode-1.2.3 directory.

Now looking at modulefiles, we suggest to have one modulefile for each version of the product and one common modulefile referred by all version-specific modulefiles which contains all the definition relative to the product. In our example for TheCode product, it means having first a version-specific modulefile at $SHSPACE_MODULEFILES/thecode/1.2.3 containing:

#%Module1.0

# Software description
set version "1.2.3"

# load common functions and behavior
source $env(SHSPACE_MODULEFILES)/thecode/.common

In this version-specific modulefile, we just set the version number then we load the common definitions for the product. By doing so, product definition is easily shared between the different versions available of this product. Moving on the common modulefile at $SHSPACE_MODULEFILES/thecode/.common:

#%Module1.0

# Software description
set whatis      "TheCode"
set software    "thecode"
set description "One sentence to describe what TheCode is done for"

# Conflict
conflict $software
# Prerequisite
prereq   conf/thecode
prereq   libprod

# load head common functions and behavior
source $env(MODULEFILES)/.scripts/.headcommon

# Loads software's environment
# application-specific variables
set prefix       "$env(SHSPACE_PRODUCTSHOME)/$software-$version"
set libdir       "$prefix/lib"
set incdir       "$prefix/include"
# compilerwrappers-specific variables
set ldflags      "<ldflags>"

append-path PATH            "$bindir"
append-path LD_LIBRARY_PATH "$libdir"

setenv VARNAME "VALUE"

# load common functions and behavior
source $env(MODULEFILES)/.scripts/.common

This common modulefile first defines the identity card of the product, by setting the local variables software, description, etc. Conflicts and prerequisites are then setup with the conflict and prereq keywords. Some computing center global definitions are then loaded, and also at the end of the file. These definitions help to get the same behavior for your modulefiles as for the computing center regular modulefiles, like for instance the message printed at module load and also the structured description returned when calling module help on a product.

Then special local variables are set to define the environment of the software. Variables prefix, libdir, incdir, cflags, ldflags will automatically create the environment variables THECODE_ROOT, THECODE_LIBDIR, THECODE_INCDIR, THECODE_CFLAGS and THECODE_LDFLAGS when the modulefile is loaded. These environment variables are guessed and set by the global common file $MODULEFILES/.common sourced at the end of this application-specific common file $SHSPACE_MODULEFILES/thecode/.common. Environment variables needed by the software can also be defined or ajusted as it is done in the example for the VARNAME, PATH and LD_LIBRARY_PATH variables.

Note

Modulefile inter-dependencies are managed automatically by module. Any time the modulefile is loaded its dependencies, defined with the prereq keyword, are automatically loaded.

We recommend the use of the previous example as a template for all your modulefiles. If you use the recommended path for all product installations, you can keep major parts of this template as it is. Just specify the right software name and version, the correct dependencies and the product-specific environment variables like VARNAME in the example. Using that template will ensure you that your modulefile behaves the same way as the default modules and that all the available module commands will work for you.

To further learn the syntax of a modulefile, please refer to the modulefile and module man pages. Moreover a modulefile is interpreted by the module command as a TCL script, so you can use the TCL code and functions. For more information on the TCL language, please refer to http://tcl-lang.org/doc/.

Improve your Modules performances with cache (Advanced Usage)

The cachebuild sub-command of the module system creates a cache file in module paths (.modulecache). Without arguments, it attempts to create cache in every enabled modulepath where the running user has write access. If arguments are provided, the cache is built in the directories specified by these arguments.

When dealing with environments that utilize a large number of module files, filesystem performance issues can arise. Invoking a module command scans the modulepath for available modules, a process that can significantly delay operations in systems with extensive module collections. This scanning puts a strain on the filesystem with a high volume of read operations, particularly in shared or high-performance computing environments. The cumulative effect of multiple users executing such operations can further exacerbate the issue.

The module cachebuild command mitigates these challenges by generating a cache file for the module paths. This cache acts as a snapshot of available modules, enabling the module command to quickly identify the available modules without scanning the entire modulepath. Reducing the need for extensive directory reads significantly decreases the time required to initialize module environments and execute module-related operations, thus improving overall system performance and user experience.

Additionally, by lessening the impact on the filesystem, module cachebuild aids in maintaining system responsiveness and stability, especially important in multi-user environments where filesystem performance is crucial. Users are advised to regularly update the module cache, particularly after adding or updating modules, to ensure that the cache accurately represents the current state of the module paths.

  1. To run without any arguments, simply execute:

    module cachebuild
    

    This command attempts to create a cache file in every enabled module path where the running user has write permissions.

  2. To build the cache in specific directories, provide the paths as arguments:

    module cachebuild /path/to/custom/modulepath1 /path/to/custom/modulepath2
    

    This builds the cache only in the specified directories, offering more control over the cache generation process.

  3. A good practice with extenv:

    module cachebuild $SHSPACE_MODULEFILES
    

    This builds the cache of $SHSPACE_MODULEFILES.

Regularly using module cachebuild can significantly enhance the responsiveness of module operations, particularly in environments with a large assortment of modules. It presents a straightforward yet effective strategy to mitigate filesystem-related performance issues, making it a strongly recommended practice for users.

Warning

After any modification to modules (such as adding, removing, or updating a module), it is crucial to rerun module cachebuild to update the cache accordingly. Failing to do so may result in discrepancies between the cache and the actual available modules.