Parallel IO

MPI-IO

MPI-IO provides a low level implementation of parallel I/O. It was introduced as a standard in MPI-2.

More information on how to use MPI-IO can be found in the I/O section of the official MPI documentation.

Parallel compression and decompression with pigz

Instead of using classical compression tools like gzip (included by default with tar), we recommend to use his parallel counterpart pigz to get faster processing.

With this tool, we advise you to limit the number of threads used for compression between 4 and 8 to have a good performance / resources ratio. Increasing the number of threads should not dramatically improve performance and could even slow your compression. To speed up the process you may also adjust the compression level at the cost of reducing the compression quality. Decompression will be done by only one thread in any case and three more threads will be used for various purposes (read, write, check).

Please do not use this tool on login nodes, and prefer an interactive submission with ccc_mprun -s or with a batch script.

Compression and decompression example using 6 threads:

#!/bin/bash
#MSUB -n 1
#MSUB -c 6
#MSUB -q <partition>
#MSUB -A <Project>
module load pigz

# compression:
# we are forced to create a wrapper around pigz if we want to use
# specific options to change the default behaviour of pigz.
# Note that '$@' is important because tar can pass arguments to pigz
cat <<EOF > pigz.sh
#!/bin/bash
pigz -p 6 \$@
EOF
chmod +x ./pigz.sh

tar -I ./pigz.sh -cf folder.tar.gz folder

# decompression:
tar -I pigz -xf folder.tar.gz

For additional information, please refer to the man-pages of the software:

$ man pigz

MpiFileUtils

MpiFileUtils is a suite of utilitaries allowing for handling file trees and large files. It is optimised for HPC and use MPI paralellisation. It offers tools for basic tasks like copy, remove, and compare for such datasets, delivering better performance than their single-process counterparts.

dcp - Copy files Using 64 processes and 16 cores, dcp provides a data rater more than 6 times greater than a regular cp using a full node to copy 80GB/1800 files on the scratch file system.

dtar - Create and extract tape archive files Using 64 processes and 16 cores, dtar provides a data rater more than 6 times greater than a regular tar using a full node to archive 80GB/1800 files on the scratch file system.

dbcast - Broadcast a file to each compute node.

dbz2 - Compress and decompress a file with bz2.

dchmod - Change owner, group, and permissions on files.

dcmp - Compare contents between directories or files.

ddup - Find duplicate files.

dfind - Filter files.

dreln - Update symlinks to point to a new path.

drm - Remove files.

dstripe - Restripe files (Lustre).

dsync - Synchronize source and destination directories or files.

dtar - Create and extract tape archive files.

dwalk - List, sort, and profile files.

Sample script for dcp, dtar and drm

#!/bin/bash
#MSUB -r dcp_dtar
#MSUB -q |default_CPU_partition|
#MSUB -T 3600
#MSUB -n 64
#MSUB -c 16
#MSUB -x
#MSUB -A <Project>
#MSUB -m work,scratch
ml pu
ml mpi
ml mpifileutils
ccc_mprun dtar -cf to_copy.tar to_tar/ # Create an archive
ccc_mprun dcp to_copy.tar copy_dcp.tar # Copy the archive
ccc_mprun drm copy_dcp.tar # Remove the copy of the archive

Sample script for dstripe

#!/bin/bash
#MSUB -r dstripe
#MSUB -q rome
#MSUB -T 1500
#MSUB -Q test
#MSUB -n 16
#MSUB -c 8
#MSUB -x
#MSUB -A <Project>
#MSUB -m work,scratch
ml gnu/8
ml mpifileutils
ccc_mprun dstripe -c <number of desired stripe> to_copy.tar

For more information, see: https://mpifileutils.readthedocs.io/en/v0.11.1/