Software

Python

Anaconda and Miniconda modules

We recommand loading Anaconda or Miniconda to use Python on Ruche. Several versions of Anaconda and Miniconda are available:

[username@ruche01~]$ module avail anaconda
------------------ /gpfs/softs/modules/modulefiles/languages -------------------
   anaconda3/2020.02/gcc-9.2.0

[username@ruche01~]$ module avail miniconda
------------------ /gpfs/softs/modules/modulefiles/languages -------------------
   miniconda2/4.7.12.1/intel-19.0.3.199
   miniconda3/4.7.12.1/intel-19.0.3.199

Note

Modules anaconda2 and miniconda2 are for Python 2 series and anaconda3 and miniconda3 are for Python 3 series.

Conda base directory

Anaconda installs all the package into the directory $HOME/.conda. These files therefore restrained by your $HOME quota (10 GB). Since large conda environments (like IA frameworks pytorch/keras/tensorflow) can take several GB of data, you may need to move this directory to your workdir space and create a symlink.

[username@ruche ~]$ mv $HOME/.conda $WORKDIR/.conda
[username@ruche ~]$ ln -s $WORKDIR/.conda ~/.conda

Activate conda environment

[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0
[username@ruche ~]$ source activate myenv
(myenv) [username@ruche ~]$

Warning

Please do not use conda activate myenv followed by conda init bash to activate an environment: this would modify your /gpfs/users/username/.bashrc file and may result in a conflict with other environments. Use source activate myenv instead.

Note that environments and packages are stored in the .conda directory located in your home directory. In order to avoid exceeding your quota on your home directory, we suggest you create a .conda directory in your workdir (or move the existing .conda located in your home directory in your workdir) and create a link in your home directory, as in the example above.

Create conda environments

Loading an Ananconda module provides you with a large range of packages, including some useful Python packages:

[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0
[username@ruche ~]$ conda list
# packages in environment at /gpfs/softs/languages/anaconda3/2020.02:
#
# Name                    Version                   Build  Channel
(...)
numpy                     1.18.1           py37h4f9e942_0 
(...)
pycurl                    7.43.0.5         py37h1ba5d50_0  
pydocstyle                4.0.1                      py_0  
pyflakes                  2.1.1                    py37_0  
pygments                  2.5.2                      py_0  
pylint                    2.4.4                    py37_0  
pyodbc                    4.0.30           py37he6710b0_0  
pyopenssl                 19.1.0                   py37_0  
pyparsing                 2.4.6                      py_0  
pyqt                      5.9.2            py37h05f1152_2  
pyrsistent                0.15.7           py37h7b6447c_0  
pysocks                   1.7.1                    py37_0  
pytables                  3.6.1            py37h71ec239_0  
pytest                    5.3.5                    py37_0  
(...)
python                    3.7.6                h0371630_2
(...)

If you need a Python package that is not available in the default environment (base), you can easily create your own to install it (see conda user guide for more details, https://conda.io/docs/user-guide/index.html). Here is an example for mpi4py:

[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0
[username@ruche ~]$ conda create -n myenv
Solving environment: done
## Package Plan ##
  environment location: /home/username/.conda/envs/myenv
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use:
# > source activate myenv
#
# To deactivate an active environment, use:
# > source deactivate
#

[username@ruche ~]$ source activate myenv

(myenv) [username@ruche ~]$ conda install mpi4py
Solving environment: done
## Package Plan ##
  environment location: /home/username/.conda/envs/myenv
  added / updated specs: 
    - mpi4py

The following packages will be downloaded:
    package                    |            build
    ---------------------------|-----------------
(...)
    mpi4py-3.0.3               |   py38h028fd6f_0         572 KB
    mpich-3.3.2                |       hc856adb_0         3.8 MB
    python-3.8.3               |       hcff3b4d_0        49.1 MB
(...)
    ------------------------------------------------------------
                                           Total:        37.9 MB
The following NEW packages will be INSTALLED:
(...)
  mpi4py             pkgs/main/linux-64::mpi4py-3.0.3-py38h028fd6f_0
  mpich              pkgs/main/linux-64::mpich-3.3.2-hc856adb_0
  python             pkgs/main/linux-64::python-3.8.3-hcff3b4d_0
(...)
Proceed ([y]/n)? y
Downloading and Extracting Packages
(...)
mpi4py-3.0.3         | 572 KB    | ##################################### | 100%
(...)
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

(myenv) [username@ruche ~]$ conda list
# packages in environment at /home/username/.conda/envs/myenv:
#
# Name                    Version                   Build  Channel
(...)
mpi4py                    3.0.3            py38h028fd6f_0  
mpich                     3.3.2                hc856adb_0  
(...) 
python                    3.8.3                hcff3b4d_0  
(...)

JupyterHub

No JupyterHub server is available on ruche.

If you want to develop a code using jupyter, a JupyterHub server is open to the users of CentraleSupélec, ENS Paris-Saclay and Université Paris Saclay: https://jupyterhub.ijclab.in2p3.fr/

You can connect to this server using Renater/EduGain and the identifiers provided by your institution (mail identifiers).

Jupyter notebooks can be converted to python and used as python scripts, which is the recommanded way to run a computation on Ruche.

$ jupyter nbconvert testnotebook.ipynb --to python

Note

jupyter binary is not available on ruche but can be installed within a conda environment.

Matlab

Note

The visualization interface of ruche is not suited for computations: it is aimed at postprocessing large data with tools such as Paraview. Please do not use it for Matlab: Matlab computations can be easily performed using a Slurm script as explained below.

Matlab modules

Several versions of Matlab are available on Ruche:

[username@ruche01 ~]$ module avail matlab
-------------------------------- /gpfs/softs/modules/modulefiles/softwares ---------------------------------
   matlab/R2020a/intel-19.0.3.199    matlab/R2020b/intel-19.0.3.199

Load the version that you need:

[username@ruche01 ~]$ module load  matlab/R2020a/intel-19.0.3.199

Managing figures when using Matlab without Desktop

The -nodisplay option starts Matlab without Desktop (see the Slurm script below).

In this case, you may need to export figures. To do so, you can use print in your Matlab script:

[username@ruche01 ~]$ cat test.m
fig = figure;
x = 0:pi/100:2*pi;
y = sin(x);
plot(x,y)
print(fig,'MySavedPlot','-dpng')

Built-in multithreading and Slurm script

As indicated on the site of Mathworks, linear algebra and numerical functions such as fft, (mldivide), eig, svd, and sort are multithreaded in Matlab. Multithreaded computations have been on by default in Matlab since Release 2008a. These functions automatically execute on multiple computational threads in a single Matlab session, allowing them to execute faster on multicore-enabled machines. Additionally, many functions in Image Processing Toolbox are multithreaded.

The maximum number of computational threads used by Matlab is equal to the number of physical cores on the node. Therefore, we recommand that you ensure that your Matlab script only uses the number of cores (ncpus) that you requested in your Slurm script. Matlab maxNumCompThreads(N) function or the -singleCompThread option can be used to set the maximum number of computational threads or limit Matlab to a single computational thread, respectively (see this page for more details).

Note

Not controlling the number of threads used by Matlab so that it matches the number of cpus required in your Slurm script may cause your computation to stop with an error.

Here is an example of a Matlab script (test.m) running on 2 threads and the corresponding Slurm script:

[username@ruche01 ~]$ cat test.m
LASTN = maxNumCompThreads(2)
a = rand(20000);
tic;rcond(a);toc
#!/bin/bash

#SBATCH --job-name=job-matlab
#SBATCH --output=%x.o%j 
#SBATCH --time=00:20:00 
#SBATCH --nodes=1 
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --partition=cpu_short

# Module load
module purge
module load matlab/R2020a/intel-19.0.3.199

matlab -nodisplay -r test

Abaqus

The current versions of Abaqus on ruche are restricted to LMPS users due to licencing policies. Any user outside of LMPS willing to use Abaqus must contact ruche.support@universite-paris-saclay.fr for more information. Please contact ruche.support@universite-paris-saclay.fr to have a SLURM script example for this software.

Comsol

The current versions of Comsol on ruche are restricted to GEEPS users due to licencing policies. Any user outside of GEEPS willing to use Comsol must contact ruche.support@universite-paris-saclay.fr for more information.

Docker/Singularity

Containers

The usage of container is a common solution to run an application within a separate and customized host system. A container is similar to a virtual machine, but is not built on a complete emulation and is lighter to execute.

Containers are used to assure the portability of the application between different platforms, since you can run the same container on several docker host. The code that will run in the container will always see the same OS environment, which you can customize to fit the application needs (OS distribution, configuration, packages, libraires, etc.).

Using containers on ruche

The advised solution to run containers on ruche is to use Singularity.

Due to security reasons (required access to superuser rights), Docker cannot be installed on ruche. As an alternative, you can execute docker images with singularity (see https://sylabs.io/guides/3.5/user-guide/singularity_and_docker.html for further details).

Singularity binairies are available with the module singularity/3.5.2/gcc-9.2.0.

[user@ruche01 ~]$ module load singularity/3.5.2/gcc-9.2.0 

Nvidia docker

Nvidia has a collection of preconfigured containers for common GPU usages such as deep learning frameworks or nvidia supported libraries. Nvidia Docker is a common solution to assure the compatibility of a deep learning code running on several environments (personnal workstation and HPC cluster, for instance).

https://ngc.nvidia.com/catalog/containers/

Warning

Since the Nvidia driver within the container will call the Nvidia driver of the OS, you must take care of the compatibility of the two drivers. Current Nvidia driver version on ruche is specified in section "Cluster Overview" of the documentation.