Software
Python
Anaconda and Miniconda modules
We recommand loading Anaconda or Miniconda to use Python on Ruche. Several versions of Anaconda and Miniconda are available:
[username@ruche01~]$ module avail anaconda
------------------ /gpfs/softs/modules/modulefiles/languages -------------------
anaconda3/2020.02/gcc-9.2.0
[username@ruche01~]$ module avail miniconda
------------------ /gpfs/softs/modules/modulefiles/languages -------------------
miniconda2/4.7.12.1/intel-19.0.3.199
miniconda3/4.7.12.1/intel-19.0.3.199
Note
Modules anaconda2
and miniconda2
are for Python 2 series and anaconda3
and miniconda3
are for Python 3 series.
Conda base directory
Anaconda installs all the package into the directory $HOME/.conda
. These files therefore restrained by your $HOME
quota (10 GB). Since large conda environments (like IA frameworks pytorch/keras/tensorflow) can take several GB of data, you may need to move this directory to your workdir space and create a symlink.
[username@ruche ~]$ mv $HOME/.conda $WORKDIR/.conda
[username@ruche ~]$ ln -s $WORKDIR/.conda ~/.conda
Activate conda environment
[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0
[username@ruche ~]$ source activate myenv
(myenv) [username@ruche ~]$
Warning
Please do not use conda activate myenv
followed by conda init bash
to activate an environment: this would modify your /gpfs/users/username/.bashrc
file and may result in a conflict with other environments. Use source activate myenv
instead.
Note that environments and packages are stored in the .conda directory located in your home directory. In order to avoid exceeding your quota on your home directory, we suggest you create a .conda directory in your workdir (or move the existing .conda located in your home directory in your workdir) and create a link in your home directory, as in the example above.
Create conda environments
Loading an Ananconda module provides you with a large range of packages, including some useful Python packages:
[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0
[username@ruche ~]$ conda list
# packages in environment at /gpfs/softs/languages/anaconda3/2020.02:
#
# Name Version Build Channel
(...)
numpy 1.18.1 py37h4f9e942_0
(...)
pycurl 7.43.0.5 py37h1ba5d50_0
pydocstyle 4.0.1 py_0
pyflakes 2.1.1 py37_0
pygments 2.5.2 py_0
pylint 2.4.4 py37_0
pyodbc 4.0.30 py37he6710b0_0
pyopenssl 19.1.0 py37_0
pyparsing 2.4.6 py_0
pyqt 5.9.2 py37h05f1152_2
pyrsistent 0.15.7 py37h7b6447c_0
pysocks 1.7.1 py37_0
pytables 3.6.1 py37h71ec239_0
pytest 5.3.5 py37_0
(...)
python 3.7.6 h0371630_2
(...)
If you need a Python package that is not available in the default environment (base), you can easily create your own to install it (see conda user guide for more details, https://conda.io/docs/user-guide/index.html). Here is an example for mpi4py:
[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0
[username@ruche ~]$ conda create -n myenv
Solving environment: done
## Package Plan ##
environment location: /home/username/.conda/envs/myenv
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use:
# > source activate myenv
#
# To deactivate an active environment, use:
# > source deactivate
#
[username@ruche ~]$ source activate myenv
(myenv) [username@ruche ~]$ conda install mpi4py
Solving environment: done
## Package Plan ##
environment location: /home/username/.conda/envs/myenv
added / updated specs:
- mpi4py
The following packages will be downloaded:
package | build
---------------------------|-----------------
(...)
mpi4py-3.0.3 | py38h028fd6f_0 572 KB
mpich-3.3.2 | hc856adb_0 3.8 MB
python-3.8.3 | hcff3b4d_0 49.1 MB
(...)
------------------------------------------------------------
Total: 37.9 MB
The following NEW packages will be INSTALLED:
(...)
mpi4py pkgs/main/linux-64::mpi4py-3.0.3-py38h028fd6f_0
mpich pkgs/main/linux-64::mpich-3.3.2-hc856adb_0
python pkgs/main/linux-64::python-3.8.3-hcff3b4d_0
(...)
Proceed ([y]/n)? y
Downloading and Extracting Packages
(...)
mpi4py-3.0.3 | 572 KB | ##################################### | 100%
(...)
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
(myenv) [username@ruche ~]$ conda list
# packages in environment at /home/username/.conda/envs/myenv:
#
# Name Version Build Channel
(...)
mpi4py 3.0.3 py38h028fd6f_0
mpich 3.3.2 hc856adb_0
(...)
python 3.8.3 hcff3b4d_0
(...)
JupyterHub
No JupyterHub server is available on ruche.
If you want to develop a code using jupyter, a JupyterHub server is open to the users of CentraleSupélec, ENS Paris-Saclay and Université Paris Saclay: https://jupyterhub.ijclab.in2p3.fr/
You can connect to this server using Renater/EduGain and the identifiers provided by your institution (mail identifiers).
Jupyter notebooks can be converted to python and used as python scripts, which is the recommanded way to run a computation on Ruche.
$ jupyter nbconvert testnotebook.ipynb --to python
Note
jupyter
binary is not available on ruche but can be installed within a conda environment.
Matlab
Note
The visualization interface of ruche is not suited for computations: it is aimed at postprocessing large data with tools such as Paraview. Please do not use it for Matlab: Matlab computations can be easily performed using a Slurm script as explained below.
Matlab modules
Several versions of Matlab are available on Ruche:
[username@ruche01 ~]$ module avail matlab
-------------------------------- /gpfs/softs/modules/modulefiles/softwares ---------------------------------
matlab/R2020a/intel-19.0.3.199 matlab/R2020b/intel-19.0.3.199
Load the version that you need:
[username@ruche01 ~]$ module load matlab/R2020a/intel-19.0.3.199
Managing figures when using Matlab without Desktop
The -nodisplay
option starts Matlab without Desktop (see the Slurm script below).
In this case, you may need to export figures. To do so, you can use print
in your Matlab script:
[username@ruche01 ~]$ cat test.m
fig = figure;
x = 0:pi/100:2*pi;
y = sin(x);
plot(x,y)
print(fig,'MySavedPlot','-dpng')
Built-in multithreading and Slurm script
As indicated on the site of Mathworks, linear algebra and numerical functions such as fft, (mldivide), eig, svd, and sort are multithreaded in Matlab. Multithreaded computations have been on by default in Matlab since Release 2008a. These functions automatically execute on multiple computational threads in a single Matlab session, allowing them to execute faster on multicore-enabled machines. Additionally, many functions in Image Processing Toolbox are multithreaded.
The maximum number of computational threads used by Matlab is equal to the number of physical cores on the node. Therefore, we recommand that you ensure that your Matlab script only uses the number of cores (ncpus) that you requested in your Slurm script. Matlab maxNumCompThreads(N) function or the -singleCompThread
option can be used to set the maximum number of computational threads or limit Matlab to a single computational thread, respectively (see this page for more details).
Note
Not controlling the number of threads used by Matlab so that it matches the number of cpus required in your Slurm script may cause your computation to stop with an error.
Here is an example of a Matlab script (test.m) running on 2 threads and the corresponding Slurm script:
[username@ruche01 ~]$ cat test.m
LASTN = maxNumCompThreads(2)
a = rand(20000);
tic;rcond(a);toc
#!/bin/bash
#SBATCH --job-name=job-matlab
#SBATCH --output=%x.o%j
#SBATCH --time=00:20:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --partition=cpu_short
# Module load
module purge
module load matlab/R2020a/intel-19.0.3.199
matlab -nodisplay -r test
Abaqus
The current versions of Abaqus on ruche are restricted to LMPS users due to licencing policies. Any user outside of LMPS willing to use Abaqus must contact ruche.support@universite-paris-saclay.fr for more information. Please contact ruche.support@universite-paris-saclay.fr to have a SLURM script example for this software.
Comsol
The current versions of Comsol on ruche are restricted to GEEPS users due to licencing policies. Any user outside of GEEPS willing to use Comsol must contact ruche.support@universite-paris-saclay.fr for more information.
Docker/Singularity
Containers
The usage of container is a common solution to run an application within a separate and customized host system. A container is similar to a virtual machine, but is not built on a complete emulation and is lighter to execute.
Containers are used to assure the portability of the application between different platforms, since you can run the same container on several docker host. The code that will run in the container will always see the same OS environment, which you can customize to fit the application needs (OS distribution, configuration, packages, libraires, etc.).
Using containers on ruche
The advised solution to run containers on ruche is to use Singularity.
Due to security reasons (required access to superuser rights), Docker cannot be installed on ruche. As an alternative, you can execute docker images with singularity (see https://sylabs.io/guides/3.5/user-guide/singularity_and_docker.html for further details).
Singularity binairies are available with the module singularity/3.5.3/gcc-11.2.0
.
[user@ruche01 ~]$ module load singularity/3.5.3/gcc-11.2.0
Nvidia docker
Nvidia has a collection of preconfigured containers for common GPU usages such as deep learning frameworks or nvidia supported libraries. Nvidia Docker is a common solution to assure the compatibility of a deep learning code running on several environments (personnal workstation and HPC cluster, for instance).
https://ngc.nvidia.com/catalog/containers/
Warning
Since the Nvidia driver within the container will call the Nvidia driver of the OS, you must take care of the compatibility of the two drivers. Current Nvidia driver version on ruche is specified in section "Cluster Overview" of the documentation.