Anaconda and Miniconda modules
We recommand loading Anaconda or Miniconda to use Python on Ruche. Several versions of Anaconda and Miniconda are available:
[username@ruche01~]$ module avail anaconda ------------------ /gpfs/softs/modules/modulefiles/languages ------------------- anaconda3/2020.02/gcc-9.2.0 [username@ruche01~]$ module avail miniconda ------------------ /gpfs/softs/modules/modulefiles/languages ------------------- miniconda2/184.108.40.206/intel-220.127.116.11 miniconda3/18.104.22.168/intel-22.214.171.124
miniconda2 are for Python 2 series and
miniconda3 are for Python 3 series.
Conda base directory
Anaconda installs all the package into the directory
$HOME/.conda. These files therefore restrained by your
$HOME quota (10 GB). Since large conda environments (like IA frameworks pytorch/keras/tensorflow) can take several GB of data, you may need to move this directory to your workdir space and create a symlink.
[username@ruche ~]$ mv $HOME/.conda $WORKDIR/.conda [username@ruche ~]$ ln -s $WORKDIR/.conda ~/.conda
Activate conda environment
[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0 [username@ruche ~]$ source activate myenv (myenv) [username@ruche ~]$
Please do not use
conda activate myenv followed by
conda init bash to activate an environment: this would modify your
/gpfs/users/username/.bashrc file and may result in a conflict with other environments. Use
source activate myenv instead.
Note that environments and packages are stored in the .conda directory located in your home directory. In order to avoid exceeding your quota on your home directory, we suggest you create a .conda directory in your workdir (or move the existing .conda located in your home directory in your workdir) and create a link in your home directory, as in the example above.
Create conda environments
Loading an Ananconda module provides you with a large range of packages, including some useful Python packages:
[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0 [username@ruche ~]$ conda list # packages in environment at /gpfs/softs/languages/anaconda3/2020.02: # # Name Version Build Channel (...) numpy 1.18.1 py37h4f9e942_0 (...) pycurl 126.96.36.199 py37h1ba5d50_0 pydocstyle 4.0.1 py_0 pyflakes 2.1.1 py37_0 pygments 2.5.2 py_0 pylint 2.4.4 py37_0 pyodbc 4.0.30 py37he6710b0_0 pyopenssl 19.1.0 py37_0 pyparsing 2.4.6 py_0 pyqt 5.9.2 py37h05f1152_2 pyrsistent 0.15.7 py37h7b6447c_0 pysocks 1.7.1 py37_0 pytables 3.6.1 py37h71ec239_0 pytest 5.3.5 py37_0 (...) python 3.7.6 h0371630_2 (...)
If you need a Python package that is not available in the default environment (base), you can easily create your own to install it (see conda user guide for more details, https://conda.io/docs/user-guide/index.html). Here is an example for mpi4py:
[username@ruche ~]$ module load anaconda3/2020.02/gcc-9.2.0 [username@ruche ~]$ conda create -n myenv Solving environment: done ## Package Plan ## environment location: /home/username/.conda/envs/myenv Proceed ([y]/n)? y Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use: # > source activate myenv # # To deactivate an active environment, use: # > source deactivate # [username@ruche ~]$ source activate myenv (myenv) [username@ruche ~]$ conda install mpi4py Solving environment: done ## Package Plan ## environment location: /home/username/.conda/envs/myenv added / updated specs: - mpi4py The following packages will be downloaded: package | build ---------------------------|----------------- (...) mpi4py-3.0.3 | py38h028fd6f_0 572 KB mpich-3.3.2 | hc856adb_0 3.8 MB python-3.8.3 | hcff3b4d_0 49.1 MB (...) ------------------------------------------------------------ Total: 37.9 MB The following NEW packages will be INSTALLED: (...) mpi4py pkgs/main/linux-64::mpi4py-3.0.3-py38h028fd6f_0 mpich pkgs/main/linux-64::mpich-3.3.2-hc856adb_0 python pkgs/main/linux-64::python-3.8.3-hcff3b4d_0 (...) Proceed ([y]/n)? y Downloading and Extracting Packages (...) mpi4py-3.0.3 | 572 KB | ##################################### | 100% (...) Preparing transaction: done Verifying transaction: done Executing transaction: done (myenv) [username@ruche ~]$ conda list # packages in environment at /home/username/.conda/envs/myenv: # # Name Version Build Channel (...) mpi4py 3.0.3 py38h028fd6f_0 mpich 3.3.2 hc856adb_0 (...) python 3.8.3 hcff3b4d_0 (...)
No JupyterHub server is available on ruche.
If you want to develop a code using jupyter, a JupyterHub server is open to the users of CentraleSupélec, ENS Paris-Saclay and Université Paris Saclay: https://jupyterhub.ijclab.in2p3.fr/
You can connect to this server using Renater/EduGain and the identifiers provided by your institution (mail identifiers).
Jupyter notebooks can be converted to python and used as python scripts, which is the recommanded way to run a computation on Ruche.
$ jupyter nbconvert testnotebook.ipynb --to python
jupyter binary is not available on ruche but can be installed within a conda environment.
The visualization interface of ruche is not suited for computations: it is aimed at postprocessing large data with tools such as Paraview. Please do not use it for Matlab: Matlab computations can be easily performed using a Slurm script as explained below.
Several versions of Matlab are available on Ruche:
[username@ruche01 ~]$ module avail matlab -------------------------------- /gpfs/softs/modules/modulefiles/softwares --------------------------------- matlab/R2020a/intel-188.8.131.52 matlab/R2020b/intel-184.108.40.206
Load the version that you need:
[username@ruche01 ~]$ module load matlab/R2020a/intel-220.127.116.11
Managing figures when using Matlab without Desktop
-nodisplay option starts Matlab without Desktop (see the Slurm script below).
In this case, you may need to export figures. To do so, you can use
[username@ruche01 ~]$ cat test.m fig = figure; x = 0:pi/100:2*pi; y = sin(x); plot(x,y) print(fig,'MySavedPlot','-dpng')
Built-in multithreading and Slurm script
As indicated on the site of Mathworks, linear algebra and numerical functions such as fft, (mldivide), eig, svd, and sort are multithreaded in Matlab. Multithreaded computations have been on by default in Matlab since Release 2008a. These functions automatically execute on multiple computational threads in a single Matlab session, allowing them to execute faster on multicore-enabled machines. Additionally, many functions in Image Processing Toolbox are multithreaded.
The maximum number of computational threads used by Matlab is equal to the number of physical cores on the node. Therefore, we recommand that you ensure that your Matlab script only uses the number of cores (ncpus) that you requested in your Slurm script. Matlab maxNumCompThreads(N) function or the
-singleCompThread option can be used to set the maximum number of computational threads or limit Matlab to a single computational thread, respectively (see this page for more details).
Not controlling the number of threads used by Matlab so that it matches the number of cpus required in your Slurm script may cause your computation to stop with an error.
Here is an example of a Matlab script (test.m) running on 2 threads and the corresponding Slurm script:
[username@ruche01 ~]$ cat test.m LASTN = maxNumCompThreads(2) a = rand(20000); tic;rcond(a);toc
#!/bin/bash #SBATCH --job-name=job-matlab #SBATCH --output=%x.o%j #SBATCH --time=00:20:00 #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=2 #SBATCH --partition=cpu_short # Module load module purge module load matlab/R2020a/intel-18.104.22.168 matlab -nodisplay -r test
The current versions of Abaqus on ruche are restricted to LMPS users due to licencing policies. Any user outside of LMPS willing to use Abaqus must contact firstname.lastname@example.org for more information. Please contact email@example.com to have a SLURM script example for this software.
The current versions of Comsol on ruche are restricted to GEEPS users due to licencing policies. Any user outside of GEEPS willing to use Comsol must contact firstname.lastname@example.org for more information.
The usage of container is a common solution to run an application within a separate and customized host system. A container is similar to a virtual machine, but is not built on a complete emulation and is lighter to execute.
Containers are used to assure the portability of the application between different platforms, since you can run the same container on several docker host. The code that will run in the container will always see the same OS environment, which you can customize to fit the application needs (OS distribution, configuration, packages, libraires, etc.).
Using containers on ruche
The advised solution to run containers on ruche is to use Singularity.
Due to security reasons (required access to superuser rights), Docker cannot be installed on ruche. As an alternative, you can execute docker images with singularity (see https://sylabs.io/guides/3.5/user-guide/singularity_and_docker.html for further details).
Singularity binairies are available with the module
[user@ruche01 ~]$ module load singularity/3.5.2/gcc-9.2.0
Nvidia has a collection of preconfigured containers for common GPU usages such as deep learning frameworks or nvidia supported libraries. Nvidia Docker is a common solution to assure the compatibility of a deep learning code running on several environments (personnal workstation and HPC cluster, for instance).
Since the Nvidia driver within the container will call the Nvidia driver of the OS, you must take care of the compatibility of the two drivers. Current Nvidia driver version on ruche is specified in section "Cluster Overview" of the documentation.