PyTorch on MOGON NHR/KI

May 2, 2025 in Guide, Example by Jens Rutten2 minutes

PyTorch on MOGON NHR/KI

This small example provides a concise walkthrough of how to use PyTorch with GPU acceleration in an interactive job on MOGON NHR/KI

PyTorch in an (interactive) Nutshell

For MOGON NHR and MOGON KI users with GPU access, this example demonstrates how to use PyTorch with GPU acceleration in interactive jobs via an available container.

Start an Interactive Job

Run the following command to start an interactive job on MOGON NHR with the requested resources:

 salloc -t 10 -p a40 --gres=gpu:2

Here, you are requesting:

  • 10min runtime (-t 10)
  • on the a40 partition (-p a40)
  • with two GPUs (--gres=gpu:2)

Environment Variables and Modules

module purge
module use /apps/easybuild/ood/modules/all/
  • module purge: Removes all loaded modules to avoid conflicts.
  • module use /apps/easybuild/ood/modules/all/: Adds the path for the required modules to the module search path.

Load the JupyterLab Module

module load tools/JupyterLab/4.2.5_gpu_dev

This module provides the environment variables and the container for JupyterLab, which includes PyTorch.

Verify PyTorch Installation

Run the following command to verify the PyTorch installation and GPU selection:

srun apptainer exec --nv $JUPYTERLAB_IMAGE python3 -c "import torch; print(f'PyTorch is installed: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_
available()}'); print([(i, torch.cuda.get_device_properties(i)) for i in range(torch.cuda.device_count())])"
  • srun apptainer exec --nv: Runs the command in the specified container with NVIDIA GPU support.
  • $JUPYTERLAB_IMAGE: The environment variable containing the path to the JupyterLab container (see below).
  • python3 -c "...": Runs the Python code directly to display PyTorch and GPU properties.
JupyterLab Container Path

Module tools/JupyterLab/4.2.5_gpu_dev provides $JUPYTERLAB_IMAGE, which is

JUPYTERLAB_IMAGE=/apps/easybuild/ood/software/jupyterlab/4.2.5/mod_JupyterLab-4.2.5_gpu.sif

this container is based on the official Jupyter PyTorch Notebook image and is used in MODs JupyterApp.

Output

You should get an output similar to the following:

PyTorch is installed: 2.5.1+cu121
CUDA available: True
[
(0, _CudaDeviceProperties(name='NVIDIA A40', major=8, minor=6, total_memory=45499MB, multi_processor_count=84, uuid=9ae31388-f709-b016-6fd3-3c9d613749d6, L2_cache_size=6MB)), 
(1, _CudaDeviceProperties(name='NVIDIA A40', major=8, minor=6, total_memory=45499MB, multi_processor_count=84, uuid=923ad8b2-b993-af05-24aa-21885c053243, L2_cache_size=6MB))
]
Feedback Welcome

This container can be further customized or updated to meet specific needs. Please share suggestions or issues to help us enhance this resource.