Memory Reservation
How to reserve Memory on MOGON
SLURM imposes a memory limit on each job. By default, it is deliberately relatively small — $300 MB$ per CPU respectively task or $115500 MB$ per node for full node jobs. If your job uses more than that without specifying the need, you’ll get an error that your job exceeded job memory limit. To set a larger limit, add to your job submission:
where X is the maximum amount of memory your job will use per node, in $MB$. Different units can be specified using the suffix [K|M|G|T]
. The larger your working data set, the larger this needs to be, but the smaller the number the easier it is for the scheduler to find a place to run your job. To determine an appropriate value, start relatively large and then use sacct
to look at how much your job is actually using or has used:
where JOBID
is the one you’re interested in. This sample command gives the output for all JobSteps and compares used CPU time with the actual elapsed time - sometimes useful to get performance hints. If your job completed long in the past you may have to tell sacct
to look further back in time by adding a start time with -S YYYY-MM-DD
. Note that for parallel jobs spanning multiple nodes, this is the maximum memory used on any one node; if you’re not setting an even distribution of tasks per node (e.g. with --ntasks-per-node
), the same job could have very different values when run at different times.
The ReqMem
value indicates the requested memory in $MB$ at the submission, appended by either n
(per CPU) or n
(per node).
The number is in KB, so divide by 1024 to get a rough idea of what to use with --mem
(set it to something a little larger than that, since you’re defining a hard upper limit). If your job completed long in the past you may have to tell sacct
to look further back in time by adding a start time with -S YYYY-MM-DD
. Note that for parallel jobs spanning multiple nodes, this is the maximum memory used on any one node; if you’re not setting an even distribution of tasks per node (e.g. with --ntasks-per-node
), the same job could have very different values when run at different times.
Sampling might be inconsistent with actually used memory values for very short jobs. Quickly aborted jobs aren’t good to retrieve the stastics one needs, obviously.
Run sacct -e
to get a full list of the available fields and man sacct
for more detailed information.
The CPU time divided by the number of used CPUs should more or less equal elapsed run time. Otherwise, this is an indication for poor parallelisation.