FAQs

Answers to frequently asked questions.

How can I apply for a project to get access to HPC resources?In case you are a principle investigator or Post-Doc you can use the state-wide online form (English) or online form (German) to apply. Please, do not hesitate in advance - we can discuss any pending question in person or in a phone call.
How can I unsubscribe from the MOGON mailing list?Unsubscribing from the mailing list is not possible. The recipient list is automatically created from the group memberships in the corresponding MOGON projects. Therefore, you are part of the mailing list as long as your user account is still member of a MOGON project group (authorized users at account.uni-mainz.de ) or a MOGON manager group (administrator of authorized users).
How do I get an account?This question is answered here.
How much may I work on login nodes?This is explained in detail here.
Is it possible to extend the walltime limit of 5/12 days?Unfortunately, it is not possible to extend the walltime limit, not even temporarily.
May I contribute to this documentation?

Yes, please! We welcome any contribution, but rather like to curate the content. Therefore, we would appreciate it if you sent your contribution with a detailed explanation as a .txt file to . However, you also have the option of creating a page yourself by following the procedure below:

  1. If the content is new, please create a new page that we can integrate into the existing layout and point us to the URL.
  2. If you correct existing content, please edit the page and (if the change is substantial).

More detailed instructions including a screencast can be found in the GoHugo Wiki .

Why do my jobs not start?A job might be pending for a number of reasons . Sometimes, the job’s priority might be low and you just have to wait.
When does my job start?With squeue -u $USER --start you can display the expected start time and the resources to be allocated for pending jobs in order of increasing start time. Please refer to the slurm documentation for more information about the –start option of squeue.
How is the Fairshare value calculated?The calculation of the Fairshare value is complex and can not be easily replicated by users. For those interested in understanding how it works in principle, please refer to SLURM’s documentation on the Fair Tree Fairshare algorithm .
How many CPU cores are equal to one GPU?There is no direct equivalence between CPU cores and a GPU due to their vastly different architectures and use cases. GPUs, with their thousands of smaller cores, excel at parallelized tasks, offering exponential speedups in certain workloads (e.g. AI training or graphic rendering). However, for tasks that require complex, sequential processing, even a few CPU cores can outperform a GPU. Thus, the comparison depends entirely on the specific workload.
Do different GPU partitions have a different costs associated with them?Yes, different GPU partitions have varying costs, represented by billing weights. For example, the m2_gpu partition has a billing weight of 6, while the a100ai partition, at the high end, has a billing weight of 17. These weights approximately reflect the underlying hardware costs.
How is RAM usage being calculated in relation to CPU cores?Memory usage billing works similarly to CPU and GPU cores, with different billing weights assigned to reflect the costs. For instance, memory on a standard node is significantly cheaper than memory on a high-capacity node with up to $2TB$ of RAM.
Are different partitions using a different share per core/hour?Yes, share usage depends heavily on the partition. Different partitions vary in hardware and software configurations, affecting their capabilities and resulting in different cost structures.
Is Fairshare usage reduced or stopped in case the simulation is finished earlier than specified in the script?No, you are charged for the full amount of resources you reserved for your job, even if it finishes early. It’s in your best interest to estimate your job’s duration accurately and avoid unnecessary overhead.
Is share being used in case the node crashes while using it?This depends on the cause of the crash. If the crash is due to hardware failure, SLURM is usually capable of detecting it, and you won’t be charged for the resources. However, if the crash is caused by a user error, the used resources will be charged against your share.
Do short tests on Login-Nodes use share?No, share usage is not accounted for on Login-Nodes. However, only short tests are allowed on these nodes, so please be considerate of others when doing so.
Is there a discount when using entire nodes/higher share usage for Single-Node jobs?No, there are no discounts available. However, it’s important to note that when using parallel partitions, you are always allocated an entire node. This means that even if you only use a few cores, you will still be charged for the entire node’s resources in terms of share. To maximize efficiency, ensure that you fully utilize the node you are paying for.
Is there, besides Fairshare, another limit regarding the number of concurrently running jobs/used nodes?Yes, there are additional limits in place. For further information about SLURM’s hierarchical implementation of how limits are enforced please refer to the official documentation on resource limits .
Is there a separate resource allocation system within a workgroup on MOGON, where less active users are given higher priority for access to resources compared to heavy users?Yes, within a workgroup, there is additional accounting to ensure fair usage of resources among users. While less active users may receive higher priority, the overall dominating factor for job scheduling will most likely still be the workgroups fairshare score. This ensures balanced resource allocation while considering both individual and group usage patterns.
Ideas for FAQs to be included here? Thank you!