Anaconda
Create your own work environments with Anaconda
Anaconda is a packaging system for software and eases the deployment of (statically) compiled software. It can be used to build your own development and research environment. This gives you the opportunity to download and use the software you need without needing an admin to install it on the cluster.
In the course of time several build systems derived directly from (Ana-)Conda, where Conda is the most prominent one and particularly well established in bioinformatics . Conda is entirely written in Python and executes rather slowly. A C++ implementation, which is considerably faster, is Mamba as it searches its repositories in parallel. All these solutions come with a serious caveat: Many files.
Micromamba
To address this issue, we recommend using
Micromamba
on our HPC facilities. The benefit: No more package files which needs manual cleanup. The drawback: Whereas Mamba is a dropin command for Conda (that is instead of typing conda
you might use mamba
, mirocmamba needs is full name micromamba
whenever you use it (see below). In the following we give some hints about using Micromamba. You may still refer to the
conda cheat sheet
as it is very useful for beginners - just remember that micromamba works slightly differntly.
Setting up Micromamba for the first time
First of all, we need to install Micromamba. It is adviced to install it in the home directory. To do so, execute:
During this install process you are asked several questions. We recommend confirming them all.
Seperating Conda/Micromamba Environments from the Module Environments
After installing Conda or Micromamba, you will find an entry in the your .bashrc
like this:
When you wrap this code in a (bash) function, like this (pay attention to the first and last line):
You need to call the function conda_initialize
every time you want to use micromamba. This avoids the (potentially overlong) execution upon every login and you can initialize micromamba in job script separately, once using the micromamba environment(s), once using module environments.
Note that mixing Conda/Micromamba environments with module environments is prone to errors!
Configuring Conda/Micromamba for better User Experience
Conda offers you to configure it with a file called ~/.condarc
to avoid too many questions, warnings or even error messages during install processes. Also, it speeds up the search for software packages. Micromamba adheres to this configuration. To configure your environment open the configuration file with an editor, e.g.:
and insert this file content:
Using this configuration file you achieve:
- installing the Python setup tools per default (such that internal
pip install
s work) - to restrict searching in the most prominent repository channels (add further channels, if desired)
- the setting of the proxy server (without it, conda will not find any software)
- to ignore checking for ssl certificates, as the local setup redirects
https
tohttp
anyways - to check the channels in the order they are listed
- not to ask for confirmation upon install requests
- and finally, the last line ensures that the display of unnamed environments (see below) does to overextend your terminal.
You may use different resource file settings, see the documentation for further hints on the configuration.
Using Micromamba
Firt Time Users
If you are working in the login shell you have been using with the curl-command, you might need to source your .bashrc
file. In this case execute:
Whenever you have a new shell or you log in again, this is automatically exectuted and not needed any more.
If you added the function above you need to run it to initialize micromamba:
Wheras Conda would display a so-called (base)
environment. Micromamba does not create such a base environment, avoiding the duplication of files from base to other environments.
Using Environments
We recommend installing software in bundles, so-called environments. It is best to have one environment per workflow.
In order to set up an environment you may choose from
- installing a (named) environment in your
HOME
directory. This might work for singly users, but carries the risk of hitting the file quota limit in yourHOME
- or in your project directory. This has the drawback of being an unnamed environment, but may serve entire groups.
To create a named environment in your HOME
run:
If you choose to create an environment in your project file space run:
Other usefull flags for creating environments are:
--pyc
to automatically create Python byte code upon installation (avoiding this step at later times and potentially doubling the file count)-f
to indicate a yaml or txt file specifiying a sofware list.
After creating one or several environments micromamba is able to list them:
Activating Environments
In order to activate an environment, that is to make the installed software available to you, run either:
You will see your environment (respectively the base name) in your prompt.
Likewise, to leave your environment run:
Adding Software to Environments (Installing additional Packages)
Make sure to activate an environment before installing packages.
You can search for a particular software name with
To install a single software package you may run
To install packages of software workflow developers usually provide text or yaml files. With a given file, e.g.
you can install this bundle with the -f
flag upon creating your environment or like: