add nextflow d30e48d

This commit is contained in:
2026-04-29 23:01:54 +02:00
parent d0b12d668d
commit 97cc9058d3
2840 changed files with 730250 additions and 0 deletions

221
nextflow/docs/conda.md Normal file
View File

@@ -0,0 +1,221 @@
(conda-page)=
# Conda environments
[Conda](https://conda.io/) is an open source package and environment management system that simplifies the installation and the configuration of complex software packages in a platform agnostic manner.
Nextflow has built-in support for Conda that allows the configuration of workflow dependencies using Conda recipes and environment files.
This allows Nextflow applications to use popular tool collections such as [Bioconda](https://bioconda.github.io) and the [Python Package index](https://pypi.org/), while taking advantage of the configuration flexibility provided by Nextflow.
## Prerequisites
This feature requires the Conda or [Miniconda](https://conda.io/miniconda.html) package manager to be installed on your system.
## How it works
Nextflow automatically creates and activates the Conda environment(s) given the dependencies specified by each process.
Dependencies are specified by using the {ref}`process-conda` directive, providing either the names of the required Conda packages, the path of a Conda environment yaml file, or the path of an existing Conda environment directory.
:::{note}
Conda environments are stored on the file system. By default, Nextflow instructs Conda to save the required environments in the pipeline work directory. The same environment may be created/saved multiple times across multiple executions when using different work directories.
:::
You can specify the directory where the Conda environments are stored using the `conda.cacheDir` configuration property. When using a computing cluster, make sure to use a shared file system path accessible from all compute nodes. See the {ref}`configuration page <config-conda>` for details about Conda configuration.
:::{warning}
The Conda environment feature is not supported by executors that use remote object storage as a work directory. For example, AWS Batch.
:::
### Enabling Conda environment
:::{versionadded} 22.08.0-edge
:::
The use of Conda recipes specified using the {ref}`process-conda` directive needs to be enabled explicitly in the pipeline configuration file (i.e. `nextflow.config`):
```groovy
conda.enabled = true
```
Alternatively, it can be specified by setting the variable `NXF_CONDA_ENABLED=true` in your environment or by using the `-with-conda` command line option.
### Use Conda package names
Conda package names can be specified using the `conda` directive. Multiple package names can be specified by separating them with a blank space. For example:
```nextflow
process hello {
conda 'bwa samtools multiqc'
script:
"""
your_command --here
"""
}
```
Using the above definition, a Conda environment that includes BWA, Samtools and MultiQC tools is created and activated when the process is executed.
The usual Conda package syntax and naming conventions can be used. The version of a package can be specified after the package name as shown here `bwa=0.7.15`.
The name of the channel where a package is located can be specified prefixing the package with the channel name as shown here `bioconda::bwa=0.7.15`.
:::{versionchanged} 26.04.0
By default, Nextflow uses the `conda-forge` and `bioconda` channels to resolve Conda packages. You can override this using the `conda.channels` configuration option.
:::
(conda-env-files)=
### Use Conda environment files
Conda environments can also be defined using one or more Conda environment files. This is a file that lists the required packages and channels structured using the YAML format. For example:
```yaml
name: my-env
channels:
- conda-forge
- bioconda
dependencies:
- star=2.5.4a
- bwa=0.7.15
```
Read the Conda documentation for more details about how to create [environment files](https://conda.io/docs/user-guide/tasks/manage-environments.html#creating-an-environment-file-manually).
The path of an environment file can be specified using the `conda` directive:
```nextflow
process hello {
conda '/some/path/my-env.yaml'
script:
"""
your_command --here
"""
}
```
:::{warning}
The environment file name **must** have a `.yml` or `.yaml` extension or else it won't be properly recognized.
:::
(conda-pypi)=
### Python Packages from PyPI
Conda environment files can also be used to install Python packages from the [PyPI repository](https://pypi.org/), through the `pip` package manager (which must also be explicitly listed as a required package):
```yaml
name: my-env-2
channels:
- conda-forge
- bioconda
dependencies:
- pip
- pip:
- numpy
- pandas
- matplotlib
```
### Conda text files
It is possible to provide dependencies by listing each package name as a separate line in a plain text file. For example:
```
bioconda::star=2.5.4a
bioconda::bwa=0.7.15
bioconda::multiqc=1.4
```
:::{note}
Dependency files must be a text file with the `.txt` extension.
:::
### Conda lock files
The final method for providing packages to Conda is by using [Conda lock files](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#identical-conda-envs).
To generate a lock file from an existing Conda environment, run the following command:
```bash
conda list --explicit > spec-file.txt
```
If you're using Mamba or Micromamba, use this command instead:
```bash
micromamba env export --explicit > spec-file.txt
```
You can also download Conda lock files from [Wave](https://seqera.io/wave/) container build pages.
These files list every package and its dependencies, so Conda doesn't need to perform dependency resolution. This makes environment setup faster and more reproducible.
Each file includes package URLs and, optionally, an MD5 hash for verifying file integrity:
```
# micromamba env export --explicit
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
@EXPLICIT
https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81
https://conda.anaconda.org/conda-forge/linux-64/libgomp-13.2.0-h77fa898_7.conda#abf3fec87c2563697defa759dec3d639
https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d
https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-13.2.0-h77fa898_7.conda#72ec1b1b04c4d15d4204ece1ecea5978
# .. and so on
```
To use a Conda lock file with Nextflow, set the `conda` directive to the path of the lock file.
:::{note}
Conda lock files must be a text file with the `.txt` extension.
:::
### Use existing Conda environments
If you already have a local Conda environment, you can use it in your workflow specifying the installation directory of such environment by using the `conda` directive:
```nextflow
process hello {
conda '/path/to/an/existing/env/directory'
script:
"""
your_command --here
"""
}
```
### Use Mamba to resolve packages
:::{warning} *Experimental: may change in a future release.*
:::
It is also possible to use [mamba](https://github.com/mamba-org/mamba) to speed up the creation of conda environments. For more information on how to enable this feature please refer to {ref}`Conda <config-conda>`.
## Best practices
When a `conda` directive is used in any `process` definition within the workflow script, Conda tool is required for the workflow execution.
Specifying the Conda environments in a separate configuration {ref}`profile <config-profiles>` is therefore recommended to allow the execution via a command line option and to enhance the workflow portability. For example:
```groovy
profiles {
conda {
process.conda = 'samtools'
}
docker {
process.container = 'biocontainers/samtools'
docker.enabled = true
}
}
```
The above configuration snippet allows the execution either with Conda or Docker specifying `-profile conda` or `-profile docker` when running the workflow script.
## Advanced settings
Conda advanced configuration settings are described in the {ref}`Conda <config-conda>` section on the Nextflow configuration page.