16 KiB
Feature Specification: Nextflow Module System Client
Feature Branch: 251117-module-system
Created: 2026-01-15
Status: Draft
Input: User description: "Implement Nextflow module system client based on ADR 20251114-module-system.md. Focus on client-side implementation only - CLI commands, DSL parser extensions, dependency resolution, and local storage. Registry backend is assumed to be already implemented."
Overview
This specification covers the Nextflow client-side implementation of the module system, enabling pipeline developers to:
- Include remote modules from the Nextflow registry using
@scope/namesyntax - Manage module versions through
nextflow.config - Use CLI commands to install, search, list, remove, publish, and run modules
- Configure module parameters through structured
meta.yamldefinitions
Out of Scope: Registry backend implementation (assumed already available at registry.nextflow.io)
User Scenarios & Testing
User Story 1 - Install and Use Registry Module (Priority: P1)
A pipeline developer wants to use a pre-built module from the Nextflow registry in their workflow without manually downloading or managing module files.
Why this priority: This is the core value proposition - enabling code reuse from the ecosystem. Without this, the module system provides no benefit.
Independent Test: Can be fully tested by running nextflow module install nf-core/fastqc and then executing a workflow that includes the module. Delivers immediate value by enabling module consumption.
Acceptance Scenarios:
- Given a new Nextflow project with no modules installed, When user runs
nextflow module install nf-core/fastqc, Then the module is downloaded tomodules/@nf-core/fastqc/, a.checksumfile is created, andnextflow_spec.jsonis updated with the version - Given a workflow file with
include { FASTQC } from '@nf-core/fastqc', When user runsnextflow run main.nf, Then Nextflow resolves the module from local storage and executes the process - Given a module version declared in
nextflow.config, When user includes the module, Then the declared version is used (not latest)
User Story 2 - Run Module Directly (Priority: P1)
A user wants to run a module directly from the command line without writing a wrapper workflow.
Why this priority: Enables immediate productivity - users can test and execute modules without boilerplate code, essential for AI agents and quick experimentation.
Independent Test: Can be tested by running nextflow module run nf-core/fastqc --input 'data/*.fq' and verifying the process executes.
Acceptance Scenarios:
- Given a module is available (locally or in registry), When user runs
nextflow module run nf-core/fastqc --input 'data/*.fastq', Then the module is executed with the provided inputs mapped to process parameters - Given a module with parameters defined in
meta.yaml, When user runsnextflow module run nf-core/bwa-align --batch_size 100000, Then the parameter is validated and passed to the process - Given a module is not installed locally, When user runs
nextflow module run nf-core/salmon, Then the module is automatically downloaded before execution
User Story 3 - Module Parameters (Priority: P1)
A module author wants to define typed, documented parameters that provide a clear interface for module customization.
Why this priority: Critical for module usability - provides type-safe, documented parameters that enable IDE autocompletion and validation, replacing the opaque ext.args pattern.
Independent Test: Can be tested by configuring params.batch_size = 100000 in config and verifying the parameter is applied in the script.
Acceptance Scenarios:
- Given a module with
paramsdefined inmeta.yaml, When user configuresparams.batch_size = 100000in config, Then the parameter is accessible in scripts viaparams.batch_size - Given a parameter with type validation, When user provides an invalid value type, Then a validation error is displayed
- Given a module with documented parameters, When user runs
nextflow module run --help, Then available parameters with descriptions are listed
User Story 4 - Module Version Management (Priority: P2)
A pipeline developer wants to pin and manage module versions to ensure reproducible workflow executions.
Why this priority: Reproducibility is important for scientific workflows - version pinning ensures consistent results.
Independent Test: Can be tested by modifying nextflow.config module versions and verifying the correct version is used on workflow run.
Acceptance Scenarios:
- Given a module is installed at version 1.0.0, When user changes
nextflow_spec.jsonto specify version 1.1.0 and runs the workflow, Then version 1.1.0 is automatically downloaded and replaces the local copy - Given modules installed locally, When user runs
nextflow module list, Then configured version, installed version, latest available version, and status are displayed for each module
User Story 5 - Module Integrity Protection (Priority: P2)
A pipeline developer who has locally modified a module (for debugging or customization) wants to be protected from accidentally losing those changes.
Why this priority: Protects user work - important for developer experience but not blocking core functionality.
Independent Test: Can be tested by modifying a module's main.nf locally, then attempting to install a different version and verifying the warning appears.
Acceptance Scenarios:
- Given a locally modified module (checksum mismatch with
.checksum), When user tries to install a different version, Then Nextflow warns about local modifications and does NOT override - Given a locally modified module, When user runs
nextflow module install -force, Then the local module is replaced with the registry version - Given a locally modified module, When user runs the workflow, Then a warning is displayed about checksum mismatch but execution continues
User Story 6 - Remove Module (Priority: P3)
A pipeline developer wants to remove a module they no longer need.
Why this priority: Housekeeping feature - useful but not blocking core workflows.
Independent Test: Can be tested by running nextflow module remove nf-core/fastqc and verifying files are deleted and config is updated.
Acceptance Scenarios:
- Given a module is installed, When user runs
nextflow module remove nf-core/fastqc, Then the module directory is deleted and the entry is removed fromnextflow_spec.json - Given a module is referenced in workflow files, When user runs
nextflow module remove, Then a warning is displayed about the reference but removal proceeds
User Story 7 - Search and Discover Modules (Priority: P3)
A pipeline developer wants to find available modules in the registry that match their analysis needs.
Why this priority: Discovery feature - useful but users can find modules through documentation or registry web UI.
Independent Test: Can be tested by running nextflow module search bwa and verifying results are displayed with name, version, and description.
Acceptance Scenarios:
- Given modules exist in the registry, When user runs
nextflow module search alignment, Then matching modules are displayed with name, latest version, description, and download count - Given user wants JSON output for scripting, When user runs
nextflow module search fastqc -json, Then results are returned in parseable JSON format - Given many results exist, When user runs
nextflow module search quality -limit 5, Then only 5 results are returned
User Story 8 - Publish Module to Registry (Priority: P3)
A module author wants to publish their module to the Nextflow registry for others to use.
Why this priority: Ecosystem contribution feature - important for growth but users can consume modules without publishing capability.
Independent Test: Can be tested by creating a valid module structure and running nextflow module publish -dry-run to validate.
Acceptance Scenarios:
- Given a valid module with
main.nf,meta.yaml, andREADME.md, When user runsnextflow module publish myorg/my-module, Then the module is uploaded to the registry and becomes available for installation - Given an invalid module (missing required fields), When user runs
nextflow module publish, Then validation errors are displayed listing the missing requirements - Given no authentication configured, When user runs
nextflow module publish, Then a clear error message indicates authentication is required
Edge Cases
- What happens when the registry is unreachable during module resolution?
- Nextflow uses locally cached modules if available, otherwise fails with a clear network error
- How does the system handle circular module dependencies?
- Dependency resolver detects cycles and fails with an error listing the cycle
- What happens when two modules require incompatible versions of the same dependency?
- System automatically selects the highest compatible version; if no compatible version exists, fails with error listing conflicting requirements
- How are modules resolved when multiple registries are configured?
- Registries are tried in order; first match wins
- What happens when
meta.yamlis missing from a module?- Module is treated as having no dependencies; basic functionality works
- What happens when local module directory is corrupted or incomplete?
- Checksum mismatch triggers warning;
-forceallows re-download
- Checksum mismatch triggers warning;
Requirements
Functional Requirements
DSL Parser Extension
- FR-001: System MUST recognize
@scope/namesyntax inincludestatements as registry module references - FR-002: System MUST distinguish between local file paths (starting with
.or/) and registry modules (starting with@) - FR-003: System MUST resolve module versions from
nextflow_spec.jsonbefore downloading - FR-004: System MUST parse and validate
meta.yamlfiles for module metadata and dependencies
Module Resolution
- FR-005: System MUST resolve modules at workflow parse time (after plugin resolution)
- FR-006: System MUST check local
modules/@scope/name/directory before querying registry - FR-007: System MUST verify module integrity using
.checksumfile on every run - FR-008: System MUST download modules from registry when not present locally or when version differs
- FR-009: System MUST NOT override locally modified modules (checksum mismatch) unless
-forceis used - FR-010: System MUST resolve version conflicts by selecting the highest compatible version; if no compatible version exists, MUST fail with error listing conflicting requirements
Local Storage
- FR-011: System MUST store modules in
modules/@scope/name/directory structure (single version per module) - FR-012: System MUST create
.checksumfile from registry's X-Checksum header on download - FR-013: System MUST store module's
main.nf,meta.yaml, and supporting files in the module directory
CLI Commands
- FR-014: System MUST provide
nextflow module install [scope/name]command to download modules - FR-015: System MUST provide
nextflow module search <query>command to search the registry - FR-016: System MUST provide
nextflow module listcommand to show installed vs configured modules - FR-017: System MUST provide
nextflow module remove scope/namecommand to delete modules - FR-018: System MUST provide
nextflow module publish scope/namecommand to upload modules to registry - FR-019: System MUST provide
nextflow module run scope/namecommand to execute modules directly - FR-019b: System MUST provide
nextflow module info scope/namecommand to display module metadata and a usage template
Configuration
- FR-020: System MUST persist module versions in
nextflow_spec.json; MUST also read versions frommodules {}block innextflow.configas an alternative - FR-021: System MUST support
registry {}block withurlandapiKeyfields for configuring registry URL and authentication - FR-022: System MUST support
NXF_REGISTRY_TOKENenvironment variable as fallback forregistry.apiKey - FR-023: System MUST support multiple registry URLs with fallback ordering
Module Parameters
- FR-024: System MUST parse module parameters from
paramssection inmeta.yaml - FR-025: System MUST validate module parameters against
meta.yamlschema (type) at workflow parse time - FR-026: System MUST support boolean, integer, float, string, file, and path parameter types
- FR-027: System MUST make module parameters accessible via standard
paramsvariable in scripts
Registry Communication
- FR-028: System MUST communicate with registry via documented Module API endpoints
- FR-029: System MUST handle authentication using Bearer token in Authorization header
- FR-030: System MUST verify SHA-256 checksum on module download
Key Entities
- Module: A reusable Nextflow process definition with
main.nfentry point, optionalmeta.yamlmanifest, and README documentation - Module Reference: A scoped identifier (
@scope/name) pointing to a registry module - Module Manifest (meta.yaml): YAML file containing module metadata, version, dependencies, and parameter definitions
- Module Parameter: A configurable parameter defined in
meta.yamlwith name, optional type, description, and example - Checksum File (.checksum): Local cache of registry checksum for integrity verification
- Registry Configuration: Settings for registry URL, authentication, and fallback ordering
Success Criteria
Measurable Outcomes
- SC-001: Pipeline developers can install and use a registry module within 5 minutes of starting a new project
- SC-002: Module resolution adds less than 2 seconds to workflow startup time when modules are cached locally
- SC-003: Users can successfully search, install, and run any module from the registry without reading documentation
- SC-004: 100% of module version changes in
nextflow.configresult in automatic module updates without manual intervention - SC-005: Users receive clear, actionable error messages for all failure scenarios (network, validation, authentication)
- SC-006: Module authors can publish a new module version within 3 minutes using the CLI
- SC-007: Locally modified modules are never accidentally overwritten during normal operations
Assumptions
- Registry backend is fully implemented and available at
registry.nextflow.iowith the Module API as documented in the ADR - Existing plugin authentication system can be reused for module registry authentication
- Module bundle size limit of 1MB (uncompressed) is enforced by the registry
- Network connectivity is available for initial module downloads; offline operation uses local cache only
- The
modules/directory is intended to be committed to the pipeline's git repository - Version constraints in
meta.yamlfollow the same syntax as existing Nextflow plugin version constraints - SHA-256 is used for all checksum operations
- Module parameters use standard
--<param_name>CLI syntax
Dependencies
- Registry backend API (Module API endpoints as specified in ADR)
- Existing Nextflow plugin system (for authentication reuse)
- Existing DSL parser infrastructure (for
includestatement extension) - Existing config parser (for
modules {}andregistry {}blocks)
Clarifications
Session 2026-01-19
- Q: What should happen when incompatible dependency versions are detected? → A: Use highest compatible version automatically, warn if none exists
- Q: When should module parameter validation occur? → A: At workflow parse time (early, before any execution)