13 KiB
Research: Nextflow Module System Client
Date: 2026-01-19 Feature: 251117-module-system
Overview
This document captures technical research and decisions for implementing the Nextflow module system client. All NEEDS CLARIFICATION items from Technical Context have been resolved through codebase exploration.
1. CLI Command Structure
Research Question: How should nextflow module CLI commands be implemented?
Decision: JCommander native subcommands — each subcommand extends CmdBase directly; no trait needed
Rationale:
- JCommander's subcommand support handles parameter parsing automatically per subcommand
- Each subcommand (install, run, list, remove, search, info, publish) is a separate class extending CmdBase
ModuleRunextendsCmdRunto reuse pipeline execution logic (PR #6381)- No custom
ModuleSubCmdtrait needed; cleaner architecture CmdModuleis registered inLauncheralongside all other top-level commands
Implemented Pattern:
@Parameters(commandDescription = "Manage Nextflow modules")
class CmdModule extends CmdBase implements UsageAware {
static final List<CmdBase> commands = []
static {
commands << new ModuleInstall() // extends CmdBase
commands << new ModuleRun() // extends CmdRun
commands << new ModuleList() // extends CmdBase
commands << new ModuleRemove() // extends CmdBase
commands << new ModuleSearch() // extends CmdBase
commands << new ModuleInfo() // extends CmdBase
commands << new ModulePublish() // extends CmdBase
}
void run() {
final jc = commander() // JCommander with all subcommands registered
jc.parse(args as String[])
final subcommand = jc.getCommands().get(jc.getParsedCommand()).getObjects()[0]
subcommand.run()
}
}
Alternatives Considered:
- CmdFs trait pattern: Considered initially; replaced by JCommander native subcommands — simpler and avoids custom parsing
- Separate top-level Cmd classes (CmdModuleInstall, etc.): Rejected — too many entry points
- Plugin-based CLI extension: Rejected — module system is core functionality, not optional
2. DSL Parser Extension for @scope/name
Research Question: How to extend include statement parsing for registry modules?
Decision: Extend ResolveIncludeVisitor to detect @ prefix and delegate to a RemoteModuleResolver SPI loaded via Java ServiceLoader
Rationale:
- Keeps
nf-langdecoupled from runtime module resolution (nf-langhas no dependency onnextflowmodule) - SPI pattern allows plugins or custom implementations to override the default resolver
- Detection:
source.startsWith('@')distinguishes registry vs local paths — preserves existing include behavior - Resolution at parse time (after plugin resolution) per ADR
Implemented Architecture:
include { X } from '@scope/name'
↓
ResolveIncludeVisitor.visitInclude() [nf-lang]
source.startsWith("@") → RemoteModuleResolverProvider.getInstance().resolve(source, baseDir)
↓
RemoteModuleResolverProvider [nf-lang]
Java ServiceLoader discovers implementations; picks highest priority
↓
DefaultRemoteModuleResolver [nextflow module]
Calls ModuleResolver.installModule(reference, version, autoInstall=true)
Returns Path to modules/@scope/name/main.nf
Key Files:
modules/nf-lang/src/main/java/nextflow/module/spi/RemoteModuleResolver.java— SPI interfacemodules/nf-lang/src/main/java/nextflow/module/spi/RemoteModuleResolverProvider.java— ServiceLoader singletonmodules/nf-lang/src/main/java/nextflow/module/spi/FallbackRemoteModuleResolver.java— error fallbackmodules/nf-lang/src/main/java/nextflow/script/control/ResolveIncludeVisitor.java— MODIFIEDmodules/nextflow/src/main/groovy/nextflow/module/DefaultRemoteModuleResolver.groovy— default impl
Alternatives Considered:
- New ANTLR grammar token for
@: Rejected — unnecessary parser complexity - Direct dependency from nf-lang to nextflow module: Rejected — circular dependency risk; SPI decouples cleanly
- Dot file marker for local modules: Deferred in ADR; current impl uses
@for registry,.//for local
3. Config Parsing for modules{} and registry{} Blocks
Research Question: How to add new config DSL blocks?
Decision: Create ModulesConfig and RegistryConfig classes implementing ConfigScope interface
Rationale:
- ConfigScope is an ExtensionPoint (pf4j) that ConfigBuilder automatically discovers
- Classes implementing ConfigScope and annotated with @ScopeName are automatically parsed
- No need to modify ConfigBuilder or create custom DSL parsers
- Pattern used throughout Nextflow: FusionConfig, CondaConfig, DockerConfig, etc.
- Provides type safety via @CompileStatic and validation via @ConfigOption
Reference Implementation:
Location: modules/nextflow/src/main/groovy/nextflow/fusion/FusionConfig.groovy
Pattern:
@ScopeName("modules")
@Description("Module version declarations")
@CompileStatic
class ModulesConfig implements ConfigScope {
@ConfigOption
@Description("Module version mappings")
final Map<String, String> modules = [:]
ModulesConfig() {}
ModulesConfig(Map opts) {
// Parse from config map
}
}
ConfigScope Interface:
Location: modules/nf-lang/src/main/java/nextflow/config/spec/ConfigScope.java
public interface ConfigScope extends ExtensionPoint {}
RegistryConfig Pattern:
@ScopeName("registry")
@Description("Module registry configuration")
@CompileStatic
class RegistryConfig implements ConfigScope {
static final String DEFAULT_REGISTRY_URL = 'https://registry.nextflow.io/api'
@ConfigOption
final Collection<String> url // One or more URLs in priority order
@ConfigOption
final String apiKey // API key; falls back to NXF_REGISTRY_TOKEN env var
RegistryConfig() {
url = [DEFAULT_REGISTRY_URL]
apiKey = null
}
RegistryConfig(Map opts) {
url = opts.url ?: [DEFAULT_REGISTRY_URL]
apiKey = opts.apiKey as String
}
String getUrl() { url ? url[0] : DEFAULT_REGISTRY_URL }
Collection<String> getAllUrls() { url ?: [DEFAULT_REGISTRY_URL] }
String getApiKey() { apiKey ?: SysEnv.get('NXF_REGISTRY_TOKEN') }
}
Integration Point: ConfigBuilder automatically discovers and parses ConfigScope implementations via ExtensionPoint mechanism
Alternatives Considered:
- Custom DSL parsers (ModulesDsl/RegistryDsl): Rejected - unnecessary complexity, ConfigScope pattern handles this automatically
- JSON/YAML config file: Rejected - inconsistent with Nextflow config style
- Dedicated pipeline.yaml: Deferred per ADR Open Questions
4. Registry HTTP Communication
Research Question: How to communicate with module registry API?
Decision: Create HttpModuleRepository following HttpPluginRepository pattern
Rationale:
- HttpPluginRepository provides robust HTTP client with retry logic
- Uses HxClient from io.seqera.http (already a dependency)
- Handles authentication headers consistently
- Supports connection pooling and timeout configuration
Reference Implementation:
Location: modules/nf-commons/src/main/nextflow/plugin/HttpPluginRepository.groovy
Pattern:
class HttpModuleRepository {
private final URI url
private final HxClient httpClient
private final String authToken
ModuleInfo getModule(String name, String version)
List<ModuleInfo> search(String query, int limit)
Path download(String name, String version, Path target)
void publish(String name, Path bundle)
}
API Endpoints (from ADR):
GET /api/modules?query=<text> # Search
GET /api/modules/{name} # Get module + latest release
GET /api/modules/{name}/releases # List all releases
GET /api/modules/{name}/{version} # Get specific release
GET /api/modules/{name}/{version}/download # Download bundle
POST /api/modules/{name} # Publish (authenticated)
Alternatives Considered:
- Direct HttpClient usage: Rejected - loses retry, pooling benefits
- gRPC protocol: Rejected - registry already uses REST
5. Authentication Patterns
Research Question: How to handle registry authentication?
Decision: Support NXF_REGISTRY_TOKEN env var + registry.apiKey config field
Rationale:
- Environment variable provides CI/CD compatibility
apiKeyconfig field allows explicit token configuration- Authentication is only applied to the primary (first) registry URL
- Bearer token in Authorization header (standard HTTP auth)
Implementation:
RegistryConfig.getApiKey() returns:
1. registry.apiKey config value if set
2. NXF_REGISTRY_TOKEN environment variable as fallback
3. null if neither is set (unauthenticated requests)
Config Syntax:
registry {
apiKey = '${NXF_REGISTRY_TOKEN}'
}
Alternatives Considered:
- Per-registry token map (
auth {}block): Was in initial design; simplified to singleapiKeysince only the primary registry uses authentication - Secrets file (~/.nextflow/secrets.json): Possible future enhancement
- OAuth flow: Rejected for CLI — token-based simpler
6. Checksum Verification
Research Question: How to implement module integrity verification?
Decision: SHA-256 checksum stored in .checksum file, verified on every run
Rationale:
- SHA-256 is industry standard, already used for plugin verification
.checksumfile stores registry-provided checksum (from X-Checksum header)- Local checksum computed on-demand and compared
- Mismatch indicates local modification (warn, don't override)
Implementation Pattern:
class ModuleChecksum {
static final String ALGORITHM = 'SHA-256'
static String compute(Path moduleDir) {
// Hash all files in module directory
// Exclude .checksum itself
// Return hex-encoded SHA-256
}
static boolean verify(Path moduleDir) {
def expected = moduleDir.resolve('.checksum').text.trim()
def actual = compute(moduleDir)
return expected == actual
}
static void save(Path moduleDir, String checksum) {
moduleDir.resolve('.checksum').text = checksum
}
}
Checksum Scope: Covers all files in module directory (main.nf, meta.yaml, README.md, etc.)
Alternatives Considered:
- Per-file checksums: Rejected - adds complexity, single checksum sufficient
- MD5: Rejected - SHA-256 more secure
7. Version Constraint Syntax
Research Question: What version constraint syntax to use for module dependencies?
Decision: Reuse existing Nextflow plugin version constraint syntax
Rationale:
- Already implemented and tested in plugin system
- Users familiar with existing
nextflowVersionsyntax - Supports ranges, comparisons, exact versions
- No new parser code needed
Supported Syntax:
| Notation | Meaning | Example |
|---|---|---|
1.2.3 |
Exact version | @nf-core/fastqc@1.0.0 |
>=1.2.3 |
Greater or equal | @nf-core/fastqc@>=1.0.0 |
<=1.2.3 |
Less or equal | @nf-core/fastqc@<=2.0.0 |
>=1.2.0,<2.0.0 |
Range | @nf-core/samtools@>=1.0.0,<2.0.0 |
Reference: Version parsing code exists in plugin system; reuse VersionNumber class
Alternatives Considered:
- NPM-style
^and~: Rejected - inconsistent with existing Nextflow patterns - Always latest: Rejected - breaks reproducibility
8. Tool Arguments Implementation
⚠️ REMOVED FROM ADR — The tool arguments feature (
tools.<name>.argsin meta.yaml and process config) was removed from the module system ADR. It is not implemented and not planned in the current scope. Themeta.yamlformat used in the actual implementation (ModuleSpec) does not include tool/argument definitions.
Summary of Key Decisions
| Area | Decision | Key Reference |
|---|---|---|
| CLI | JCommander subcommands; each extends CmdBase (ModuleRun extends CmdRun) | CmdModule.groovy |
| DSL Parser | SPI pattern — ResolveIncludeVisitor delegates to RemoteModuleResolver; DefaultRemoteModuleResolver bridges to ModuleResolver | ResolveIncludeVisitor.java, RemoteModuleResolver.java |
| Config | ModulesConfig + RegistryConfig (ConfigScope) | FusionConfig.groovy, ConfigScope.java |
| Registry HTTP | ModuleRegistryClient using HxClient + npr-api models | HttpPluginRepository.groovy |
| Authentication | NXF_REGISTRY_TOKEN env var or registry.apiKey config field (primary registry only) |
RegistryConfig.groovy |
| Checksums | SHA-256/SHA-512, .checksum file, download integrity via X-Checksum header |
ModuleChecksum.groovy |
| Version Storage | nextflow_spec.json (auto-managed); modules {} in nextflow.config (manual alternative) |
PipelineSpec.groovy |
| Version Syntax | Plugin-compatible constraints | VersionNumber class |
| Tool Args | N/A |
Open Items (Deferred)
- Local vs managed module distinction: Resolved —
@prefix for registry modules only; local paths start with.or/ - Tool arguments: Removed from ADR — not in scope
- Module version location: Resolved —
nextflow_spec.json(auto-managed bymodule install);modules {}block innextflow.configsupported as alternative - DSL parser
@scope/nameinclude: ✅ Resolved — SPI pattern implemented (T017a-d)