add nextflow d30e48d

This commit is contained in:
2026-04-29 23:01:54 +02:00
parent d0b12d668d
commit 97cc9058d3
2840 changed files with 730250 additions and 0 deletions

View File

@@ -0,0 +1,39 @@
# Specification Quality Checklist: Nextflow Module System Client
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-01-15
**Feature**: [spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- Specification is complete and ready for `/speckit.plan`
- All 8 user stories have clear acceptance scenarios
- 29 functional requirements defined across 6 categories
- 8 success criteria defined with measurable outcomes
- Edge cases documented for error handling scenarios
- Registry backend is explicitly out of scope (assumed implemented)

View File

@@ -0,0 +1,388 @@
openapi: 3.0.3
info:
title: Nextflow Module Registry API
description: |
API specification for the Nextflow Module Registry.
This documents the endpoints that the Nextflow module system client consumes.
The registry backend is assumed to be already implemented at registry.nextflow.io.
version: 1.0.0
contact:
name: Nextflow Team
url: https://nextflow.io
servers:
- url: https://registry.nextflow.io/api
description: Production registry
security:
- BearerAuth: []
paths:
/modules:
get:
operationId: searchModules
summary: Search modules
description: Search for modules by query text (semantic search across name, description, keywords)
tags:
- Modules
parameters:
- name: query
in: query
required: true
description: Search query text
schema:
type: string
example: "alignment"
- name: limit
in: query
required: false
description: Maximum number of results
schema:
type: integer
default: 10
maximum: 100
responses:
'200':
description: Search results
content:
application/json:
schema:
type: object
properties:
modules:
type: array
items:
$ref: '#/components/schemas/ModuleSummary'
total:
type: integer
description: Total matching modules
'400':
description: Invalid query parameters
/modules/{name}:
get:
operationId: getModule
summary: Get module details
description: Get module metadata including latest release information
tags:
- Modules
parameters:
- name: name
in: path
required: true
description: Module name including scope (e.g., nf-core/fastqc)
schema:
type: string
example: "nf-core/fastqc"
responses:
'200':
description: Module details
content:
application/json:
schema:
$ref: '#/components/schemas/ModuleDetails'
'404':
description: Module not found
post:
operationId: publishModule
summary: Publish module version
description: Upload and publish a new module version (requires authentication)
tags:
- Modules
security:
- BearerAuth: []
parameters:
- name: name
in: path
required: true
description: Module name including scope
schema:
type: string
requestBody:
required: true
content:
multipart/form-data:
schema:
type: object
required:
- bundle
properties:
bundle:
type: string
format: binary
description: Module bundle (tar.gz archive)
tags:
type: array
items:
type: string
description: Additional tags for discoverability
responses:
'201':
description: Module published successfully
content:
application/json:
schema:
$ref: '#/components/schemas/PublishResult'
'400':
description: Invalid module bundle or manifest
content:
application/json:
schema:
$ref: '#/components/schemas/ValidationError'
'401':
description: Authentication required
'403':
description: Insufficient permissions for scope
/modules/{name}/releases:
get:
operationId: listReleases
summary: List module releases
description: Get all available versions of a module
tags:
- Modules
parameters:
- name: name
in: path
required: true
schema:
type: string
responses:
'200':
description: List of releases
content:
application/json:
schema:
type: object
properties:
releases:
type: array
items:
$ref: '#/components/schemas/ReleaseInfo'
'404':
description: Module not found
/modules/{name}/{version}:
get:
operationId: getRelease
summary: Get specific release
description: Get metadata for a specific module version
tags:
- Modules
parameters:
- name: name
in: path
required: true
schema:
type: string
- name: version
in: path
required: true
description: Semantic version (e.g., 1.0.0)
schema:
type: string
example: "1.0.0"
responses:
'200':
description: Release details
content:
application/json:
schema:
$ref: '#/components/schemas/ReleaseInfo'
'404':
description: Module or version not found
/modules/{name}/{version}/download:
get:
operationId: downloadModule
summary: Download module bundle
description: Download the module source archive for a specific version
tags:
- Modules
parameters:
- name: name
in: path
required: true
schema:
type: string
- name: version
in: path
required: true
schema:
type: string
responses:
'200':
description: Module bundle
headers:
X-Checksum:
description: SHA-256 checksum of the bundle
schema:
type: string
example: "sha256:abc123..."
Content-Disposition:
description: Suggested filename
schema:
type: string
example: "attachment; filename=nf-core-fastqc-1.0.0.tar.gz"
content:
application/gzip:
schema:
type: string
format: binary
'404':
description: Module or version not found
components:
securitySchemes:
BearerAuth:
type: http
scheme: bearer
description: |
Authentication token. Can be provided via:
- NXF_REGISTRY_TOKEN environment variable
- registry.auth config block
schemas:
ModuleSummary:
type: object
required:
- name
- latestVersion
- description
properties:
name:
type: string
description: Full module name with scope
example: "nf-core/fastqc"
latestVersion:
type: string
description: Latest available version
example: "1.2.0"
description:
type: string
description: Short module description
example: "Run FastQC on sequenced reads"
downloadCount:
type: integer
description: Total download count
example: 15420
keywords:
type: array
items:
type: string
example: ["quality control", "fastq"]
ModuleDetails:
type: object
required:
- name
- latestVersion
- description
properties:
name:
type: string
example: "nf-core/fastqc"
latestVersion:
type: string
example: "1.2.0"
description:
type: string
authors:
type: array
items:
type: string
example: ["@drpatelh"]
maintainers:
type: array
items:
type: string
license:
type: string
example: "MIT"
keywords:
type: array
items:
type: string
downloadCount:
type: integer
createdAt:
type: string
format: date-time
updatedAt:
type: string
format: date-time
latestRelease:
$ref: '#/components/schemas/ReleaseInfo'
ReleaseInfo:
type: object
required:
- version
- checksum
- publishedAt
properties:
version:
type: string
description: Semantic version
example: "1.2.0"
checksum:
type: string
description: SHA-256 checksum of bundle
example: "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
publishedAt:
type: string
format: date-time
size:
type: integer
description: Bundle size in bytes
example: 45678
requires:
type: object
properties:
nextflow:
type: string
example: ">=24.04.0"
plugins:
type: array
items:
type: string
modules:
type: array
items:
type: string
PublishResult:
type: object
properties:
name:
type: string
version:
type: string
checksum:
type: string
publishedAt:
type: string
format: date-time
downloadUrl:
type: string
format: uri
ValidationError:
type: object
properties:
code:
type: string
enum:
- INVALID_MANIFEST
- MISSING_MAIN_NF
- MISSING_README
- INVALID_VERSION
- BUNDLE_TOO_LARGE
- DUPLICATE_VERSION
message:
type: string
details:
type: array
items:
type: string

View File

@@ -0,0 +1,272 @@
# Data Model: Nextflow Module System Client
**Date**: 2026-01-19
**Feature**: 251117-module-system
**Last Updated**: 2026-02-27 (reflects final implementation)
## Overview
This document defines the data entities, their attributes, relationships, and state transitions for the Nextflow module system client implementation.
---
## Entity Definitions
### 1. ModuleReference
Represents a reference to a module in DSL `include` statements.
```groovy
@CompileStatic
class ModuleReference {
String scope // e.g., "nf-core"
String name // e.g., "fastqc"
String fullName // e.g., "@nf-core/fastqc"
static ModuleReference parse(String source) {
// Parses "@scope/name" format
}
boolean isRegistryModule() {
return fullName.startsWith('@')
}
}
```
**Validation Rules**:
- `scope`: lowercase alphanumeric with hyphens, pattern `[a-z0-9][a-z0-9-]*`
- `name`: lowercase alphanumeric with underscores/hyphens, pattern `[a-z][a-z0-9_-]*`
- `fullName`: must match `^@[a-z0-9][a-z0-9-]*/[a-z][a-z0-9_-]*$`
---
### 2. ModuleSpec
Parsed representation of `meta.yaml` file. Class: `nextflow.module.ModuleSpec`.
```groovy
@CompileStatic
class ModuleSpec {
String name // e.g., "nf-core/fastqc" (without @)
String version // e.g., "1.0.0"
String description // Module description
List<String> keywords // Discovery keywords
List<String> authors // GitHub handles
String license // SPDX identifier
Map<String, String> requires // dependency -> version constraint
static ModuleSpec load(Path metaYamlPath) { ... }
List<String> validate() { ... } // Returns list of validation errors
boolean isValid() { ... }
}
```
**Validation Rules**:
- `name`: Must match `scope/name` or `scope/path/to/name` pattern
- `version`: Must be valid SemVer (`MAJOR.MINOR.PATCH[-prerelease]`)
- `description`, `license`: Required fields (validate() reports missing)
**Note**: Tool/argument definitions were removed from the ADR and are not part of `ModuleSpec`.
---
### 3. InstalledModule
Represents a module in the local `modules/` directory.
```groovy
@CompileStatic
class InstalledModule {
ModuleReference reference
Path directory // e.g., /project/modules/@nf-core/fastqc
Path mainFile // e.g., /project/modules/@nf-core/fastqc/main.nf
Path manifestFile // e.g., /project/modules/@nf-core/fastqc/meta.yaml
Path checksumFile // e.g., /project/modules/@nf-core/fastqc/.checksum
String installedVersion
String expectedChecksum
ModuleIntegrity getIntegrity() {
// Compute and compare checksum
}
}
enum ModuleIntegrity {
VALID, // Checksum matches
MODIFIED, // Checksum mismatch (local changes)
MISSING_CHECKSUM, // No .checksum file
CORRUPTED // Missing required files
}
```
**State Transitions**:
```
[NOT_INSTALLED] --install--> [VALID]
[VALID] --user edits--> [MODIFIED]
[MODIFIED] --install -force--> [VALID]
[VALID] --version change in config--> [VALID] (replaced)
[MODIFIED] --version change in config--> [MODIFIED] (blocked, warn)
```
---
### 4. ModulesConfig and RegistryConfig
Modules configuration loaded from `nextflow_spec.json` ( or the `modules {}` block in `nextflow.config` as alternative). Registry settings from the `registry {}` block in `nextflow.config`.
```groovy
@ScopeName("modules")
@CompileStatic
class ModulesConfig implements ConfigScope {
Map<String, String> modules = [:] // module fullName -> version
String getVersion(String moduleName) { ... }
boolean hasVersion(String moduleName) { ... }
}
@ScopeName("registry")
@CompileStatic
class RegistryConfig implements ConfigScope {
static final String DEFAULT_REGISTRY_URL = 'https://registry.nextflow.io/api'
Collection<String> url // Registry URL(s) in priority order
String apiKey // API key (falls back to NXF_REGISTRY_TOKEN env var)
String getUrl() // Returns primary (first) URL
Collection<String> getAllUrls()
String getApiKey() // Returns apiKey or NXF_REGISTRY_TOKEN
}
```
**Config Syntax**:
```nextflow
// nextflow_spec.json (current approach)
{
"modules": {
"@nf-core/fastqc": "1.0.0",
"@nf-core/bwa-align": "1.2.0"
}
}
// nextflow.config (alternative not currently used)
modules {
'@nf-core/fastqc' = '1.0.0'
'@nf-core/bwa-align' = '1.2.0'
}
registry {
url = [
'https://private.registry.myorg.com',
'https://registry.nextflow.io/api'
]
apiKey = '${MYORG_TOKEN}' // Only applied to the primary registry
}
```
---
### 5. DefaultRemoteModuleResolver (SPI)
Bridges the DSL parser to the module resolution runtime. Class: `nextflow.module.DefaultRemoteModuleResolver`.
```groovy
// Implements: nextflow.module.spi.RemoteModuleResolver (nf-lang)
class DefaultRemoteModuleResolver implements RemoteModuleResolver {
int getPriority() { return 0 } // Can be overridden by plugins with higher priority
Path resolve(String moduleName, Path baseDir) {
// 1. Parse ModuleReference from "@scope/name"
// 2. Read version constraints from nextflow_spec.json / ModulesConfig
// 3. Call ModuleResolver.installModule(reference, version, autoInstall=true)
// 4. Return path to modules/@scope/name/main.nf
}
}
```
The SPI is loaded via Java `ServiceLoader` by `RemoteModuleResolverProvider` (in `nf-lang`), which selects the highest-priority implementation available.
---
### 6. PipelineSpec
Reads and writes `nextflow_spec.json` in the project root. Class: `nextflow.pipeline.PipelineSpec`.
```groovy
class PipelineSpec {
PipelineSpec(Path baseDir)
Map<String, String> getModules()
void addModuleEntry(String name, String version)
boolean removeModuleEntry(String name)
}
```
---
### 6. ModuleResolutionResult
Result of module resolution process.
```groovy
@CompileStatic
class ModuleResolutionResult {
ModuleReference reference
Path resolvedPath // Absolute path to main.nf
ResolutionAction action
String message // Warning/info message if any
}
enum ResolutionAction {
USE_LOCAL, // Used existing local module
DOWNLOADED, // Downloaded from registry
REPLACED, // Replaced local with different version
BLOCKED_MODIFIED, // Local modified, not replaced (warning issued)
FAILED // Resolution failed (error)
}
```
---
## Relationships
```
PipelineSpec (1) -----> (*) ModuleReference (nextflow_spec.json)
ModulesConfig (1) -----> (*) ModuleReference (nextflow.config alternative)
RegistryConfig (1) -----> (*) Registry URLs
ModuleReference (1) -----> (0..1) InstalledModule
|
v (via registry)
ModuleSpec (1) <----- InstalledModule (from meta.yaml)
```
---
## Storage Layout
```
project-root/
├── nextflow.config # registry{} block; optional modules{} block
├── nextflow_spec.json # auto-managed module version pins
├── main.nf # include { X } from '@scope/name'
└── modules/
└── @scope/
└── name/
├── .checksum # SHA-256 from registry (download integrity)
├── main.nf # Entry point (required)
├── meta.yaml # Manifest (required for publishing)
├── README.md # Documentation (required for publishing)
└── [other files] # Supporting files
```
---
## Validation Summary
| Entity | Field | Validation |
|--------|-------|------------|
| ModuleReference | fullName | Pattern: `^@[a-z0-9][a-z0-9-]*/[a-z][a-z0-9_-]*$` |
| ModuleSpec | name | Pattern: `scope/name` or `scope/path/to/name` |
| ModuleSpec | version | SemVer: `MAJOR.MINOR.PATCH[-prerelease]` |
| ModuleSpec | description, license | Required (non-empty) |
| InstalledModule | directory | Must contain main.nf |
| ModulesConfig | modules keys | Must be valid module fullName |
| RegistryConfig | url | Valid HTTPS URL |

View File

@@ -0,0 +1,137 @@
# Implementation Plan: Nextflow Module System Client
**Branch**: `251117-module-system` | **Date**: 2026-01-19 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/251117-module-system/spec.md`
## Summary
Implement client-side module system for Nextflow enabling pipeline developers to include remote modules from the Nextflow registry using `@scope/name` syntax, manage versions via `nextflow.config`, configure module parameters via `meta.yaml`, and use CLI commands (install, search, list, remove, publish, run). Implementation extends existing DSL parser, config parser, and follows plugin system patterns for registry communication and authentication.
## Technical Context
**Language/Version**: Groovy 4.0.29 (targeting Java 17 runtime, Java 21 toolchain for development)
**Primary Dependencies**:
- Existing Nextflow DSL parser (nf-lang module, ANTLR)
- Existing config parser (ConfigBuilder, ConfigParser)
- Existing HTTP client (HxClient from io.seqera.http)
- Existing plugin authentication infrastructure
- Existing npr-api (registry data models and schema validation)
**Storage**: Local filesystem (`modules/@scope/name/` per-project, `.checksum` files)
**Testing**: Spock Framework for unit tests, integration tests in `tests/` directory
**Target Platform**: JVM 17+ (same as Nextflow core)
**Project Type**: Multi-module Gradle project extension (core modules + CLI)
**Performance Goals**: Module resolution adds <2 seconds to workflow startup when cached locally (SC-002)
**Constraints**:
- Module bundle size limit: 1MB uncompressed (enforced by registry)
- Backward compatibility: Must not break existing `include` statements
- Offline operation: Must work with locally cached modules
**Scale/Scope**: Ecosystem-wide module distribution; typical project: 5-20 modules
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
| Principle | Status | Evidence |
|-----------|--------|----------|
| I. Modular Architecture | PASS | Module system client belongs in `modules/nextflow` (core CLI) with potential shared utilities in `nf-commons` |
| II. Test-Driven Quality | PASS | Unit tests (Spock), integration tests planned, smoke test support |
| III. Dataflow Programming Model | PASS | Modules are process definitions; include resolution at parse time preserves dataflow semantics |
| IV. Apache 2.0 License | PASS | All new code will include Apache 2.0 headers |
| V. DCO Sign-off | PASS | All commits will use `git commit -s` |
| VI. Semantic Versioning | PASS | Modules use SemVer; plugin-compatible version constraint syntax |
| VII. Groovy Idioms | PASS | Follow existing patterns from CmdPlugin, ConfigBuilder, HttpPluginRepository |
**Gate Status**: PASS - No violations requiring justification
## Project Structure
### Documentation (this feature)
```text
specs/251117-module-system/
├── plan.md # This file
├── spec.md # Feature specification
├── research.md # Phase 0 output
├── data-model.md # Phase 1 output
├── quickstart.md # Phase 1 output
├── contracts/ # Phase 1 output (API contracts)
└── tasks.md # Phase 2 output (from /speckit.tasks)
```
### Source Code (repository root)
```text
modules/nextflow/src/main/groovy/nextflow/
├── cli/
│ ├── CmdModule.groovy # Main module command (uses JCommander)
│ └── module/
│ ├── ModuleInstall.groovy # Install subcommand (extends CmdBase)
│ ├── ModuleRun.groovy # Run subcommand (extends CmdRun)
│ ├── ModuleList.groovy # List subcommand (extends CmdBase)
│ ├── ModuleRemove.groovy # Remove subcommand (extends CmdBase)
│ ├── ModuleSearch.groovy # Search subcommand (extends CmdBase)
│ ├── ModuleInfo.groovy # Info subcommand (extends CmdBase)
│ └── ModulePublish.groovy # Publish subcommand (extends CmdBase)
├── config/
│ ├── ModulesConfig.groovy # modules{} config scope
│ └── RegistryConfig.groovy # registry{} config scope (fields: url, apiKey)
├── module/
│ ├── ModuleReference.groovy # @scope/name parser
│ ├── ModuleResolver.groovy # Core resolution logic (version/integrity/install)
│ ├── ModuleStorage.groovy # Local filesystem operations
│ ├── ModuleRegistryClient.groovy # HTTP registry client
│ ├── ModuleChecksum.groovy # SHA-256 integrity verification
│ ├── ModuleSpec.groovy # Module manifest (meta.yaml) entity
│ ├── InstalledModule.groovy # Installed module entity
│ └── DefaultRemoteModuleResolver.groovy # SPI impl: bridges DSL parser → ModuleResolver
└── pipeline/
└── PipelineSpec.groovy # nextflow_spec.json read/write
modules/nf-lang/src/main/java/nextflow/script/
└── control/ResolveIncludeVisitor.java # MODIFIED: Delegates @scope/name to SPI resolver
modules/nf-lang/src/main/java/nextflow/module/spi/
├── RemoteModuleResolver.java # SPI interface (extensible by plugins)
├── RemoteModuleResolverProvider.java # ServiceLoader wrapper (singleton)
└── FallbackRemoteModuleResolver.java # Error fallback when no impl found
modules/nextflow/src/test/groovy/nextflow/
├── cli/module/
│ ├── ModuleInstallTest.groovy
│ ├── ModuleRunTest.groovy
│ └── [other subcommand tests]
└── module/
├── ModuleResolverTest.groovy
├── ModuleStorageTest.groovy
└── [other module tests]
tests/modules/
├── install-module.nf # Integration tests
├── run-module.nf
└── [other integration tests]
```
**Structure Decision**: Implementation extends existing Nextflow core modules following modular architecture. New code in `modules/nextflow` for CLI and core logic. DSL parser extension in `modules/nf-lang` via SPI. No new plugins required.
## Architecture Notes
### Remote Module Inclusion — SPI Pattern
The DSL parser (`ResolveIncludeVisitor`) detects the `@` prefix in `include` statements and delegates resolution to a `RemoteModuleResolver` SPI loaded via Java `ServiceLoader`. This keeps `nf-lang` decoupled from the runtime module resolution logic:
```
include { X } from '@nf-core/fastqc'
ResolveIncludeVisitor (nf-lang)
source.startsWith("@") → RemoteModuleResolverProvider.getInstance().resolve(...)
DefaultRemoteModuleResolver (nextflow module)
auto-installs via ModuleResolver if missing → returns Path to main.nf
```
The `RemoteModuleResolver` interface in `nf-lang` can be overridden by plugins with a higher priority value.
## Complexity Tracking
No constitution violations requiring justification.

View File

@@ -0,0 +1,274 @@
# Quickstart: Nextflow Module System
This guide covers the essential workflows for using the Nextflow module system.
## Prerequisites
- Nextflow 25.x or later (with module system support)
- Network connectivity for initial module downloads
- Optional: `NXF_REGISTRY_TOKEN` for publishing
---
## 1. Install and Use a Module
### Install a module
```bash
# Install latest version
nextflow module install nf-core/fastqc
# Install specific version
nextflow module install nf-core/fastqc -version 1.0.0
```
This downloads the module to `modules/@nf-core/fastqc/` and updates `nextflow_spec.json` with the installed version.
### Use in your workflow
```groovy
// main.nf
include { FASTQC } from '@nf-core/fastqc'
workflow {
reads = Channel.fromFilePairs('data/*_{1,2}.fastq.gz')
FASTQC(reads)
}
```
### Run your workflow
```bash
nextflow run main.nf
```
---
## 2. Run a Module Directly
Execute a module without writing a wrapper workflow:
```bash
# Basic usage
nextflow module run nf-core/fastqc --input 'data/*.fastq.gz'
# Run specific version
nextflow module run nf-core/fastqc --input 'data/*.fastq.gz' -version 1.0.0
# With Nextflow options
nextflow module run nf-core/salmon \
--reads reads.fq \
--index salmon_index \
-profile docker \
-resume
```
## 3. View Module Information
```bash
# Show module metadata and a generated usage template
nextflow module info nf-core/fastqc
# Show a specific version
nextflow module info nf-core/fastqc -version 1.0.0
# JSON output for scripting
nextflow module info nf-core/fastqc -json
```
---
## 4. Manage Module Versions
### Version tracking
Module versions are automatically recorded in `nextflow_spec.json` by `nextflow module install`. You can also pin versions manually:
```json
// nextflow_spec.json
{
"modules": {
"@nf-core/fastqc": "1.0.0",
"@nf-core/bwa-align": "1.2.0"
}
}
```
Alternatively, declare versions in `nextflow.config` (not currently used):
```nextflow
modules {
'@nf-core/fastqc' = '1.0.0'
'@nf-core/bwa-align' = '1.2.0'
}
```
### Check module status
```bash
# List all modules
nextflow module list
# Output:
# MODULE CONFIGURED INSTALLED LATEST STATUS
# @nf-core/fastqc 1.0.0 1.0.0 1.2.0 outdated
# @nf-core/bwa-align 1.2.0 1.2.0 1.2.0 up-to-date
# @nf-core/samtools 2.1.0 - 2.1.0 missing
```
### Update a module
Change the version in `nextflow_spec.json` (or `nextflow.config`), then run your workflow. Nextflow automatically downloads the new version.
---
## 5. Search for Modules
```bash
# Search by keyword
nextflow module search alignment
# Limit results
nextflow module search "quality control" -limit 5
# JSON output for scripting
nextflow module search bwa -json
```
---
## 6. Work with Private Registries
### Configure authentication
```nextflow
// nextflow.config
registry {
// Multiple registries (tried in order)
url = [
'https://private.registry.myorg.com',
'https://registry.nextflow.io/api'
]
apiKey = 'MYORG_TOKEN' // Applied to the primary (first) registry only
}
```
### Or use environment variable
```bash
export NXF_REGISTRY_TOKEN=your-token-here
nextflow module install nf-core/fastqc
```
---
## 7. Publish a Module
### Prepare your module
```
my-module/
├── main.nf # Required: entry point
├── meta.yaml # Required for registry
├── README.md # Required for registry
└── tests/ # Recommended
```
### Validate before publishing
```bash
nextflow module publish myorg/my-module -dry-run
```
### Publish to registry
```bash
export NXF_REGISTRY_TOKEN=your-token
nextflow module publish myorg/my-module
```
---
## 8. Handle Local Modifications
If you modify a module locally (for debugging), Nextflow protects your changes:
```bash
# This warns and does NOT override your changes
nextflow module install nf-core/fastqc -version 1.1.0
# Warning: Module @nf-core/fastqc has local modifications. Use -force to override.
# Force replacement if needed
nextflow module install nf-core/fastqc -version 1.1.0 -force
```
---
## 9. Remove a Module
```bash
# Remove module and config entry
nextflow module remove nf-core/fastqc
# Keep config entry (just delete local files)
nextflow module remove nf-core/fastqc -keep-config
# Keep local files (just remove from config)
nextflow module remove nf-core/fastqc -keep-files
```
---
## Common Patterns
### Offline operation
Modules are cached locally in `modules/`. Once installed, workflows run without network access.
### Git integration
The `modules/` directory is intended to be committed to your git repository:
```bash
git add modules/
git commit -m "Add module dependencies"
```
---
## Troubleshooting
### Module not found
```bash
# Check if module exists in registry
nextflow module search exact-module-name
# Verify spelling and scope
# Correct: @nf-core/fastqc
# Wrong: @nfcore/fastqc, nf-core/fastqc (without @)
```
### Authentication errors
```bash
# Verify token is set
echo $NXF_REGISTRY_TOKEN
# Check registry config
grep -A5 'registry' nextflow.config
```
### Version conflicts
If two modules require incompatible versions of a dependency:
- Nextflow selects the highest compatible version automatically
- If no compatible version exists, an error lists the conflicts
### Checksum warnings
```
Warning: Module @nf-core/fastqc has local modifications
```
This means the local module content differs from the registry version. Your changes are preserved. Use `-force` only if you want to discard local changes.

View File

@@ -0,0 +1,352 @@
# Research: Nextflow Module System Client
**Date**: 2026-01-19
**Feature**: 251117-module-system
## Overview
This document captures technical research and decisions for implementing the Nextflow module system client. All NEEDS CLARIFICATION items from Technical Context have been resolved through codebase exploration.
---
## 1. CLI Command Structure
**Research Question**: How should `nextflow module` CLI commands be implemented?
**Decision**: JCommander native subcommands — each subcommand extends `CmdBase` directly; no trait needed
**Rationale**:
- JCommander's subcommand support handles parameter parsing automatically per subcommand
- Each subcommand (install, run, list, remove, search, info, publish) is a separate class extending CmdBase
- `ModuleRun` extends `CmdRun` to reuse pipeline execution logic (PR #6381)
- No custom `ModuleSubCmd` trait needed; cleaner architecture
- `CmdModule` is registered in `Launcher` alongside all other top-level commands
**Implemented Pattern**:
```groovy
@Parameters(commandDescription = "Manage Nextflow modules")
class CmdModule extends CmdBase implements UsageAware {
static final List<CmdBase> commands = []
static {
commands << new ModuleInstall() // extends CmdBase
commands << new ModuleRun() // extends CmdRun
commands << new ModuleList() // extends CmdBase
commands << new ModuleRemove() // extends CmdBase
commands << new ModuleSearch() // extends CmdBase
commands << new ModuleInfo() // extends CmdBase
commands << new ModulePublish() // extends CmdBase
}
void run() {
final jc = commander() // JCommander with all subcommands registered
jc.parse(args as String[])
final subcommand = jc.getCommands().get(jc.getParsedCommand()).getObjects()[0]
subcommand.run()
}
}
```
**Alternatives Considered**:
- CmdFs trait pattern: Considered initially; replaced by JCommander native subcommands — simpler and avoids custom parsing
- Separate top-level Cmd classes (CmdModuleInstall, etc.): Rejected — too many entry points
- Plugin-based CLI extension: Rejected — module system is core functionality, not optional
---
## 2. DSL Parser Extension for @scope/name
**Research Question**: How to extend `include` statement parsing for registry modules?
**Decision**: Extend `ResolveIncludeVisitor` to detect `@` prefix and delegate to a `RemoteModuleResolver` SPI loaded via Java `ServiceLoader`
**Rationale**:
- Keeps `nf-lang` decoupled from runtime module resolution (`nf-lang` has no dependency on `nextflow` module)
- SPI pattern allows plugins or custom implementations to override the default resolver
- Detection: `source.startsWith('@')` distinguishes registry vs local paths — preserves existing include behavior
- Resolution at parse time (after plugin resolution) per ADR
**Implemented Architecture**:
```
include { X } from '@scope/name'
ResolveIncludeVisitor.visitInclude() [nf-lang]
source.startsWith("@") → RemoteModuleResolverProvider.getInstance().resolve(source, baseDir)
RemoteModuleResolverProvider [nf-lang]
Java ServiceLoader discovers implementations; picks highest priority
DefaultRemoteModuleResolver [nextflow module]
Calls ModuleResolver.installModule(reference, version, autoInstall=true)
Returns Path to modules/@scope/name/main.nf
```
**Key Files**:
- `modules/nf-lang/src/main/java/nextflow/module/spi/RemoteModuleResolver.java` — SPI interface
- `modules/nf-lang/src/main/java/nextflow/module/spi/RemoteModuleResolverProvider.java` — ServiceLoader singleton
- `modules/nf-lang/src/main/java/nextflow/module/spi/FallbackRemoteModuleResolver.java` — error fallback
- `modules/nf-lang/src/main/java/nextflow/script/control/ResolveIncludeVisitor.java` — MODIFIED
- `modules/nextflow/src/main/groovy/nextflow/module/DefaultRemoteModuleResolver.groovy` — default impl
**Alternatives Considered**:
- New ANTLR grammar token for `@`: Rejected — unnecessary parser complexity
- Direct dependency from nf-lang to nextflow module: Rejected — circular dependency risk; SPI decouples cleanly
- Dot file marker for local modules: Deferred in ADR; current impl uses `@` for registry, `.`/`/` for local
---
## 3. Config Parsing for modules{} and registry{} Blocks
**Research Question**: How to add new config DSL blocks?
**Decision**: Create ModulesConfig and RegistryConfig classes implementing ConfigScope interface
**Rationale**:
- ConfigScope is an ExtensionPoint (pf4j) that ConfigBuilder automatically discovers
- Classes implementing ConfigScope and annotated with @ScopeName are automatically parsed
- No need to modify ConfigBuilder or create custom DSL parsers
- Pattern used throughout Nextflow: FusionConfig, CondaConfig, DockerConfig, etc.
- Provides type safety via @CompileStatic and validation via @ConfigOption
**Reference Implementation**:
```
Location: modules/nextflow/src/main/groovy/nextflow/fusion/FusionConfig.groovy
Pattern:
@ScopeName("modules")
@Description("Module version declarations")
@CompileStatic
class ModulesConfig implements ConfigScope {
@ConfigOption
@Description("Module version mappings")
final Map<String, String> modules = [:]
ModulesConfig() {}
ModulesConfig(Map opts) {
// Parse from config map
}
}
```
**ConfigScope Interface**:
```
Location: modules/nf-lang/src/main/java/nextflow/config/spec/ConfigScope.java
public interface ConfigScope extends ExtensionPoint {}
```
**RegistryConfig Pattern**:
```groovy
@ScopeName("registry")
@Description("Module registry configuration")
@CompileStatic
class RegistryConfig implements ConfigScope {
static final String DEFAULT_REGISTRY_URL = 'https://registry.nextflow.io/api'
@ConfigOption
final Collection<String> url // One or more URLs in priority order
@ConfigOption
final String apiKey // API key; falls back to NXF_REGISTRY_TOKEN env var
RegistryConfig() {
url = [DEFAULT_REGISTRY_URL]
apiKey = null
}
RegistryConfig(Map opts) {
url = opts.url ?: [DEFAULT_REGISTRY_URL]
apiKey = opts.apiKey as String
}
String getUrl() { url ? url[0] : DEFAULT_REGISTRY_URL }
Collection<String> getAllUrls() { url ?: [DEFAULT_REGISTRY_URL] }
String getApiKey() { apiKey ?: SysEnv.get('NXF_REGISTRY_TOKEN') }
}
```
**Integration Point**: ConfigBuilder automatically discovers and parses ConfigScope implementations via ExtensionPoint mechanism
**Alternatives Considered**:
- Custom DSL parsers (ModulesDsl/RegistryDsl): Rejected - unnecessary complexity, ConfigScope pattern handles this automatically
- JSON/YAML config file: Rejected - inconsistent with Nextflow config style
- Dedicated pipeline.yaml: Deferred per ADR Open Questions
---
## 4. Registry HTTP Communication
**Research Question**: How to communicate with module registry API?
**Decision**: Create HttpModuleRepository following HttpPluginRepository pattern
**Rationale**:
- HttpPluginRepository provides robust HTTP client with retry logic
- Uses HxClient from io.seqera.http (already a dependency)
- Handles authentication headers consistently
- Supports connection pooling and timeout configuration
**Reference Implementation**:
```
Location: modules/nf-commons/src/main/nextflow/plugin/HttpPluginRepository.groovy
Pattern:
class HttpModuleRepository {
private final URI url
private final HxClient httpClient
private final String authToken
ModuleInfo getModule(String name, String version)
List<ModuleInfo> search(String query, int limit)
Path download(String name, String version, Path target)
void publish(String name, Path bundle)
}
```
**API Endpoints** (from ADR):
```
GET /api/modules?query=<text> # Search
GET /api/modules/{name} # Get module + latest release
GET /api/modules/{name}/releases # List all releases
GET /api/modules/{name}/{version} # Get specific release
GET /api/modules/{name}/{version}/download # Download bundle
POST /api/modules/{name} # Publish (authenticated)
```
**Alternatives Considered**:
- Direct HttpClient usage: Rejected - loses retry, pooling benefits
- gRPC protocol: Rejected - registry already uses REST
---
## 5. Authentication Patterns
**Research Question**: How to handle registry authentication?
**Decision**: Support `NXF_REGISTRY_TOKEN` env var + `registry.apiKey` config field
**Rationale**:
- Environment variable provides CI/CD compatibility
- `apiKey` config field allows explicit token configuration
- Authentication is only applied to the primary (first) registry URL
- Bearer token in Authorization header (standard HTTP auth)
**Implementation**:
```
RegistryConfig.getApiKey() returns:
1. registry.apiKey config value if set
2. NXF_REGISTRY_TOKEN environment variable as fallback
3. null if neither is set (unauthenticated requests)
```
**Config Syntax**:
```nextflow
registry {
apiKey = '${NXF_REGISTRY_TOKEN}'
}
```
**Alternatives Considered**:
- Per-registry token map (`auth {}` block): Was in initial design; simplified to single `apiKey` since only the primary registry uses authentication
- Secrets file (~/.nextflow/secrets.json): Possible future enhancement
- OAuth flow: Rejected for CLI — token-based simpler
---
## 6. Checksum Verification
**Research Question**: How to implement module integrity verification?
**Decision**: SHA-256 checksum stored in `.checksum` file, verified on every run
**Rationale**:
- SHA-256 is industry standard, already used for plugin verification
- `.checksum` file stores registry-provided checksum (from X-Checksum header)
- Local checksum computed on-demand and compared
- Mismatch indicates local modification (warn, don't override)
**Implementation Pattern**:
```groovy
class ModuleChecksum {
static final String ALGORITHM = 'SHA-256'
static String compute(Path moduleDir) {
// Hash all files in module directory
// Exclude .checksum itself
// Return hex-encoded SHA-256
}
static boolean verify(Path moduleDir) {
def expected = moduleDir.resolve('.checksum').text.trim()
def actual = compute(moduleDir)
return expected == actual
}
static void save(Path moduleDir, String checksum) {
moduleDir.resolve('.checksum').text = checksum
}
}
```
**Checksum Scope**: Covers all files in module directory (main.nf, meta.yaml, README.md, etc.)
**Alternatives Considered**:
- Per-file checksums: Rejected - adds complexity, single checksum sufficient
- MD5: Rejected - SHA-256 more secure
---
## 7. Version Constraint Syntax
**Research Question**: What version constraint syntax to use for module dependencies?
**Decision**: Reuse existing Nextflow plugin version constraint syntax
**Rationale**:
- Already implemented and tested in plugin system
- Users familiar with existing `nextflowVersion` syntax
- Supports ranges, comparisons, exact versions
- No new parser code needed
**Supported Syntax**:
| Notation | Meaning | Example |
|----------|---------|---------|
| `1.2.3` | Exact version | `@nf-core/fastqc@1.0.0` |
| `>=1.2.3` | Greater or equal | `@nf-core/fastqc@>=1.0.0` |
| `<=1.2.3` | Less or equal | `@nf-core/fastqc@<=2.0.0` |
| `>=1.2.0,<2.0.0` | Range | `@nf-core/samtools@>=1.0.0,<2.0.0` |
**Reference**: Version parsing code exists in plugin system; reuse VersionNumber class
**Alternatives Considered**:
- NPM-style `^` and `~`: Rejected - inconsistent with existing Nextflow patterns
- Always latest: Rejected - breaks reproducibility
---
## 8. Tool Arguments Implementation
> **⚠️ REMOVED FROM ADR** — The tool arguments feature (`tools.<name>.args` in meta.yaml and process config) was removed from the module system ADR. It is not implemented and not planned in the current scope. The `meta.yaml` format used in the actual implementation (`ModuleSpec`) does not include tool/argument definitions.
---
## Summary of Key Decisions
| Area | Decision | Key Reference |
|------|----------|---------------|
| CLI | JCommander subcommands; each extends CmdBase (ModuleRun extends CmdRun) | CmdModule.groovy |
| DSL Parser | SPI pattern — ResolveIncludeVisitor delegates to RemoteModuleResolver; DefaultRemoteModuleResolver bridges to ModuleResolver | ResolveIncludeVisitor.java, RemoteModuleResolver.java |
| Config | ModulesConfig + RegistryConfig (ConfigScope) | FusionConfig.groovy, ConfigScope.java |
| Registry HTTP | ModuleRegistryClient using HxClient + npr-api models | HttpPluginRepository.groovy |
| Authentication | `NXF_REGISTRY_TOKEN` env var or `registry.apiKey` config field (primary registry only) | RegistryConfig.groovy |
| Checksums | SHA-256/SHA-512, `.checksum` file, download integrity via X-Checksum header | ModuleChecksum.groovy |
| Version Storage | `nextflow_spec.json` (auto-managed); `modules {}` in nextflow.config (manual alternative) | PipelineSpec.groovy |
| Version Syntax | Plugin-compatible constraints | VersionNumber class |
| Tool Args | ~~Implicit variable, parse-time validation~~**Removed from ADR** | N/A |
---
## Open Items (Deferred)
1. **Local vs managed module distinction**: Resolved — `@` prefix for registry modules only; local paths start with `.` or `/`
2. **Tool arguments**: Removed from ADR — not in scope
3. **Module version location**: Resolved — `nextflow_spec.json` (auto-managed by `module install`); `modules {}` block in `nextflow.config` supported as alternative
4. **DSL parser `@scope/name` include**: ✅ Resolved — SPI pattern implemented (T017a-d)

View File

@@ -0,0 +1,261 @@
# Feature Specification: Nextflow Module System Client
**Feature Branch**: `251117-module-system`
**Created**: 2026-01-15
**Status**: Draft
**Input**: User description: "Implement Nextflow module system client based on ADR 20251114-module-system.md. Focus on client-side implementation only - CLI commands, DSL parser extensions, dependency resolution, and local storage. Registry backend is assumed to be already implemented."
## Overview
This specification covers the **Nextflow client-side implementation** of the module system, enabling pipeline developers to:
- Include remote modules from the Nextflow registry using `@scope/name` syntax
- Manage module versions through `nextflow.config`
- Use CLI commands to install, search, list, remove, publish, and run modules
- Configure module parameters through structured `meta.yaml` definitions
**Out of Scope**: Registry backend implementation (assumed already available at `registry.nextflow.io`)
## User Scenarios & Testing
### User Story 1 - Install and Use Registry Module (Priority: P1)
A pipeline developer wants to use a pre-built module from the Nextflow registry in their workflow without manually downloading or managing module files.
**Why this priority**: This is the core value proposition - enabling code reuse from the ecosystem. Without this, the module system provides no benefit.
**Independent Test**: Can be fully tested by running `nextflow module install nf-core/fastqc` and then executing a workflow that includes the module. Delivers immediate value by enabling module consumption.
**Acceptance Scenarios**:
1. **Given** a new Nextflow project with no modules installed, **When** user runs `nextflow module install nf-core/fastqc`, **Then** the module is downloaded to `modules/@nf-core/fastqc/`, a `.checksum` file is created, and `nextflow_spec.json` is updated with the version
2. **Given** a workflow file with `include { FASTQC } from '@nf-core/fastqc'`, **When** user runs `nextflow run main.nf`, **Then** Nextflow resolves the module from local storage and executes the process
3. **Given** a module version declared in `nextflow.config`, **When** user includes the module, **Then** the declared version is used (not latest)
---
### User Story 2 - Run Module Directly (Priority: P1)
A user wants to run a module directly from the command line without writing a wrapper workflow.
**Why this priority**: Enables immediate productivity - users can test and execute modules without boilerplate code, essential for AI agents and quick experimentation.
**Independent Test**: Can be tested by running `nextflow module run nf-core/fastqc --input 'data/*.fq'` and verifying the process executes.
**Acceptance Scenarios**:
1. **Given** a module is available (locally or in registry), **When** user runs `nextflow module run nf-core/fastqc --input 'data/*.fastq'`, **Then** the module is executed with the provided inputs mapped to process parameters
2. **Given** a module with parameters defined in `meta.yaml`, **When** user runs `nextflow module run nf-core/bwa-align --batch_size 100000`, **Then** the parameter is validated and passed to the process
3. **Given** a module is not installed locally, **When** user runs `nextflow module run nf-core/salmon`, **Then** the module is automatically downloaded before execution
---
### User Story 3 - Module Parameters (Priority: P1)
A module author wants to define typed, documented parameters that provide a clear interface for module customization.
**Why this priority**: Critical for module usability - provides type-safe, documented parameters that enable IDE autocompletion and validation, replacing the opaque `ext.args` pattern.
**Independent Test**: Can be tested by configuring `params.batch_size = 100000` in config and verifying the parameter is applied in the script.
**Acceptance Scenarios**:
1. **Given** a module with `params` defined in `meta.yaml`, **When** user configures `params.batch_size = 100000` in config, **Then** the parameter is accessible in scripts via `params.batch_size`
2. **Given** a parameter with type validation, **When** user provides an invalid value type, **Then** a validation error is displayed
3. **Given** a module with documented parameters, **When** user runs `nextflow module run --help`, **Then** available parameters with descriptions are listed
---
### User Story 4 - Module Version Management (Priority: P2)
A pipeline developer wants to pin and manage module versions to ensure reproducible workflow executions.
**Why this priority**: Reproducibility is important for scientific workflows - version pinning ensures consistent results.
**Independent Test**: Can be tested by modifying `nextflow.config` module versions and verifying the correct version is used on workflow run.
**Acceptance Scenarios**:
1. **Given** a module is installed at version 1.0.0, **When** user changes `nextflow_spec.json` to specify version 1.1.0 and runs the workflow, **Then** version 1.1.0 is automatically downloaded and replaces the local copy
2. **Given** modules installed locally, **When** user runs `nextflow module list`, **Then** configured version, installed version, latest available version, and status are displayed for each module
---
### User Story 5 - Module Integrity Protection (Priority: P2)
A pipeline developer who has locally modified a module (for debugging or customization) wants to be protected from accidentally losing those changes.
**Why this priority**: Protects user work - important for developer experience but not blocking core functionality.
**Independent Test**: Can be tested by modifying a module's `main.nf` locally, then attempting to install a different version and verifying the warning appears.
**Acceptance Scenarios**:
1. **Given** a locally modified module (checksum mismatch with `.checksum`), **When** user tries to install a different version, **Then** Nextflow warns about local modifications and does NOT override
2. **Given** a locally modified module, **When** user runs `nextflow module install -force`, **Then** the local module is replaced with the registry version
3. **Given** a locally modified module, **When** user runs the workflow, **Then** a warning is displayed about checksum mismatch but execution continues
---
### User Story 6 - Remove Module (Priority: P3)
A pipeline developer wants to remove a module they no longer need.
**Why this priority**: Housekeeping feature - useful but not blocking core workflows.
**Independent Test**: Can be tested by running `nextflow module remove nf-core/fastqc` and verifying files are deleted and config is updated.
**Acceptance Scenarios**:
1. **Given** a module is installed, **When** user runs `nextflow module remove nf-core/fastqc`, **Then** the module directory is deleted and the entry is removed from `nextflow_spec.json`
2. **Given** a module is referenced in workflow files, **When** user runs `nextflow module remove`, **Then** a warning is displayed about the reference but removal proceeds
---
### User Story 7 - Search and Discover Modules (Priority: P3)
A pipeline developer wants to find available modules in the registry that match their analysis needs.
**Why this priority**: Discovery feature - useful but users can find modules through documentation or registry web UI.
**Independent Test**: Can be tested by running `nextflow module search bwa` and verifying results are displayed with name, version, and description.
**Acceptance Scenarios**:
1. **Given** modules exist in the registry, **When** user runs `nextflow module search alignment`, **Then** matching modules are displayed with name, latest version, description, and download count
2. **Given** user wants JSON output for scripting, **When** user runs `nextflow module search fastqc -json`, **Then** results are returned in parseable JSON format
3. **Given** many results exist, **When** user runs `nextflow module search quality -limit 5`, **Then** only 5 results are returned
---
### User Story 8 - Publish Module to Registry (Priority: P3)
A module author wants to publish their module to the Nextflow registry for others to use.
**Why this priority**: Ecosystem contribution feature - important for growth but users can consume modules without publishing capability.
**Independent Test**: Can be tested by creating a valid module structure and running `nextflow module publish -dry-run` to validate.
**Acceptance Scenarios**:
1. **Given** a valid module with `main.nf`, `meta.yaml`, and `README.md`, **When** user runs `nextflow module publish myorg/my-module`, **Then** the module is uploaded to the registry and becomes available for installation
2. **Given** an invalid module (missing required fields), **When** user runs `nextflow module publish`, **Then** validation errors are displayed listing the missing requirements
3. **Given** no authentication configured, **When** user runs `nextflow module publish`, **Then** a clear error message indicates authentication is required
---
### Edge Cases
- What happens when the registry is unreachable during module resolution?
- Nextflow uses locally cached modules if available, otherwise fails with a clear network error
- How does the system handle circular module dependencies?
- Dependency resolver detects cycles and fails with an error listing the cycle
- What happens when two modules require incompatible versions of the same dependency?
- System automatically selects the highest compatible version; if no compatible version exists, fails with error listing conflicting requirements
- How are modules resolved when multiple registries are configured?
- Registries are tried in order; first match wins
- What happens when `meta.yaml` is missing from a module?
- Module is treated as having no dependencies; basic functionality works
- What happens when local module directory is corrupted or incomplete?
- Checksum mismatch triggers warning; `-force` allows re-download
## Requirements
### Functional Requirements
#### DSL Parser Extension
- **FR-001**: System MUST recognize `@scope/name` syntax in `include` statements as registry module references
- **FR-002**: System MUST distinguish between local file paths (starting with `.` or `/`) and registry modules (starting with `@`)
- **FR-003**: System MUST resolve module versions from `nextflow_spec.json` before downloading
- **FR-004**: System MUST parse and validate `meta.yaml` files for module metadata and dependencies
#### Module Resolution
- **FR-005**: System MUST resolve modules at workflow parse time (after plugin resolution)
- **FR-006**: System MUST check local `modules/@scope/name/` directory before querying registry
- **FR-007**: System MUST verify module integrity using `.checksum` file on every run
- **FR-008**: System MUST download modules from registry when not present locally or when version differs
- **FR-009**: System MUST NOT override locally modified modules (checksum mismatch) unless `-force` is used
- **FR-010**: System MUST resolve version conflicts by selecting the highest compatible version; if no compatible version exists, MUST fail with error listing conflicting requirements
#### Local Storage
- **FR-011**: System MUST store modules in `modules/@scope/name/` directory structure (single version per module)
- **FR-012**: System MUST create `.checksum` file from registry's X-Checksum header on download
- **FR-013**: System MUST store module's `main.nf`, `meta.yaml`, and supporting files in the module directory
#### CLI Commands
- **FR-014**: System MUST provide `nextflow module install [scope/name]` command to download modules
- **FR-015**: System MUST provide `nextflow module search <query>` command to search the registry
- **FR-016**: System MUST provide `nextflow module list` command to show installed vs configured modules
- **FR-017**: System MUST provide `nextflow module remove scope/name` command to delete modules
- **FR-018**: System MUST provide `nextflow module publish scope/name` command to upload modules to registry
- **FR-019**: System MUST provide `nextflow module run scope/name` command to execute modules directly
- **FR-019b**: System MUST provide `nextflow module info scope/name` command to display module metadata and a usage template
#### Configuration
- **FR-020**: System MUST persist module versions in `nextflow_spec.json`; MUST also read versions from `modules {}` block in `nextflow.config` as an alternative
- **FR-021**: System MUST support `registry {}` block with `url` and `apiKey` fields for configuring registry URL and authentication
- **FR-022**: System MUST support `NXF_REGISTRY_TOKEN` environment variable as fallback for `registry.apiKey`
- **FR-023**: System MUST support multiple registry URLs with fallback ordering
#### Module Parameters
- **FR-024**: System MUST parse module parameters from `params` section in `meta.yaml`
- **FR-025**: System MUST validate module parameters against `meta.yaml` schema (type) at workflow parse time
- **FR-026**: System MUST support boolean, integer, float, string, file, and path parameter types
- **FR-027**: System MUST make module parameters accessible via standard `params` variable in scripts
#### Registry Communication
- **FR-028**: System MUST communicate with registry via documented Module API endpoints
- **FR-029**: System MUST handle authentication using Bearer token in Authorization header
- **FR-030**: System MUST verify SHA-256 checksum on module download
### Key Entities
- **Module**: A reusable Nextflow process definition with `main.nf` entry point, optional `meta.yaml` manifest, and README documentation
- **Module Reference**: A scoped identifier (`@scope/name`) pointing to a registry module
- **Module Manifest (meta.yaml)**: YAML file containing module metadata, version, dependencies, and parameter definitions
- **Module Parameter**: A configurable parameter defined in `meta.yaml` with name, optional type, description, and example
- **Checksum File (.checksum)**: Local cache of registry checksum for integrity verification
- **Registry Configuration**: Settings for registry URL, authentication, and fallback ordering
## Success Criteria
### Measurable Outcomes
- **SC-001**: Pipeline developers can install and use a registry module within 5 minutes of starting a new project
- **SC-002**: Module resolution adds less than 2 seconds to workflow startup time when modules are cached locally
- **SC-003**: Users can successfully search, install, and run any module from the registry without reading documentation
- **SC-004**: 100% of module version changes in `nextflow.config` result in automatic module updates without manual intervention
- **SC-005**: Users receive clear, actionable error messages for all failure scenarios (network, validation, authentication)
- **SC-006**: Module authors can publish a new module version within 3 minutes using the CLI
- **SC-007**: Locally modified modules are never accidentally overwritten during normal operations
## Assumptions
- Registry backend is fully implemented and available at `registry.nextflow.io` with the Module API as documented in the ADR
- Existing plugin authentication system can be reused for module registry authentication
- Module bundle size limit of 1MB (uncompressed) is enforced by the registry
- Network connectivity is available for initial module downloads; offline operation uses local cache only
- The `modules/` directory is intended to be committed to the pipeline's git repository
- Version constraints in `meta.yaml` follow the same syntax as existing Nextflow plugin version constraints
- SHA-256 is used for all checksum operations
- Module parameters use standard `--<param_name>` CLI syntax
## Dependencies
- Registry backend API (Module API endpoints as specified in ADR)
- Existing Nextflow plugin system (for authentication reuse)
- Existing DSL parser infrastructure (for `include` statement extension)
- Existing config parser (for `modules {}` and `registry {}` blocks)
## Clarifications
### Session 2026-01-19
- Q: What should happen when incompatible dependency versions are detected? → A: Use highest compatible version automatically, warn if none exists
- Q: When should module parameter validation occur? → A: At workflow parse time (early, before any execution)

View File

@@ -0,0 +1,35 @@
# Specification Quality Checklist: Fusion GPU Metrics Collection
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-04-10
**Feature**: [spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- All items pass. Spec references internal Nextflow concepts (TraceRecord, TowerObserver) by necessity since this is an internal infrastructure feature, but avoids prescribing implementation approach.
- The `resourceAllocation` pattern reference in FR-003 is a design constraint from the user, not an implementation detail leak.

View File

@@ -0,0 +1,179 @@
# Implementation Plan: Fusion GPU Metrics Collection
**Branch**: `260410-fusion-gpu-metrics-v2` | **Date**: 2026-04-10 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/260410-fusion-gpu-metrics/spec.md`
## Summary
Collect GPU metrics from Fusion's `.fusion/trace.json` file on task completion and send them to Seqera Platform. The GPU block is carried as a transient `Map<String,Object>` field on `TraceRecord` (same pattern as `resourceAllocation`) and included in the task payload via `TowerObserver.makeTaskMap0()`.
## Technical Context
**Language/Version**: Groovy 4.0.29 / Java 17 target (Java 21 toolchain)
**Primary Dependencies**: Nextflow core (`modules/nextflow`), nf-tower plugin (`plugins/nf-tower`)
**Storage**: N/A (read-only file access to `.fusion/trace.json`)
**Testing**: Spock Framework (unit tests in both modules)
**Target Platform**: All Fusion-enabled executors (AWS Batch, Google Batch, Azure Batch, K8s, Seqera, SLURM)
**Project Type**: Multi-module Gradle project
**Performance Goals**: Negligible overhead — one small JSON file read per task completion
**Constraints**: Must not break existing trace pipeline; must be forward-compatible with evolving GPU block schema
**Scale/Scope**: 4 files modified, ~80 lines of production code, ~120 lines of test code
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
| Principle | Status | Notes |
|-----------|--------|-------|
| I. Modular Architecture | PASS | Core trace logic in `modules/nextflow`, Platform integration in `plugins/nf-tower` — correct placement |
| II. Test-Driven Quality | PASS | Unit tests planned for both TraceRecord and TowerClient |
| III. Dataflow Programming | N/A | No changes to dataflow model |
| IV. Apache 2.0 License | PASS | All modified files already have headers |
| V. DCO Sign-off | PASS | Will use `git commit -s` |
| VI. Semantic Versioning | PASS | No version bump needed — feature addition within existing release cycle |
| VII. Groovy Idioms | PASS | Uses JsonSlurper, follows existing getter/setter patterns |
## Project Structure
### Files to Modify
```text
modules/nextflow/
├── src/main/groovy/nextflow/trace/TraceRecord.groovy # Add transient field + parsing method
├── src/main/groovy/nextflow/processor/TaskHandler.groovy # Read .fusion/trace.json on completion
└── src/test/groovy/nextflow/trace/TraceRecordTest.groovy # Test transient field + parsing
plugins/nf-tower/
├── src/main/io/seqera/tower/plugin/TowerObserver.groovy # Include gpuMetrics in task map
└── src/test/io/seqera/tower/plugin/TowerClientTest.groovy # Test gpuMetrics in task map
```
## Implementation Tasks
### Task 1: Add transient `gpuMetrics` field to TraceRecord
**File**: `modules/nextflow/src/main/groovy/nextflow/trace/TraceRecord.groovy`
**Changes**:
1. Add field after `resourceAllocation` (line 128):
```groovy
transient private Map<String,Object> gpuMetrics
```
2. Add getter/setter after existing `resourceAllocation` getter/setter (after line 649):
```groovy
Map<String,Object> getGpuMetrics() {
return gpuMetrics
}
void setGpuMetrics(Map<String,Object> value) {
this.gpuMetrics = value
}
```
### Task 2: Add Fusion trace file parsing method to TraceRecord
**File**: `modules/nextflow/src/main/groovy/nextflow/trace/TraceRecord.groovy`
**Changes**:
Add a static method to parse `.fusion/trace.json` and extract the `gpu` block:
```groovy
static Map<String,Object> parseFusionTraceFile(Path file) {
final text = file.text
final json = (Map) new JsonSlurper().parseText(text)
return (Map<String,Object>) json.get('gpu')
}
```
This keeps parsing logic on TraceRecord (consistent with `parseTraceFile()` for `.command.trace`).
### Task 3: Read `.fusion/trace.json` in TaskHandler.getTraceRecord()
**File**: `modules/nextflow/src/main/groovy/nextflow/processor/TaskHandler.groovy`
**Changes**:
After the existing `.command.trace` parsing block (after line 253), add:
```groovy
// collect Fusion GPU metrics
if( task.processor.executor.isFusionEnabled() ) {
final fusionTrace = task.workDir?.resolve('.fusion/trace.json')
try {
if( fusionTrace ) {
final gpu = TraceRecord.parseFusionTraceFile(fusionTrace)
if( gpu )
record.gpuMetrics = gpu
}
}
catch( NoSuchFileException e ) {
// ignore - Fusion trace may not exist
}
catch( Exception e ) {
log.debug "[WARN] Cannot read Fusion trace file: $fusionTrace -- Cause: ${e.message}"
}
}
```
**Key design decisions**:
- Gated by `task.processor.executor.isFusionEnabled()` — no file access when Fusion is not enabled (FR-007)
- Placed inside `isCompleted()` block but NOT gated by task status — runs for both success and failure (FR-005)
- Same error handling pattern as `.command.trace` parsing above it (FR-006)
### Task 4: Include `gpuMetrics` in TowerObserver task payload
**File**: `plugins/nf-tower/src/main/io/seqera/tower/plugin/TowerObserver.groovy`
**Changes**:
In `makeTaskMap0()` method, add after `record.resourceAllocation = trace.getResourceAllocation()` (after line 476):
```groovy
record.gpuMetrics = trace.getGpuMetrics()
```
### Task 5: Unit tests for TraceRecord
**File**: `modules/nextflow/src/test/groovy/nextflow/trace/TraceRecordTest.groovy`
**Tests to add**:
1. **Transient field serialization test** (follows `numSpotInterruptions` pattern):
- Set `gpuMetrics` on a TraceRecord
- Serialize and deserialize
- Verify deserialized record has `null` for `gpuMetrics`
2. **parseFusionTraceFile with GPU block**:
- Create a temp file with valid trace.json content including a `gpu` block
- Verify the returned map contains all GPU fields with correct values
3. **parseFusionTraceFile without GPU block**:
- Create a temp file with valid trace.json content without a `gpu` key
- Verify `null` is returned
4. **parseFusionTraceFile with malformed JSON**:
- Create a temp file with invalid JSON
- Verify an exception is thrown (caller handles it)
### Task 6: Unit tests for TowerClient/TowerObserver
**File**: `plugins/nf-tower/src/test/io/seqera/tower/plugin/TowerClientTest.groovy`
**Test to add** (follows `resourceAllocation` test at lines 684-711):
- Create a TraceRecord with `gpuMetrics` set to a GPU metrics map
- Call `makeTasksReq([trace])`
- Verify `req.tasks[0].gpuMetrics` contains the GPU data
## Implementation Order
1. **Task 1 + Task 2** (TraceRecord changes) — no dependencies
2. **Task 3** (TaskHandler) — depends on Task 1+2
3. **Task 4** (TowerObserver) — depends on Task 1
4. **Task 5** (TraceRecord tests) — depends on Task 1+2
5. **Task 6** (TowerClient tests) — depends on Task 4
Tasks 1+2 and 5 can be done in parallel with Tasks 4 and 6.
## Verification
After implementation, run:
```bash
./gradlew :nextflow:test --tests "TraceRecordTest"
./gradlew :nf-tower:test --tests "TowerClientTest"
make smoke # verify no regressions
```

View File

@@ -0,0 +1,49 @@
# Research: Fusion GPU Metrics Collection
## R1: How to detect Fusion at trace collection time
**Decision**: Use `task.processor.executor.isFusionEnabled()` in `TaskHandler.getTraceRecord()`.
**Rationale**: TaskHandler already accesses the executor at line 222 (`task.processor.executor.getName()`), so this is a proven access path. The base `Executor.isFusionEnabled()` returns `false` by default, and Fusion-capable executors override it via `FusionHelper.isFusionEnabled(session)`. This works for all handler subclasses without requiring `instanceof` checks.
**Alternatives considered**:
- Checking `this instanceof FusionAwareTask`: Would miss custom executors that support Fusion but don't implement the trait. Also, `FusionAwareTask` is a trait on handler subclasses, not on the base `TaskHandler` where `getTraceRecord()` lives.
- Adding a Fusion flag to TaskRun/TaskConfig: Unnecessary complexity — Fusion is an executor-level property, not a per-task property.
## R2: Where to read `.fusion/trace.json`
**Decision**: Read it in `TaskHandler.getTraceRecord()`, right after the existing `.command.trace` parsing block (lines 244-253), gated by `task.processor.executor.isFusionEnabled()`.
**Rationale**: This is the single place where all task trace data is assembled, regardless of executor type. The existing `.command.trace` parsing already demonstrates the pattern: resolve a file in the work dir, parse it, handle `NoSuchFileException` and `IOException` gracefully.
**Alternatives considered**:
- Reading in each TaskHandler subclass: Would require changes across 7 handler subclasses in both core and plugins. Much higher blast radius.
- Reading in `TowerObserver`: Would couple Platform-specific code with file I/O. The observer should only transform data, not collect it.
## R3: Transient field pattern on TraceRecord
**Decision**: Add `transient private Map<String,Object> gpuMetrics` with getter/setter, following the exact `resourceAllocation` pattern.
**Rationale**: Transient fields on TraceRecord are the established mechanism for carrying executor-specific data to TowerObserver without persisting it in serialization (Kryo). The `resourceAllocation` field is the closest precedent — it's also a `Map<String,Object>` set during trace collection and consumed in `TowerObserver.makeTaskMap0()`.
**Implementation details**:
- Field: `transient private Map<String,Object> gpuMetrics`
- Getter: `Map<String,Object> getGpuMetrics()`
- Setter: `void setGpuMetrics(Map<String,Object> value)`
- In `makeTaskMap0()`: `record.gpuMetrics = trace.getGpuMetrics()`
## R4: JSON parsing approach
**Decision**: Use Groovy's `JsonSlurper` to parse `.fusion/trace.json` and extract the `gpu` key.
**Rationale**: `JsonSlurper` is already used throughout the Nextflow codebase (e.g., in tests and utilities). It parses JSON into native Groovy maps/lists, which is exactly what we need for the `Map<String,Object>` transient field. No additional dependencies required.
## R5: Test strategy
**Decision**: Three test locations following existing patterns.
1. **TraceRecordTest**: Verify `gpuMetrics` transient field is not persisted across serialization (follows `numSpotInterruptions` test pattern).
2. **TraceRecordTest**: Verify `parseFusionTraceFile()` correctly extracts GPU block from valid JSON, handles missing file, handles malformed JSON, handles missing GPU block.
3. **TowerClientTest**: Verify `gpuMetrics` is included in task map output (follows `resourceAllocation` test at lines 684-711).
**Rationale**: These three test locations mirror exactly how `resourceAllocation` and `numSpotInterruptions` are tested, ensuring consistency with project conventions.

View File

@@ -0,0 +1,147 @@
# Feature Specification: Fusion GPU Metrics Collection
**Feature Branch**: `260410-fusion-gpu-metrics`
**Created**: 2026-04-10
**Status**: Draft
**Input**: User description: "Collect GPU metrics from Fusion trace.json and send to Seqera Platform via TowerClient"
## User Scenarios & Testing *(mandatory)*
### User Story 1 - GPU metrics sent to Platform on task completion (Priority: P1)
A user runs a Nextflow pipeline with Fusion enabled on a GPU-equipped executor (e.g., AWS Batch, Google Batch, Kubernetes). When each task completes, Nextflow reads the Fusion-generated `.fusion/trace.json` file from the task work directory, extracts the `gpu` block, and includes it in the task trace data sent to Seqera Platform. The user can then view GPU utilization metrics (compute %, memory %, active time, etc.) for each task in the Platform UI.
**Why this priority**: This is the core feature. Without it, GPU usage is invisible to Platform users running Fusion-enabled pipelines.
**Independent Test**: Can be tested by running a Fusion-enabled task that produces a `.fusion/trace.json` with a `gpu` block, then verifying the GPU data appears in the task payload sent to Platform.
**Acceptance Scenarios**:
1. **Given** a completed task with Fusion enabled and a valid `.fusion/trace.json` containing a `gpu` block, **When** the task trace is collected, **Then** all GPU metrics from the `gpu` block are included in the task data sent to Platform.
2. **Given** a completed task with Fusion enabled and a valid `.fusion/trace.json` without a `gpu` block (CPU-only task), **When** the task trace is collected, **Then** no GPU metrics are sent and no error occurs.
3. **Given** a failed task with Fusion enabled and a valid `.fusion/trace.json` containing a `gpu` block, **When** the task trace is collected, **Then** GPU metrics are still sent (metrics are collected irrespective of task status).
---
### User Story 2 - Graceful handling when trace.json is missing or malformed (Priority: P2)
When Fusion's `.fusion/trace.json` file is missing (e.g., task was killed before Fusion wrote it) or contains invalid JSON, the system logs a debug-level warning and proceeds without GPU metrics. The task trace is still sent to Platform with all other fields intact.
**Why this priority**: Robustness is essential — GPU metrics are supplementary data and must never cause task reporting to fail.
**Independent Test**: Can be tested by simulating a completed task where `.fusion/trace.json` is absent or contains malformed JSON, and verifying the task trace is still sent successfully without GPU data.
**Acceptance Scenarios**:
1. **Given** a completed Fusion-enabled task where `.fusion/trace.json` does not exist, **When** the task trace is collected, **Then** no GPU metrics are included and no error is raised.
2. **Given** a completed Fusion-enabled task where `.fusion/trace.json` contains invalid JSON, **When** the task trace is collected, **Then** the file is skipped with a debug log message and the task trace is sent without GPU data.
3. **Given** a completed Fusion-enabled task where `.fusion/trace.json` exists but the `gpu` block is null/absent, **When** the task trace is collected, **Then** no GPU metrics are included and no error is raised.
---
### Edge Cases
- What happens when the `gpu` block contains unexpected or extra fields not in the known schema? They are included as-is (forward compatibility).
- What happens when Fusion is not enabled for a task? No attempt is made to read `.fusion/trace.json`.
- What happens when the task work directory is inaccessible at trace collection time (e.g., remote storage timeout)? The same error handling as existing `.command.trace` parsing applies — log and continue.
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: System MUST read the file `.fusion/trace.json` from the task work directory on task completion when the executor has Fusion enabled.
- **FR-002**: System MUST extract the entire `gpu` block from the parsed `trace.json` as a map.
- **FR-003**: System MUST store the GPU metrics as a transient field on `TraceRecord` (following the same pattern as `resourceAllocation`).
- **FR-004**: System MUST include the GPU metrics map in the task payload sent to Seqera Platform via the Tower observer.
- **FR-005**: System MUST collect GPU metrics irrespective of task completion status (success or failure).
- **FR-006**: System MUST NOT fail or disrupt task trace reporting if `.fusion/trace.json` is missing, unreadable, or malformed.
- **FR-007**: System MUST only attempt to read `.fusion/trace.json` when Fusion is enabled for the executor.
### Key Entities
- **Fusion Trace File**: JSON file at `.fusion/trace.json` in the task work directory, produced by the Fusion client. Contains `proc`, `gpu`, and `cgroup` blocks with runtime metrics.
- **GPU Metrics Block**: The `gpu` object within `trace.json`, containing fields: `name`, `mem`, `driver`, `active_time`, `pct`, `peak`, `pct_mem`, `peak_mem`, `avg_mem`, `peak_mem_used`, `avg_mem_bw_util`, `peak_mem_bw_util`.
#### Example `.fusion/trace.json`
```json
{
"proc": {
"realtime": 660541,
"pct_cpu": 1045,
"cpu_name": "Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz",
"arch": "linux/amd64",
"rchar": 14112539262,
"wchar": 12668821375,
"syscr": 1823378,
"syscw": 169293,
"read_bytes": 8011776,
"write_bytes": 102400,
"pct_mem": 56,
"vmem": 39015152,
"rss": 14826068,
"peak_vmem": 39047920,
"peak_rss": 15775480,
"vol_ctxt": 413015,
"inv_ctxt": 1540
},
"gpu": {
"name": "Tesla T4",
"mem": 15360,
"driver": "580.126.09",
"active_time": 651030,
"pct": 75,
"peak": 100,
"pct_mem": 40.11115345483025,
"peak_mem": 74.140625,
"avg_mem": 6161,
"peak_mem_used": 11388,
"avg_mem_bw_util": 43,
"peak_mem_bw_util": 83
},
"cgroup": {
"version": "v2",
"memory_current": 25469927424,
"memory_peak": 41178980352,
"memory_rss": 67919872,
"memory_peak_rss": 14783070208,
"cpu_usage_usec": 785302059,
"cpu_user_usec": 549732867,
"cpu_system_usec": 235569192,
"io_read_bytes": 8503296,
"io_write_bytes": 12671918080,
"io_read_ops": 98,
"io_write_ops": 97975,
"memory_limit": 77309411328,
"cpu_quota": 0,
"cpu_period": 0,
"memory_oom_kills": 0,
"cpu_nr_throttled": 0,
"cpu_throttled_usec": 0,
"cpu_psi_some": 582969,
"cpu_psi_full": 582860,
"memory_psi_some": 0,
"memory_psi_full": 0,
"io_psi_some": 1038270,
"io_psi_full": 1037514
}
}
```
- **TraceRecord GPU field**: New transient field on `TraceRecord` that carries the GPU metrics map through the existing trace pipeline to the Tower observer, following the `resourceAllocation` pattern.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: GPU metrics from Fusion trace files are visible in Seqera Platform for all Fusion-enabled tasks that ran on GPU hardware.
- **SC-002**: Tasks without GPU usage or without Fusion enabled report successfully with no GPU data and no errors.
- **SC-003**: A missing or malformed `.fusion/trace.json` does not cause any task to fail reporting — 100% of tasks still have their standard metrics delivered.
- **SC-004**: GPU metrics collection adds negligible overhead — reading and parsing a single small JSON file per task completion.
## Assumptions
- The Fusion client is responsible for creating `.fusion/trace.json` in the task work directory. Nextflow only reads it.
- The `gpu` block schema may evolve over time. The implementation forwards the entire block as a map rather than mapping to fixed fields, ensuring forward compatibility.
- Seqera Platform API already accepts or will be updated to accept the GPU metrics payload alongside existing task trace data.
- The file path `.fusion/trace.json` is stable and defined by the Fusion client contract.
- All executors that support Fusion (AWS Batch, Google Batch, Azure Batch, Kubernetes, Seqera, SLURM) benefit from this feature without executor-specific code — the detection is based on whether Fusion is enabled, not on the executor type.

View File

@@ -0,0 +1,131 @@
# Tasks: Fusion GPU Metrics Collection
**Input**: Design documents from `/specs/260410-fusion-gpu-metrics/`
**Prerequisites**: plan.md, spec.md, research.md
**Tests**: Included — the spec requires unit tests for both TraceRecord and TowerClient.
**Organization**: Tasks grouped by user story for independent implementation and testing.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2)
- Exact file paths included in descriptions
## Phase 1: Foundational (TraceRecord transient field)
**Purpose**: Add the `gpuMetrics` transient field to TraceRecord — all subsequent tasks depend on this.
- [ ] T001 Add transient `gpuMetrics` field with getter/setter to `modules/nextflow/src/main/groovy/nextflow/trace/TraceRecord.groovy` (after `resourceAllocation` field at line 128, getter/setter after line 649)
- [ ] T002 Add static `parseFusionTraceFile(Path)` method to `modules/nextflow/src/main/groovy/nextflow/trace/TraceRecord.groovy` — parse `.fusion/trace.json` and return the `gpu` block as `Map<String,Object>`
**Checkpoint**: TraceRecord can hold and parse GPU metrics. No behavior change yet.
---
## Phase 2: User Story 1 - GPU metrics sent to Platform (Priority: P1)
**Goal**: Read `.fusion/trace.json` on task completion, extract GPU block, send to Platform via TowerObserver.
**Independent Test**: Run a Fusion-enabled task with `.fusion/trace.json` containing a `gpu` block, verify GPU data appears in the Platform task payload.
### Implementation
- [ ] T003 [US1] Read `.fusion/trace.json` in `TaskHandler.getTraceRecord()` at `modules/nextflow/src/main/groovy/nextflow/processor/TaskHandler.groovy` — add after `.command.trace` parsing block (after line 253), gated by `task.processor.executor.isFusionEnabled()`
- [ ] T004 [US1] Include `gpuMetrics` in task payload in `TowerObserver.makeTaskMap0()` at `plugins/nf-tower/src/main/io/seqera/tower/plugin/TowerObserver.groovy` — add `record.gpuMetrics = trace.getGpuMetrics()` after `resourceAllocation` line (line 476)
### Tests
- [ ] T005 [P] [US1] Test `parseFusionTraceFile` with valid GPU block in `modules/nextflow/src/test/groovy/nextflow/trace/TraceRecordTest.groovy` — create temp file with full trace.json, verify returned map has all GPU fields
- [ ] T006 [P] [US1] Test `gpuMetrics` transient field is not persisted across serialization in `modules/nextflow/src/test/groovy/nextflow/trace/TraceRecordTest.groovy` — set field, serialize/deserialize, verify null
- [ ] T007 [US1] Test `gpuMetrics` included in task map in `plugins/nf-tower/src/test/io/seqera/tower/plugin/TowerClientTest.groovy` — create TraceRecord with gpuMetrics set, call `makeTasksReq()`, verify output contains GPU data
**Checkpoint**: GPU metrics flow end-to-end from `.fusion/trace.json` to Platform payload. Run:
```bash
./gradlew :nextflow:test --tests "TraceRecordTest"
./gradlew :nf-tower:test --tests "TowerClientTest"
```
---
## Phase 3: User Story 2 - Graceful error handling (Priority: P2)
**Goal**: Ensure missing, malformed, or GPU-less trace files don't break task reporting.
**Independent Test**: Simulate tasks with missing/malformed `.fusion/trace.json`, verify task trace is sent without GPU data and no errors.
### Tests
- [ ] T008 [P] [US2] Test `parseFusionTraceFile` without GPU block in `modules/nextflow/src/test/groovy/nextflow/trace/TraceRecordTest.groovy` — create temp file with valid JSON but no `gpu` key, verify null returned
- [ ] T009 [P] [US2] Test `parseFusionTraceFile` with malformed JSON in `modules/nextflow/src/test/groovy/nextflow/trace/TraceRecordTest.groovy` — create temp file with invalid JSON, verify exception is thrown
**Checkpoint**: Error handling verified. The implementation in T003 already handles these cases via try/catch — these tests confirm the behavior.
---
## Phase 4: Verification
**Purpose**: End-to-end validation across both modules.
- [ ] T010 Run smoke tests to verify no regressions: `make smoke`
---
## Dependencies & Execution Order
### Phase Dependencies
- **Phase 1** (T001, T002): No dependencies — start immediately
- **Phase 2** (T003-T007): Depends on Phase 1 completion
- **Phase 3** (T008-T009): Depends on Phase 1 (T002 specifically)
- **Phase 4** (T010): Depends on all previous phases
### Parallel Opportunities
- T001 and T002 modify the same file but different sections — execute sequentially
- T005, T006 are [P] — can run in parallel (same file but independent test methods)
- T008, T009 are [P] — can run in parallel
- T004 and T005/T006 are in different modules — can run in parallel after T001
### Within Each Phase
```
Phase 1: T001 → T002
Phase 2: T003 → T004 (sequential: different modules but T004 depends on field from T001)
T005, T006 (parallel, after T002)
T007 (after T004)
Phase 3: T008, T009 (parallel, after T002)
Phase 4: T010 (after all)
```
---
## Implementation Strategy
### MVP (User Story 1 Only)
1. Complete Phase 1: TraceRecord field + parser (T001-T002)
2. Complete Phase 2: TaskHandler + TowerObserver + tests (T003-T007)
3. **STOP and VALIDATE**: Run unit tests for both modules
4. GPU metrics flow to Platform
### Full Feature
1. MVP above
2. Add Phase 3: Error handling tests (T008-T009)
3. Phase 4: Smoke tests (T010)
---
## Summary
| Metric | Value |
|--------|-------|
| Total tasks | 10 |
| US1 tasks | 5 (T003-T007) |
| US2 tasks | 2 (T008-T009) |
| Foundational | 2 (T001-T002) |
| Verification | 1 (T010) |
| Files modified | 5 |
| Parallel opportunities | T005+T006, T008+T009 |

View File

@@ -0,0 +1,667 @@
# Seqera executor `process.resourceLabels` Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Make the `nf-seqera` executor honour `process.resourceLabels` by sending the config-level baseline as run labels and the per-task delta as Sched task labels.
**Architecture:** Cumulative Nextflow labels split into two scheduler scopes — config-level `process.resourceLabels` becomes `CreateRunRequest.labels`; the difference between `task.config.getResourceLabels()` and that baseline becomes `Task.labels`. The redundant `seqera.executor.labels` config option is removed.
**Tech Stack:** Groovy 4 / Java 21 toolchain, Gradle, Spock, `io.seqera:sched-client` (≥ 0.51.0 — must expose `Task.labels`), Nextflow extension-point plugin model.
**Spec:** `docs/superpowers/specs/2026-04-17-seqera-resource-labels-design.md`
**File map:**
- Modify `settings.gradle` — uncomment-style `includeBuild '../sched'` for dev
- Modify `plugins/nf-seqera/build.gradle` — bump `sched-client` version
- Modify `plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy` — add `withProcessResourceLabels`, add `delta`, remove `withUserLabels`
- Modify `plugins/nf-seqera/src/main/io/seqera/config/ExecutorOpts.groovy` — remove `labels` field
- Modify `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraExecutor.groovy` — wire run labels + cache `runResourceLabels`
- Modify `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraTaskHandler.groovy` — attach delta to `Task.labels`
- Modify `plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy`
- Modify `plugins/nf-seqera/src/test/io/seqera/config/ExecutorOptsTest.groovy`
- Modify `plugins/nf-seqera/src/test/io/seqera/executor/SeqeraTaskHandlerTest.groovy`
- Modify `docs/reference/process.md` — add Seqera executor to support list
- Modify `plugins/nf-seqera/changelog.txt` — entry
- Modify `plugins/nf-seqera/VERSION` — bump to `0.18.0`
---
### Task 1: Bump `sched-client` and wire `includeBuild '../sched'`
**Files:**
- Modify: `plugins/nf-seqera/build.gradle:54`
- Modify: `settings.gradle:17-19`
The local `~/Projects/sched` checkout is at `0.51.0`; that is the version exposing `Task.labels`. Until 0.51.0 is published to the Seqera Maven repo, we use a Gradle composite build to substitute the dependency from the local checkout.
- [ ] **Step 1: Bump sched-client version**
Edit `plugins/nf-seqera/build.gradle:54`:
```gradle
api 'io.seqera:sched-client:0.51.0'
```
- [ ] **Step 2: Add `includeBuild '../sched'` block to `settings.gradle`**
Replace the commented `pluginManagement` block at `settings.gradle:17-19` with both blocks (keep the existing comment, add a new one for sched as dev-only opt-in):
```gradle
// pluginManagement {
// includeBuild '../nextflow-plugin-gradle'
// }
// For local development against an unpublished sched-client, uncomment:
// includeBuild '../sched'
includeBuild '../sched'
```
(The uncommented `includeBuild '../sched'` line is required for the build to resolve `sched-client:0.51.0` until the artifact is published. The commented hint stays for future reference.)
- [ ] **Step 3: Verify the build resolves**
Run: `./gradlew :plugins:nf-seqera:compileGroovy`
Expected: BUILD SUCCESSFUL. If it fails with a missing `sched-client:0.51.0`, confirm `~/Projects/sched/VERSION` contains `0.51.0` and that `~/Projects/sched/sched-client` builds locally (`cd ~/Projects/sched && ./gradlew :sched-client:assemble`).
- [ ] **Step 4: Commit**
```bash
git add settings.gradle plugins/nf-seqera/build.gradle
git commit -s -m "build(nf-seqera): bump sched-client to 0.51.0 via includeBuild"
```
---
### Task 2: Add `Labels.withProcessResourceLabels` (TDD)
**Files:**
- Modify: `plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy`
- Modify: `plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy`
- [ ] **Step 1: Write the failing tests**
Append to `plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy` (before the closing `}`):
```groovy
def 'should add process resource labels coercing values to string'() {
when:
def labels = new Labels()
.withProcessResourceLabels([team: 'genomics', priority: 7, retain: true])
then:
labels.entries['team'] == 'genomics'
labels.entries['priority'] == '7'
labels.entries['retain'] == 'true'
}
def 'should ignore null or empty process resource labels'() {
when:
def a = new Labels().withProcessResourceLabels(null)
def b = new Labels().withProcessResourceLabels([:])
then:
a.entries.isEmpty()
b.entries.isEmpty()
}
def 'should let process resource labels override workflow metadata on key collision'() {
given:
def workflow = Mock(WorkflowMetadata) {
getProjectName() >> 'hello'
getRunName() >> 'happy_turing'
getSessionId() >> UUID.randomUUID()
isResume() >> false
getManifest() >> new Manifest([:])
}
when:
def labels = new Labels()
.withWorkflowMetadata(workflow)
.withProcessResourceLabels(['nextflow.io/runName': 'custom', team: 'a'])
then:
labels.entries['nextflow.io/runName'] == 'custom'
labels.entries['team'] == 'a'
labels.entries['nextflow.io/projectName'] == 'hello'
}
```
- [ ] **Step 2: Run the tests to verify they fail**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.LabelsTest' -i`
Expected: FAIL — `MissingMethodException: No signature of method ... withProcessResourceLabels`.
- [ ] **Step 3: Implement `withProcessResourceLabels`**
Edit `plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy`. After the `withUserLabels` method (which will be removed in Task 4), add:
```groovy
/**
* Add config-level {@code process.resourceLabels}. Values are coerced to
* string via {@link String#valueOf} to satisfy the scheduler API typing.
*/
Labels withProcessResourceLabels(Map<String,?> map) {
if( !map ) return this
map.each { k, v -> entries.put(k.toString(), String.valueOf(v)) }
return this
}
```
- [ ] **Step 4: Run the tests to verify they pass**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.LabelsTest' -i`
Expected: PASS for the three new tests; existing tests still PASS.
- [ ] **Step 5: Commit**
```bash
git add plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy \
plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy
git commit -s -m "feat(nf-seqera): add Labels.withProcessResourceLabels"
```
---
### Task 3: Add `Labels.delta` and `Labels.toStringMap` helpers (TDD)
**Files:**
- Modify: `plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy`
- Modify: `plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy`
These two static helpers compute the per-task delta and coerce arbitrary `Map<String,?>` values to strings — used by both the executor (to cache the run baseline) and the task handler (to compute the delta).
- [ ] **Step 1: Write the failing tests**
Append to `plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy` (before the closing `}`):
```groovy
def 'should coerce map values to strings'() {
expect:
Labels.toStringMap(null) == [:]
Labels.toStringMap([:]) == [:]
Labels.toStringMap([a: 1, b: 'x', c: true]) == [a: '1', b: 'x', c: 'true']
}
def 'should compute null delta when task labels are empty'() {
expect:
Labels.delta(null, [team: 'a']) == null
Labels.delta([:], [team: 'a']) == null
}
def 'should return full task labels when run labels are empty'() {
expect:
Labels.delta([team: 'a', region: 'us'], null) == [team: 'a', region: 'us']
Labels.delta([team: 'a', region: 'us'], [:]) == [team: 'a', region: 'us']
}
def 'should keep only differing or missing keys in delta'() {
expect:
Labels.delta([team: 'a', region: 'us'], [team: 'a']) == [region: 'us']
Labels.delta([team: 'b'], [team: 'a']) == [team: 'b']
Labels.delta([team: 'a', region: 'us'], [team: 'a', region: 'us']) == null
}
```
- [ ] **Step 2: Run the tests to verify they fail**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.LabelsTest' -i`
Expected: FAIL — `MissingMethodException: ... toStringMap` and `... delta`.
- [ ] **Step 3: Implement the helpers**
Edit `plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy`. Inside the class, after the existing `runId` method, add:
```groovy
/**
* Coerce arbitrary map values to strings via {@link String#valueOf}.
* Returns an empty map for null/empty input.
*/
static Map<String,String> toStringMap(Map<String,?> map) {
if( !map ) return Collections.<String,String>emptyMap()
final result = new LinkedHashMap<String,String>(map.size())
map.each { k, v -> result.put(k.toString(), String.valueOf(v)) }
return result
}
/**
* Return the entries of {@code task} that are missing from {@code run}
* or have a different value. Returns {@code null} if the resulting
* map would be empty (so callers can omit the field).
*/
static Map<String,String> delta(Map<String,String> task, Map<String,String> run) {
if( !task ) return null
final result = new LinkedHashMap<String,String>()
task.each { k, v ->
if( run == null || !run.containsKey(k) || run.get(k) != v )
result.put(k, v)
}
return result.isEmpty() ? null : result
}
```
- [ ] **Step 4: Run the tests to verify they pass**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.LabelsTest' -i`
Expected: PASS for all new tests; existing tests still PASS.
- [ ] **Step 5: Commit**
```bash
git add plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy \
plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy
git commit -s -m "feat(nf-seqera): add Labels.toStringMap and Labels.delta helpers"
```
---
### Task 4: Remove `seqera.executor.labels` config option
**Files:**
- Modify: `plugins/nf-seqera/src/main/io/seqera/config/ExecutorOpts.groovy:74-79,130,168-170`
- Modify: `plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy:81-88`
- Modify: `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraExecutor.groovy:120`
- Modify: `plugins/nf-seqera/src/test/io/seqera/config/ExecutorOptsTest.groovy:131-165`
- Modify: `plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy:127-150,192-199`
The user-facing `seqera.executor.labels` option is replaced by the standard Nextflow `process.resourceLabels` directive.
- [ ] **Step 1: Remove the field and getter from `ExecutorOpts`**
Edit `plugins/nf-seqera/src/main/io/seqera/config/ExecutorOpts.groovy`:
Remove lines 74-79 (the `@ConfigOption` block and `final Map<String, String> labels` field):
```groovy
@ConfigOption
@Description("""
Custom labels to apply to AWS resources for cost tracking and resource organization.
Labels are propagated to ECS tasks, capacity providers, and EC2 instances.
""")
final Map<String, String> labels
```
Remove the assignment in the constructor (around line 129-130):
```groovy
// labels for cost tracking
this.labels = opts.labels as Map<String, String>
```
Remove the getter (around line 168-170):
```groovy
Map<String, String> getLabels() {
return labels
}
```
- [ ] **Step 2: Remove the `withUserLabels` method from `Labels`**
Edit `plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy`. Delete the entire `withUserLabels` method (lines 81-88):
```groovy
/**
* Add user-configured labels. These take precedence over implicit labels.
*/
Labels withUserLabels(Map<String,String> labels) {
if( labels )
entries.putAll(labels)
return this
}
```
- [ ] **Step 3: Remove the `withUserLabels` call site in `SeqeraExecutor.createRun()`**
Edit `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraExecutor.groovy`. Delete the line:
```groovy
labels.withUserLabels(seqeraConfig.labels)
```
- [ ] **Step 4: Remove obsolete tests**
Edit `plugins/nf-seqera/src/test/io/seqera/config/ExecutorOptsTest.groovy`. Delete the three tests (lines 131-165): `'should create config with labels'`, `'should handle null labels'`, `'should handle empty labels'`.
Edit `plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy`. Delete the two tests: `'should allow user labels to override implicit labels'` (lines 127-150) and `'should handle null user labels'` (lines 192-199).
- [ ] **Step 5: Compile and run tests**
Run: `./gradlew :plugins:nf-seqera:compileGroovy :plugins:nf-seqera:test`
Expected: BUILD SUCCESSFUL; all remaining tests PASS. If the compiler complains about a stray reference to `seqeraConfig.labels` or `withUserLabels`, grep for and remove them: `rg "seqeraConfig\.labels|withUserLabels" plugins/nf-seqera`.
- [ ] **Step 6: Commit**
```bash
git add plugins/nf-seqera/src/main/io/seqera/config/ExecutorOpts.groovy \
plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy \
plugins/nf-seqera/src/main/io/seqera/executor/SeqeraExecutor.groovy \
plugins/nf-seqera/src/test/io/seqera/config/ExecutorOptsTest.groovy \
plugins/nf-seqera/src/test/io/seqera/executor/LabelsTest.groovy
git commit -s -m "refactor(nf-seqera)!: remove seqera.executor.labels in favour of process.resourceLabels"
```
---
### Task 5: Wire `process.resourceLabels` into `SeqeraExecutor.createRun()` and expose `runResourceLabels` (TDD)
**Files:**
- Modify: `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraExecutor.groovy`
- Modify: `plugins/nf-seqera/src/test/io/seqera/executor/SeqeraExecutorTest.groovy`
The executor reads the config-level `process.resourceLabels` map once at run creation, attaches it to the run labels via `Labels.withProcessResourceLabels`, and caches the coerced map so task handlers can compute deltas.
- [ ] **Step 1: Write the failing test**
Append to `plugins/nf-seqera/src/test/io/seqera/executor/SeqeraExecutorTest.groovy` (before the final closing `}`):
```groovy
def 'should expose run resource labels coerced from config-level process.resourceLabels'() {
given:
def executor = new SeqeraExecutor()
executor.@session = Mock(Session) {
getConfig() >> [process: [resourceLabels: [team: 'a', priority: 7]]]
}
when:
executor.computeRunResourceLabels()
then:
executor.runResourceLabels == [team: 'a', priority: '7']
}
def 'should yield empty run resource labels when process.resourceLabels is absent'() {
given:
def executor = new SeqeraExecutor()
executor.@session = Mock(Session) {
getConfig() >> [:]
}
when:
executor.computeRunResourceLabels()
then:
executor.runResourceLabels == [:]
}
```
(`Session` is already imported at line 21; if not, add the import.)
- [ ] **Step 2: Run the test to verify it fails**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.SeqeraExecutorTest' -i`
Expected: FAIL — `computeRunResourceLabels` / `runResourceLabels` don't exist.
- [ ] **Step 3: Implement on `SeqeraExecutor`**
Edit `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraExecutor.groovy`.
Add a private field near the other private fields (after `runId` at line 65):
```groovy
private volatile Map<String,String> runResourceLabels = Collections.<String,String>emptyMap()
```
Add a method to compute the run resource labels (place near other protected/package methods, e.g. before `createRun()` at line 110):
```groovy
@groovy.transform.PackageScope
void computeRunResourceLabels() {
final processMap = session.config.process as Map
final raw = processMap?.get('resourceLabels') as Map<String,?>
this.runResourceLabels = Labels.toStringMap(raw)
}
```
Add the public getter (after `getRunId()` around line 204):
```groovy
Map<String,String> getRunResourceLabels() {
return runResourceLabels
}
```
Wire it into `createRun()`. Replace the labels-building block at `SeqeraExecutor.groovy:117-120` (after the deletion in Task 4 it should look like the first three lines below) with:
```groovy
computeRunResourceLabels()
final labels = new Labels()
if( seqeraConfig.autoLabels )
labels.withWorkflowMetadata(session.workflowMetadata)
labels.withProcessResourceLabels(runResourceLabels)
```
- [ ] **Step 4: Run the test to verify it passes**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.SeqeraExecutorTest' -i`
Expected: PASS for both new tests; existing tests still PASS.
- [ ] **Step 5: Commit**
```bash
git add plugins/nf-seqera/src/main/io/seqera/executor/SeqeraExecutor.groovy \
plugins/nf-seqera/src/test/io/seqera/executor/SeqeraExecutorTest.groovy
git commit -s -m "feat(nf-seqera): attach process.resourceLabels to Sched run labels"
```
---
### Task 6: Send per-task delta on `Task.labels` from `SeqeraTaskHandler.submit()` (TDD)
**Files:**
- Modify: `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraTaskHandler.groovy`
- Modify: `plugins/nf-seqera/src/test/io/seqera/executor/SeqeraTaskHandlerTest.groovy`
Capture the `Task` passed to the batch submitter and assert its `labels` field reflects the delta between the task's `getResourceLabels()` and the executor's `runResourceLabels`.
- [ ] **Step 1: Write the failing tests**
Append to `plugins/nf-seqera/src/test/io/seqera/executor/SeqeraTaskHandlerTest.groovy` (before the final closing `}`):
```groovy
def 'submit attaches Task.labels containing only the per-task delta'() {
given:
Task captured = null
def batchSubmitter = Mock(SeqeraBatchSubmitter) {
submit(_, _) >> { args -> captured = args[1] as Task }
}
def taskConfig = Mock(TaskConfig) {
getCpus() >> 2
getMemory() >> MemoryUnit.of('1 GB')
getAccelerator() >> null
getResourceLabels() >> [team: 'a', region: 'us-east-1']
getResourceLimit('memory') >> null
getResourceLimit('cpus') >> null
getDisk() >> null
}
def taskRun = Mock(TaskRun) {
getConfig() >> taskConfig
getWorkDir() >> Paths.get('/work/ab/cd1234')
getWorkDirStr() >> '/work/ab/cd1234'
getContainer() >> 'docker.io/library/alpine:3'
getContainerPlatform() >> 'linux/amd64'
getId() >> TaskId.of(1)
getHash() >> HashCode.fromInt(1)
lazyName() >> 'sample_task'
}
def executor = Mock(SeqeraExecutor) {
getClient() >> Mock(SchedClient)
getBatchSubmitter() >> batchSubmitter
getSeqeraConfig() >> Mock(ExecutorOpts) {
getMachineRequirement() >> Mock(io.seqera.config.MachineRequirementOpts)
getTaskEnvironment() >> [:]
}
getRunResourceLabels() >> [team: 'a']
ensureRunCreated() >> {}
}
def handler = Spy(new SeqeraTaskHandler(taskRun, executor)) {
fusionEnabled() >> true
fusionLauncher() >> Mock(nextflow.fusion.FusionScriptLauncher) {
fusionEnv() >> [:]
}
fusionSubmitCli() >> ['/bin/sh', '-c', 'true']
fusionConfig() >> Mock(nextflow.fusion.FusionConfig) {
snapshotsEnabled() >> false
}
}
when:
handler.submit()
then:
captured != null
captured.getLabels() == [region: 'us-east-1']
}
def 'submit leaves Task.labels unset when the task labels equal the run baseline'() {
given:
Task captured = null
def batchSubmitter = Mock(SeqeraBatchSubmitter) {
submit(_, _) >> { args -> captured = args[1] as Task }
}
def taskConfig = Mock(TaskConfig) {
getCpus() >> 2
getMemory() >> MemoryUnit.of('1 GB')
getAccelerator() >> null
getResourceLabels() >> [team: 'a']
getResourceLimit('memory') >> null
getResourceLimit('cpus') >> null
getDisk() >> null
}
def taskRun = Mock(TaskRun) {
getConfig() >> taskConfig
getWorkDir() >> Paths.get('/work/ab/cd1234')
getWorkDirStr() >> '/work/ab/cd1234'
getContainer() >> 'docker.io/library/alpine:3'
getContainerPlatform() >> 'linux/amd64'
getId() >> TaskId.of(1)
getHash() >> HashCode.fromInt(1)
lazyName() >> 'sample_task'
}
def executor = Mock(SeqeraExecutor) {
getClient() >> Mock(SchedClient)
getBatchSubmitter() >> batchSubmitter
getSeqeraConfig() >> Mock(ExecutorOpts) {
getMachineRequirement() >> Mock(io.seqera.config.MachineRequirementOpts)
getTaskEnvironment() >> [:]
}
getRunResourceLabels() >> [team: 'a']
ensureRunCreated() >> {}
}
def handler = Spy(new SeqeraTaskHandler(taskRun, executor)) {
fusionEnabled() >> true
fusionLauncher() >> Mock(nextflow.fusion.FusionScriptLauncher) {
fusionEnv() >> [:]
}
fusionSubmitCli() >> ['/bin/sh', '-c', 'true']
fusionConfig() >> Mock(nextflow.fusion.FusionConfig) {
snapshotsEnabled() >> false
}
}
when:
handler.submit()
then:
captured != null
captured.getLabels() == null
}
```
- [ ] **Step 2: Run the tests to verify they fail**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.SeqeraTaskHandlerTest' -i`
Expected: FAIL — assertions on `captured.getLabels()` fail because submit() does not set them.
- [ ] **Step 3: Wire the delta into `submit()`**
Edit `plugins/nf-seqera/src/main/io/seqera/executor/SeqeraTaskHandler.groovy`. After the `final schedTask = new Task() ... .nextflow(...)` block ending around line 140, before the `log.debug` call at line 141, insert:
```groovy
// attach per-task resource labels delta (over run-level baseline)
final taskLabels = Labels.toStringMap(task.config.getResourceLabels())
final delta = Labels.delta(taskLabels, executor.runResourceLabels)
if( delta )
schedTask.labels(delta)
```
- [ ] **Step 4: Run the tests to verify they pass**
Run: `./gradlew :plugins:nf-seqera:test --tests 'io.seqera.executor.SeqeraTaskHandlerTest' -i`
Expected: PASS for both new tests; existing tests still PASS.
- [ ] **Step 5: Run the full plugin test suite**
Run: `./gradlew :plugins:nf-seqera:test`
Expected: BUILD SUCCESSFUL; no regressions.
- [ ] **Step 6: Commit**
```bash
git add plugins/nf-seqera/src/main/io/seqera/executor/SeqeraTaskHandler.groovy \
plugins/nf-seqera/src/test/io/seqera/executor/SeqeraTaskHandlerTest.groovy
git commit -s -m "feat(nf-seqera): send per-task resourceLabels delta on Sched task"
```
---
### Task 7: Docs, changelog, and version bump
**Files:**
- Modify: `docs/reference/process.md:1388-1393`
- Modify: `plugins/nf-seqera/changelog.txt`
- Modify: `plugins/nf-seqera/VERSION`
- [ ] **Step 1: Update docs**
Edit `docs/reference/process.md`. Replace the executor support list at lines 1388-1393:
```markdown
Resource labels are currently supported by the following executors:
- {ref}`awsbatch-executor`
- {ref}`azurebatch-executor`
- {ref}`google-batch-executor`
- {ref}`k8s-executor`
- {ref}`seqera-executor`
```
(If `seqera-executor` is not a defined ref, drop the `{ref}` wrapper and write `Seqera executor` as plain text.)
- [ ] **Step 2: Update plugin changelog**
Edit `plugins/nf-seqera/changelog.txt`. Add a new entry at the top, above the `0.17.0` block:
```
0.18.0 - <today's date>
- Support process.resourceLabels: config-level labels attached to Sched run, per-task delta attached to Sched task
- Remove seqera.executor.labels config option (use process.resourceLabels instead)
- Bump sched-client@0.51.0
```
- [ ] **Step 3: Bump plugin VERSION**
Edit `plugins/nf-seqera/VERSION`:
```
0.18.0
```
- [ ] **Step 4: Verify everything builds**
Run: `./gradlew :plugins:nf-seqera:check`
Expected: BUILD SUCCESSFUL.
- [ ] **Step 5: Commit**
```bash
git add docs/reference/process.md plugins/nf-seqera/changelog.txt plugins/nf-seqera/VERSION
git commit -s -m "docs(nf-seqera): document resourceLabels support and bump to 0.18.0"
```
---
## Self-review checklist (executed)
- **Spec coverage:** every section of `2026-04-17-seqera-resource-labels-design.md` maps to a task — sched-client bump (Task 1), `withProcessResourceLabels` (Task 2), `delta` + `toStringMap` (Task 3), removal of `seqera.executor.labels` (Task 4), run-level wiring + `runResourceLabels` (Task 5), per-task delta on `Task.labels` (Task 6), docs / changelog / VERSION (Task 7).
- **Placeholder scan:** no TBDs, no "implement later", every code step has the actual code.
- **Type consistency:** `Labels.toStringMap(Map<String,?>)` and `Labels.delta(Map<String,String>, Map<String,String>)` referenced consistently in Tasks 3, 5, 6; `runResourceLabels` field, `computeRunResourceLabels()` method, and `getRunResourceLabels()` getter consistent across Tasks 5 and 6.

View File

@@ -0,0 +1,170 @@
# Seqera executor: support `process.resourceLabels`
Date: 2026-04-17
Status: Approved
## Problem
The `nf-seqera` executor does not honour the `process.resourceLabels`
directive. `SeqeraTaskHandler.submit()` builds the scheduler `Task` with
`name`, `image`, `command`, `environment`, `resourceRequirement`,
`resourceLimit`, `machineRequirement`, and `nextflow(taskId/hash/workDir)`
it never reads `task.config.getResourceLabels()`. The plugin's only label
path is at the run level (`SeqeraExecutor.createRun()`), where
`Labels.withUserLabels(seqeraConfig.labels)` and optional auto-labels are
attached to `CreateRunRequest`.
`AbstractComputePlatformProvider.addConfigResourceLabels()` emits
`process.resourceLabels = [...]` into the Nextflow config for every CE type
including the Seqera Compute default config. AWS Batch / GCP Batch / Azure /
K8s honour the directive; on the Seqera Compute path the directive is
effectively dead.
## Goal
Implement support for `process.resourceLabels` in the Seqera executor and
pass labels through to the `sched-client`, with cumulative semantics that
mirror Nextflow's existing label model.
## Label model
Nextflow labels are cumulative:
- `process.resourceLabels` at the top level of `nextflow.config` is the
common baseline — it applies to every task across every process.
- Selector-scoped (`withName:`, `withLabel:`) and in-process-body
`resourceLabels` directives merge on top, per process.
- `TaskConfig.getResourceLabels()` returns the final merged map for a given
task.
The `sched-api` (≥ 0.51.0) exposes labels at two scopes:
- `CreateRunRequest.labels` — set once at run creation
- `Task.labels` — set per task
We map cumulative Nextflow labels onto these two scopes:
- **Run-level labels** = config-level `process.resourceLabels` (the common
baseline) + `nextflow.io/*` auto-labels (when `seqera.executor.autoLabels`
is enabled).
- **Per-task labels** = the *delta* between `task.config.getResourceLabels()`
and the run-level baseline:
- keys present on the task but absent from the run baseline
- keys present in both where the task value differs from the run value
- keys present in both with identical values are omitted
When the delta is empty, `Task.labels` is left unset.
The Sched scheduler is expected to merge run + task labels with task labels
overriding run labels on key collision; this preserves the Nextflow
semantic where a per-process `resourceLabels` directive overrides the
config-level default for the same key.
## Changes
### 1. Remove `seqera.executor.labels`
This config option becomes redundant once `process.resourceLabels` is the
canonical user-facing way to attach run-level labels.
- `ExecutorOpts` (`plugins/nf-seqera/src/main/io/seqera/config/ExecutorOpts.groovy`):
remove the `labels` field, getter, and `@ConfigOption` declaration.
- `SeqeraExecutor.createRun()`: drop the
`labels.withUserLabels(seqeraConfig.labels)` call.
- `Labels`: remove `withUserLabels(Map)` (no remaining callers).
- `ExecutorOptsTest`, `LabelsTest`, `SeqeraExecutorTest`: drop assertions
for the removed option.
- Plugin `changelog.txt`: note the removal as a breaking change for plugin
`nf-seqera` 0.18.0.
### 2. Add `withProcessResourceLabels` to `Labels`
`plugins/nf-seqera/src/main/io/seqera/executor/Labels.groovy`:
```groovy
Labels withProcessResourceLabels(Map<String,Object> map) {
if( map )
map.each { k, v -> entries.put(k.toString(), String.valueOf(v)) }
return this
}
```
Values are coerced to `String` via `String.valueOf` to satisfy
`sched-api`'s `Map<String,String>` typing without rejecting non-string
values that Nextflow's `resourceLabels` directive accepts.
### 3. Wire run-level labels in `SeqeraExecutor.createRun()`
```groovy
final processLabels = (session.config.process as Map)?.resourceLabels as Map<String,Object>
final labels = new Labels()
if( seqeraConfig.autoLabels )
labels.withWorkflowMetadata(session.workflowMetadata)
labels.withProcessResourceLabels(processLabels)
this.runResourceLabels = coerceToStringMap(processLabels)
```
The coerced map is cached on the executor as `runResourceLabels` so task
handlers can compute the delta without re-reading config or duplicating the
coercion logic. `coerceToStringMap` lives next to `Labels` (or as a
`static` helper on it) and applies `String.valueOf` to each value.
### 4. Compute and attach the per-task delta in `SeqeraTaskHandler.submit()`
```groovy
final taskLabels = coerceToStringMap(task.config.getResourceLabels())
final delta = deltaLabels(taskLabels, executor.runResourceLabels)
if( delta )
schedTask.labels(delta)
```
`deltaLabels(task, run)` returns a `Map<String,String>` containing entries
in `task` that are missing in `run` or whose value differs from the value
in `run`. Empty map → return `null` so the caller can omit the field.
The helper lives alongside `Labels` (e.g. `Labels.delta(task, run)`).
### 5. `sched-client` dependency
- `plugins/nf-seqera/build.gradle`: bump `io.seqera:sched-client` to the
released version exposing `Task.labels` (≥ 0.51.0 once published).
- `settings.gradle`: add an `includeBuild '../sched'` block matching the
existing commented `includeBuild '../nextflow-plugin-gradle'` pattern, so
development against an unreleased sched is opt-in via uncommenting.
### 6. Tests (Spock)
- `LabelsTest`:
- `withProcessResourceLabels` merges entries, coerces non-String values,
no-ops on null/empty.
- `delta(task, run)` returns missing keys, returns differing keys, omits
matching keys, returns empty/null when fully covered.
- Existing `withUserLabels` assertions removed.
- `SeqeraExecutorTest`:
- `createRun` populates `CreateRunRequest.labels` with config-level
`process.resourceLabels` merged with auto-labels.
- `runResourceLabels` accessor returns the coerced baseline map.
- Removed assertions for `seqera.executor.labels`.
- `SeqeraTaskHandlerTest`:
- `submit` attaches `Task.labels` containing the delta when the task adds
new labels or overrides values.
- `submit` leaves `Task.labels` unset when the task labels equal the run
baseline.
- `ExecutorOptsTest`: remove `labels` parsing test.
### 7. Docs
- `docs/reference/process.md` (resourceLabels section, around line 1388):
add `{ref}seqera-executor` to the list of executors that support
`resourceLabels`.
- `plugins/nf-seqera/changelog.txt`: entry covering the new behaviour and
the removal of `seqera.executor.labels`.
## Out of scope
- No deprecation warning shim for `seqera.executor.labels` — the plugin is
early (0.17.0) and the user has approved removal.
- No changes to `seqera.executor.autoLabels` semantics or to the
`nextflow.io/*` / `seqera:sched:*` label namespaces.
- No changes to the sched-api or sched-client itself; the `Task.labels`
field is assumed already published in the version we depend on.