add nextflow d30e48d
This commit is contained in:
99
nextflow/docs/amazons3.md
Normal file
99
nextflow/docs/amazons3.md
Normal file
@@ -0,0 +1,99 @@
|
||||
(amazons3-page)=
|
||||
|
||||
# Amazon S3
|
||||
|
||||
Nextflow includes support for AWS S3 storage. Files stored in an S3 bucket can be accessed transparently in your pipeline script like any other file in the local file system.
|
||||
|
||||
## S3 path
|
||||
|
||||
In order to access an S3 file, you only need to prefix the file path with the `s3` schema and the `bucket` name where it is stored.
|
||||
|
||||
For example, if you need to access the file `/data/sequences.fa` stored in a bucket named `my-bucket`, that file can be accessed using the following fully qualified path:
|
||||
|
||||
```
|
||||
s3://my-bucket/data/sequences.fa
|
||||
```
|
||||
|
||||
The usual file operations can be applied to a path handle with the above notation. For example, the content of an S3 file can be printed as follows:
|
||||
|
||||
```nextflow
|
||||
println file('s3://my-bucket/data/sequences.fa').text
|
||||
```
|
||||
|
||||
See {ref}`working-with-files` and the {ref}`stdlib-types-path` reference to learn more about available file operations.
|
||||
|
||||
## Security credentials
|
||||
|
||||
AWS access credentials can be provided in two ways:
|
||||
|
||||
1. Using AWS access and secret keys in your pipeline configuration.
|
||||
2. Using IAM roles to grant access to S3 storage on AWS EC2 instances.
|
||||
|
||||
### AWS access and secret keys
|
||||
|
||||
The AWS access and secret keys can be specified by using the `aws` section in the `nextflow.config` configuration file as shown below:
|
||||
|
||||
```groovy
|
||||
aws {
|
||||
accessKey = '<Your AWS access key>'
|
||||
secretKey = '<Your AWS secret key>'
|
||||
region = '<AWS region identifier>'
|
||||
}
|
||||
```
|
||||
|
||||
If the access credentials are not found in the above file, Nextflow looks for AWS credentials in the following order:
|
||||
|
||||
1. The `nextflow.config` file in the pipeline execution directory
|
||||
2. The environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`
|
||||
3. The environment variables `AWS_ACCESS_KEY` and `AWS_SECRET_KEY`
|
||||
4. The profile in the AWS credentials file located at `~/.aws/credentials`
|
||||
- Uses the `default` profile or the environment variable `AWS_PROFILE` if set
|
||||
5. The profile in the AWS client configuration file located at `~/.aws/config`
|
||||
- Uses the `default` profile or the environment variable `AWS_PROFILE` if set
|
||||
6. The temporary AWS credentials provided by an IAM instance role. See [IAM Roles](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html) documentation for details.
|
||||
|
||||
More information regarding [AWS Security Credentials](http://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html) are available in the AWS documentation.
|
||||
|
||||
### IAM roles with AWS EC2 instances
|
||||
|
||||
When running your pipeline in an EC2 instance, IAM roles can be used to grant access to AWS resources.
|
||||
|
||||
In this scenario, you only need to launch the EC2 instance with an IAM role which includes the `AmazonS3FullAccess` policy. Nextflow will detect and automatically acquire the permission to access S3 storage, without any further configuration.
|
||||
|
||||
Learn more about [Using IAM Roles to Delegate Permissions to Applications that Run on AWS EC2](http://docs.aws.amazon.com/IAM/latest/UserGuide/roles-usingrole-ec2instance.html) in the AWS documentation.
|
||||
|
||||
## China regions
|
||||
|
||||
To use an AWS China region, make sure to specify the corresponding AWS API S3 endpoint in the Nextflow configuration file as shown below:
|
||||
|
||||
```groovy
|
||||
aws {
|
||||
client {
|
||||
endpoint = "https://s3.cn-north-1.amazonaws.com.cn"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Read more about AWS API endpoints in the [AWS documentation](https://docs.aws.amazon.com/general/latest/gr/s3.html)
|
||||
|
||||
## S3-compatible storage
|
||||
|
||||
To use S3-compatible object storage such as [Ceph](https://ceph.io) or [Minio](https://min.io) specify the endpoint of
|
||||
your storage provider and enable the [S3 path style access](https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#path-style-access)
|
||||
in your Nextflow configuration as shown below:
|
||||
|
||||
|
||||
```groovy
|
||||
aws {
|
||||
accessKey = '<Your access key>'
|
||||
secretKey = '<Your secret key>'
|
||||
client {
|
||||
endpoint = '<Your storage endpoint URL>'
|
||||
s3PathStyleAccess = true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Advanced configuration
|
||||
|
||||
Read {ref}`AWS configuration<config-aws>` section to learn more about advanced S3 client configuration options.
|
||||
Reference in New Issue
Block a user