这次分析流程搭建使用基于Nextflow 的 nf-core,该工具可以实现自动化的转录组上游分析。

安装

下载最新版:https://github.com/nextflow-io/nextflow/releases

我安装时最新版为 nextflow-21.04.0-edge-all

wegt https://github.com/nextflow-io/nextflow/releases/download/v21.04.0-edge/nextflow-21.04.0-edge-all
mv nextflow-21.04.0-edge-all nextflow

安装 nf-core rnaseq

可以使用Git clone,也可以下载好解压到流程目录

官网:https://nf-co.re/rnaseq

GitHub:https://github.com/nf-core/rnaseq

```

#```

#### 安装aws

```shell
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

下载参考基因组

aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ ./references/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ --exclude "*" --include "genes.gtf"
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/STARIndex/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/STARIndex/
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ ./references/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ --exclude "*" --include "genes.bed"

https://ewels.github.io/AWS-iGenomes/

测试数据

数据来源GSE101571

构建测试数据信息表,rnaseq-test.csv

group,replicate,fastq_1,fastq_2,strandedness
2cell,1,/data/baimoc/data/rnaseq-test/SRR5837392_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837392_2.fastq.gz,unstranded
2cell,1,/data/baimoc/data/rnaseq-test/SRR5837393_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837393_2.fastq.gz,unstranded
8cell,1,/data/baimoc/data/rnaseq-test/SRR5837402_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837402_2.fastq.gz,unstranded
8cell,1,/data/baimoc/data/rnaseq-test/SRR5837403_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837403_2.fastq.gz,unstranded

项目目录

mark

启动流程

../../nextflow run ../../rnaseq --input /data/baimoc/data/rnaseq-test/rnaseq-test.csv --genome GRCh37 --igenomes_base /data/baimoc/references/ -profile docker