这次分析流程搭建使用基于Nextflow 的 nf-core,该工具可以实现自动化的转录组上游分析。
安装
下载最新版:https://github.com/nextflow-io/nextflow/releases
我安装时最新版为 nextflow-21.04.0-edge-all
wegt https://github.com/nextflow-io/nextflow/releases/download/v21.04.0-edge/nextflow-21.04.0-edge-all mv nextflow-21.04.0-edge-all nextflow
|
安装 nf-core rnaseq
可以使用Git clone,也可以下载好解压到流程目录
官网:https://nf-co.re/rnaseq
GitHub:https://github.com/nf-core/rnaseq
下载参考基因组
aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ ./references/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ --exclude "*" --include "genes.gtf" aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/ aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/STARIndex/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/STARIndex/ aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/ aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/ ./references/Homo_sapiens/Ensembl/GRCh37/Sequence/Bowtie2Index/ aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ ./references/Homo_sapiens/Ensembl/GRCh37/Annotation/Genes/ --exclude "*" --include "genes.bed"
|
https://ewels.github.io/AWS-iGenomes/
测试数据
数据来源GSE101571
构建测试数据信息表,rnaseq-test.csv
group,replicate,fastq_1,fastq_2,strandedness 2cell,1,/data/baimoc/data/rnaseq-test/SRR5837392_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837392_2.fastq.gz,unstranded 2cell,1,/data/baimoc/data/rnaseq-test/SRR5837393_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837393_2.fastq.gz,unstranded 8cell,1,/data/baimoc/data/rnaseq-test/SRR5837402_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837402_2.fastq.gz,unstranded 8cell,1,/data/baimoc/data/rnaseq-test/SRR5837403_1.fastq.gz,/data/baimoc/data/rnaseq-test/SRR5837403_2.fastq.gz,unstranded
|
项目目录
启动流程
../../nextflow run ../../rnaseq --input /data/baimoc/data/rnaseq-test/rnaseq-test.csv --genome GRCh37 --igenomes_base /data/baimoc/references/ -profile docker
|