共计 1789 个字符,预计需要花费 5 分钟才能阅读完成。
<~ 生~ 信~ 交~ 流~ 与~ 合~ 作~ 请~ 关~ 注~ 公~ 众~ 号 @生信摸索 >
代码地址
https://jihulab.com/BioQuest/smkhss
https://github.com/BioQuestX/smkhss
GATK best practices workflow Pipeline summary
SnakeMake workflow for Human Somatic short variants (SNP+INDEL)
Expected fastq inputs
Matched normal and tumor samples.
Reference
- Reference genome related files and GTAK budnle files (GATK)
- VEP Variarition annotation files (VEP)
Prepare
- Adapter trimming (Fastp)
- Aligner (BWA mem2)
- Mark duplicates (samblaster)
- Generates recalibration table for Base Quality Score Recalibration (BaseRecalibrator)
- Apply base quality score recalibration (ApplyBQSR)
- Merge CRAMs of every sample, repesectly (Picard)
- Create CRAM index (samtools)
Quality control report
- Fastp report (MultiQC)
- Alignment report (MultiQC)
Call
- Call somatic SNVs and indels via local assembly of haplotypes (Mutect2)
- Tabulates pileup metrics for inferring contamination (GetPileupSummaries)
- Calculate the fraction of reads coming from cross-sample contamination (CalculateContamination)
- Get the maximum likelihood estimates of artifact prior probabilities in the orientation bias mixture model filter (LearnReadOrientationModel)
- Filter somatic SNVs and indels called by Mutect2 (FilterMutectCalls)
- Merge all the VCF files (Picard)
Annotation
Annotate variant calls with VEP (VEP)
SnakeMake Report
Outputs
├── config
│ ├── captured_regions.bed
│ ├── config.yaml
│ └── samples.tsv
├── dag.svg
├── logs
│ ├── annotate
│ ├── call
│ ├── prepare
│ ├── qc
│ ├── ref
│ └── trim
├── raw
│ ├── P1.N.fastq.gz
│ └── P1.T.fastq.gz
├── report
│ ├── fastp_multiqc_data
│ ├── fastp_multiqc.html
│ ├── P1.N.fastp.html
│ ├── P1.N.fastp.json
│ ├── P1.T.fastp.html
│ ├── P1.T.fastp.json
│ ├── prepare_multiqc_data
│ ├── prepare_multiqc.html
│ └── vep_report.html
├── results
│ ├── annotated
│ ├── called
│ ├── prepared
│ └── trimmed
└── workflow
├── envs
├── report
├── rules
├── schemas
├── scripts
└── Snakefile
Directed Acyclic Graph
Refrence
https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-
正文完