<~生~信~交~流~与~合~作~请~关~注~公~众~号@生信摸索>
代码地址
https://jihulab.com/BioQuest/smkhsshttps://github.com/BioQuestX/smkhss
GATK best practices workflow Pipeline summary
SnakeMake workflow for Human Somatic short variants (SNP+INDEL)
Expected fastq inputs
Matched normal and tumor samples.
Reference
- Reference genome related files and GTAK budnle files (GATK)
- VEP Variarition annotation files (VEP)
Prepare
- Adapter trimming (Fastp)
- Aligner (BWA mem2)
- Mark duplicates (samblaster)
- Generates recalibration table for Base Quality Score Recalibration (BaseRecalibrator)
- Apply base quality score recalibration (ApplyBQSR)
- Merge CRAMs of every sample, repesectly (Picard)
- Create CRAM index (samtools)
Quality control report
- Fastp report (MultiQC)
- Alignment report (MultiQC)
Call
- Call somatic SNVs and indels via local assembly of haplotypes (Mutect2)
- Tabulates pileup metrics for inferring contamination (GetPileupSummaries)
- Calculate the fraction of reads coming from cross-sample contamination (CalculateContamination)
- Get the maximum likelihood estimates of artifact prior probabilities in the orientation bias mixture model filter (LearnReadOrientationModel)
- Filter somatic SNVs and indels called by Mutect2 (FilterMutectCalls)
- Merge all the VCF files (Picard)
Annotation
Annotate variant calls with VEP (VEP)
SnakeMake Report
Outputs
├── config│ ├── captured_regions.bed│ ├── config.yaml│ └── samples.tsv├── dag.svg├── logs│ ├── annotate│ ├── call│ ├── prepare│ ├── qc│ ├── ref│ └── trim├── raw│ ├── P1.N.fastq.gz│ └── P1.T.fastq.gz├── report│ ├── fastp_multiqc_data│ ├── fastp_multiqc.html│ ├── P1.N.fastp.html│ ├── P1.N.fastp.json│ ├── P1.T.fastp.html│ ├── P1.T.fastp.json│ ├── prepare_multiqc_data│ ├── prepare_multiqc.html│ └── vep_report.html├── results│ ├── annotated│ ├── called│ ├── prepared│ └── trimmed└── workflow ├── envs ├── report ├── rules ├── schemas ├── scripts └── Snakefile
Directed Acyclic Graph
Refrence
https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-