Basic Pipeline

alignment

  • build index (if needed - how to check?)
$ bowtie2-build -f <reference.fa>
  • align reads
$ bowtie2 -p <num_cores> \
          -x <index> \
          -U <unpaired_reads.fq> \
          -S <output.sam>
$ samtools view -@ <num_cores> \
                -b <input.sam> \
                -o <output.bam>
  • sort BAM
$ samtools sort -@ <num_cores> <input.bam> <output.sorted.bam>
  • index BAM
$ samtools index <input.sorted.bam>

Call variants

  • filter by BED (optional)
  • call using Freebayes
    • quite sensitive
    • actual variants will have quite high score
    • possibly parallelisable, using freebayes-parallel.sh
$ freebayes -f <ref.fa> <alignment.sorted.bam> > <output.vcf>

Filter results

  • Apply a simple filter to the results - remove all variants with a QUAL scores less than 30
  • 2-step process:
    • apply filter to VCF file - does not remove any lines
$ cat <variants.vcf> | vcf-annotate -f Q=30 > <variants.filtered.vcf>
  • Can then remove all failing variants
$ grep -E "^#" <variants.filtered.vcf> > > <final.vcf>     # select lines starting with '#' - VCF header lines
$ grep "PASS" <variants.filtered.vcf> >> <final.vcf>       # vcf-annotate adds "PASS" to FILTER column of variants scoring above threshold
                                                           # the double redirect ">>" appends an existing file.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License