Tools

Tools

Aligners

Alignment guide

Name URL Read Size Assembly Size De Novo support?
Bowtie http://bowtie-bio.sourceforge.net 35bp whole-genome ???
Bowtie2 http://bowtie-bio.sourceforge.net/bowtie2 50-100+ whole-genome ???
BWA http://bio-bwa.sourceforge.net 70-100+ ??? ???
maq ??? ??? ??? ???
scalpel http://scalpel.sourceforge.net ??? ??? ???

TODO

Missense Analysis

TODO

Splice Analysis

Splice Site

TODO

Enhancers / repressors

TODO

Sequence tools

Command line tools

  • Bedtools
  • Samtools
  • wgsim
    • simulates NGS by generating aritifical short reads from a reference sequence.
    • $C = (N * L) / G$
      • C = average coverage depth
      • N = number of reads
      • L = length of reads
      • G = size of genome
      • Therefore:
        • $N = (C * G) / L$

Python utilities

  • pyfaidx
    • access + edit FASTA files quickly without reading the whole file by using samtools-style fai indexing.
    • simple to use in interactive Python interpreter
> from pyfaidx import Fasta
> genes = Fasta(<filename>, mutable=True)     # mutable switch edits file in place
> genes[<seqID>[<position>]                   # prints single nucleotide at 0-indexed <position>
> genes[<seqID>[<position>] = "T"             # replaces nucleotide with T
> genes[<seqID>[<start>:<end>]                # prints a range of nucelotides. Range includes start, but excludes end.
> genes[<seqID>[10001:10003]                  # prints 2 nucleotides, positions 10001 & 10002
> genes[<seqID>[10001:10003] = "TT"           # replaces those nucelotides with TT. Must conserve total length though, cannot do indels this way.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License