Run multiple sequence alignment
Now, let’s begin the phylogenetic analysis by performing Multiple Sequence Alignment (MSA) for our SARS-CoV-2 sequences. Recall that like human genomes, viral genomes can “evolve” (i.e. mutate) as they replicate. Common mutations include substitutions, insertions, and deletions. Given a set of viral sequences, each of which differ from the sequence of the common ancestor by a series of mutations, it is our job to first “line up” each of our sequences so that each position in the “alignment” of our sequences corresponds to the same position in the sequence of the common ancestor.
We will use ViralMSA to perform MSA. Check out how ViralMSA should be used with .
To run MSA for the sequences in sarscov2_sequences.fas:
Take a quick look at ./ViralMSA_Out/sarscov2_sequences.fas.aln. This file is still in FASTA format, but what changed?