Viral Phylogenetics
Help 5 / 7
Generate Rooted Tree

Use LSD2 and dates file to generate a rooted tree

In the previous step we created an unrooted phylogenetic tree through Phylogenetic Inference. However, because we have access to the collection of dates of the sequences in our SARS-CoV-2 dataset, we can “root” the tree (find the most likely position of the MRCA) and “date” the tree (scale the branch lengths to be in units of time). We will use LSD2 to generate a rooted tree, and we will use an outgroup to help us do this. Known organisms that are distantly related to the species of interest can act as outgroups (i.e. references) when inferring a rooted tree, which can help us perform more accurate rooting and dating. In our case, we will use a RaTG13 bat coronavirus sequence as our outgroup.

  1. Try to take a look at the usage instruction of LSD2.

  2. Now, to generate our rooted tree:

lsd2 \
   -i sarscov2_sequences.unrooted_tree.nwk \
   -d sarscov2_dates.txt \
   -g sarscov2_outgroup.txt \
   -G -l -1 -o lsd2_out

The above command incorporates the following flags:

  • -i specifies the input file, which is our unrooted phylogenetic tree from Step 2
  • -d specifies the file with sequences dates, which is essential for rooting
  • -g specifies the file with outgroup sequences
  • -G removes the outgroups from the tree (uses it to root, but does not show it on the tree)
  • -o specifies the name of our output file
  1. Now, we have a rooted tree stored in a file called lsd2_out.nwk. Like in Step 2, we can view the first 10 lines of the Newick file at the command line with:
head -10 lsd2_out.nwk
  1. Let’s visualize in the terminal using:
nw_display lsd2_out.nwk
  1. Again, we can download the file so that it can be uploaded and better visualized in Taxonium:
download lsd2_out.nwk

Take a look at the lsd2_out log file. When did the MRCA exist?

WHO declared COVID-19 a pandemic on March 11, 2020. Does our MRCA date to before or after this day?

We used 10 SARS-CoV-2 sequences to generate this rooted tree. Which statement is true about this approach?

Loading...