Brief presentation and graphical overview of the pipeline

This manual provides the usage details for GET_PHYLOMARKERS, a software package primarily designed to select “well-behaved” phylogenetic markers to estimate a maximum likelihoood (ML) species tree from the supermatrix of concatenated, top-scoring alignments. These are identified through a series of sequential filters that operate on orthologous gene/transcript/protein clusters computed by GET_HOMOLOGUES to exclude:

  1. alignments with evidence for recombinant sequences
  2. sequences that yield “outlier gene trees” in the context of the distributions of topologies and tree-lengths expected under the multispecies coalescent
  3. poorly resolved gene trees

However, GET_PHYLOMARKERS can also estimate maximum likelihood and parsimony trees from pan-genome matrices, as well as computing basic population genetic statistics and neutrality tests.

Figure 1 provides a graphical overview of the GET_PHYLOMARKERS pipeline. The Manual will describe in detail each of these steps along with the options available to the user to control the pipeline’s behaviour, the stringency of the filters, as well as the number of substitution models evaluated and tree-search thoroughness. In addition, the script can search for ML and parsimony pan-genome phylogenies using the pan-genome matrix computed by from the GET_HOMOLOGUES suite, as shown in the pipeline’s flowchart below.