PHylogeographic infeRence using APproximated Likelihood

What is PHRAPL?

PHRAPL is a phylogeographic model selection method based on approximate likelihoods. This method estimates the probability of observing a set of gene trees under a model by calculating the frequency at which observed tree topologies occur in a distribution of expected tree topologies. The relative probability of models within a set can be assessed using Akaike information criterion (AIC). Because the method uses gene tree topologies only (excluding branch lengths), it can, relatively quickly, compare the fit of a broad range of models that include coalescence times, migration rates, and distinct/fluctuating population sizes, potentially all acting simultaneously.

Why to use PHRAPL?

Phylogeographic research aims to understand the recent history of species. Over the last decades, researchers have increasingly incorporated demographic models in order to estimate parameters (i.e., divergence times, population sizes, and rates of migration and expansion) that can contribute phylogeographic inference. Typically, this is conducted via the use of software packages that contain specified models (n-island models or fixed topologies). Alternatively, simulation-based approaches allow researchers to customize models for the particular details of their system, and may be useful in testing preexisting biogeographic hypotheses. Because the demographic model is central to the analysis in either case, researchers may wish to assess the appropriateness of their model to the data. PHRAPL is designed to provide such a tool to researchers.

How PHRAPL works

PHRAPL simulates genealogies under a wide range of demographic models and compares the empirical genealogies to the simulated gene tree distributions. Demographic models that are probable given the data will contain many genealogies that match the estimated gene trees. Because the proportion of matching gene trees for a given model is equivalent to the probability of the data given the model and parameter values, we can use this value in an information theoretic framework to evaluate the relative weight of all models. This provides the researcher with an independent assessment of both the best model, given the data as well as the ability to calculate the model likelihoods of classes of models (e.g., n-island vs. isolation models). Watch these YouTube videos to learn more about PHRAPL. After installing PHRAPL, type library(help=phrapl) to get a list of functions with documentation. To open a help file for a particular function, type ?function_name.

CODE

PHRAPL is written in R, but it uses perl and ms to perform simulations. The pre-CRAN (code under development) can be found in github.

CITATION

Jackson N, Morales AE, Carstens BC, O’Meara BC (2017) PHRAPL: Phylogeographic Inference using Approximate likelihoods. Systematic Biology. 66:1045-1053.

OTHER REFERENCES

Jackson N, Carstens BC, Morales AE, O’Meara BC (2017) Species delimitation with gene flow. Systematic Biology. 66:799-812.
Morales AE, Jackson N, Dewey T, O’Meara BC, Carstens BC (2017) Speciation with gene flow in North American Myotis bats. Systematic Biology. 66:440-452.
Carstens BC, Morales AE, Jackson N, O’Meara BC (2017) Objective choice of Phylogeographic Models. Molecular Phylogenetics and Evolution. 116:136-140.

DO YOU HAVE A QUESTION ABOUT PHRAPL OR WANT TO REPORT A BUG?

Post questions and comments in the phrapl-users google group.

THE PHRAPL TEAM

(listed in alphabetical order by last name)

FUNDING

The National Science Foundation funded this research (DEB 1257784/DEB 1257669). The Ohio Supercomputer Center allocated resources to support part of this study (PAS1184). Additional computational resources were allocated by the Carstens and O’Meara Labs.

_{Acknowledgement:Template modified from Just the Docs, a documentation theme for Jekyll.}