INPUT FILES
PHRAPL
requires two input files: an assignment file (e.g. cladeAssignments.txt) and a file with phylogenetic trees in newik format (e.g. trees.tre). All files must have the same taxon names (labels) across all loci.
You can have missing data. PHRAPL
does not require gene trees to include all individuals/alleles per population/species/group.
You can include both haploid and diploid data in the same analyses using the argument popScaling
in the GridSearch
function.
Assignment file
The assignment file (cladeAssignments.txt) specifies which individuals belong to which populations/species/groups. This file is a table that must consist of two columns:
- The first column lists the individuals in the dataset, whose names must match those at the tips of the trees. Note that not all the individuals listed in the table must exist on every tree (i.e., missing data/unique tip names for each tree are fine).
- The second column should provide the population or species name to which each individual is assigned (e.g., “A”, “B”, “C”, “D”).
## Example of an input assignment file.
# The first column lists individuals
# The second column lists populations.
# This example does not include an outgroup.
Indv PopLabel
ind1 A
ind2 A
ind3 A
ind4 B
ind5 B
ind6 B
ind7 C
ind8 C
ind9 C
ind10 D
ind11 D
ind12 D
If there is an outgroup taxon, it MUST be listed as the last population in the table and the first letter in the population name should also come last alphanumerically (e.g., "Z.outgroup"
).
## Example of input assignment file with an outgroup (all listed in the last line).
# The first column lists individuals
# The second column lists populations.
Indv PopLabel
ind1 A
ind2 A
ind3 A
ind4 B
ind5 B
ind6 B
ind7 C
ind8 C
ind9 C
ind10 D
ind11 D
ind12 D
ind13 Z.outgroup
Note:
PHRAPL
assigns population indexes (i.e., 1, 2, 3, etc.) to taxa/populations alphanumerically, such that population 1 corresponds to the population name in your assignment table that comes first in the alphabet, and so forth. This is important to remember when interpreting parameter nomenclature in the output.A header row of some sort must also be included.