P4 — Introduction

2024-10-07.

What is p4?

P4 implements tree-heterogeneous models of evolution

Heterogeneity in the process of evolution should be reflected in heterogeneity in the models that we use. The process of evolution can change over time, and tree-heterogeneous models in p4 can be used for that.

Different genes behave differently in evolution, and it so we can have data-heterogeneous models, where things like the rate matrix, composition, and data type can differ in different data partitions. P4 implements such data-partitioned models, but models that accommodate compositional site-heterogeneity, as is modelled by the CAT model in PhyloBayes, are not implemented in p4.

P4 implements an MCMC for doing Bayesian phylogenetic analysis. It can use both the tree-heterogeneous and multi-partition models mentioned above.

P4 is a phylogenetics toolkit

P4 can be used as a phylogenetic toolkit, the elements of which you can string together in different ways depending on the job at hand. It is useful for programmatic manipulation of phylogenetic data and trees. If you want to do something interesting with your trees or data, p4 might have at least some of what you want to do already in place.
P4 will read data in a few of the common phylogenetic formats (eg Nexus, Phylip, clustalw, fasta, pir/nbrf), but does not read other formats in bioinformatics (eg EMBL, genbank). P4 will read in trees in Nexus or Phylip format.
P4 will do some elementary data manipulation, eg extracting a Nexus-defined charset from an alignment, or converting data from one format to another. P4 will also do tree manipulation, and tree drawing.
P4 is meant to be easily extensible, so if you want to do something that it cannot do, it is often easy to add that functionality.

A citation

For the moment, until there is a better one, the best citation is

Foster, P.G. 2004. Modeling compositional heterogeneity. Syst. Biol. 53: 485-495. https://doi.org/10.1080/10635150490445779