Compositional homogeneity tests¶
The usual, poor, way to do a compositional homogeneity test is to do a
Chi-square test on the data. Paup does this on the data overall, and
Tree-Puzzle does it on individual sequences. In the latter, since there
are many simultaneous comparisons, the power of the conclusions might be
compromized. P4 does the test both ways, using the method
Data.Data.compoChiSquaredTest()
.
This test uses the X^2 statistic. It is well-known that this sort of test suffers from a high probability of type II error, because the chi-square curve is not an appropriate null distribution by which to assess significance of the X^2 statistic.
A better null distribution can be obtained by simulating data on the
tree and model in question, and using X^2 statistics extracted from
those simulations to make a null distribution appropriate to the problem
at hand. This is done using the Tree method
Tree.Tree.compoTestUsingSimulations()
.
Model fit tests¶
For Bayesian model fit assessment you can do posterior predictive simulations during the MCMC. See Posterior predictive simulations.
P4 implements two ML-based model fit tests– the tree- and model-based
composition fit test, and Goldman’s version of the Cox test. Both
rely on simulations that generally need to be followed by
optimizations, so they are expensive tests. In both tests the
simulation/optimization part is the time-consuming part, and since
both tests can use the same simulations, in p4 they are done together.
First, the simulations are done using the Tree method
Tree.Tree.simsForModelFitTests()
. After that, the simulation
results are digested with the Tree method
Tree.Tree.modelFitTests()
.