Saturday 19 March 2011

Consistent tree drawing

In his VizBi 2011 presentation, Rod Page discussed how the same tree can be drawn in different ways. This is a problem as tree may display differently between viewers, and following the addition or removal of leaves.
Image extracted from Rod's VizBi 2011 presentation

He showed how leaves can be ordered using sequential ids, for example from the NCBI taxonomy. This ordering is then used to specify how leaves are be displayed in the Y dimension leading to consistent display between viewers. In addition, this ordering is not affected by the addition of new, or deletion of existing, leaves.
Image extracted from Rod's VizBi 2011 presentation

Ordering from an arbitrarily ordered list, such as the sequence of names added to the NCBI taxonomy, results in an arbitrarily node ordering. Yet leaves are often names, not numbers, and we are used to viewing text in alphabetical order (a-z).

Leaves may, however, be ordered alphabetically when a tree is first built. To prevent restructuring of a tree, following the addition of new groups of leaves, new groups can be appended to the initial list in alphabetical order. Such ordering maintains consistency in tree rendering while supporting partial alphabetical ordering of tips.
Image edited from Rod's VizBi 2011 presentation

When desired the sequential alphabetically ordered groups can be sorted to create a new fully ordered list.
Image edited from Rod's VizBi 2011 presentation

Whether such partial alphabetical ordering improves tree comprehension requires evaluation, but it is likely to be most beneficial for trees with many polytomies, e.g. taxonomies. There are also overheads associated with maintaining the ordered list(s) with trees to be considered.

Thursday 3 March 2011

Areas of Endemism and Event-Based Methods

In a recent Journal of Biogeography editorial, Ontology of areas of endemism, Brian Crother and Christopher Murray argue that areas of endemism should be the preferred unit in historical biogeography, including event-based methods such as Dispersal Vicariance Analysis and La Grange. I disagree.

Event-based methods reconstruct the history of a clade from an observed distribution of taxa and their evolutionary relationships, given a biogeographic model that defines three things.
  1. How geographical space is divided into units
  2. How those units relate through time
  3. How organisms respond to different configurations of units.
Crother and Murray state that the geographic units should be areas of endemism. Like the 'niche', an 'area of endemism' is a simple concept that most biogeographers' recognise, yet are unable to agree on a clear simple definition. It is, however, essentially an area occupied by a group of species that share similar ranges. It is argued that sharing ranges implies a shared history of the taxa, and, by inference, a history of place. Common phylogenetic pattern confirms hypotheses of shared taxa history. Areas of endemism are further believed to be hierarchical as they are (usually) created by vicariance, and they are geographical units which can be nested within other such units, e.g Jamaica can be nested within North America.

I am interested in reconstructing the spatial and temporal history of a family of freshwater fish species as they diversified with the rise of the Trans-Mexican Volcanic Belt. I know where the fish live now, their evolutionary relationships, and I have a partial hydrological history inferred from geological evidence. The fish cannot disperse between drainage basins and the configuration of the drainage basins changes through time. Basins split, coalesce, or part of one may be exchanged with an adjacent basin in a river capture event. While identifying areas of endemism occupied by sister taxa may help identify past splitting and exchange events, the units that determine the history of the clade are the basins themeselves, and to a lesser extent environmental variation within basins. Basins which change in extent and connectivity through time, and are not in any way hierarchical.

Areas of endemism may exist, they may be correlated with shared history, but they are only one of many artifacts left by history. To reconstruct taxon histories fragmented information must be integrated from many sources to build scenarios consistent with all the available information. Areas of endemism are just one source of information.