A phylogeny charts the evolutionary relationships among organisms, usually represented as a tree. This lineage can be depicted at several levels, e.g. at the level of genes, cells, organs, or species. Recently gene trees have become an important tool for predicting gene function based on orthology. Currently, the relationships between genes are determined by their sequences using various methods: parsimony, likelihood, and Bayesian analysis. However, the root of these trees is difficult to determine with assurance. In practice, the location / time of gene duplications and losses are generally not determined -- partially because programs for the reconcilation of trees are poorly written or not rigorous. There are also several methods available to reconcile trees. The fastest method is parsimony where losses and duplications are minimized. Arvestad et al. (2003) explored an alternative method for gene/species tree reconciliation using Bayesian methods, which may find solutions discarded by the deterministic parsimony method. In the first part of this project, we would like to implement their method in a tool which is usable and accessible for biologists, something which is not currently available.
In the second part of our project, we would like to explore hierarchical representations of reconciled phylogenies (Serb and Oakley 2005). A common problem with large reconciled phylogenies is the difficulty in interpreting the resulting tree. Which genes belong to which species? Which species have losses and when did losses happen in the species tree history? We believe phylogenies could be made easier to read and understand by displaying trees within trees (for example, the gene tree within the species tree) and allowing the user to interact with the display by choosing their level of detail (or zoom). This type of interface was successfully used in the UCSC Genome Browser to handle presentation of multiple levels of data (Karolchik et al. 2003). The viewer sees more detail as they narrow their focus to a specific part of the dataset, and less detail as they broaden their focus to the entire dataset. This tool could be useful as an aid both in research and education.
We have three unpublished datasets from the Oakley lab: opsin, phosphodiesterase, and G-protein data. We would like to map these to the species tree from the Tree of Life project using the Baysian method and compare with the results from other methods.
A second goal is to evaluate the effectiveness of the level-of-detail interface in informal user studies.
Papers we have read are checked.
(x) Arvestad et al., Bayesian gene/species tree reconciliation and orthology analysis using MCMC. Bioinformatics 2003.
(x) Goodman et al., Fitting the gene lineage into it species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 1979.
( ) Page and Charleston, From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molecular phylogenetics and evolution 1997.
(x) Serb and Oakley, Hierarchical phylogenetics as a quantitative analytical framework for evolutionary developmental biology. BioEssays 2005.
( ) Karolchik et al. The UCSC Genome Browser. Nucleic Acids Research 2003.
( ) Martin Reddy, Perceptually modulated level of detail for virtual environments. Ph. D. Thesis, University of Edinburgh, 1997.
( ) Leubke et al., Level of Detail for 3D Graphics. Morgan-Kaufmann Publishers, San Francisco 2002.