

One of the first challenges in handling large quantities of data is finding new ways to visuazlize it so that the relevant patterns and important exceptions are easy to see. While some good general purpose tools for the exploration of large amounts of data do exist, they are not equipped to handle structural or phylogenetic information that can accompany sequence data. On the other hand, tools for sequence data do not provide all of the basic features for good data visualization and exploration. Therefore we are developing our own tool that combines the best features of both types of software with new features that are specifically tuned for large alignments. This tool, VELMA, is very much a work in progress; the currently available implementation only possesses a subset of the full functionality we will ultimately develop.

Molecular Determinants of a Phenotype

The emergence of a phenotype, such as a new drug resistance, is usually accompanied by a search for the underlying molecular causes. Part of this process involves choosing a single wildtype strain and a single mutant strain, identifying the amino acid differences between them, and making mutations back and forth in an attempt to identify the minimal number of changes that are necessary and sufficient to switch the phenotype from wildtype to mutant and back. This process can be expensive and labor-intensive and is not guaranteed to find the complete answer. We are developing an algorithm that can analyze all available data to minimize the number of experiments that need to be done and to improve the quality of the results. This algorithm involves a combinatorial search through all possible explanations for the change in phenotype.