Carnegie Mellon
Computational Molecular Biology Symposium

What, how and how many: Genome-wide analysis of RNA splicing

Javier Lopez, Department of Biological Sciences, Carnegie Mellon University

The complexity of multicellular animals appears to be generated with surprisingly small numbers of protein-coding genes. One answer to this paradox is that the number of functionally distinct proteins exceeds the number of genes. The difference is a consequence of the discontinuous structure of most genes, whose coding regions (exons) are separated by non-coding introns. As RNA is transcribed from the gene, a process called splicing removes the introns and joins the exons together to form the functional messenger RNA, which is then translated into protein. A gene can encode more than one type of protein by combinatorial use of exons ("alternative splicing"). Between 30% and 65% of human genes exhibit alternative splicing, in some cases generating thousands of different proteins from one gene. Alternative splicing is also regulated to produce different protein combinations in different cell types. Gene-finding programs cannot predict the occurrence of alternative splicing, and current approaches to genome-scale identification and comparison of splicing events are inefficient, subject to artifacts, and blind to certain types of splicing. These limitations can be surmounted with novel approaches that focus on an information-rich intermediate of the splicing reactions instead of the final products. The methods can be used for gene annotation, for quantitative comparisons of splicing in different cell types or disease states, and for uncovering fundamental mechanisms of splicing.

Return to homepage.

The second and fourth images in the header are courtesy the BIODIDAC website.