The research in our lab uses a mixture of computational and experimental genomics to study two main areas:
- Identification and characterization of non-coding RNA (ncRNA) genes
On the computational side of the lab, we develop and refine methods to detect RNA genes in genomic sequence using probabilistic models and comparative genomics. We have created RNA genefinders for known classes of RNAs (transfer RNAs, C/D box & H/ACA guide snoRNAs), and seek to identify and model new classes of RNAs that, among other traits, are highly structured or contain stretches of complementarity to other DNA/RNA targets (like microRNAs).
Postdoc Peter Schattner has created a new program that allows searching for any specific RNA using flexible probabilistic models and allows for base pairings to target RNAs in trans. His first application of the program was creating a genefinder for the H/ACA class of guide snoRNA genes. We applied the program to the yeast genome and found and confirmed five new genes in work recently published. Peter is now applying the program to mammalian and Archaeal genomes.
On the experimental side, we assay ncRNA transcriptional profiles via hybridization to in-house generated DNA microarrays, augmented by traditional molecular biology characterization. We believe tight integration of theoretical and experimental approaches is the quickest, most efficient path to discovery.
One "dry/wet" research team has been screening the human and mouse genomes for novel non-coding RNAs. Bioinformatics grad student Andy Pohl has performed the computational analyses to create a ranked list of novel ncRNA candidates from hundreds of thousands of highly conserved regions with hallmarks of non-coding RNAs. Biology graduate student Sergio Barberan has created DNA microarrays to test for expression of these candidates in a variety of tissue types, and is verifying positive array signals with Northern blot analyses. Sergio's thesis work will continue with biochemical characterization of verified candidates (cell localization, protein binding assays), and Andy will begin creating probabilistic models to search for orthologs in other genomes.
- Understanding the unique biology of Archaeal "extremophiles" -- microbes that live at the edge of the limits of life (high/low temperature, pH, salt, pressure)
We have created full genome DNA microarrays for two of the most extreme hyperthermophilic Archaea sequenced to date, Pyrococcus furiosus and Pyrobaculum aerophilum, which natively grow at boiling temperatures. The organisms are particularly mysterious because no genetic systems exist to study individual gene function -- all analyses to date have been biochemical or by computational methods. In collaboration with groups from UCLA and the University of Maryland, we have challenged these organisms in a series of stress conditions to measure global transcriptional response. Grad student Aaron Cozen continues new stress and varied nutrient growth experiments in our own lab.
On the dry side, postdoc Katie Pollard analyzes the extremophile array data to predict functional roles for genes of unknown function, identify the major players in various cellular stresses, and develop robust functional clusters. Work with these organisms is particularly challenging because less than a third of all genes have reliable functional annotation.
Bioinformatics graduate student Patricia Chan is re-analyzing all gene predictions in our two hyperthermophile lab species to find novel RNA genes. She is using microarray data and computational analyses to detect ncRNAs in intergenic regions or regions that have been mis-annotated as proteins. Evidence from other bacterial genome researchers suggest a large number of regions annotated as hypothetical proteins are actually highly conserved RNA genes. Patricia is using tools such as QRNA, RNAfold, positional promoter analysis, and custom RNA genefinders developed in our lab to detect regions that show expression on microarrays, but do not appear to be proteins on closer examination. Aaron will work with Patricia to verify new RNA gene predictions via Northern analysis and primer extension experiments.
Bioinformatics graduate student Matthew Weirauch is also developing improved operon predictions for these species, and creating a database of Archaeal-specific transcription factor binding sites. The approach makes use of the much-better-studied bacterium E. coli by lifting co-regulated functional groups determined by other groups, mapping orthologous genes to archaeal species, and looking for similar, unifying upstream control sequence. By leveraging the new, unpublished array data in our lab, we seek to improve the accuracy of these computational analyses.
Finally, as a research tool for in-lab use, and as a service to the community, Andy Pohl and undergrad Kevin Schneider have created full genome browsers using the superb Haussler/Kent human genome browser framework. We have inserted our microarray data, promoter analyses, refined operon predictions, RNA gene predictions, and phylo-classified BlastP similarity matches on the browser in the context of the existing Genbank and COG annotation. We expect to branch out to the remaining sequenced archaeal species gradually as we turn our computational and microarray transcription studies to focus on them.
Back to main lab homepage
Last updated September, 2004