Institute of Molecular
Evolutionary Genetics











Spring 2009


Previous IMEG Seminars and Abstracts:

Fall 2013

Spring 2013

Fall 2012


Spring 2012

Fall 2011

Fall 2010


Spring 2010
Fall 2009

Spring 2009

Fall 2008

Spring 2008

Fall 2007
Spring 2007
Fall 2006

Spring 2006
Fall 2005
Spring 2005

Fall 2004

Spring 2004

Fall 2003

Spring 2003
Fall 2002




Speaker: Dr. Mike Axtell - Dept of Biology

Title: Plant small RNAs:  The old, the new, and what they do.



Small regulatory RNAs between 20 and 30 nucleotides in length are a ubiquitous component of eukaryotic transcriptomes.  Several functionally and evolutionarily distinct subclasses of small RNAs have been described, including microRNAs (miRNAs; which function to regulate specific target messenger RNAs), and short interfering RNAs (siRNAs; which often function to repress the expression of transposons or RNA viruses).  Research in my laboratory focuses upon the function and evolution of small RNAs within the land plants.  I shall present results from four recent and/or ongoing projects:  1)  The discovery and functional characterization of universally conserved plant miRNAs, 2)  Analyses of recently evolved, lineage-specific plant miRNAs, 3)  The characterization of anti-transposon siRNA systems in a basal land plant, and 4)  The development of a novel, high-throughput  methodology for detection of small RNA targets using empirical data instead of computational predictions.


Axtell, M.J., and Bowman, J.L. (2008). Evolution of plant microRNAs and their targets. Trends Plant Sci. 13, 343-349.

Cho, S.H., Addo-Quaye, C., Coruh, C., Arif, M.A., Ma, Z., Frank, W., and Axtell, M.J. (2008). Physcomitrella patens  

          DCL3 is required for 22-24 nt siRNA accumulation, suppression of retrotransposon-derived transcripts, and  

         normal development. PLoS Genet. 4, e1000314.

Addo-Quaye, C., Eshoo, T.W., Bartel, D.P., and Axtell, M.J. (2008). Endogenous siRNA and miRNA Targets Identified

          by Sequencing of the Arabidopsis Degradome. Curr. Biol. 18, 758-762.


Speaker: Nick Polato - Dept of Biology

Title: Connectivity and Climate Change: Impacts on a Reef Building Coral in the Central Pacific


Connectivity is a primary factor determining community structure, species cohesion, and population persistence. Weather phenomena which alter wind and current patterns will significantly impact patterns of reef connectivity. Long-distance dispersal to the east from the coral rich central Pacific may be favored during El Nino Southern Oscillation (ENSO) events due to reversal of surface currents, providing larvae to areas denuded by recurring high Sea Surface Temperatures (SSTs) caused by ENSO, and supplementing coral populations in marginal habitats in the Eastern Pacific. In an effort to understand patterns of connectivity influencing the origin, evolution, and regeneration of coral populations in the eastern Pacific, microsatellite loci were generated for Porites lobata, a massive reef building species found throughout the Pacific Ocean. Genotype data have been examined for individuals from throughout the Hawaiian archipelago providing insight into the origins of reef populations in this region. Ongoing work will apply the genotyping markers to additional populations. Nuclear sequence markers will be used in conjunction with microsatellite data to estimate the timing and extent of long distance dispersal events from central to eastern Pacific populations, and patterns of connectivity will be combined with oceanographic current models to identify primary corridors for genetic connectivity. Whether recruits to eastern Pacific populations are produced locally or regionally will affect the rate of colonization following disturbance and the ability of populations to adapt to local conditions, with important implications for the design of reserves intended to protect them.


Lessios, H.A. and Robertson, D.R. (2006) Crossing the impassable: genetic connections in 20 reef fishes across 

         the eastern Pacific barrier. Proc. R. Soc. B. 273, 2201-2208.

Glynn, P.W. & Ault, J.S. (2000) A biogeographic analysis and review of the far eastern Pacific coral reef region. Coral

          Reefs. 19, 1-23.

Baums, I.B., Miller, M.W. & Hellberg, M.E. (2005) Regionally isolated populations of an imperiled Caribbean coral,

           Acropora palmata. Molecular Ecology. 14, 1377-1390.


Speaker: Dr. Hong Ma - Dept of Biology

Title: Distinct birth-and-death patterns of eukaryotic gene families: a summary of gene family studies and functional implications, a case study with two gene families encoding histone demethylases, and a hypothesis of an early eukaryotic genome duplication


Birth-and-death is a dominant form of eukaryotic gene family evolution, resulting in gene copy number changes that establish the foundation for functional divergence. We have studied evolution of gene families that regulate development and control key cellular processes, and have found that there is a correlation between gene copy number stability and functional conservation. Histone methylation is an important mechanism for controlling chromatin structure and gene expression.  Recently, others have discovered genes encoding demethylases. We have analyzed all detectable homologs of histone demethylases and their prokaryotic relatives from several representative organisms with whole-genome sequences, and performed detailed phylogenetic studies of these genes. We show that there are two separate gene families with distinct evolutionary patterns. In addition, the fate of ancestral genes in animal and plant lineages are different with evidence for convergent evolution. Gene family studies that we and other have performed have provided strong evidence for duplication before the divergence of animals and plants.  In principle, these could have been the result of an ancient genome duplication. We have analyzed several thousand gene families, and obtained evidence that hundreds of gene families show duplication before the animal-plant split. These might be the remnants of a hypothesized genome duplication, which might have provided new copies of genes that facilitated early eukaryotic diversification. 


Lin, Z., Kong, H., Nei, M.*, Ma, H.* 2006. Origins and evolution of the recA/RAD51 gene family: Evidence for ancient

         gene duplication and endosymbiont gene transfer. Proc. Natl. Acad. Sci. USA. 103: 10328-10333.

          (*corresponding authors)

Zahn, L. M., Leebens-Mack, J., Arrington, J. M., Hu, Y., Landherr, L.L., dePamphilis, C. W., Becker, A., Theissen, G.,

          and Ma, H.* 2006. Conservation and divergence in the AGAMOUS subfamily of MADS-box genes: Evidence for

          independent sub- and neofunctionalization events. Evol. Dev. 8: 30-45.

Kong, H., Frohlick, M., Leebens-Mack, J., Ma, H. dePamphilis, C.  2007. Rapid birth of plant SKP1 genes by tandem

          duplication and retrotransposition. Plant J. 50: 873-885.

Lin, Z. Nei, M., Ma, H. 2007. The origins and early evolution of DNA mismatch repair genes - multiple horizontal gene

          transfers and co-evolution.  Nucleic Acids Res. 35: 7591-7603.

Sun, Y., Zhou, X., Ma, H. 2007. Genome wide analysis of Kelch repeat-containing F-box family (KFB). J. Intg. Plant

          Biol49: 940-952.

Hu, W., dePamphilis, C., Ma, H. 2008. Phylogenetic analysis of the plant-specific Zinc Finger-Homeobox and Mini

          Zinc Finger gene families. J. Integ. Plant Biol. 50: 1031-1045.

Quan, L.#, Xiao, R.#, Li, W., Oh, S.-A., Ambrose, J.C., Cyr, R., Twell, D., Ma, H.* 2008. Functionally divergence of the

          duplicated Arabidopsis AtKIN14a and AtKIN14b genes: critical roles in meiosis and gametophyte

          development.  Plant J. 53: 1013-1026. (# co-first authors).

Surcel, A., Zhou, X., Quan, L., Ma, H. 2008. Long-term maintenance of stable copy number in the eukaryotic SMC

          family: origin of a vertebrate meiotic SMC1 and fate of recent segmental duplicates. J. System. Evol. 46: 405-


Zhou, X., Ma, H. 2008. Evolutionary history of histone demethylase families: distinct evolutionary patterns suggest

          functional divergence. BMC Evolution. 8: 294.

Xu, G., Ma, H., Nei, M., Kong, H. 2009. Evolution of F-box genes in plants: different modes of sequence divergence

           and their relationships with functional diversification. Proc. Natl. Acad. Sci. USA. In press.


Speaker: Samir Wadhawan - Dept of Biochemistry, Microbiology, and Molecular Biology - CANCELLED

Title: Analysis of Mammalian Overlapping  Reading Frames and Metagenomic DNA


Recent advances in sequencing technologies have resulted in an exponential increase in the amount of sequence data available in the public domains. This not only makes it feasible to compare sequences from different species in order to analyze genomic signatures that distinguish them, but also facilitates the taxonomic classification of newly discovered species. Here we utilize the currently available sequence data to (i) study the evolution of mammalian overlapping reading frames and (ii) develop approaches to analyze metagenomic DNA.
The ability of a single stretch of DNA to encode different proteins using alternate reading frames was first discovered in bacteriophages phiX174 and G4. Since then, dual-coding regions have been shown to occur frequently in genomes of other viruses and bacteria. Here they serve to increase the diversity of encoded proteins in a genome constrained by smaller size. However, in higher eukaryotes where genome size is not a limitation, dual-coding regions are rare and are believed to be involved in gene regulation. Preserving overlapping reading frames in large genomes is costly as it not only decreases the substitution rate but also limits the choice of codon usage. Therefore most dual-coding regions in eukaryotic genomes are short and poorly conserved. Against these odds, there exist three human genes (XBP1, INK4A and GNAS1) with well-conserved dual-coding regions.  Studies of overlaps in XBP1 and INK4A have shown an inconsistency in the mode of evolution of dual-coding regions in mammals. The overlapping reading frames of   XBP1 co-evolve by accumulating nonsynonymous changes at similar rates. In contrast, those of INK4A evolve asymmetrically. One of the reading frames exhibits signatures of diversifying selection while the other is under purifying selection. Here we report the analysis of the dual-coding region of mammalian GNAS1 and show that it evolves differently in rodents as compared to other mammals. Further we analyze the selection constraints on other transcripts of GNAS1 and trace back the evolutionary history of the locus by comparing it with its paralog, GNAL.
     In an altogether different application, we have utilized publicly available sequences to develop a robust framework to analyze metagenomic DNA. The field of metagenomics, which aims to characterize organisms inhabiting a particular environment, is evolving rapidly due to the development of sequencing technologies that circumvent the need to clone DNA and culture microbes.  As a consequence, most studies to date have focused on the identification and characterization of prokaryotic species. Further, there still remains the need to establish a uniform methodology to analyze metagenomic data. Here we describe an approach we utilized to compare eukaryotic species diversity between distinct geographical locations in the Northeastern United States. The computational framework we have developed can not only be applied to other metagenomic data sets but can also be used for other read-count based applications of next-generation sequencing, such as the comparison of expression levels.


Wadhawan S, Dickins B, Nekrutenko A. (2008) Wheels within wheels: clues to the evolution of the Gnas and Gnal loci. Mol. Biol. Evol.;25(12):2745-57.

Szklarczyk R, Heringa J, Pond SK, Nekrutenko A (2007) Rapid asymmetric evolution of a dual-coding tumor suppressor INK4a/ARF locus contradicts its function.Proc Natl Acad Sci U S A. 31;104(31):12807-12.

Chung WY, Wadhawan S, Szklarczyk R, Pond SK, Nekrutenko A. (2007) A first look at ARFome: dual-coding genes in mammalian genomes. PLoS Comput Biol. 3(5):e91.

Nekrutenko A, He J. (2006) 4) Functionality of unspliced XBP1 is required to explain evolution of overlapping reading frames.Trends Genet. 22(12):645-8.


Speaker: Dr. Abdelali Barakat - Dept of School of Resources

Title: The Cinnamyl Alcohol Dehydrogenase gene family in Populus: Genome organization, expression, and evolution



The Cinnamyl alcohol dehydrogenase (CAD) enzyme catalyzes the last step of the synthesis of phenolic monomers which plants use to build their cell walls. CAD has been studied extensively; however, little is known about the evolution of CAD genes. In plants, only a couple CAD genes were shown previously to be involved in wood development. The other CAD genes from model plants do not seem to be associated with wood development under normal growth conditions, which suggests that they serve other roles. Previous studies showed that members of the CAD family were distributed in different classes: However, we do not know if the distribution of CAD genes into various classes reflects a functional divergence. Are genes from one class associated with xylem development while genes from the others function in defense against biotic stresses?  Do genes from the different classes compensate for each other both under normal growth and stress conditions? We recently started a project on the CAD gene family and are attempting to answer some of these questions. We analyzed the phylogeny, the gene organization, the gene structure, the promoter sequences, and the expression of CAD genes in various tissues from Populus plants grown under normal conditions and from stressed plants. The phylogeny showed evidence of three main classes of CAD with the bona fide cell wall – related genes clustering together in a separate class. Analysis of the genome organization of the CAD family in Populus showed that most CAD genes were duplicated and a large fraction is still in conserved positions on the chromosomes. Expression profiling in Populus showed that the expression of CAD genes is coordinated in tissues from plants under normal growth and stress conditions. We showed that only two genes are involved in lignin biosynthesis in xylem (stem) cells while the others function together in plant defense against pest attack. We also found that Populus CAD genes may play their role in defense in a tissue specific manner. We showed that CAD gene duplicates have diversified and may have evolved modified functions.



Raes J, Rohde A, Christensen JH, Van de Peer Y, Boerjan W: Genome-wide characterization of the lignification toolbox in Arabidopsis. Plant Physiol 2003, 133(3):1051-1071.

Kim SJ, Kim KW, Cho MH, Franceschi VR, Davin LB, Lewis NG: Expression of cinnamyl alcohol dehydrogenases and their putative homologues during Arabidopsis thaliana growth and development: lessons for database annotations? Phytochemistry 2007, 68(14):1957-1974.

Tobias CM, Chow EK: Structure of the cinnamyl-alcohol dehydrogenase gene family in rice and promoter activity of a member associated with lignification. Planta 2005, 220(5):678-688.


Speaker: Dr. Greg Wray - Professor of Biology and Evolutionary Anthropology - Duke University

Title: Identifying changes in transcriptional regulation that contributed to human origins


The genetic differences that separate us from our closet living relatives seem modest, yet humans are morphologically, physiologically, and cognitively a prominent outlier among the extant great apes. Identifying the genomic regions and specific mutations that contributed to the origin of human traits requires that we develop methods to identify candidates throughout the genome. We are currently pursuing two such approaches: statistical methods based on maximum likelihood models to scan for evidence of branch-specific positive selection on noncoding regions and empirical methods based on next-generation sequencing to identify genes whose expression differs between humans and apes. These analyses are proving useful in several ways. We carried out a meta-analysis of scans for positive selection on coding and noncoding regions, which suggests that mutations underlying adaptation differ depending upon gene function and tissue of expression. We are also carrying out focal analyses of particularly interesting candidate genes that have emerged from the scans. An example involves the SLC2A gene family, where two paralogs may have contributed to the expansion of brain size along the human lineage.   





Speaker: Dr. David Geiser - Dept of Plant Pathology

Title: Fungal forensics and agriculture in the information age


The introduction of exotic pests and pathogens into natural and agricultural ecosystems has been a problem for millennia. Following the accidental introduction of the chestnut blight fungus into North America a century ago and the subsequent obliteration of the American Chestnut, The US government passed the Plant Quarantine Act of 1912, which set up a regulatory system to protect plants from introduced threats. As globalization has only increased in the last century, agriculture has been bombarded with a constant onslaught of new and introduced pathogens. Our ability to respond to these introductions is severely limited by a lack of knowledge about the diversity of microbes. We propose that by proactively characterizing important groups of pathogens in existing germplasm collections and making the data publicly available, we can greatly improve our ability to respond to new and introduced microbial pathogens. I will describe the Phytophthora Database ( here at Penn State, an online database for this important genus of fungus-like organisms that cause some devastating plant diseases, including the recently discovered “Sudden Oak Death” organism Phytophthora ramorum. I will also discuss how this database and others like it are helping in the discovery of new microbial species, and improving our ability to respond to disease outbreaks.


1. O’Donnell, K., Sutton, D.A., Fothergill, A., McCarthy, D., Rinaldi, M.G., Brandt, M.E., Zhang, N. and Geiser, D.M. 2008. Molecular phylogenetic diversity, multilocus haplotype nomenclature and in vitro antifungal resistance within the Fusarium solani species complex. J. Clin. Microbiol. 46: 2477-2490.

2. Park, J. and 28 co-authors. 2008. Phytophthora database: A forensic database supporting the identification and monitoring of Phytophthora. Plant Dis. 92: 966-972.

3. Blair, J.E., Coffey, M.D., Park, S-Y, Geiser, D.M. and Kang, S. 2008. A multilocus phylogeny for Phytophthora utilizing markers derived from complete genome sequences. Fungal Genet. Biol. 45: 266-267.

4. O’Donnell, K., Sarver, B.A.J., Sutton, D.A., Benjamin, L., Lindsley, M., Padhye, A., Geiser, D.M. and Ward, T.J. 2007. Phylogenetic diversity and microsphere array-based genotyping of human pathogenic fusaria, including isolates from the 2005-06 multistate contact lens-associated US keratitis outbreaks. J. Clin. Microbiol., 45: 2235-2248.

5. Zhang, N., O’Donnell, K., Sutton, D.A., Nalim, F.A., Summerbell, R.C., Padhye, A.A., and Geiser, D.M. 2006. Members of the Fusarium solani species complex causing infections in both humans and plants are common in the environment. J. Clin. Microbiol., 44: 2186-2190.



Speaker: Dr. Ross Hardison - Dept of Biochemistry & Molecular Biology

Ross Hardison, Yong Cheng, Lou Dore, X. Zhang, Y. Zhou, D.C. King, Y. Zhang, C. Dorman, D. Abebe, S. Kumar, F. Chiaromonte, Webb Miller, R. Green, Mitchell Weiss

Center for Comparative Genomics and Bioinformatics of the Huck Institutes of Life Sciences, The Pennsylvania State University, University Park, Pennsylvania; Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania; NimbleGen Systems Inc., Madison, Wisconsin


Sophisticated models of evolutionary change based on multispecies alignments in noncoding DNA sequences can be used to find gene regulatory regions. About half the predictions are correct (under favorable circumstances) and most nonfunctional regions are rejected. However, these methods are much less effective in finding regulatory regions that are lineage-specific or have undergone turnover in sites for transcription factor binding. We are combining several lines of investigation into the role of the transcription factor GATA-1 in erythroid maturation to better understand DNA sequences involved in gene regulation, both in terms of their evolutionary history and their current mechanisms of action. We mapped occupancy of DNA segments by GATA-1 along 66 million bp of mouse chromosome 7 in G1E erythroid cells (derived from Gata1-null mice) in which GATA-1 function has been restored. Initial ChIP-chip results were re-tested by independent quantitative PCR to generate a high quality dataset of occupied sites. Combining these results with patterns of gene expression in response to GATA-1, direct biological assays for functions of the occupied sites, and analysis of DNA sequences and evolutionary patterns revealed important insights. The protein GATA-1 is an exquisite discriminator of binding site motifs; only about 1/500 potential sites are occupied in vivo. Motifs for other transcription factors found in erythroid cells partially explain the specific binding. The occupied DNA segments tend to be in the “vicinity” (about a hundred kilobaases) of genes regulated by GATA-1, with about half active as enhancers and about a quarter in promoters. Virtually all the occupied DNA segments contain the canonical binding site motif for GATA-1, WGATAR, but that motif is conserved in multiple mammalian orders in less than half the segments. These occupied segments with conserved binding site motifs are strongly associated with enhancement when tested in transfected mammalian cells. These and other results strongly support the hypothesis that the evolutionary constraint on regulatory function of a subset of occupied sites preserves the sequence and location of the binding site motif. Occupied segments with lineage-specific motifs may have modulatory or storage functions for the transcription factor.


Cheng Y, King DC, Dore LC, Zhang X, Zhou Y, Zhang Y, Dorman C, Abebe D, Kumar
SA, Chiaromonte F, Miller W, Green RD, Weiss MJ, Hardison RC. (2008) Transcriptional enhancement by GATA1-occupied DNA segments is strongly associated with evolutionary constraint on the binding site motif. Genome Res. 18:1896-1905.

Full text:





Speaker: Dr. Webb Miller - Dept of Biology
Title: Genomics and Species Conservation



This talk describes some of our recent efforts to better understand the genomic properties that affect, or are affected by, extinction of a species. We have sequenced the woolly mammoth (Miller et al. 2008) and the so-called Tasmanian Tiger (Miller et al. 2009) using new methods for handling ancient DNA. Currently, we are using high-throughput sequencing and genotyping methods to help save the Tasmanian Devil, an iconic Australian marsupial carnivore that is greatly endangered by Devil Facial Tumor Disease. This project has led to the formulation of a novel computational problem concerning the analysis of DNA sequences.



Miller et al. 2008. Sequencing the genome of the extinct woolly mammoth. Nature 456, 387-390.

Miller et al. 2009. The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus). Genome Research 19, 213-220.


Speaker: Xin Ye - Dept of Genetics

Title: Scalloped, a critical partner of Yorkie in tissue growth control



Hippo (Hpo) signaling pathway is a newly identified tumor suppressor pathway in Drosophila, which regulates cell growth, proliferation and apoptosis. Components of Hpo pathway include proteins from transmembrane receptor to nuclear transcription coactivator, which are highly conserved in mammals. In this work, we focused on the identification of transcription factor(s), which bring the transcription coactivator of Hpo pathway, Yorkie (Yki), to the expression of target genes. The results showed that TEAD/Sd interacts with YAP/Yki to promote tissue growth.



Zhao, B., Kim, J., Ye, X., Lai, Z.-C., and K.-L. Guan. (2009). Both TEAD-Binding and WW Domains Are Required for the Growth Stimulation and Oncogenic Transformation Activity of Yes-Associated Protein. Cancer Res. 69 (3): 1089-1098.

Zhao, B., Ye, X., Yu, J., Li, L., Li, W., Li, S., Yu, J., Lin, J. D., Chinnaiyan, A. M., Lai, Z.-C. and K.-L. Guan. (2008). TEAD mediates YAP dependent gene induction and growth control. Genes & Development 22: 1962-1971.


Speaker: Dr. Eddie Holmes - Dept of Biology

Title: The Evolutionary Genomics of Influenza


Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, PA 16802. USA
Fogarty International Center, National Institutes of Health, Bethesda, MD 20892. USA

Most studies of the evolution and epidemiology of influenza A virus have focused on the hemagglutinin (HA) protein in isolation, with relatively little attention paid to evolutionary dynamics at the genomic level, particularly the relationship between natural selection, reassortment, and the functional interactions among segments.  However, recent developments in comparative genomics, particularly the Influenza Genome sequencing Project begun in 2005 (; with ~3500 genomes generated to date) have provided a vital new perspective on the epidemiology and evolution of human influenza A virus.  Herein, I will show how genomic data is altering our view of influenza virus evolution.  I will focus on the following areas: (i) the nature of viral evolution within epidemic seasons, particularly in the United State, (ii) the extent of global genetic diversity and pattern of migration, and (iii) the evolutionary genomics of influenza A virus, particularly its population dynamics and the evolutionary role played by reassortment.  Together, these analyses reveal that the genome-wide evolution of influenza A virus is characterized by a complex pattern of frequent reassortment, often involving HA, interspersed by dramatic reductions in genetic diversity most likely reflecting selective sweeps, some of which affect multiple segments.  The H3N2 and H1N1 subtypes of influenza A virus also exhibit strikingly different (and out-of-phase) epidemiological dynamics, with the latter characterized by the circulation of multiple lineages, indicative of weaker antigenic drift.  Finally, I will show how intensive intra-host genome sequence data is providing a unique insight into the evolutionary dynamics of influenza, with mixed infection a particularly frequent occurrence




Speaker: Sayaka Miura - Dept of Biology

Title: Linkage maps and birth-and-death evolution of MicroRNA genes in some flowering plant and insect species


MicroRNAs (miRNAs) are 21- to 24-nucleotide long RNAs that regulate gene expression in the posttranscriptional stage. MiRNAs can be classified into several gene families. It has been suggested that these families evolve following the birth-and-death model of evolution, but the detailed mechanism remains unclear. We have therefore constructed linkage maps for a few groups of closely related species of plants and animals and investigated how the gene gains and losses occur in the evolutionary process. BLAST searches for conserved families of miRNA genes were conducted with the genome sequences of two rice subspecies (Japonica and Indica), two Arabidopsis species (A. thaliana and A. lyrata), and four Drosophila species (D. melanogaster, D. simulans, D. sechellia, and D. yakuba). About 130, 80, and 150 miRNA genes were found in rice, Arabidopsis, and Drosophila. The linkage maps showed that substantial number of miRNA gene rearrangements have occurred in Arabidopsis and Drosophila species and that miRNA gene gains or losses were often generated tandemly. The total number of gene gains and losses were about 15 even between two subspecies of rice, which diverged about 400,000 years ago. The rate of gene gains and losses per gene per year was 0.12 × 10-6, 0.02 × 10-6, and 0.01 × 10-6  in rice, Arabidopsis, and Drosophila, respectively. The rate of gene gains and losses of miRNA genes was similar to that of olfactory receptor coding genes in Drosophila. We also studied the rates of nucleotide substitution in different regions of miRNA genes. The rates of substitution in the mature miRNA and the remainder regions were 1.7 × 10-9 and 3.6 × 10-9 per site per year between the two subspecies of rice, 0.32 × 10-9  and 5.9 × 10-9  between the two species of Arabidopsis, and 0.82 × 10-9 and 2.71 × 10-9  between D. melanogaster and D. simulans, respectively. These rates are considerably lower than the rates of synonymous and nonsynonymous substitution in protein-coding genes. In general, however, the evolution of miRNA gene families is qualitatively similar to that of protein-coding multigene families, and follows the birth-and-death model of evolution. Mature miRNA sequences may change in the evolutionary process, but they remain functional as long as the complimentary sequences match with them.


Tanzer, A. and Stadler, PF. (2004), Molecular Evolution of a MicroRNA Cluster, J. Mol. Biol. 339, 327 335.

Li, A. and Mao, L. (2007), Evolution of plant microRNA gene families, Cell Research 17, 212-218.


Speaker: Yogeshwar Kelkar - Dept of Integrative Biosciences

Title: What should be called a microsatellite: A case study with dinucleotides


Microsatellites have proved to be very useful in population genetic studies because of their abundance in eukaryotic genomes, and the dynamic nature of their mutations. Some microsatellites are known to have functional effects, and microsatellite instability is responsible for a significant number of neurological diseases. Despite its relevance to these concerns, a consistently elusive property of microsatellites is their minimal requisite size. Here, we have attempted to determine the threshold sizes for uninterrupted dinucleotide microsatellites. We define the threshold value at which a repetitive motif becomes a microsatellite as, the number of units for which the insertion-deletion (indel) frequency within the repetitive motif is higher than the average indel frequency in non-repetitive DNA sequence. We used two approaches to obtain the threshold size. In the first, experimental approach, we analyzed in-vitro replication-slippage error rates of DNA polymerases at repetitive sequences of different repeat numbers. In the second, computational approach, we studied microsatellite indel polymorphism in ten ENCODE regions, using the HapMap-II resequencing data for 48 humans. To find the minimal size of a mature microsatellite, we studied the relationship between the proportion of polymorphic microsatellite loci, and the size of the smallest microsatellite allele.
We found that loci with the smallest allele size of four or five repeats were rarely polymorphic, but the proportion of polymorphic loci increased steadily thereafter, up to an allele size of 12 repeats, at which point all microsatellite loci became polymorphic.  This corresponded very well with results from the in-vitro studies, suggesting that measuring the proportion of polymorphic loci depending on repeat number is an appropriate method to study the dynamic microsatellite mutational activity.


Yogeshwar D Kelkar1,2, Noelle Candiello3, Suzanne E. Hile3, Francesca Chiaromonte2,4, Kateryna D. Makova1,2, and Kristin A. Eckert3

1Department of Biology, Penn State University, University Park, Pennsylvania, 16802, USA;

2Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania, 16802, USA;

3Department of Pathology, Gittlen Cancer Research Foundation, The Pennsylvania State University College of Medicine, 500 University Drive, Hershey, PA 17033, USA

4Department of Statistics, Penn State University, University Park, Pennsylvania, 16802, USA


Speaker: Dr. Ken Weiss - Dept of Anthropology

Title: Current issues in inferring the genetic basis of complex traits



I will be discussing informally some of the issues being faced in the attempt to understand the genetic basis of complex traits. In particular, I will discuss current mapping approaches in human genetics, and a problem in craniofacial developmental genetics that we are working on.


Weiss, K. Tilting at Quixotic Trait Loci (QTLs): an evolutionary perspective on genetic causation, Genetics, 179: 1741-1756.
Buchanan, A, Sholtis, S, Richtsmeier, J, Weiss, K. What are genes for or where are traits from ? What is the question? BioEssays, 31:198-208.


Speaker: Dr. Benjamin Dickins - Dept of Biochemistry & Molecular Biology

Title: High-resolution mapping of viral evolutionary trajectories



Experimental evolution of rapidly reproducing viruses offers a robust means to infer substitution trajectories during evolution. But, with approaches based on conventional sequencing, this inference is limited by how many individual genotypes can be sampled from the population at a time. Low frequency changes are difficult to detect, potentially rendering early stages of adaptation unobservable. We circumvent this using short-read sequencing technology in a fine-grained analysis of polymorphism dynamics in the single-stranded DNA phage PhiX174. Two replicate populations were adapted to their bacterial host, Escherichia coli C, under continuous culture conditions designed to minimize co-adaptation. Three aliquots, taken from each lineage during the first 32 hours, were subjected to in vivo DNA amplification followed by Solexa high-throughput sequencing. Polymorphism frequencies were calculated from reads aligned to the reference genome and signal was educed from noise with binomial filtering methods that harnessed quality scores and separate data from brief phage amplifications. Several metrics revealed evolutionary change in the two viral lineages. Ten out of 54 highly polymorphic sites showed monotonic increases in polymorphism frequencies over time in both populations, and, of these, seven were identified in previous PhiX174 studies that used conventional sequencing. Out of 33 pairs of polymorphic sites within a read-length of each other, seven showed statistically significant linkage in both lineages at all time points. Further parallel changes were apparent in the sharing of polymorphic sites between lineages and in correlated polymorphism frequencies. We also observed that missense mutations were more likely to occur than silent mutations. Our study offers the first glimpse into "real-time" substitution dynamics and offers a robust conceptual framework for future viral re-sequencing studies.


 Wichman, H. A., Badgett, M. R., Scott, L. A., Boulianne, C. M. and Bull, J. J. (1999) "Different trajectories of parallel evolution during viral adaptation." Science 285(5426): 422-424.