Institute of Molecular
Evolutionary Genetics
















Fall 2010


Previous IMEG Seminars and Abstracts:

Fall 2013

Spring 2013

Fall 2012


Spring 2012

Fall 2011

Fall 2010


Spring 2010
Fall 2009

Spring 2009

Fall 2008

Spring 2008

Fall 2007
Spring 2007
Fall 2006

Spring 2006
Fall 2005
Spring 2005

Fall 2004

Spring 2004

Fall 2003

Spring 2003
Fall 2002




Speaker: Dr. Jim Marden Penn State University Department of Biology

Title: Evolutionary origins of flight in insects

    Recent findings regarding the evolutionary origin of insects, their gas exchange physiology, wing development and flight ability are rapidly changing what we know and hypothesize about the history and biology of this major animal group. Hexapods now appear to be a sister taxon of branchiopod crustaceans, arising within the Pancrustacea. Developmental studies support the wings-from-gills hypothesis by identification of specific expression of orthologous genes during gill/wing formation in crustaceans and insects, and repression of abdominal appendages in modern insects by variants of genes that, in crustaceans, do not repress abdominal appendages. As gills evolved into wings, gas exchange physiology must have changed simultaneously and there is now evidence for this in both the origin of tracheae from cells in the wing primoridia, under control of wing-development genes, and in the discovery that some modern Plecoptera still express high levels of functional haemocyanin in their blood (i.e. insects that use both tracheal and blood-based gas exchange, similar to the overlap of dinosaur and avian traits in Archaeopteryx). Plecoptera and Ephemeroptera also display a wide range of mechanisms for aerodynamic locomotion across the surface of water, while their body weight is fully or partly supported by the water. These behaviors demonstrate a series of mechanically intermediate stages by which flapping gills could have evolved into flight-capable wings in an aquatic environment.

Marden, J.H.  2008. Evolution and physiology of flight in aquatic insects.  In: Aquatic Insects: Challenges to Populations, ed. J. Lancaster.  CABI Press. 
Hagner-Holler, S., Pick, C., Girgenrath,S., Marden, J.H., and Burmester, T. 2007.  Diversity of stonefly hexamerins and implication for the evolution of insect storage proteins. Insect Biochem. Mol. Biol. 37:1064-74.
Hagner-Holler, S., A. Schoen, W. Erker, J.H. Marden, R. Rupprecht, H. Decker, T. Burmester. 2004.  A respiratory hemocyanin from an insect. Proceedings of the National Academy of Sciences 101: 871-874.
Marden, J.H. and M.A. Thomas. 2003. Rowing locomotion by a stonefly that possesses the ancestral pterygote condition of co-occurring wings and abdominal gills. Biological Journal of the Linnean Society 79: 341–349.
Thomas, M.A., K.A. Walsh, M.R. Wolf, B.A. McPheron, and J.H. Marden.  2000.  Molecular phylogenetic analysis of evolutionary trends in stonefly wing structure and locomotor behavior. Proceedings of the National Academy of Sciences 97:13178-13183.


Speaker: Dr. Stephern SchusterPenn State UniversityDepartment of Biochem/Mol Biol

Title: Assessing African human diversity: The Southern African Genome project.


        Advances in sequencing technology have made human whole-genome sequencing more accessible, now also allowing the inclusion of a wider array of human indigenous groups. Human diversity is believed to be the largest on the continent of human origin, Africa. While the original sequencing efforts focused on European, Asian and Western African groups, our efforts targeted representatives of the southern parts of Africa, e.g. the Bantu and Bushman groups of South Africa and Namibia. Both groups represent largely diverse linguistic and cultural communities that have adopted lifestyles as herders and agriculturalists (Bantu) and hunter/gatherers (Bushman). Despite both groups’ hundreds of centuries old shared cultural history, their genetic ancestry is much lesser known.

        In this study we investigate the extent of genetic variation of the Bushman genome in comparison to other publically available human genomes. For this undertaking we have performed whole-genome shotgun sequencing using Roche/454 Titanium chemistry of two complete genomes, together with exome sequencing of two matched and three additional samples. The genome of the Bantu representative, Archbishop Tutu, was sequenced on the SOLiD platform in combination with exome sequencing on the Roche platform. The Bushman and Bantu genomes in addition were resequenced using the Illumina GA platform, allowing for a three-way reconfirmation of any newly reported human variants.

        Our study investigates the SNP diversity, indel and repeat content, and copy number variation of the two ethnic groups against the human reference sequence. The genetic variations detected by sequencing (multi-platform) are being additionally validated in parallel using DNA SNP arrays. The study therefore aims at generating a high-quality version of a human genome that defines the outer boundaries of human diversity, as well as providing new markers for novel-content arrays.

        Our data will also support the interpretation of other human genome sequences. As a large number of the discovered genetic variants are novel and currently not contained in dbSNP, this project will aid future studies on rare human alleles. With current genome-wide association studies being largely limited to European populations, disease associations have generally been mapped to broad genomic regions. Human diversity studies of the Bushmen and Bantu will facilitate the narrowing of these regions.

        More importantly, we believe that this genome project will bring advantages to the Southern African people at the medical level. It is hoped that availability of the first Southern African human genome sequences will aid the development of drugs that no longer exclude these ethnic groups. The immediate use of 1.3 million novel SNP markers from this study in ongoing medical research is therefore seen only as the very first step for extending the benefits of modern medicine also to indigenous groups throughout the world.


S.C. Schuster, W. Miller , A. Ratan, L.P. Tomsho, B. Giardine, L.R. Kasson, R.S. Harris, D.C. Petersen, F. Zhao, J. Qi, C. Alkan,  J.M. Kidd, Y. Sun, D.I. Drautz, P. Bouffard, D.M. Muzny, J.G. Reid, L.V. Nazareth, Q. Wang, R. Burhans, C. Riemer, N.E. Wittekindt, P. Moorjani, E.A. Tindall, C.G. Danko, W.S. Teo, A.M. Buboltz, Z. Zhang, Q. Ma, A. Oosthuysen, A.W. Steenkamp, H. Oostuisen, P. Venter, J. Gajewski, Y. Zhang, B.F. Pugh, K.D. Makova, A. Nekrutenko, E.R. Mardis, N. Patterson, T.H. Pringle, F. Chiaromonte, J.C. Mullikin, E.E. Eichler, R.C. Hardison, R.A. Gibbs, T.T. Harkins, V.M. Hayes.


Speaker: Dr. Hiroki Goto Penn State University Department of Biology

(Makova Lab)


            The endangered Przewalski’s horse is the closest relative of the domestic horse and is the only true wild horse species surviving today. The question of whether Przewalski’s horse is the direct progenitor of domestic horse has been hotly debated. Studies of DNA diversity within Przewalski’s horses have been sparse, but are urgently needed to ensure their successful reintroduction to the wild. To resolve the controversy surrounding the phylogenetic position and genetic diversity of Przewalski’s horses, we used massively parallel sequencing technology to decipher the complete mitochondrial and partial nuclear genomes for all four surviving maternal lineages of Przewalski’s horses. Three mitochondrial haplotypes were discovered - two similar ones, haplotypes I/II, and one substantially divergent from the other two, haplotype III. Haplotypes I/II vs. III did not cluster together on a phylogenetic tree, rejecting the monophyly of Przewalski’s horse maternal lineages, and were estimated to split0.117-0.186 million years ago, significantly preceding horse domestication. In the phylogeny based on X chromosomal sequences, Przewalski’s and domestic horse lineages were intermixed, while in that built from autosomal sequences, Przewalski’s horses lineages were monophyletic. Despite a recent genetic bottleneck, Przewalski’s horses exhibited nucleotide diversity comparable with that of other mammalian species. Thus, Przewalski’s horses have ancient polyphyletic origins and are not the direct progenitors of domestic horses. The analysis of the vast amount of sequence data presented here suggests that Przewalski’s and domestic horse lineages diverged at least 0.117 million years ago, but since then have retained ancestral genetic polymorphism and/or experienced gene flow.



Speaker: Dr. Timothy Jegla Penn State University Department of Biology

Title: The evolutionary history of metazoan ion channels

        The metazoan nervous system requires a large and functionally diverse set of ion channel genes to generate complex behavioral responses. We can trace how and when this ion channel set evolved with great precision through comparison of genomes from phylogenetically diverse species.  The basic structures of metazoan ion channel protein structures are ancient (most first appear in prokaryotes), but evolution of the characteristic set of metazoan channel gene families appears to have been driven by the advent of neuronal signaling. Continuing duplication of this neural specific channel gene set has led to unique channel suites in extant metazoan phyla.

Review discussing human ion channel set and its evolutionary origins revealed by genome analysis. Jegla TJ, Zmasek CM, Batalov S, Nayak SK. Evolution of the human ion channel set. Comb Chem High Throughput Screen. 2009 Jan;12(1):2-23. Review. PubMed PMID: 19149488.

An early view of K+ channel evolution gleaned from C. elegans genome sequence. Wei A, Jegla T, Salkoff L. Eight potassium channel families revealed by the C. elegans genome project. Neuropharmacology. 1996;35(7):805-29. PubMed PMID: 8938713.

The Cnidarian genome – shows what genes were in the first nervous systems. Putnam NH, Srivastava M, Hellsten U, Dirks B, Chapman J, Salamov A, Terry A, Shapiro H, Lindquist E, Kapitonov VV, Jurka J, Genikhovich G, Grigoriev IV, Lucas SM, Steele RE, Finnerty JR, Technau U, Martindale MQ, Rokhsar DS. Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science. 2007 Jul 6;317(5834):86-94. PubMed PMID: 17615350.


Speaker: Heather Simmons - Penn State UniversityDepartment of Biology

(Stephenson Lab)

Title: The effect of transmission mode on genetic diversity in Zucchini Yellow Mosaic virus (ZYMV)

    As a result of selective pressures and population bottlenecks, viral genetic diversity is modulated by the manner in which transmission occurs. By sequencing the coat protein (CP) gene of ZYMV, we generated clone data from three distinct populations, two horizontally transmitted populations (one via aphid, the other without), and one vertically transmitted population (via seed). As individual transmission events are at least partially random, both selection and drift may shape the evolutionary trajectories of RNA viruses. Thus our clone data will provide a unique opportunity to look at how the interplay between transmission mode and genetic diversity is reflected in clonal populations of ZYMV. As our data encompasses the epidemiological spread of the virus this has also enabled a determination of the intra-host RNA viral diversity. This is important for understanding how RNA viruses, such as ZYMV, emerge, overcome host resistance, switch hosts, and adapt to rapidly changing environments In addition our data indicate that the ZYMV seed transmission rate is approximately three orders of magnitude higher than the most commonly cited rate. Correctly accounting for seed transmission will permit the refinement of management strategies for this devastating crop pathogen that annually costs millions in agricultural losses.
Simmons, H.E., Holmes, E.C., Stephenson, A.G. 2008. Rapid evolutionary dynamics of zucchini yellow mosaic virus. J. Gen. Virol. 89, 1081-1085.


Speaker: Dr. Mathias Stiller Penn State University - Dept of Biol
(Shapiro's Lab)
Title:Employing next generation high-throughput sequencing technologies to improve the genetic analysis of extinct animals

        In my talk I am going to give an overview about the main challenges I faced during my PhD when analyzing ancient DNA extracted from fossil remains. In particular I will focus on miscoding lesions in ancient DNA sequence data generated using next generation high-throughput sequencing (NGS) technologies (e.g. 454 platform) compared to traditional Sanger sequencing (Stiller et al. 2006). I will further present a case study of cave bears in which I reconstructed their demographic history over time and discuss possible reasons for their extinction (Stiller et al. 2010). In the third part, I am introducing a new method, which allows the reconstruction of large amounts of ancient DNA sequence data from multiple individuals. This combination of multiplex PCR with molecular barcoding and NGS technologies allows a detailed reconstruction of the phylogenetic relationships among cave bears (Stiller et al. 2009).
Stiller M., Green R. E., Ronan M., Simons J. F., Du L., He W., Egholm M., Rothberg J. M., Keates S. G., Ovodov N. D., Antipina E. E., Baryshnikov G. F., Kuzmin Y. V., Vasilevski A. A., Wuenschell G. E., Termini J., Hofreiter M., Jaenicke-Després V. and Pääbo S. (2006) “Patterns of nucleotide misincorporations during enzymatic amplification and direct large-scale sequencing of ancient DNA.” Proceedings of the National Academy of Sciences, USA 103(37), 13578-84.

Stiller M., Knapp M., Stenzel U., Hofreiter M. and Meyer M. (2009) „Direct multiplex sequencing (DMPS) – a novel method for targeted high-throughput sequencing of ancient and highly degraded DNA.” Genome Research 19(10), 1843-8.

Stiller M., Baryshnikov G., Bocherens H., Grandal d'Anglade A., Hilpert B., Münzel S.C., Pinhasi R., Rabeder G., Rosendahl W., Trinkaus E., Hofreiter M. and Knapp M. (2010) “Withering away – 25,000 years of genetic decline preceded cave bear extinction.” Molecular Biology and Evolution 27(5):975-8.


Speaker: Dr. Josh Der Penn State University Department of Biology

(dePamphilis Lab)

Title: High throughput genome and transcriptome sequencing reveal extremely high levels of RNA editing in the plastome of bracken fern.


RNA editing is the post-transcriptional modification of RNA nucleotides relative to their encoding genomic DNA sequences. RNA editing in land plant organelles occurs in the form of pyrimidine exchanges. In general, levels of RNA editing are higher in mitochondrial genomes than in chloroplast genomes, and in seed-free vascular plants and hornworts relative to other lineages of land plants (i.e. liverworts, mosses, and seed plants). To date, genome-wide chloroplast RNA editing has been systematically examined in only seed-free plants: a fern (Adiantum capillis-veneris) and a hornwort (Anthoceros formosae). The extremely high levels of RNA editing observed in chloroplast transcripts of ferns and hornworts relative to seed plants (up to 25 times higher) raises a number of interesting questions on the evolution of RNA editing in plastomes. We present a genome-wide analysis of chloroplast RNA editing in a second fern species. We use a novel, rapid method to determine the complete chloroplast genome sequence and identify RNA editing sites using second-generation high-throughput shotgun DNA sequencing technologies. This study also represents the first application of RNA-seq to examine RNA editing in a chloroplast transcriptome. The complete circular mapping chloroplast genome sequence of Pteridium aquilinum is 152,362 bp long and contains a gene set and gene order identical to Adiantum capillis-veneris. There are 117 different genes, including 84 protein coding genes, 4 ribosomal RNA genes, and 29 transfer RNA genes. We experimentally detected 551 unique C to U RNA editing sites and 300 U to C RNA editing sites in the chloroplast genome, 2.5 times that detected in Adiantum and approaching the level of RNA editing observed in Anthoceros. RNA editing events have been observed in protein coding genes, tRNA, rRNA, introns, and intergenic regions. This work will enable a reexamination of the evolution of chloroplast RNA editing by surveying RNA editing in the complete plastid transcriptome of a second fern. Our approach should also be widely applicable to studying RNA editing in the chloroplasts of other plant lineages.


Paul G. Wolf; Joshua P. Der; Aaron M. Duffy; Jacob B. Davidson; Amanda L. Grusz; and Kathleen M. Pryer. "The evolution of chloroplast genes and genomes in ferns". (2010) Plant Molecular Biology, in press.


Der, Joshua P. "Genomic Perspectives on Evolution in Bracken Fern" (2010). All Graduate Theses and Dissertations. Paper 663.


Sugiura, M.: RNA Editing in Chloroplasts. In H.U. Göringer (ed) RNA Editing. Nucleic Acids and Molecular Biology 20 (2008) 123-142.


Wolf, P.G., Rowe, C.A. and Hasebe, M.: High levels of RNA editing in a vascular plant chloroplast genome: analysis of transcripts from the fern Adiantum capillus-veneris. Gene 339 (2004) 89-97.



Speaker: Dr. Masafumi Nozawa Penn State University Department of Biology

(Nei Lab)

Title: Extensive birth-and-death evolution of microRNA gene repertoires in plant species

        MicroRNAs (miRs) are among the most important regulatory elements in animals and plants. However, their origin and evolutionary dynamics have not been studied systematically in plants. In this study, we have identified putative miR genes in 11 plant species using bioinformatic approach and examined their evolutionary changes. Our homology search revealed the possibility that miR genes existed before the divergence of land plants and green algae. The number of miR genes increased in the land plant lineage, but after the divergence of eudicots and monocots the number has considerably fluctuated in a lineage specific manner. We also found that miR genes have mainly originated from the duplication of preexisting miR genes or protein-coding genes, although transposable elements have potentially been important as well for the origin of miR genes. By contrast, the contribution of random hairpin structures seems to be minor. After origination, old miR gene families have been retained in a genome more often than new gene families. In addition, we found that single copy miR genes have been lost more frequently than the genes belonging to multigene families. These results suggest that young single copy genes may be under less functional constraints whereas old miR genes belonging to multigene families have essential functions. These evolutionary patterns of plant miR genes are partly similar to but partly quite different from those of Drosophila miR genes.


Fahlgren, N., S. Jogdeo, K.D. Kasschau, C.M. Sullivan, E.J. Chapman, S. Laubinger, L.M. Smith, M. Dasenko, S.A. Givan, D. Weigel et al. 2010. MicroRNA Gene Evolution in Arabidopsis lyrata and Arabidopsis thaliana. Plant Cell 22: 1074-1089.

Ma, Z., C. Coruh, and M.J. Axtell. 2010. Arabidopsis lyrata Small RNAs: Transient MIRNA and Small Interfering RNA Loci within the Arabidopsis Genus. Plant Cell 22: 1090-1103.

Nozawa, M., S. Miura, and M. Nei. 2010. Origins and Evolution of MicroRNA Genes in Drosophila Species. Genome Biol Evol 2: 180-189.



Speaker: Dr. Webb Miller Penn State University Department of Biology

Title: Analysis of primate gene clusters

        We have organized a project to sequence and analyze some primate gene clusters of biomedical interest. Most of the talk will describe our automatic and interactive computational tools for reconstructing the evolutionary history of gene clusters, focusing on detection of conversion events. Our conversion-detection tools are efficient enough to comprehensively catalog conversion events for the entire human genome.  

Zhang, Y., Song, G., Vinar, T., Green, E.D., Siepel, A., and W. Miller. 2009. Evolutionary history reconstruction for mammalian complex gene clusters. J. Comp. Biol. 16:1051-1070.

Hsu, C., Zhang, Y., Hardison, R., NISC COMPARATIVE SEQUENCING PROGRAM, Green, E.D., and W. Miller. 2010. An effective method for detecting gene conversion events in whole genomes. J. Comp. Biol. 17: 1281-1297.



Speaker: Dr. Hielim Kim Penn State University Department of Biology

(Nei Lab)

Title: Evaluation of the Markov cluster (MCL) algorithm for identifying gene families from genome sequence data

        The eukaryotic genome is composed of thousands of multigene families, and therefore it is important to know how these gene families change in the process of evolution. Because the number of gene families is so large that it is necessary to use bioinformatic techniques to identify each gene family and study its evolutionary change. There are many such computational methods, but the most frequently used one is Enright et al’s (2002) Markov cluster (MCL) method. In this method, a group of closely related genes are indentified by a special mathematical definition of similarity without considering the phylogenetic relationships of the genes. We have therefore decided to examine the accuracy of finding the correct gene clusters or gene families using several different sets of animal genomes. The results showed that the number of multigene families identified varies considerably with the sets of the genomes used. Because MCL is used for identifying single-copy genes that can be used for constructing phylogenetic trees of different species, we also examined the accuracy of identification of single-copy genes from Blast and phylogenetic analysis. Our results have shown that the single-copy genes identified by MCL are often incorrect and therefore the genes identified for a set of species cannot be used for constructing trees for different sets of species. We are currently identifying gene families by using the PANTHER hidden Markov model (HMM).

Duarte JM, Wall PK, Edger PP, Landherr LL, Ma H, Pires JC, Leebens-Mack J, Depamphilis CW. 2010. Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evol Biol 10:61.

Enright AJ, Van Dongen S, Ouzounis CA. 2002. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30:1575-1584.


Speaker: Dr. Chris House Penn State University Department of GeoSciences
Title: Metagenomics in low DNA environments

Microbial environments with low activity and relatively small amounts of DNA are challenging to study, but important for the field of Astrobiology.  The application of metagenomics to several low DNA environments
will be presented.  These include past research on the subsurface of sediments off of the Peru Margin and the surface waters of the Dead Sea.  On-going work on a variety of subseafloor environments will also be discussed.


Rhodes, M.E., Fitz-Gibbon, S.T., Oren, A., House, C.H. 2010.  Amino Acid Signatures of Salinity on an Environmental Scale with a Focus on the Dead Sea. Environmental Microbiology.  12:  2613–2623, September 2010

Biddle, J.F., Fitz-Gibbon, S.T., Schuster, S.C., Brenchley, J.E., House, C.H.,  2008.  Metagenomic signatures of the subseafloor biosphere.  PNAS 105: 10583-10588.


~~~~~~~~~~~~~~~ THANKSGIVING BREAK ~~~~~~~~~~~~~~~~~~~~


Speaker: Dr. Kazuhiko Kawasaki Penn State University Department of Anthropology

( Weiss Lab)

Title: The evolution of milk casein genes from tooth genes

Milk caseins are a composite of proteins, comprised of two distinct types. The calcium-sensitive casein interacts with calcium, whereas the calcium-insensitive casein stabilizes the casein-calcium complex. Enzymatic digestion of calcium-insensitive caseins results in milk coagulation, separation of the curd (cheese) and whey. By analogy, it has been thought that the calcium-insensitive casein arose from g-fibrinogen, a blood coagulation factor. However, we discovered two different genes ancestral to each type of casein genes in the lizard and mammalian genomes. Both ancestral genes are involved in tooth formation and are unrelated to blood coagulation. I will discuss duplication history of these and other related genes.

Evolutionary genetics of tissue mineralization: the origin and evolution of the secretory calcium-binding phosphoprotein family. Kawasaki, K. and Weiss, K. M. 2006 J. Exp. Zool. (Mol. Dev. Evol.) 306B: 295-316.
Mineralized tissue and vertebrate evolution: The secretory calcium-binding phosphoprotein gene cluster. Kawasaki, K. and Weiss, K. M. 2003 Proc. Natl. Acad. Sci. USA. 100: 4060-4065.