Previous IMEG Seminars and Abstracts:
Doug Cavener – Penn State University – Department of Biology
Title: Evolution of regulating protein
synthesis and why it matters to defining gene coding sequences
Abstract: Gene expression is
highly regulated at the level of protein synthesis specifically at the
translation initiation step. Regulation of translation initiation is
mediated by the variation in mRNA structure and sequence and translation
initiation factors. The 5’ untranslated region (5’UTR) of eukaryotic mRNAs
is often long, structurally complex, and contain multiple upstream open
reading frames (uORFs) and internal ribosome entry sites (IRES), which
impacts the frequency and site of translation initiation by the ribosome.
In addition translational regulatory proteins including the family of eIF2a
kinases play a dominant role in regulating both the level of translation
initiation and start site selection. Previously, I performed the first
comparative analysis of start and stop codon sequence context among major
eukaryotic groups and analyzed 5’ UTR sequences. More recently my research
group has developed genetic model systems in mice to determine the
functions of the eIF2 alpha kinases genes in mice. We have discovered that
the eIF2 alpha kinase genes play diverse roles in metabolism, stress
responses, development, and neurological functions. An important
consequence of our work and others is that considerable variation in coding
sequences within genes is caused by the use of alternative translation
start and stop sites that are regulated by physiological and developmental
Zhang, W., Feng, D., Li, Y., Iida, K., McGrath, B.,
Cavener, D. R. (2006) PERK EIF2AK3 control of
pancreatic b cell differentiation and proliferation is required for
postnatal glucose homeostasis. Cell Metabol. 4:491-497.
Cavener, D. R., Ray, S. C.
(1991) Eukaryotic start and stop translation sites. Nuc. Acids Res.
Hao, S., Sharp, J. W.,
Ross-Inta, C. M., McDaniel, B. J., Anthony, T. G. Wek, R. C.,
Cavener, D. R., McGrath, B. C., Rudell, J. B., Koehnle, T. J., Gietzen, D.
W. (2005) Uncharged tRNA and sensing of amino acid deficiency in mammalian
piriform cortex. Science 307:1776-1778.
Speaker: Dr. David Rand - Brown University - Department of
Ecology & Evolutionary Biology
Title: Running hot and cold about
balancing selection: thermal selection in flies and barnacles
Abstract: Balancing selection can explain the maintenance of
genetic variation in populations. The popularity of this model has
waxed and waned over the years. In this seminar I will present
empirical data from two systems that makes a case for balancing selection,
or certainly environmentally variable selection regimes, related to
temperature. In /Drosophila melanogaster/, we did two distinct thermal
selection experiments from two different stock populations and mapped
thermal QTL. Notably, a marker in the /shaggy/ locus at band 3A that was significantly
differentiated in both experiments and implicates a connection between
circadian rhythms and thermotolerance. The same allele that was increased
in frequency in the high temperature populations is significantly clinal in
North America and is more common in Florida than in Maine. In the
acorn barnacle, /Semibalanus
balanoides/, the Mpi locus has a common
polymorphism that shows genotype-specific zonation in the intertidal,
related to thermal stress. Experimental transplants, as well as DNA
sequence data of the Mpi locus, implicate a history of balancing selection,
and its modulation by gene flow. Together these systems implicate
thermal selection as a likely source of genetic heterogeneity.
References: Schmidt, P. S. and Rand, D. M.
(2001) Adaptive maintenance of genetic polymorphism in an intertidal
barnacle: habitat – and life-stage-specific survivorship of MPI genotypes.
Schmidt, P. S. and Rand, D. M.
(1999) Intertidal microhabitat and selection at MPI: Interlocus contrasts
in the northern acorn barnacle, semibalanus balanoides. Evolution:
Schmidt, P. S., Bertness, M. D.,
and Rand, D. M. (2000) Environmental
heterogeneity and balancing selection in the acorn barnacle Semibalanus
balanoides. Proc. R. Soc. Lon. B. 267:379-384.
Rand, D. M., Spaeth, P. S.,
Sackton, T. B., and Schmidt, P. S. (2002) Ecological genetics of Mpi and
Gpi polymorphisms in the acorn barnacle and the spatial scale of neutral
and non-neutral variation. Integ. And Comp. Biol. 42:825–836.
Ma – Penn State University – Department of Integrative BioSci
Title: Comparative study of Arabidopsis
thaliana and Arabidopsis lyrata small RNAs
Abstract: Small RNAs are short non-coding RNAs that regulate
gene expression post-transcriptionally. In this seminar I will present my
research on two classes of plant small RNAs - microRNAs (miRNAs) and siRNAs
by comparing two closely related Brassicaceae species Arabidopsis
thaliana and Arabidopsis lyrata. We knew some plant miRNAs are
newly evolved and lineage specific. Do this group of newly evolved miRNAs
have the same level of evolutionary constraints compared to ancient and
more conserved miRNAs? We classified Arabidopsis miRNAs into two
groups: "Brassicaceae-specific" and "more conserved",
based on whether they are identified only in Brassicaceae species or in
other species as well. We found that Brassicaceae-specific miRNAs have
greater divergence between Arabidopsis thaliana and Arabidopsis
lyrata in MIRNA sequences and target complementarity sites, and have
lower processing accuracy for miRNA/miRNA* production compared to more
conserved miRNAs. Arabidopsis siRNAs were known to have
"hotspots" of regions with high siRNA production, but it is not
known whether these siRNA hotspots are retained between species. We
compared siRNA hotspots in both genomes and found no evidence of retention
of 24nt siRNA hotspots.
Meyers BC, Axtell MJ, Bartel B, Bartel
DP, Baulcombe D, Bowman JL, Cao X, Carrington JC, Chen X, Green PJ,
Griffiths-Jones S, Jacobsen SE, Mallory AC, Martienssen RA, Poethig RS, Qi
Y, Vaucheret H, Voinnet O,
Watanabe Y, Weigel D, Zhu JK (2008) Criteria for
Annotation of Plant MicroRNAs. Plant Cell 20: 3186-3190.
Axtell MJ (2008) Evolution of microRNAs and their targets: Are all
microRNAs biologically relevant? Biochim. Biophys. Acta. 1779, 725-734.
Kasschau KD, Fahlgren N, Chapman EJ, Sullivan CM, Cumbie JS, Givan SA,
Carrington JC (2007) Genome-wide profiling and analysis of Arabidopsis
siRNAs. PLoS Biol. 5: e57.
Firth – Penn State University – Department of Biology
Title: A phylodynamic approach to the evolution of emerging infectious
Abstract: Eighty-seven novel human pathogens have been described
since 1980, with viruses comprising 75% of these. As a result, there
has been increasing interest in the processes that support and shape the
emergence of new pathogens. Studying the evolutionary trajectories of
both host and pathogen during emergence can reveal population– and
species-level changes in dynamics that characterize these events. In
particular, phylogenetic methods use the information in genetic data to add
insight into emergence by exploring features such as: the direction and
speed of geographic spread of a pathogen, changes in population sizes over
time, the relationships between a group of hosts and their pathogens, as
well as infer the timing of critical events such as cross-species transmission
and global dissemination. Here, the evolution of two emerging viral
pathogens, one human (Hantaviruses) and one livestock (Porcine Circovirus
2), will be explored to help determine the origin of these pathogens, as
well as the timeframe and geographic context under which they have emerged.
Holmes EC (2008) Evolutionary
history and phylogeography of human viruses. Annu Rev Microbiol. 62:
Finsterbusch T, Mankertz A
(2009) Porcine circoviruses—small but powerful. Virus Res. 143: 177-183.
Ramsden C, Holmes EC, Charleston
MA (2009) Hantavirus evolution in relation to its rodent and insectivore
hosts: no evidence for codivergence. Mol Biol Evol. 26: 143-153.
Speaker: MARKER LECTURE - Dr. John Doebly, University of
Wisconsin, 4:00 PM, 100 Berg Auditorium
Host: Blair Hedges
Title: Darwin and Domestication
his book “On the Origin of Species”, Charles Darwin used plant and animal
domestication as a model to inform his theory on evolution under natural selection.
Artificial selection during plant domestication is thought to have been
largely unconscious, the inevitable product of a sowing-reaping cycle.
Selection pressures placed by humans on crops are analogous to those placed
by seed-dispersers such as birds on wild species. Nevertheless,
Darwin’s use of domestication as a model for natural evolution has been
controversial. Over the past 20 years, genetic and molecular research
has begun to uncover the genetic basis of the changes involved in the evolution
of plant form under both natural and artificial selection. In the
case of domestication, approximately 15 genes involved in the changes in
morphology have been isolated. For most of these genes, the nature of the
alteration in the gene is understood. I will review what has been
learned about change in form under domestication and whether any patterns
are beginning to emerge.
Speaker: MARKER LECTURE - Dr. John Doebly, University of
Wisconsin, 4:00 PM, 100 Berg Auditorium
Host: Blair Hedges
Title: Unraveling a developmental pathway
involved in maize domestication
is a domesticated form of a wild Mexican grass called teosinte. The
domestication of maize from teosinte occurred about 8,000 years ago.
As a result of human (artificial) selection during the domestication
process, dramatic changes in morphology arose such that maize no longer
closely resembles its teosinte ancestor in ear and plant
architecture. Quantitative trait locus (QTL) mapping has shown that
many genes contributed to the differences between maize and teosinte, but
among these are several of very large effect. We have cloned and
analyzed two of these large-effect genes. teosinte branched (tb1)
is largely responsible for the difference between the long branches of
teosinte versus the short branches of maize. tb1 encodes a
transcriptional regulator that functions as a repressor of branch
elongation. Gene expression analysis indicates that the product of the
teosinte allele of tb1 accumulates at about half the level of the
maize allele. Fine-mapping experiments show that the differences in
phenotype and gene expression are controlled by an enhancer that is 65 kb
upstream of the ORF. teosinte glume architecture (tga1) is
largely responsible for the formation of a casing that surrounds teosinte
seeds but is lacking in maize. tga1 also encodes a
transcriptional regulator, however in this case a single amino acid change
represents the functional difference between maize and teosinte. This
single amino acid change appears to convert the maize allele into a
transcriptional repressor of target genes. Analysis of the
interactions between tb1, tga1 and other domestication genes
indicates that they form a cascade of transcriptional regulators that were
a target of human selection during the domestication process.
Speaker: Dr. Charles Addo-Quaye – Penn
State University – Department of Biology
detection of cleaved RNA targets of small silencing RNAs in plants by using
the degradome sequencing method
Abstract: Small silencing RNAs are 20-30 nucleotides (nts) long
non-coding RNA sequences which play a critical role in gene and genome
regulation in eukaryotes. The two main modes of post transcriptional gene
regulation are the suppression of translation and the cleavage of targeted
RNA transcripts. microRNAs (miRNAs) are a major category of small silencing
RNAs and are usually 21-24nt long. In plants, the predominant role of
well-characterized miRNAs is the cleavage of messenger RNAs of members of
gene families of transcription factors and other regulatory genes involved
in growth and development. Finding the targets of a miRNA is essential to
discovering its biological significance. In this talk, I would discuss the
degradome sequencing method we designed and implemented for the global
detection of cleaved targets of small silencing RNAs and the Cleaveland
computational pipeline used in the analysis of degradome sequences. The
moss Physcomitrella patens (P. patens) is an important model
organism for investigating the evolution of land plants. I would also be
discussing the results of our analyses of degradome sequences derived from
the Physcomitrella transcriptome.
Voinnet, Olivier. (2009).
Origin, Biogenesis, and Activity of Plant MicroRNAs. Cell 136: 669–687.
C., Eshoo, T.W., Bartel, D.P., and Axtell, M.J. (2008). Endogenous
siRNA and miRNA targets identified by sequencing of the Arabidopsis
degradome. Curr. Biol., 18, 758–762.
C., Miller, W., and Axtell, M.J. (2009). CleaveLand: A pipeline for using
degradome data to find cleaved small RNA targets. Bioinformatics 25:
Speaker: Dr. Todd LaJeunesse – Penn State University – Department of Biology
turnover of sequence variants in the ribosomal arrays of eukaryotes
inferred from molecular ecological studies of coral endosymbionts.
Abstract: Ribosomal DNA sequences have provided the basis of
phylogenetic reconstructions for most of the planet’s biota. Ironically,
rRNA genes evolve differently from single copy nuclear or plastid
genes. The enigmatic processes of concerted evolution, molecular
drive, and gene conversion appear to “homogenize” the sequences of 100’s to
1000’s of copies arrayed in tandem repeats found on one or multiple
chromosomes in a typical eukaryotic genome. In reality, the
intragenomic arrays of most eukaryotes are not completely homogenized, but
instead comprise numerous functional and non-functional sequence
variants. The bacterial cloning and sequencing of rDNA recovers this
variation, which is often incorrectly interpreted as inter-individual
differences. However, the genomes of most individuals in a population
appear to possess one variant that is numerically dominant. The
tracking of dominant intragenomic variants among species of endosymbiotic
dinoflagellates associated with reef corals using denaturing gradient gel
electrophoresis (DGGE) fingerprinting of the Internal Transcribed Spacers
(ITS) reveals patterns suggestive of the mode and tempo of rDNA sequence
turnover. It would seem that rare variants, which differ from the
dominant sequence by one base change, periodically proliferate to displace
the dominant variant resulting in the deliberate stepwise divergence of
rDNA among isolated populations. These data may help refine molecular
clock estimates of rDNA sequence evolution.
Thornhill DJ, LaJeunesse TC, Santos SR (2007)
Measuring rDNA diversity in eukaryotic microbial systems: How intragenomic
variation, pseudogenes, and PCR artifacts confound biodiversity estimates.
Mol Ecol 16:5326-5340.
Pinzón JH (2007) Screening
intragenomic rDNA for dominant variants can provide a consistant retrieval
of evolutionarily persistent ITS (rDNA) sequences. Molecular Phylogenetics
and Evolution. 45:417-422.
Speaker: Dr. Christina Grozinger – Penn State University – Department of Entomology
Title: Genomics and
evolution of chemical communication in social insects
Abstract: Chemical communication plays
critical role regulating social behavior in honey bees. Furthermore,
this communication system is exquisitely tuned to the environmental context
and physiological state of both the signaling and receiving animal, and
thus represents a subtle and intricate system for coordinating the
activities of thousands of individuals in a colony. We seek to
understand the molecular and physiological basis of modulation of chemical
communication in honey bees, both in terms of production of the chemical
signal and responsiveness of the receiving individual. We are also
extending these studies to other related species, to determine if the genes
associated with pheromone response are conserved across species, and to
elucidate the evolution of pheromonal regulation of social behavior.
Kocher, S.D., Richard, F.J.,
Tarpy, D.R., and C.M. Grozinger. “Queen reproductive state modulates
queen pheromone production and queen-worker interactions in honey bees”
Behavioral Ecology. Advance Access published on July 2, 2009; doi:
Richard, F.J., Tarpy, D.R, and
C.M. Grozinger. “Effects of insemination quantity on honey bee queen
physiology”. PLoS ONE , 2(10):e980 (2007).
Grozinger, C. M., Sharabash, N.
M., Whitfield, C. W. and Robinson, G. E. (2003) “Pheromone mediated gene
expression in the honey bee brain.” Proc Natl Acad Sci U S A 100 (Suppl
Robinson, G.E., Grozinger, C.M.,
and Whitfield, C.W. (2005) “Social life in molecular terms.” Nat Gen Rev 6,
Speaker: Dr. Trudy MacKay – North
Carolina State University – Dept of Genetics
Huck Institute Lecture Series – 4:00 PM 100 Berg
Title: Systems Genetics of Quantitative Traits in Drosophila
Abstract: Population variation for
quantitative traits is caused by segregating alleles at multiple
interacting loci, with effects that are sensitive to the environment.
Knowledge of the detailed genetic architecture of quantitative traits is
important from the perspectives of evolutionary biology, human health and
plant and animal breeding. Mapping quantitative trait loci to the level of
individual genes and causal molecular variants is challenging because large
numbers of individuals need to be assessed for the trait phenotype and a
dense panel of polymorphic molecular markers in order to detect loci with
modest effects; further, allelic effects can be sex-, environment- and
genetic background-specific. Our understanding of the genetic architecture
of quantitative traits will benefit from interrogating a single resource
population for variation in DNA sequence, transcript abundance, proteins
and metabolites; for multiple organismal phenotypes; and in multiple
environments. This ‘systems genetics’ approach will yield a detailed map of
genetic variants associated with each organismal phenotype in each
environment; provide a functional context for interpreting the phenotypes;
elucidate the genetic underpinnings that govern the interdependence of
multiple phenotypes; and address the long-standing question of the genetic
basis of genotype by environment interaction. The Drosophila Genetic Reference Panel (DGRP) is one such common
resource population, which consists of 192 inbred lines derived from the
Raleigh, USA population. The National Institutes of Health National Human
Genome Research Institute has approved the sequencing of these lines by the
Baylor College of Medicine Sequencing Center, using next generation
sequencing technologies. The DGRP is a living library of common
polymorphisms affecting complex traits, and a community resource for whole
genome association mapping of quantitative trait loci. I will report the
current status of the sequencing effort, as well as initial systems genetic
analyses of several Drosophila
life history traits.
MacKay, T. F. C., Stone, E. A., Ayroles, J. F. (2009).
The genetics of quantitative traits: challenges and prospects. Nat Rev Gene
10: 565 – 577.
Harbison, S. T., Carbone, M. A., Ayroles, E. A., Lyman,
R. F., MacKay, T. F. C. (2009). Co-regulated transcriptional networks
natural genetic variation in Drosophila sleep. Nat Genet
41: 371 – 375.
Ayroles, J. F., Carbone, M. A., Stone, E. A., Jordan, K. W., Lyman, R.
F., Magwire, M. M., Rollmann, S. M.,
Duncan, L. H., Lawrence, F., Anholt,
R. R. H., Mackay, T. F. C. (2009). Systems genetics of complex traits in Drosophila
melanogaster. Nat Genet 41: 299 – 307.
Speaker: Dr. Masafumi
Nozawa – Penn State University – Department
Title: Origin and
evolution of microRNA genes in Drosophila species
Abstract: MicroRNA (miR)
genes are known to regulate many genes at the posttranscriptional level. However,
their origin and evolutionary processes after their birth are still
unclear. I have therefore identified miR genes in 12 Drosophila
species by using bioinformatics approach and examined their evolutionary
mechanisms. The results showed that the extant and ancestral Drosophila
species have >100 miR genes and frequent gains and losses of miR genes
have occurred during evolution. A majority of gene gains generated new gene
families, suggesting that many miR genes have originated from non-miR
sequences. However, miR genes showed no sequence similarity to transposable
elements or protein-coding genes. Instead, nearly half of miR genes were
located within introns of protein-coding genes. These observations suggest
that miR genes have largely originated from random hairpin structures or
introns. I also found that new miR genes show a similar substitution rate
to synonymous sites of protein-coding genes, implying that most of the
“potential” miR genes may not have acquired any function yet and could
become nonfunctional. By contrast, old miR genes showed a substitution rate
much lower than protein-coding genes. There was a strong trend of
substitution patterns that paired and unpaired sites in stem regions retain
the same status even after substitutions during the evolution. Therefore,
once miR genes acquired functions they appear to have evolved very slowly
with keeping original structures over a long evolutionary period. This
study revealed the contrast evolution of Drosophila miR genes
between the short- and long-run.
Bartel, D.P. 2004. MicroRNAs: genomics, biogenesis,
mechanism, and function. Cell 116: 281-297.
Sempere, L.F., Cole, C.N., McPeek, M.A., and Peterson,
K.J. 2006. The phylogenetic distribution of metazoan microRNAs: insights
into evolutionary complexity and constraint. J Exp Zoolog B Mol Dev Evol
Lu, J., Shen, Y., Wu, Q., Kumar, S., He, B., Shi, S.,
Carthew, R.W., Wang, S.M., and Wu, C.I. 2008. The birth and death of
microRNA genes in Drosophila. Nat Genet 40: 351-355.
Speaker: Yuannian Jiao – Penn State University – Department of Plant Biology
Title: The history of genome
duplications in flowering plants: evidence from global gene family
is strong evidence that the ancestors of major eudicot lineages have
undergone one or more rounds of whole-genome duplication (WGD) following
the divergence of monocots and eudicots. Although the occurrence of WGD
event(s) is well accepted, the actual number, phylogenetic timing, and age
of the event(s) remain equivocal. To address these issues, we built a
phylogenomic pipeline to reconstruct the evolutionary relationships of 4433
gene families from the complete gene sets of Arabidopsis, Populus, Vitis,
and Oryza. 1787 families were characterized by a surviving duplication
shared by rosid I (Populus) and rosid II (Arabidopsis). These alignments
were populated with unigenes of Asteridae and re-estimated the phylogenies
to track potential WGD event(s) in eudicots, rosids, and asterids. Very
little evidence was found to support large-scale duplications shared only
by rosid I and rosid II, rejecting prior hypotheses of a rosid-wide WGD.
The overwhelming majority of resolved duplications shared by rosid I/II
were placed before the separation of rosids and asterids, providing
evidence for WGD (($B!&(B early in eudicot evolution. Concentrations of
gene duplications also suggested potential WGD events in the lineages
leading to Solanaceae and to Asteraceae, but not across all Asteridae. Finally,
our results support two rounds of WGD (($B!&(Band ($B!&(B in the
Arabidopsis lineage after the divergence of rosid I/II. Global gene family
phylogenies are a valuable complement to genome-scale structural analysis,
incorporating extensive evidence even without conservation of gene order or
a sequenced genome, and facilitate a better understanding of WGD events in
Bowers JE, Chapman
BA, Rong J, Paterson AH: Unravelling angiosperm genome evolution by
of chromosomal duplication events.
Nature 2003, 422(6930):433-438.
Tang H, Bowers JE,
Wang X, Ming R, Alam M, Paterson AH: Synteny and collinearity in plant
genomes. Science 2008,
NO IMEG SEMINAR - HAPPY THANKSGIVING
Speaker: Dr. Stephen Schaeffer – Penn State University – Department of Biology
Title: The Evolutionary Significance
of Inversions in Natural Populations of Drosophila
Abstract: The Drosophila 12 genomes project has provided
an opportunity to understand how evolution has shaped the organization of
genes on chromosomes, but it is unclear what evolutionary forces allow new
chromosomal rearrangements to invade natural populations. Chromosomal
rearrangements may play an important role in how populations adapt to a
local environment. The gene arrangement polymorphism on the third
chromosome of Drosophila pseudoobscura is a model system to help
determine the role that inversions play in the evolution of this species.
The gene arrangements are the likely target of strong selection because
they form classical clines across diverse geographic habitats, they cycle
in frequency over seasons, and they form stable equilibria in population
cages. A numerical approach was developed to estimate the fitness sets for
15 gene arrangement karyotypes in six niches based on a model of
selection–migration balance. Gene arrangement frequencies in the six
different niches were able to reach a stable metapopulation equilibrium
that matched the observed gene arrangement frequencies when recursions used
the estimated fitnesses with a variety of initial inversion frequencies.
These analyses show that a complex pattern of selection is operating in the
six niches to maintain the D. pseudoobscura gene arrangement
polymorphism. Models of local adaptation predict that the new inversion
mutations were able to invade populations because they held combinations of
two to 13 local adaptation loci together.
BHUTKAR, A. et al., 2008 Chromosomal
rearrangement inferred from comparisons of twelve Drosophila genomes.
SCHAEFFER, S. W., 2008 Selection in heterogeneous
environments maintains the gene arrangement polymorphism
pseudoobscura. Evolution 62: 3082-3099.
Speaker: Yogeshwar Kelkar – Penn State University – Department of Integrative
Should We Call a Microsatellite?
Abstract: Microsatellites are repeats of short
(1 to 6 bp) DNA motifs, and are ubiquitous in eukaryotic genomes.
Microsatellites experience rapid insertion-deletion (of the motif)
mutations as they are hotspots for polymerase slippage.
Microsatellite mutation rate estimates from pedigree studies and
experimental assays range from ~10-6 to ~10-2
mutations per locus per generation, orders of magnitude higher than for
non-repetitive DNA (Ellegren 2000). Due to their high polymorphism levels,
microsatellites are valuable genetic markers. While many (especially
intergenic) microsatellites are thought to evolve neutrally, some,
particularly the ones located within or in the vicinity of genes, are known
to affect gene expression, splicing, or protein sequence (Li et al.), and
have been implicated in many diseases (Pearson, Nichol Edamura, and
Previously we have shown that
microsatellite size (repeat number) is a primary determinant of
mono-,di-,tri-, and tetranucleotide microsatellite mutation rates (Kelkar
et al.). One of the more debated issues pertaining to the very definition
of microsatellites is, whether there is a critical (‘threshold’) size
required for a repeat to be qualified as a microsatellite. Previous
approaches to address this question involved phylogenetic observations of
microsatellite growth, or inferences based on size-frequency distributions
of microsatellites in genomes. In contrast, because the defining
characteristic of microsatellites is the dynamic nature of their mutations,
we used an operational definition of microsatellite threshold as the size
at which the rate of polymerase slippage at a repeat significantly and
sharply exceeds that of the background slippage process taking place at the
smallest repeats in the genome. Here, we present a combined computational
and experimental approach to determine the threshold value for [A/T]n
mononucleotides, and for [TG/AC]n and [TC/AG]n
dinucleotides. In our computational analysis, we assessed microsatellite
polymorphism levels from the extensive re-sequencing of ten ENCODE regions
in human populations ( The International Hapmap Consortium 2005). Our
premise was that, presence of polymorphisms at repeats of a certain repeat-
number reflects their dynamic mutation activity. In our experimental
analysis, we modified our published HSV-tk in vitro mutagenesis
system (Eckert, Yan, and Hile) to quantify DNA polymerase error frequencies
within tandemly repeated sequences differing by increments of one unit.
With this combined approach, we find evidence for existence of threshold sizes
for all microsatellites investigated. Importantly our results indicate that
microsatellite threshold is characterized by a minimal number of
nucleotides, rather than a minimal number of repeats, irrespective of the
size of the motif involved. With our approach, we aim to set an unambiguous
standard for what loci should be called microsatellites in future studies.
References: The International Hapmap Consortium. 2005. A haplotype
map of the human genome. Nature 437:1299-1320.
Eckert, K. A., G. Yan, and S. E. Hile. 2002. Mutation
rate and specificity analysis of tetranucleotide microsatellite DNA alleles
in somatic human cells. Mol Carcinog 34:140-150.
Ellegren, H. 2000. Microsatellite mutations in the
germline: implications for evolutionary inference. Trends Genet 16:551-558.
Kelkar, Y. D., S. Tyekucheva, F. Chiaromonte, and K. D.
Makova. 2008. The genome-wide determinants of human and chimpanzee
microsatellite evolution. Genome Res 18:30-38.
Li, Y. C., A. B. Korol, T. Fahima, A. Beiles, and E.
Nevo. 2002. Microsatellites: genomic distribution, putative functions and
mutational mechanisms: a review. Mol Ecol 11:2453-2465.
Pearson, C. E., K. Nichol Edamura, and J. D. Cleary.
2005. Repeat instability: mechanisms of dynamic mutations. Nat Rev Genet 6:729-742.