Preparing for a Changing Climate and an Altered Landscape:
A Symposium for Uniting Theoretical & Empirical Approaches to Biodiversity Science
Friday, June 7, 2019 （9:00AM-5:00PM）
Anne D. Yoder, Duke University, USA
Ziheng Yang, University College London, UK
Thomas Flouris, University College London, UK
Xiyun Jiao, University College London, UK
Jelmer Poelstra, Duke University, USA
Bruce Rannala, University of California, Davis, USA
George Tiley, Duke University, USA
Dong-dong Wu, Kunming Institute of Zoology, CAS, CHINA
Ziheng Yang, University College London, UK
Anne Yoder, Duke University, USA
Tianqi Zhu, Academy of Mathematics & Systems Science, CAS, CHINA
|09:15am-09:45am||Oral talk: Anne Yoder|
|09:45am–10:15am||Oral talk: George Tiley|
|10:30am–11:00am||Oral talk: Jelmer Poelstra|
|11:00am-11:30am||Oral talk: Dong-dong Wu|
|01:45pm–02:15pm||Oral talk: Bruce Rannala|
|02:15pm-02:45pm||Oral talk: Thomas Flouris|
|03:00pm–03:30pm||Oral talk: Xiyun Jiao|
|03:30pm–04:00pm||Oral talk: Tianqi Zhu|
|04:00pm-04:30pm||Oral talk: Ziheng Yang|
|04:30pm-05:00pm||Open Discussion & Concluding Remarks|
Abstracts for Public Symposium:
Preparing for a Changing Climate and an Altered Landscape:
A Symposium for uniting theoretical and empirical approaches to biodiversity science
June 7, 2019
Morning Session: Empirical biodiversity studies using the multispecies coalescent
Madagascar as a natural laboratory for understanding climatic impacts on the genomics of speciation and species delimitation in cryptic mammals
Presenter: Anne Yoder, Duke University, USA
Madagascar is one of the world’s greatest natural evolutionary laboratories. Having been isolated from other significant landmasses for at least 80 million years, its biodiversity is comprised of a complex mix of plants and animals that have evolved independently for millions of years. Some lineages are truly ancient, existing on the island prior to the explosive extinction event around the Cretaceous/Tertiary boundary, whereas others are more recent arrivals, having been transported from other landmasses via transoceanic dispersal. In virtually every circumstance, independent of biogeographic origins, this diversity is poorly documented and severely threatened by human-mediated habitat destruction. Thus, as evolutionary geneticists with an eye towards conservation, it is our challenge to characterize and to prioritize for protection this biodiversity. A powerful method for so doing lies in the application of the Multispecies Coalescent (MSC) to comparative genomic data. With such an approach we can characterize fundamental evolutionary parameters such as phylogeny, divergence time, and historical demography, even with limited population-level sampling. And combined with the relative ease and low cost of genome-wide SNP data generated via RADseq, the MSC offers one of the most effective strategies for identifying those lineages and populations in decline, and thus, in most dire need of protection.
Resolving recent biogeographic history and migration corridors among populations of mouse lemurs
Presenter: George Tiley, Duke University, USA
Biodiversity in Madagascar is threated by human-driven forest loss and fragmentation. As closed-canopy forest is removed, grasslands take over and transform the ecology and community structure. Central Madagascar is largely composed of grassland, with patches of forests that create the Central Highland Savannah. However, whether the origins of the Central Highland Savannah are anthropogenic or due to past climate change remains a topic of debate. We investigated fragmentation and the age of the Central Highland Savannah with Goodman’s mouse lemur (Microcebus lehilahytsara), which is assumed to be restricted to closed-canopy forest. Goodman’s mouse lemur is regarded as a highland specialist, and has a uniquely disjointed distribution in both lowland eastern rainforest and relict forest patches in the Central Highland Savannah. We investigated RADseq data with statistically rigorous coalescent methods to perform model-based hypothesis testing and infer migration corridors among sampling sites. Analyses of molecular evolution revealed that forest fragmentation occurred rapidly and affected both the eastern rainforests and Central Highlands during a period of decreased precipitation near the last glacial maximum. Multiple lines of evidence suggests population substructure with a high degree of connectivity between the eastern rainforests and the Central Highland Savannah. Although RADseq loci are short relative to traditional molecular markers and may contain little variation individually, full-likelihood coalescent methods implemented in a Bayesian framework are capable of leveraging information about demographic processes across multiple loci, as demonstrated by comparisons between posterior and marginal prior distributions. RADseq data for non-model taxa combined with coalescent methods can be a useful tool for similar investigations of natural history between the last 10 Kya to 2 Mya, but a number of practical and computational challenges for investigations of increased scope are discussed. Overall, our findings support origins of the Central Highland Savannah that predate human arrival, but also show that continued habitat loss from deforestation would likely endanger Goodman’s mouse lemur, which is currently classified as vulnerable.
Divergence and gene flow in mouse lemurs inferred by RAD-sequencing
Presenter: Jelmer Poelstra, Duke University, USA
The process of speciation is one of the most fundamental biological processes, yet remains poorly understood in many groups of organisms. Mouse lemurs are a highly diverse genus of small primates endemic to Madagascar, in which nearly all species look extremely similar and most diversity has only recently been described using genetic data. However, in many cases it is not clear to what extent described species represent distinct biological species since they are geographically isolated from each other and have been sequenced only at mitochondrial loci. Studies are needed in areas where mouse lemur taxa come into contact, using genomic data to co-estimate divergence times and rates of gene flow. Widespread hybridization was recently inferred in a contact zone between two closely related species of mouse lemurs. Here, we demonstrate with RADseq data that, contrary to expectations, little or no hybridization and no introgression is currently taking place despite local co-ocurrence of the two species. Using formal admixture statistics and coalescent-based modelling, we infer that some ancestral gene flow did take place but has ceased towards the present. These two species are therefore repoductively isolated and since we estimate that they have diverged less than a million years ago, this provides evidence for rapid speciation in mouse lemurs. At the same time, we find that two populations that were recently split as distinct species have diverged only within the last 30,000 years, substantiating concerns that taxa may be “oversplit” when genetic data is limited to mitochondrial DNA. We conclude that, particularly in cryptic species, genomic data is key to inferring divergence histories which, in turn, can be used to delimit species and understand patterns of speciation.
Genetic introgression and high-altitude adaptation of domestic animals
Presenter: Dong-dong Wu, Kunming Institute of Zoology, CAS, CHINA
Abundant and diverse domestic mammals living on the Tibetan Plateau provide useful materials for investigating genetic underpinning of environmental adaptation. We utilized large scale genomes from horses, sheep, goats, cattle, pigs, dogs and donkeys living at both high and low altitudes to disentangle the genetic mechanisms underlying local adaptation. We found that convergent positive selection at the gene level occurs frequently in these Tibetan domestic mammals. Genetic introgression was found to be an important mechanism driving high altitude adaptation of domestic animals. We also reported a potential function in response to hypoxia for the gene C10orf67, which underwent positive selection in the domestic mammals. Our data provides insight into convergent evolution and genetic introgression of high-altitude domestic mammals, and should facilitate the search for additional novel genes involved in the hypoxia response pathway.
Afternoon Session: Theoretical applications for measuring biodiversity using the multispecies coalescent
Species delimitation and species tree estimation using genomic sequence data
Presenter: Bruce Rannala, University of California, Davis, USA
Species play a central role in all branches of biological research and are the fundamental unit used to measure biodiversity on earth. The current rate of speceis extinctions due to anthropogenic activity (including climate change) is difficult to precisely estimate both because species boundaries are in some cases unclear and because millions of species have yet to be described. Species delimitation is the process of determining which groups of individual organisms constitute different populations of a single species and which constitute different species. Genomic data carries extensive information about the degree of genetic isolation among species including ancient and recent introgressions and thus can play an important role in species delimitation. Several large initiatives are currently underway to either generate DNA barcodes for most species (International Barcode of Life), or to sequence the genomes of all known species and discover the remaining species (Earth Biogenome Project). An estimated 80 to 90\% of species on earth are currently undiscovered. Such efforts require methods for identifying new species (species delimitation) or assigning organisms to known species. Genomic species delimitation is therefore at the forefront of modern biodiversity science. For many organisms, morphological species delimitation has been very effective, however some domains of life have proven difficult to classify using morphology. Bacteria, for example, have few distinctive traits. Other groups may have a distinct morphology that is influenced by environment, leading to morphological convergence or divergence that is loosely tied to genetic or evolutionary relatedness. Moreover, morphological species delimitation requires a high level of expertise, and is time consuming, making it impractical for use with very large groups that urgently need to be classified. A semi-automated process of delimitation in which experts need only confirm results obtained using genomic data and computer algorithms is therefore very attractive. Current methods for automated species delimitation and species tree inference using genomic data are described. The methods are based on an explicit population genetic model, the multispecies coalescent, that accounts for factors that can cause gene trees and species trees to differ and accommodates the statistical uncertainty of inferred gene trees. Several outstanding difficulties and future challenges for genomic species delimitation are discussed.
Bayesian MCMC implementation of the multispecies-coalescent-with-introgression (MSci) model
Presenter: Thomas Flouris, University College London, UK
Recent analyses suggest that cross-species gene flow or introgression is common in nature,especially during species divergences. Genomic sequence data can be used to infer introgression events and to estimate the timing and intensity of introgression, providing an important means to advance our understanding of the role of gene flow in radiative speciations. We have implemented the multispecies-coalescent-with-introgression (MSci) model, an extension of the multispecies-coalescent (MSC) model to incorporate introgression, in our Bayesian Markov chain Monte Carlo (MCMC) program bpp. The MSci model accommodates deep coalescence (or incomplete lineage sorting) as well as introgression/hybridization and provides a natural framework for such inference.Both computer simulation and real data analysis suggest that hundreds or thousands of loci are needed to estimate the introgression proportion reliably. Re-analysis of datasets from the purple cone spruce confirms the hypothesis of hybrid speciation, but with no evidence for extinction of the parental species at the formation of the hybrid species. We estimated the intensity of introgression using the genomic sequence data from six mosquito species in the Anopheles gambiae species complex, which varies considerably across the genome, driven by differential selection against introgressed alleles.
The impact of migration/introgression on species tree estimation
Presenter: Xiyun Jiao, University College London, UK
Recent analyses of genomic sequence data suggest that cross-species gene flow is common in both plants and animals and may pose serious challenges to inference of species phylogeny. In extreme cases,pervasive gene flow causes the whole-genome phylogeny to differ from the species tree, while the sex chrosomomes (X or Z), presumably enriched with hybrid sterility genes and thus resistant to gene flow, may reflect the true history of species divergence. Here we examine the amount of gene flow that is necessary to mislead species tree estimation in the case of three species, with either episodic introgressive hybridization or continuous migration between the outgroup species and one of the ingroup species. We focus on two species tree methods, which use the gene tree topologies (the majority-vote method) and the average sequence distances (or average coalescent times) between species, respectively. Both are based on the multispecies coalescent model but do not account for gene flow. Our analysis suggests that a small amount of introgression or a low rate of migration may be sufficient to mislead species tree methods, especially if the species diverged through radiative speciation events over short time intervals. Recent analyses of genomic data from the Anopheles gambia species complex and the Heliconius butterflies suggest that these are examples of such extreme impact of gene flow on species phylogeny.
Maximum likelihood implementation of the isolation-with-migration model
Presenter: Tianqi Zhu, Academy of Mathematics & Systems Science, CAS, CHINA
We develop a maximum likelihood (ML) method for estimating migration rates between species using genomic sequence data. A species tree is used to accommodate the phylogenetic relationships among three species, allowing for migration between the two sister species, while the third species is used as an out-group. A Markov chain characterization of the genealogical process of coalescence and migration is used to integrate out the migration histories at each locus analytically, whereas Gaussian quadrature is used to integrate over the coalescent times on each genealogical tree numerically. This is an extension of our early implementation of the symmetrical isolation-with-migration model for three species to accommodate arbitrary loci with two or three sequences per locus and to allow asymmetrical migration rates. Our implementation can accommodate tens of thousands of loci, making it feasible to analyze genome-scale data sets to test for gene flow. We calculate the posterior probabilities of gene trees at individual loci to identify genomic regions that are likely to have been transferred between species due to gene flow. We conduct a simulation study to examine the statistical properties of the likelihood ratio test for gene flow between the two in-group species and of the ML estimates of model parameters such as the migration rate. Inclusion of data from a third out-group species is found to increase dramatically the power of the test and the precision of parameter estimation.
Coalescent-based models and methods for inferring gene flow between species using genomic data
Presenter: Ziheng Yang, University College London, UK
'll provide an overview of models and methods for testing and estimating gene flow between species. Mainly two kinds of models have been developed. First, the continuous migration model assumes a certain proportion of immigrants between species per generation. This is based on the structured coalescent model in population genetics, and by incorporating the species phylgeony, has become known as the isolation-with-migration (IM) or isolation-with-initial-migration (IIM) models. Second, the hybridization or introgression model assumes that at certain time point in the past two parental species merged to form a hybrid species, or some individuals of one species moved into another through introgressive hybridization. While in theory continuous migration and episodic introgression can be distinguished using genomic data and also they may operate on the same species phylogeny, it is currently a great challenge even to implement properly one or any of them. A number of methods and software programs have been developed to detect the presence of gene flow, and some of the methods can also estimate the migration rate or introgression intensity. Some methods are based on data summaries, such as the proportions of (estimated) gene trees from different gene loci or genomic segments (such as SNaQ/PhyloNetworks, PhyloNet), or genome-scale summaries (the ABBA-BABA test or D statistic and methods based on multi-population joint site-spectrum frequencies). These methods are typically computationally efficient but have poor statistical performance. Other methods make full likelihood-based inference using sequence alignments. These are more powerful in terms of statistical properties but suffer from high computational costs. I will discuss the strengths and weaknesses of the different methods and speculate on directions for future research.