Information

22.3: Selection Pressures and Drivers - Biology

22.3: Selection Pressures and Drivers - Biology


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

DS qs GA db Qc GE Ji Mx QM Ft TC Lt fA Ra DI
  1. Competition for sunlight. Seedless vascular plants were able to reach heights up to 100 feet tall. In the lineage leading to the gymnosperms and angiosperms, some plants developed the ability to grow wider as they grew taller. This secondary growth allowed for increased stability and, eventually, to reach heights over 300 feet.
  2. Drought. Dry conditions would have selected for plants with thicker cuticles, leaves with less surface area to evaporate from, and propagules that could disperse without water and survive through dry periods to germinate when water was available.
  3. Herbivory. In addition to leaves that could resist drought, the presence of insects would have driven selection for plants that could defend against herbivory. The thick cuticle and tough texture of xerophytic leaves made them difficult to eat, while resin canals in both leaves and stems provided another line of defense.

Parasitoids as drivers of symbiont diversity in an insect host

Immune systems have repeatedly diversified in response to parasite diversity. Many animals have outsourced part of their immune defence to defensive symbionts, which should be affected by similar evolutionary pressures as the host’s own immune system. Protective symbionts provide efficient and specific protection and respond to changing selection pressure by parasites. Here we use the aphid Aphis fabae, its protective symbiont Hamiltonella defensa, and its parasitoid Lysiphlebus fabarum to test whether parasite diversity can maintain diversity in protective symbionts. We exposed aphid populations with the same initial symbiont composition to parasitoid populations that differed in their diversity. As expected, single parasitoid genotypes mostly favoured a single symbiont that was most protective against that particular parasitoid, while multiple symbionts persisted in aphids exposed to more diverse parasitoid populations, which in turn affected aphid population density and rates of parasitism. Parasite diversity may be crucial to maintaining symbiont diversity in nature.


In silico Evolution of Lysis-Lysogeny Strategies Reproduces Observed Lysogeny Propensities in Temperate Bacteriophages

Bacteriophages are the most abundant organisms on the planet and both lytic and temperate phages play key roles as shapers of ecosystems and drivers of bacterial evolution. Temperate phages can choose between (i) lysis: exploiting their bacterial hosts by producing multiple phage particles and releasing them by lysing the host cell, and (ii) lysogeny: establishing a potentially mutually beneficial relationship with the host by integrating their chromosome into the host cell's genome. Temperate phages exhibit lysogeny propensities in the curiously narrow range of 5-15%. For some temperate phages, the propensity is further regulated by the multiplicity of infection, such that single infections go predominantly lytic while multiple infections go predominantly lysogenic. We ask whether these observations can be explained by selection pressures in environments where multiple phage variants compete for the same host. Our models of pairwise competition, between phage variants that differ only in their propensity to lysogenize, predict the optimal lysogeny propensity to fall within the experimentally observed range. This prediction is robust to large variation in parameters such as the phage infection rate, burst size, decision rate, as well as bacterial growth rate, and initial phage to bacteria ratio. When we compete phage variants whose lysogeny strategies are allowed to depend upon multiplicity of infection, we find that the optimal strategy is one which switches from full lysis for single infections to full lysogeny for multiple infections. Previous attempts to explain lysogeny propensity have argued for bet-hedging that optimizes the response to fluctuating environmental conditions. Our results suggest that there is an additional selection pressure for lysogeny propensity within phage populations infecting a bacterial host, independent of environmental conditions.

Keywords: competition in silico evolution lysis-lysogeny mathematical modeling phage-bacteria.

Figures

Dynamics of the bacterial and…

Dynamics of the bacterial and phage populations as a function of time, starting…

The payoff matrix for Player…

The payoff matrix for Player 1, ( L 1L 2 )/(…

Variation of the optimal (minimax)…

Variation of the optimal (minimax) lysogeny propensity, f opt , as parameters are…

The evolution of the winning…

The evolution of the winning strategy f ( m ), where m is…


22.1 Prokaryotic Diversity

By the end of this section, you will be able to do the following:

  • Describe the evolutionary history of prokaryotes
  • Discuss the distinguishing features of extremophiles
  • Explain why it is difficult to culture prokaryotes

Prokaryotes are ubiquitous. They cover every imaginable surface where there is sufficient moisture, and they also live on and inside virtually all other living things. In the typical human body, prokaryotic cells outnumber human body cells by about ten to one. They comprise the majority of living things in all ecosystems. Some prokaryotes thrive in environments that are inhospitable for most living things. Prokaryotes recycle nutrients —essential substances (such as carbon and nitrogen)—and they drive the evolution of new ecosystems, some of which are natural and others man-made. Prokaryotes have been on Earth since long before multicellular life appeared. Indeed, eukaryotic cells are thought to be the descendants of ancient prokaryotic communities.

Prokaryotes, the First Inhabitants of Earth

When and where did cellular life begin? What were the conditions on Earth when life began? We now know that prokaryotes were likely the first forms of cellular life on Earth, and they existed for billions of years before plants and animals appeared. The Earth and its moon are dated at about 4.54 billion years in age. This estimate is based on evidence from radiometric dating of meteorite material together with other substrate material from Earth and the moon. Early Earth had a very different atmosphere (contained less molecular oxygen) than it does today and was subjected to strong solar radiation thus, the first organisms probably would have flourished where they were more protected, such as in the deep ocean or far beneath the surface of the Earth. Strong volcanic activity was common on Earth at this time, so it is likely that these first organisms—the first prokaryotes—were adapted to very high temperatures. Because early Earth was prone to geological upheaval and volcanic eruption, and was subject to bombardment by mutagenic radiation from the sun, the first organisms were prokaryotes that must have withstood these harsh conditions.

Microbial Mats

Microbial mats or large biofilms may represent the earliest forms of prokaryotic life on Earth there is fossil evidence of their presence starting about 3.5 billion years ago. It is remarkable that cellular life appeared on Earth only a billion years after the Earth itself formed, suggesting that pre-cellular “life” that could replicate itself had evolved much earlier. A microbial mat is a multi-layered sheet of prokaryotes (Figure 22.2) that includes mostly bacteria, but also archaeans. Microbial mats are only a few centimeters thick, and they typically grow where different types of materials interface, mostly on moist surfaces. The various types of prokaryotes that comprise them carry out different metabolic pathways, and that is the reason for their various colors. Prokaryotes in a microbial mat are held together by a glue-like sticky substance that they secrete called extracellular matrix.

The first microbial mats likely obtained their energy from chemicals found near hydrothermal vents. A hydrothermal vent is a breakage or fissure in the Earth’s surface that releases geothermally heated water. With the evolution of photosynthesis about three billion years ago, some prokaryotes in microbial mats came to use a more widely available energy source—sunlight—whereas others were still dependent on chemicals from hydrothermal vents for energy and food.

Stromatolites

Fossilized microbial mats represent the earliest record of life on Earth. A stromatolite is a sedimentary structure formed when minerals are precipitated out of water by prokaryotes in a microbial mat (Figure 22.3). Stromatolites form layered rocks made of carbonate or silicate. Although most stromatolites are artifacts from the past, there are places on Earth where stromatolites are still forming. For example, growing stromatolites have been found in the Anza-Borrego Desert State Park in San Diego County, California.

The Ancient Atmosphere

Evidence indicates that during the first two billion years of Earth’s existence, the atmosphere was anoxic , meaning that there was no molecular oxygen. Therefore, only those organisms that can grow without oxygen—anaerobic organisms—were able to live. Autotrophic organisms that convert solar energy into chemical energy are called phototrophs , and they appeared within one billion years of the formation of Earth. Then, cyanobacteria , also known as “blue-green algae,” evolved from these simple phototrophs at least one billion years later. It was the ancestral cyanobacteria (Figure 22.4) that began the “oxygenation” of the atmosphere: Increased atmospheric oxygen allowed the evolution of more efficient O2-utilizing catabolic pathways. It also opened up the land to increased colonization, because some O2 is converted into O3 (ozone) and ozone effectively absorbs the ultraviolet light that could have otherwise caused lethal mutations in DNA. The current evidence suggests that the increase in O2 concentrations allowed the evolution of other life forms.

Microbes Are Adaptable: Life in Moderate and Extreme Environments

Some organisms have developed strategies that allow them to survive harsh conditions. Almost all prokaryotes have a cell wall, a protective structure that allows them to survive in both hypertonic and hypotonic aqueous conditions. Some soil bacteria are able to form endospores that resist heat and drought, thereby allowing the organism to survive until favorable conditions recur. These adaptations, along with others, allow bacteria to remain the most abundant life form in all terrestrial and aquatic ecosystems.

Prokaryotes thrive in a vast array of environments: Some grow in conditions that would seem very normal to us, whereas others are able to thrive and grow under conditions that would kill a plant or an animal. Bacteria and archaea that are adapted to grow under extreme conditions are called extremophiles , meaning “lovers of extremes.” Extremophiles have been found in all kinds of environments: the depths of the oceans, hot springs, the Arctic and the Antarctic, in very dry places, deep inside Earth, in harsh chemical environments, and in high radiation environments (Figure 22.5), just to mention a few. Because they have specialized adaptations that allow them to live in extreme conditions, many extremophiles cannot survive in moderate environments. There are many different groups of extremophiles: They are identified based on the conditions in which they grow best, and several habitats are extreme in multiple ways. For example, a soda lake is both salty and alkaline, so organisms that live in a soda lake must be both alkaliphiles and halophiles (Table 22.1). Other extremophiles, like radioresistant organisms, do not prefer an extreme environment (in this case, one with high levels of radiation), but have adapted to survive in it (Figure 22.5). Organisms like these give us a better understanding of prokaryotic diversity and open up the possibility of finding new prokaryotic species that may lead to the discovery of new therapeutic drugs or have industrial applications.

Extremophile Conditions for Optimal Growth
Acidophiles pH 3 or below
Alkaliphiles pH 9 or above
Thermophiles Temperature 60–80 °C (140–176 °F)
Hyperthermophiles Temperature 80–122 °C (176–250 °F)
Psychrophiles Temperature of -15-10 °C (5-50 °F) or lower
Halophiles Salt concentration of at least 0.2 M
Osmophiles High sugar concentration

Prokaryotes in the Dead Sea

One example of a very harsh environment is the Dead Sea, a hypersaline basin that is located between Jordan and Israel. Hypersaline environments are essentially concentrated seawater. In the Dead Sea, the sodium concentration is 10 times higher than that of seawater, and the water contains high levels of magnesium (about 40 times higher than in seawater) that would be toxic to most living things. Iron, calcium, and magnesium, elements that form divalent ions (Fe 2+ , Ca 2+ , and Mg 2+ ), produce what is commonly referred to as “hard” water. Taken together, the high concentration of divalent cations, the acidic pH (6.0), and the intense solar radiation flux make the Dead Sea a unique, and uniquely hostile, ecosystem 1 (Figure 22.6).

What sort of prokaryotes do we find in the Dead Sea? The extremely salt-tolerant bacterial mats include Halobacterium, Haloferax volcanii (which is found in other locations, not only the Dead Sea), Halorubrum sodomense, and Halobaculum gomorrense, and the archaean Haloarcula marismortui, among others.

Unculturable Prokaryotes and the Viable-but-Non-Culturable State

The process of culturing bacteria is complex and is one of the greatest discoveries of modern science. German physician Robert Koch is credited with discovering the techniques for pure culture, including staining and using growth media. Microbiologists typically grow prokaryotes in the laboratory using an appropriate culture medium containing all the nutrients needed by the target organism. The medium can be liquid, broth, or solid. After an incubation time at the right temperature, there should be evidence of microbial growth (Figure 22.7). Koch's assistant Julius Petri invented the Petri dish, whose use persists in today’s laboratories. Koch worked primarily with the Mycobacterium tuberculosis bacterium that causes tuberculosis and developed guidelines, called Koch's postulates , to identify the organisms responsible for specific diseases. Koch's postulates continue to be widely used in the medical community. Koch’s postulates include that an organism can be identified as the cause of disease when it is present in all infected samples and absent in all healthy samples, and it is able to reproduce the infection after being cultured multiple times. Today, cultures remain a primary diagnostic tool in medicine and other areas of molecular biology.

Koch's postulates can be fully applied only to organisms that can be isolated and cultured. Some prokaryotes, however, cannot grow in a laboratory setting. In fact, over 99 percent of bacteria and archaea are unculturable. For the most part, this is due to a lack of knowledge as to what to feed these organisms and how to grow them they may have special requirements for growth that remain unknown to scientists, such as needing specific micronutrients, pH, temperature, pressure, co-factors, or co-metabolites. Some bacteria cannot be cultured because they are obligate intracellular parasites and cannot be grown outside a host cell.

In other cases, culturable organisms become unculturable under stressful conditions, even though the same organism could be cultured previously. Those organisms that cannot be cultured but are not dead are in a viable-but-non-culturable (VBNC) state. The VBNC state occurs when prokaryotes respond to environmental stressors by entering a dormant state that allows their survival. The criteria for entering into the VBNC state are not completely understood. In a process called resuscitation , the prokaryote can go back to “normal” life when environmental conditions improve.

Is the VBNC state an unusual way of living for prokaryotes? In fact, most of the prokaryotes living in the soil or in oceanic waters are non-culturable. It has been said that only a small fraction, perhaps one percent, of prokaryotes can be cultured under laboratory conditions. If these organisms are non-culturable, then how is it known whether they are present and alive? Microbiologists use molecular techniques, such as the polymerase chain reaction (PCR), to amplify selected portions of DNA of prokaryotes, e.g., 16S rRNA genes, demonstrating their existence. (Recall that PCR can make billions of copies of a DNA segment in a process called amplification.)

The Ecology of Biofilms

Some prokaryotes may be unculturable because they require the presence of other prokaryotic species. Until a couple of decades ago, microbiologists used to think of prokaryotes as isolated entities living apart. This model, however, does not reflect the true ecology of prokaryotes, most of which prefer to live in communities where they can interact. As we have seen, a biofilm is a microbial community (Figure 22.8) held together in a gummy-textured matrix that consists primarily of polysaccharides secreted by the organisms, together with some proteins and nucleic acids. Biofilms typically grow attached to surfaces. Some of the best-studied biofilms are composed of prokaryotes, although fungal biofilms have also been described, as well as some composed of a mixture of fungi and bacteria.

Biofilms are present almost everywhere: they can cause the clogging of pipes and readily colonize surfaces in industrial settings. In recent, large-scale outbreaks of bacterial contamination of food, biofilms have played a major role. They also colonize household surfaces, such as kitchen counters, cutting boards, sinks, and toilets, as well as places on the human body, such as the surfaces of our teeth.

Interactions among the organisms that populate a biofilm, together with their protective exopolysaccharidic (EPS) environment, make these communities more robust than free-living, or planktonic, prokaryotes. The sticky substance that holds bacteria together also excludes most antibiotics and disinfectants, making biofilm bacteria hardier than their planktonic counterparts. Overall, biofilms are very difficult to destroy because they are resistant to many common forms of sterilization.

Visual Connection

Compared to free-floating bacteria, bacteria in biofilms often show increased resistance to antibiotics and detergents. Why do you think this might be the case?


Discussion

Genetic algorithms have been employed in multiple studies to search for phylogenetic trees ( Lewis 1998 Katoh, Kuma, and Miyata 2001 Brauer et al. 2002 Lemmon and Milinkovitch 2002 Kim, Lee, and Moon 2003 Shen and Heckendorn 2004). Our study is the first to our knowledge that uses a genetic algorithm to search for models of substitution in a phylogenetic context. Classical GA algorithms employ large population sizes (hundreds or thousands of individuals), and implicitly rely on the ability to compute the fitness of a single individual (in this case, the AICc of a given model) very quickly. Unfortunately, most molecular evolution problems, such as those considered here, are computationally challenging, and the number of fitness evaluations must be limited. Fortunately, there exist aggressive versions of GA, such as the CHC algorithm used in this study, which work well with smaller population sizes. Alternatively, a reversible jump Markov chain Monte Carlo procedure could be used to simultaneously estimate the number of classes of ω and the assignments to branches however, this approach is likely to be far more computationally intensive than the GA approach pursued here. The GA method can be run on a small cluster of computers in a matter of hours and is therefore feasible for immediate use by researchers.

Using Akaike's Information Criterion as a measure of goodness-of-fit has been criticized for being too liberal, and may select models that are overly complex instead, we have used a small sample AIC, AICc . In principle, information criteria other than AICc could be used, for example Schwarz's Bayesian Information Criterion (BIC) ( Schwarz 1978), which is consistent (the probability that it will recover a true low-dimensional model approaches 1 as the sample size tends to infinity). However, the BIC assumes that the true model is contained within the candidate set of models, whereas the AIC does not assume that any of the candidate models is necessarily true rather, it quantifies the discrepancy between the probability density generated by the model and the data as measured by Kullback-Liebler information ( Kullback and Liebler 1951).

Given the large number of possible models and limited data, it is likely that a substantial number—tens or hundreds—of models will be consistent with the data. With the use of Akaike weights, which can be directly interpreted as conditional probabilities of the models, the robustness of conclusions to the choice of model can be assessed by averaging over models. Furthermore, an a priori biological model can be placed in the context of other credible models. If, for example, such a model falls outside the inferred 95% confidence set, it may not be well supported, as there are many more models which fit the data significantly better, even when the a priori model may be significantly better than the single-ratio model, as assessed by a nested comparison using a likelihood ratio test. Additionally we can test whether an a priori biological model is significantly worse than the best fitting model found by the GA using a Shimodaira-Hasegawa test. It is important to consider whether the a priori model is simply a submodel of the best fitting model, and whether the biological conclusions are altered under the best fitting model. Based on the small lysozyme data set, we find that the hypothesis that the lineages leading to Colobines and Hominoids have elevated nonsynonymous to synonymous substitution rates is consistent with our GA-based analysis. However, our GA results based on the large data set suggest that positive selection is ongoing following the radiation of these clades in contrast to the hypothesis of Messier and Stewart (1997).

When an a priori model compares favorably with the set of “good” models, we may be more confident in biological conclusions derived from studying such an a priori model. If, however, there are many alternative models which fit the data much better than our preconceived hypothesis, further exploration of the data, and perhaps a reassessment of our assumptions, is in order. In many practical cases the model itself is a nuisance component of the analysis, for example when one is interested in evolution along a specific lineage, and the rest of the phylogeny forms the “background.” By weighing over credible models, quantities of biological interest (such as the probability that along a particular branch dN/dS > 1) may be derived without relying on a specific a priori model and concomitant assumptions, such as uniformity of selective pressure along background branches.


3 Themes from the Open Discussions

Substantial time in the workshop was left for open discussion, and a couple of important conclusions emerged. One was pluralism about OEE. It seems clear that there is more than one interesting and important kind of OEE. This means that those discussing OEE should whenever possible be explicit and precise about the kind of OEE of interest. Every kind of OEE should be identified and defined as precisely as possible, taking care not to lose those kinds that have intuitive appeal but cannot be precisely defined. Each successful definition should be operational and quantitative. But no definition is the one and only right definition of OEE if there is more than one kind of OEE. Some people might be especially interested in one kind of OEE, and others in another kind. Later in this section we attempt to identify the broad categories of kinds of OEE discussed at the workshop.

“Open-ended evolution” refers to a distinctive kind of behavior exhibited by some evolving systems, and different kinds of OEE correspond to somewhat different kinds of behavior. The workshop discussion highlighted the importance of distinguishing observable behavioral hallmarks of systems undergoing OEE from hypothesized underlying mechanisms that explain why a system exhibits those hallmarks. These “mechanisms” might be merely causally necessary conditions for OEE, or necessary boundary conditions. A sufficiently large population, or a sufficiently long duration, or a sufficiently large evolutionary search space have all been proposed as necessary conditions for the appearance of OEE. Perhaps no single mechanism is causally sufficient to produce OEE, but presumably each kind of OEE is produced by something like a set of individually necessary and jointly sufficient mechanisms. Identifying these key mechanisms for (each kind of) OEE is the question driving much of the research on OEE.

Both hallmarks of OEE and mechanisms for OEE are important, but they are important for different reasons. The hallmarks identify the important distinctive observable signs of (different kinds of) OEE. A given kind of OEE might have more than one behavioral hallmark, and different kinds will have somewhat different hallmarks, so the list of hallmarks of OEE can be expected to be somewhat heterogeneous.

Ongoing generation of new adaptations is a very simple kind of OEE, and detecting it was the motivation for the original evolutionary activity statistics of Bedau and Packard [5, 4]. New adaptations could arise through a combination of different evolutionary and ecological mechanisms, such as competitive exclusion, random drift among neutral variants, and kin selection. Adaptation comes in different forms for example, sometimes it is possible and important to distinguish new instances of a familiar kind of adaptation from qualitatively new kinds of adaptations, and the ongoing generation of qualitatively new kinds of adaptations is more interesting and more challenging to understand. The ongoing generation of new adaptations might seem to involve populations of agents with an unlimited number of different basic adaptive traits. However, practical considerations often impose a finite ceiling on the number of different basic adaptations distinguished in computer models or natural systems. Nevertheless, if evolution can produce finite combinations (sets) of adaptive traits, then the number of potential new adaptive combinations increases dramatically.

Ongoing generation of new kinds of entities is one way to bring about the ongoing generation of new kinds of adaptations. The emergence of dynamical hierarchies described by Rasmussen et al. [38] is one mechanism for generating new kinds of entities with new kinds of properties. However, since the underlying mechanism in dynamical hierarchies can be merely chemical and physical self-assembly and self-organization, the properties of the entities at different levels in the hierarchy might not be adaptations. But if a dynamical hierarchy incorporates new material and information from the environment, and if the whole hierarchy can reproduce similar daughter hierarchies, then adaptive evolution could arise and start to shape a population of new kinds of entities.

A major transition in evolution involves the emergence of a dynamical hierarchy and does involve adaptive evolution, and ongoing major transitions in evolution constitute another kind of OEE. The major transitions in evolution discussed by Maynard Smith and Szathmáry [31] (and recently revisited by Szathmáry [47]) are an especially interesting form of dynamical hierarchies, and they are special because each new level in the hierarchy consists of a new population of reproducing and evolving entities. A major transition in evolution is preceded by the evolution of one or several distinct kinds of reproducing entity. Eventually certain groups of those entities come to interact very tightly, and they become members of a new population of higher-level reproducing wholes. Entities in the old lower-level population become parts of the new wholes, but they cannot reproduce independently. Now the process repeats once more. Certain groups in the population of new wholes come to interact very tightly, and they become new even-higher-level wholes that reproduce and form an even-higher-level population, and so on. Maynard Smith and Szathmáry [31] conclude that the major transitions in evolution they survey are quite contingent they could easily not have happened, and there may be no more major transitions. 11 So, the existence of some major transitions in evolution is not necessarily any kind of OEE. But major transitions can spur many further adaptations and help make evolution open-ended. And ongoing major transitions would be an especially impressive kind of OEE. 12

Since major transitions in evolution typically create new kinds of entities with new kinds of adaptations, the transitions are one way in which the ability to evolve can itself evolve. But there are many other ways in which the ability to evolve can itself evolve. Thus, the ongoing evolution of evolvability is another kind of OEE. One especially critical step in the evolution of evolvability is the very first step: the emergence of the ability to evolve at all (the subject of Packard's talk at the workshop).

One focus is the complexity of the entities in an evolving population, and one kind of OEE is the ongoing growth in complexity of entities in the evolving population. The property of interest here is the complexity of the most complex entities, rather than entities with mean or modal complexity [18]. Further kinds of OEE involve ongoing growth of other global properties of the evolving population, such as diversity or disparity [17]. Note that growth of entity complexity is a side effect of major transitions in evolution, when the old evolving entities become parts of the new ones. But other mechanisms could also produce entities that are more and more complex.

Another way in which an evolving system can become more complex is for the interactions among the entities in it to become more complex. Ongoing growth of complexity of interactions among entities is another kind of OEE. Even if the internal properties of the entities in a system remain the same, the interactions among entities can become more and more complex, as when food webs among species become more complex.

The emphasis on “ongoing” novelty itself deserves a brief mention. “Ongoing” is better than another common expression used in this context—“perpetual” novelty—because OEE is not actually perpetual although it is ongoing. The discussion in York focused partly on David Ackley's idea of indefinite scalability, after this concept was emphasized in his talk. Ackley [2] defines indefinite scalability “as supporting open-ended computational growth without requiring substantial re-engineering.” The key criteria for indefinite scalability is that, should an upper bound be reached (e.g., in the number of novel entities encountered over the course of evolution or in the diversity or complexity of entities), increasing the values of physical limits (e.g., available matter, population size, or memory) should enable an unbounded sequence of greater upper bounds to be achieved (after sufficient increases in the limits). However, it is not possible in finite system time to establish that a metric is truly unbounded. 13 And it is not possible—over a finite number of increases in system parameter(s)—to establish that a metric is truly indefinitely scalable. Further, an increase in parameter(s) may require a longer system time before a greater scale (higher value metric) is achieved. Claims about systems can, though, be expressed and evaluated in terms such as a metric increasing without bound up to a certain system time (or number of generations, etc.) or a metric increasing as system parameter(s) are increased up to certain value(s), where it was necessary to increase these to establish increases in the metric's maximum observed value over successive runs. Furthermore, we can define boundedness of a metric within a system in a rigorous way by fitting mathematical functions to the data and using statistics to ascertain which function is the best fit (e.g., see [57]). If the best-fit function is unbounded, that is a good indication that the system is exhibiting unbounded behavior.

Clarifying the hallmarks of OEE is a crucial step in clearly identifying and distinguishing the different kinds of OEE. After the hallmarks are clear, another crucial step is identifying and testing possible mechanisms that would produce and explain each of the kinds of OEE. Different mechanisms might be proposed to produce or explain a given kind of OEE the mechanisms could provide competing explanations, or they could provide cooperating mechanisms. Also, a single mechanism might be involved in the explanation of more than one kind of OEE. So, the list of hypothetical mechanisms for a given kind of OEE could be rather heterogeneous. In addition, OEE pluralism means that different kinds of OEE could have different underlying mechanisms. Some mechanisms might be necessary for one kind but not another kind of OEE other mechanisms might be necessary for every kind.

The discussion in York was weighted towards hallmarks of OEE, but some mechanisms for OEE were also mentioned and discussed. For example, one might think that the evolution of the genetic code is the mechanism behind OEE. Certain mechanisms are very obvious, but often insufficient by themselves. For example, since OEE involves adaptive evolution, natural selection helps explain it, and we already know a lot about evolution by natural selection. The participants in the discussion were divided about whether we already know enough to explain each kind of OEE, with some conjecturing that a fundamentally new mechanism is required for some kinds of OEE, such as major transitions in evolution.

Note that major transitions, the evolution of the genetic code, and the evolution of evolvability in general, are both kinds of OEE and mechanisms for kinds of OEE. This shows how one and the same thing can appear on the lists of both hallmarks of (one kind of) OEE and mechanisms for (another kind of) OEE.

An important research goal is to document examples of each hallmark and requirement of OEE, both in computer models and in natural systems. Positive examples that demonstrate a kind of OEE in a model or natural system are especially critical, but also important are negative examples of model or natural systems that do not demonstrate some kind of OEE.

Open-ended evolution is an ongoing process, so a single instance of the behavioral hallmarks of OEE falls short of being genuinely open-ended. A single new adaptation is not OEE, neither is the growth in complexity of one organism, nor one instance of the evolution of evolvability, nor one major transition in evolution. Nevertheless, it can be a significant scientific achievement to document even single instances of some especially challenging hallmarks, such as major transitions in evolution.

Earth's biosphere has been classified, through fossil data sets at the level of taxonomic families, as exhibiting open-ended evolutionary dynamics according to Bedau and Packard's evolutionary activity measures [6, 7]. Bedau et al. reasoned that it was not necessary to include a shadow mechanism in this analysis, as “the mere fact that a family appears in the fossil record is good evidence that its persistence reflects its adaptive significance” [7].

The Long-Term Evolution Experiment (LTEE) (Lenski et al.). The LTEE [28] is the most extensive laboratory study of ongoing biological evolution. Publications from this highly tractable system exhibit dynamics in evolving populations of E. coli that appear to be open-ended. Specifically, the LTEE has shown continuous increases in fitness that are best described by an unbounded power law function [57, 29]. Individual populations have shown continuous generation of novelty, such as: new portions of the fitness landscape continually being explored [53], numerous selective sweeps [30], new diversity arising after sweeps [9], and epistatic interactions among mutations where later benefits depend upon earlier mutations [56]. Finally, multiple populations in the LTEE exhibit frequency dependence [42, 24, 41, 30], including a special case [10, 9, 54] shown to be driven by ecological specialization and crossfeeding [55]. Most prominently, the LTEE gained substantial attention when a drastically new phenotype appeared, giving rise to what amounted to a new species [10, 9].

Tierra (Ray). Tierra [39] is perhaps the most well-known example of an early ALife system in which digital organisms—self-replicating computer programs—were free to evolve in an open-ended manner without the guidance of an explicit fitness function. However, each particular run of the system would eventually reach a state of stasis where only selectively neutral variations were seen to emerge [40, 49]. Bedau and colleagues analyzed the dynamics of a Tierra-like system named Evita (but not Tierra itself), and found it to have qualitatively different evolutionary dynamics from those displayed in biological evolution as evidenced by the fossil record [6]. 15

Avida (Ofria et al.). Avida [32] is currently the most widely used digital evolution system, and is used to study a wide range of evolutionary and ecological dynamics in populations of self-replicating computer programs. Avida has enabled the evolution of qualitatively novel behaviors such as complex features completely absent in the ancestor organism (ongoing generation of new adaptations) [27], novel collaboration strategies among organisms (ongoing growth of complexity of interactions) [16], and novel ecological interactions through coevolution promoting even greater levels of complexity [58]. Dolson and collaborators are actively testing their complexity barriers in this system, as well as analyzing evolutionary activity statistics. Initial results of the boundedness of fitness growth in simple to complex environments in Avida indicate that fitness continues to increase without an asymptote in the default environment. Many ongoing projects use Avida to evolve cooperation, ecosystems, sexual reproduction, parasitism/mutualism, pleiotropy, intelligence, evolvability, and complexity.

Geb (Channon). Geb was the first ALife system to be classified as exhibiting openended evolutionary dynamics according to Bedau and Packard's evolutionary activity measures [7] and is the only one to have been classified as such according to an enhanced version of these measures developed by Channon [14, 15]. 16 Novel adaptations reported in Geb include behaviors such as following, fighting, fleeing, mimicking, and novel artifacts such as matching I&O channels in agents' neurocontrollers. Preliminary (unpublished) results presented at the Artificial Life XI conference in 2008 further indicated that component diversity (a simple measure of system complexity) may be indefinitely scalable (although that term was not yet in use) a more complete study of this is now planned.

Pichler's computational ecosystem [35, 36, 34] is the only other ALife system to date to have been classified as exhibiting open-ended evolutionary dynamics according to Bedau and Packard's evolutionary activity measures. To the best of our knowledge, it has not been subjected to the enhanced test.

Stringmol (Hickinbotham et al.). Stringmol is an artificial chemistry system that has been shown to exhibit the ongoing appearance of new chemical species [21]. In some cases the system has been shown to evolve multi-species hypercycles that persist for prolonged periods. Thus, Stringmol demonstrates the ongoing generation of new adaptations. These adaptations affect a species' binding affinity to other species, as well as its reaction rules. Quantitative novelties are certainly arising in the system (e.g., in binding affinities), although it is yet to be established whether any qualitative novelties are arising at the level of the individual chemical species. The appearance of hypercycles also demonstrates growth of complexity of interactions, and a qualitatively new organization of the system.

Novelty search (Lehman and Stanley). Mentioned in Dolson's talk, Lehman and Stanley's novelty search technique has attracted considerable interest in recent years [25, 26]. The approach has been shown to generate ongoing generation of new adaptations. However, this is achieved by employing a selection mechanism that specifically looks for novel phenotypes. Hence, by design the approach will produce the hallmark of ongoing generation of new adaptations if (and only if) the system has implemented the necessary mechanisms for the ongoing generation of such adaptations. Novelty search, by itself, does not take a stand on what kinds of mechanisms are required. Furthermore, it requires a measure of phenotypic distance between individuals to tell whether two individuals exhibit sufficiently different behaviors. Like a fitness function in traditional EAs, this definition of phenotypic uniqueness needs to be carefully chosen. Defining a more general measure, applicable to OEE, appears to be a major research challenge, but potentially a rewarding one. Further work is required to understand the similarities and differences between novelty search and OEE one line of research along these lines has recently been initiated by Soros and Stanley [44].

Dynamical hierarchies (Rasmussen et al.). Rasmussen and colleagues reported results in a model of a physicochemical system that exhibited dynamical hierarchies. They demonstrated the emergence of two higher orders of entities and interactions on top of the basic first-order elements built into the system. This work was based upon a model of self-assembly rather than evolution to be of direct relevance to OEE it would need to be augmented with mechanisms for self-replication, variation, and selection of the emergent dynamical hierarchies. Enabling populations of newly emerging dynamical hierarchies to undergo adaptive evolution would unify the processes of self-assembly and self-organization with the process of adaptive evolution, and this could explain one kind of novelty in OEE: the evolution of new kinds of wholes with new kinds of properties. In this context one mechanism driving the ongoing generation of novelty is the availability in the environment of new materials that can aggregate and generate novel properties.

Emergence of coding (Packard and Guttenberg). In his talk, Packard described preliminary work on a model in which alternation between unstable and fixed-point dynamics produced conditions suitable for the emergence of informationally-stable components. While preliminary, the results have relevance for the pre-biotic transition to information-driven systems. Packard reported that colleagues are applying these ideas to models of other major evolutionary transitions too. This work has not yet been published.

Patented technology (Bedau et al.). Bedau suggested that the actual evolution of technology (detected in the patent record) is a real-world system that exhibits a form of OEE that he termed ongoing door-opening evolution [11], which occurs when one technological innovation enables a whole new kind of technology to arise and diversify. Bedau conjectured that door-opening innovations are an important mechanism behind the ongoing generation of new kinds of adaptations, and he proposed some first steps to observing and measuring door-opening innovations in the patent record.

Social media tags (Ikegami). In his talk, Ikegami argued that the social media tag system he described represented an example of OEE. With respect to the evolution of new combinations of tags, this would be ongoing generation of new adaptations (at a quantitative level). The role of human users as an integral part of the system, who both supply the new tags and upload new images to be tagged, is a complicating factor in this case.

Of the example systems discussed by the speakers in York, many focused on the ongoing generation of quantitatively new adaptations, where “quantitatively new” means that the adaptations are novel, but identifiable within a determined class of possibilities, and as a result of their identifiability, they may be statistically quantified. In contrast, qualitatively new adaptations lie outside any predetermined class of possibilities. It is clear that qualitatively new adaptations are a part of natural evolutionary processes, but less clear whether and how they might occur in example systems considered so far—indeed, clearer criteria are required for what counts as qualitative, rather than quantitative, novelty in these systems. Sharpening this distinction should lead toward progress in understanding how open-ended evolution manifests properties such as growth of complexity of interactions, ongoing generation of new entities, ongoing generation of new functionalities, and major evolutionary transitions.


Access options

Get full journal access for 1 year

All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Get time limited or full article access on ReadCube.

All prices are NET prices.


Results

Establishing Method Performance

False Positives

The rejection rate on null data with two branch sets, that is, on all sites where the evolutionary rates among branch sets were equal, in aggregate, was slightly lower than nominal rates (diagonal line, the test statistic performs exactly as expected under the null model fig. 1A). We restricted the calculations only to variable sites because Contrast-FEL returns null results on invariant sites by definition (all maximum likelihood estimates for rate parameters are 0 at such sites) and because including invariant sites would only lower observed rejection rates. Contrast-FEL may become anticonservative (rejection rates above nominal) for very high divergence rates ( fig. 1A and E, see below). The rate of false positives on null sites was largely independent of the values of synonymous and nonsynonymous rates, the levels of mean sequence divergence (within reason) in branch sets, and data set size ( fig. 1B–D). The test is more conservative for low rates and smaller sets of branches, as expected. Permutation P-values and false discovery rate (FDR) q-values delivered more conservative detection rates than standard LRT P-values but mirrored the trends of the latter ( fig. 1C–D). This is expected because permutation P-values are only computed conditioned on the significant LRT, so they can only be more conservative, and q-values incorporate a multiple testing correction. When a site is very saturated, that is, the product of the maximal rate estimate (α or β) and the alignment-wide tree length in expected substitutions per site exceeds 100, the test becomes anticonservative the qq plot for the sites with log 10 divergence rate between 2.5 and 3.5 is shown in figure 1A as those are the saturated sites with a detection rate permutation P ≥ 0.05 as per figure 1E. Our implementation reports the total branch length for each tested site, and saturated sites can be screened out using this metric. Such sites are rare in simulated data and should be even more rare in empirical alignments (we did not detect any in the empirical data).

Contrast-FEL performance on null data (error control). The plots are based on 1,090,929 variable sites simulated with equal nonsynonymous rates on two branch sets (see text for simulation details). (A) Q–Q plots of LRT P-values for all sites (blue line) and for 3,684 saturated sites ( ⁠ log 10 of the divergence level between 2.5 and 3.5, orange line). (B) Detection rate as a function simulated synonymous and nonsynonymous substitution rates ( ⁠ log 10 scale). (C) Detection rate as a function of the number of branches in the test set (binned in increments of 5). Blue line: proportion of sites with LRT P < 0.05, red line: proportion of sites with permutation P < 0.05, gray line: proportion of sites with q < 0.20. Blue area plot shows for the proportion of sites with LRT P < 0.01 (lower) and LRT P < 0.1 (upper). Orange circles reflect the number of sites contributing to each bin. (D) Detection rate as a function of the total branch length of the test set of branches (binned in increments of 0.5) same notation as in (C) otherwise. (E) Detection rate as a function of the log 10 of the divergence level at the site (binned in increments of 0.25) same notation as in (C) otherwise.

Contrast-FEL performance on null data (error control). The plots are based on 1,090,929 variable sites simulated with equal nonsynonymous rates on two branch sets (see text for simulation details). (A) Q–Q plots of LRT P-values for all sites (blue line) and for 3,684 saturated sites ( ⁠ log 10 of the divergence level between 2.5 and 3.5, orange line). (B) Detection rate as a function simulated synonymous and nonsynonymous substitution rates ( ⁠ log 10 scale). (C) Detection rate as a function of the number of branches in the test set (binned in increments of 5). Blue line: proportion of sites with LRT P < 0.05, red line: proportion of sites with permutation P < 0.05, gray line: proportion of sites with q < 0.20. Blue area plot shows for the proportion of sites with LRT P < 0.01 (lower) and LRT P < 0.1 (upper). Orange circles reflect the number of sites contributing to each bin. (D) Detection rate as a function of the total branch length of the test set of branches (binned in increments of 0.5) same notation as in (C) otherwise. (E) Detection rate as a function of the log 10 of the divergence level at the site (binned in increments of 0.25) same notation as in (C) otherwise.

Only 1 in 100 simulated data sets showed false positive rates (FPRs) of 8% or greater (10% or greater for 1 in 1,000), implying that one rarely encounters a simulated data set where the FPR is notably above the typical level.

Precision and Recall

The ability of Contrast-FEL to identify sites that experience differential selective pressures is influenced by the effective sample size, which depends in turn on the number of branches in the group and the extent of sequence divergence, and the effect size, that is, the magnitude of differences in nonsynonymous substitution rates, β. For simulations with two sets of branches, restricted to detectable sites (i.e., sites that were not invariable), power to detect differences aggregated over simulation scenarios is summarized in table 2. Over all detectable sites, the power using the FDR of 20% is 0.319. Restricted to the sites where the difference in β rates between groups was at least 1 (“Large effect”), the power rises to 0.603, and further restricting to only those sites where both the test and the background branch sets had at least three expected substitutions per site (“Large sample size”), increases the power to 0.860 see 3. Similar trends occur for testing using LRT P-values, or permutation-based P-values. Perfectly ladder-like trees on average yield somewhat higher power than either perfectly balanced or random/biological trees.

Power of Contrast-FEL for Detecting Differences in Selection.

Simulation . N . p ≤ 0.05 . q ≤ 0.2 . Permutation p ≤ 0.05 .
Overall 139,753 0.418 0.319 0.361
Large effect 30,923 0.727 0.603 0.665
Large effect/sample size 14,265 0.902 0.860 0.867
Perfectly balanced trees 39,671 0.411 0.316 0.354
Perfectly ladder-like trees 27,439 0.479 0.378 0.417
Random/biological trees 72,643 0.399 0.298 0.343
Four-class simulations
Overall (omnibus) 18,141 0.355 0.276 0.379
Overall (any test) 18,141 0.415 N/A N/A
Large effect (omnibus) 8,684 0.516 0.411 0.542
Large effect (any test) 8,684 0.587 N/A N/A
Simulation . N . p ≤ 0.05 . q ≤ 0.2 . Permutation p ≤ 0.05 .
Overall 139,753 0.418 0.319 0.361
Large effect 30,923 0.727 0.603 0.665
Large effect/sample size 14,265 0.902 0.860 0.867
Perfectly balanced trees 39,671 0.411 0.316 0.354
Perfectly ladder-like trees 27,439 0.479 0.378 0.417
Random/biological trees 72,643 0.399 0.298 0.343
Four-class simulations
Overall (omnibus) 18,141 0.355 0.276 0.379
Overall (any test) 18,141 0.415 N/A N/A
Large effect (omnibus) 8,684 0.516 0.411 0.542
Large effect (any test) 8,684 0.587 N/A N/A

Note .—N, total number of differentially selected sites in the set. Large effect is defined as having the absolute difference in simulated β rates of at least 1. Large sample size is defined as having at least three substitutions occurring along both test and reference branch sets.

Power of Contrast-FEL for Detecting Differences in Selection.

Simulation . N . p ≤ 0.05 . q ≤ 0.2 . Permutation p ≤ 0.05 .
Overall 139,753 0.418 0.319 0.361
Large effect 30,923 0.727 0.603 0.665
Large effect/sample size 14,265 0.902 0.860 0.867
Perfectly balanced trees 39,671 0.411 0.316 0.354
Perfectly ladder-like trees 27,439 0.479 0.378 0.417
Random/biological trees 72,643 0.399 0.298 0.343
Four-class simulations
Overall (omnibus) 18,141 0.355 0.276 0.379
Overall (any test) 18,141 0.415 N/A N/A
Large effect (omnibus) 8,684 0.516 0.411 0.542
Large effect (any test) 8,684 0.587 N/A N/A
Simulation . N . p ≤ 0.05 . q ≤ 0.2 . Permutation p ≤ 0.05 .
Overall 139,753 0.418 0.319 0.361
Large effect 30,923 0.727 0.603 0.665
Large effect/sample size 14,265 0.902 0.860 0.867
Perfectly balanced trees 39,671 0.411 0.316 0.354
Perfectly ladder-like trees 27,439 0.479 0.378 0.417
Random/biological trees 72,643 0.399 0.298 0.343
Four-class simulations
Overall (omnibus) 18,141 0.355 0.276 0.379
Overall (any test) 18,141 0.415 N/A N/A
Large effect (omnibus) 8,684 0.516 0.411 0.542
Large effect (any test) 8,684 0.587 N/A N/A

Note .—N, total number of differentially selected sites in the set. Large effect is defined as having the absolute difference in simulated β rates of at least 1. Large sample size is defined as having at least three substitutions occurring along both test and reference branch sets.

The power of the Contrast-FEL adheres to the expected patterns it increases with the sample size and the effect size. For example, greater levels of divergence at a site (up to a point) corresponded to notable gains in the power of the test ( fig. 2A), as did greater numbers of substitutions occurring in the test set of branches, with power rising from ∼ 26.4 % ( ⁠ p ≤ 0.05 ⁠ ) for two substitutions to 50.7% for eight substitutions ( fig. 2B). Best power is achieved when the difference between substitution rates on the two sets of branches is large ( fig. 2C), exceeding 80% for sufficiently disparate rates, and dropping to < 10 % for rates that are very similar. Power numbers are high when the size of either of the sets is not too small ( fig. 2D).

Contrast-FEL performance data with rate differences (power). The plots are based on 139,753 variable sites simulated with unequal nonsynonymous rates on two branch sets (see text for simulation details). (A) Detection rate as a function of the log 10 of the divergence level at the site. Blue line: proportion of sites with LRT P < 0.05, red line: proportion of sites with permutation P < 0.05, gray line: proportion of sites with q < 0.20. Blue area plot shows for the proportion of sites with LRT P < 0.01 (lower) and LRT P < 0.1 (upper). Orange circles reflect the number of sites contributing to each bin. (B) Detection rate as a function of the number of inferred substitutions in the test set same notation as in (A) otherwise. (C) Detection rate as a function of the simulated nonsynonymous rates in test and background branch sets and (D) the numbers of branches in the test and reference set.

Contrast-FEL performance data with rate differences (power). The plots are based on 139,753 variable sites simulated with unequal nonsynonymous rates on two branch sets (see text for simulation details). (A) Detection rate as a function of the log 10 of the divergence level at the site. Blue line: proportion of sites with LRT P < 0.05, red line: proportion of sites with permutation P < 0.05, gray line: proportion of sites with q < 0.20. Blue area plot shows for the proportion of sites with LRT P < 0.01 (lower) and LRT P < 0.1 (upper). Orange circles reflect the number of sites contributing to each bin. (B) Detection rate as a function of the number of inferred substitutions in the test set same notation as in (A) otherwise. (C) Detection rate as a function of the simulated nonsynonymous rates in test and background branch sets and (D) the numbers of branches in the test and reference set.

Next, we focus on the data simulated with the relatively small (31 sequences) biological tree of vertebrate rhodopsins from Yokoyama et al. (2008) and three different test branch sets: small clade, large clade, and branches grouped by phenotype (absorption wavelength), shown in figure 3. For sufficiently stringent FDR (q-values) cutoffs, high (90%) precision (positive predictive value [PPV]) can be achieved for all three cases, although the cutoffs need to be more stringent for the small clade scenario. High precision is achieved at the cost of fairly low recall ( ⁠ 20 − 25 % ⁠ ), and the small clade scenario has the worst performance among the three scenarios considered.

Contrast-FEL performance on vertebrate rhodopsin simulations. Precision–recall curves for the three sets of simulations, all based on the vertebrate rhodopsin tree from Yokoyama et al. (2008), with different choices for the “test” branch set (precision = true positives/all test positives, recall = true positives/positive training cases). Dotted lines show corresponding base rates for “no-skill” classifiers in each case (i.e., classify all sites as differentially selected). Circles on the individual curves show (left-to-right) precision–recall values for q = 0.1 , q = 0.2 , q = 0.5 ⁠ . There were a total of 37,565 variable sites for the “small” case, 15,010 sites for the “large” case, and 37,401 sites for the “blue” case.

Contrast-FEL performance on vertebrate rhodopsin simulations. Precision–recall curves for the three sets of simulations, all based on the vertebrate rhodopsin tree from Yokoyama et al. (2008), with different choices for the “test” branch set (precision = true positives/all test positives, recall = true positives/positive training cases). Dotted lines show corresponding base rates for “no-skill” classifiers in each case (i.e., classify all sites as differentially selected). Circles on the individual curves show (left-to-right) precision–recall values for q = 0.1 , q = 0.2 , q = 0.5 ⁠ . There were a total of 37,565 variable sites for the “small” case, 15,010 sites for the “large” case, and 37,401 sites for the “blue” case.

Four Branch Classes

Contrast-FEL remained conservative on null data when we applied it to alignments simulated with four branch classes ( fig. 4A), for all types of tests: FWER (family-wise error rates) corrected pairwise differences, omnibus test (any rates are different), and when considering simulations where only some (but not all) of the groups had equal rates. As was the case with simpler two-class simulations, Type I error for severely saturated sites was somewhat elevated. Power to detect differences among any pair of branch groups, either via the pairwise or the omnibus test was strongly influenced by the effect size, ranging from near 0 for rates that were close in magnitude to over 80% for sites where the largest substitution rate was sufficiently high (>1), and sufficiently different (e.g., 5 × ⁠ ) larger than the smallest rate ( fig. 4B). Power of the method is strongly influenced by the effect size, that is, the magnitude of differences between β rates ( fig. 4C), and the information content or saturation of the site, measured as a function of expected substitutions per site ( fig. 4D). Introducing multiple branch classes increases the number of tests performed at each site, and because of the site-level multiple test correction, dilutes the power compared with the two-class case ( table 2). Calling a site differentially evolving if any of the tests returns a significant corrected P-value realizes a 5–6% power boost compared with relying only on the omnibus test.

Contrast-FEL performance data with rate differences using four branch sets. (A) Q–Q plots of either omnibus test P-values (blue) or FWER (orange), which are based on rejections of any of the true nulls among 151,838 sites simulated where all branches had the same nonsynonymous rate. Green line shows the Q–Q plot of the FWERs on 1,944 saturated sites ( ⁠ log 10 of the divergence level above 2.5). Red line shows the FWER for the 3702 data sets where some, but not of the nulls were true (i.e., some branch sets shared rates, whereas others did not). (B) Detection rate as a function of the log 10 of the lowest and highest nonsynonymous rates (rates lower than 0.01 are shown as 0.01) computed on 18,141 sites where at least two rates were different. (C) Detection rate as a function of the effect size, measured as the maximum difference between nonsynonymous rates among branch classes. Blue line: proportion of sites with LRT P < 0.05 (the omnibus test), red line: proportion of sites with permutation P < 0.05, gray line: proportion of sites with q < 0.20. Blue area plot shows for the proportion of sites with LRT P < 0.01 (lower) and LRT P < 0.1 (upper). Orange circles reflect the number of sites contributing to each bin. (D) Detection rate as a function of the log 10 of the divergence level at the site.

Contrast-FEL performance data with rate differences using four branch sets. (A) Q–Q plots of either omnibus test P-values (blue) or FWER (orange), which are based on rejections of any of the true nulls among 151,838 sites simulated where all branches had the same nonsynonymous rate. Green line shows the Q–Q plot of the FWERs on 1,944 saturated sites ( ⁠ log 10 of the divergence level above 2.5). Red line shows the FWER for the 3702 data sets where some, but not of the nulls were true (i.e., some branch sets shared rates, whereas others did not). (B) Detection rate as a function of the log 10 of the lowest and highest nonsynonymous rates (rates lower than 0.01 are shown as 0.01) computed on 18,141 sites where at least two rates were different. (C) Detection rate as a function of the effect size, measured as the maximum difference between nonsynonymous rates among branch classes. Blue line: proportion of sites with LRT P < 0.05 (the omnibus test), red line: proportion of sites with permutation P < 0.05, gray line: proportion of sites with q < 0.20. Blue area plot shows for the proportion of sites with LRT P < 0.01 (lower) and LRT P < 0.1 (upper). Orange circles reflect the number of sites contributing to each bin. (D) Detection rate as a function of the log 10 of the divergence level at the site.

If a differentially evolving site was identified as such by the omnibus test at p ≤ 0.05 (FWER corrected), in 99.6% of cases it was also identified by one or more of the individual pairwise tests, implying that in most cases (at least for our simulation), it is possible to pinpoint specific pairs of branch sets that are responsible. For the remaining 0.4% of sites, the omnibus test was significant, but none of the individual tests was. Alternatively, among the sites that were identified by at least one of the pairwise tests, 85.2% of them are also identified by the omnibus test for the remainder of the sites, the omnibus test is not significant. Among those sites, 89.6% had a single pairwise significant test (median omnibus corrected P = 0.103, interquartile range [ 0.072 − 0.159 ] ⁠ ), 9.5% had two pairwise significant tests ( ⁠ 0.066 [ 0.056 − 0.081 ] ⁠ ), and 0.9% had three pairwise significant tests ( ⁠ 0.058 [ 0.053 − 0.071 ] ⁠ ).

To boost power in low information settings (small branch sets or low divergence), it may be advisable to run only the omnibus test, that is, forego pairwise tests and the attendant FWER correction.

Comparison with Post Hoc Tests

A reasonable heuristic approach for identifying sites that evolve differently between branch sets, B1 and B2 is to run an existing test which can determine whether the site evolved nonneutrally along either sets, and call the site differentially selected if there is evidence of positive selection on one group but not another. Approaches like this have been commonly used in literature, for example, Kapralov et al. (2012). One can also call a site differentially evolving if nonneutrality tests of B1 and B2 return discordant results. For example, B1 is negatively selected, but B2 is neutral, or B2 is positively selected and B1 is negatively selected. Contrast-FEL is, of course, a direct test of rate differences, so it could additionally identify, for instance, sites where B1 and B2 are both negatively/positively selected, but not at the same level. To illustrate the benefit of Contrast-FEL compared with the Post hoc approach (which also requires at least two separate computational analyses, one for each branch set, so may be less computationally efficient), we performed post hoc analyses based on independent FEL tests (one for each branch set) on a subset of 185,070 sites from 425 alignments.

Using the LRT P-value cutoff of 5%, over all variable sites, Contrast-FEL achieves FPR of 3.4%, power of 37.2%, PPV of 62.0%, and negative predictive value (NPV) of 91.1%. The “discordant results” post hoc FEL approach by comparison has FPR of 53.0%, power of 63.6%, PPV of 15.2%, and NPV of 89.6% the dramatic increase in the rate of false positives for the post hoc method is mostly (93.7%) due to cases, where a site that was simulated under the null is misclassified because one of the branch sets is determined to be nonneutral by FEL and the other—neutral. All of the sites that were identified by Contrast-FEL but not by the post hoc heuristic were those where FEL (correctly) determined that both branch sets were conserved, but the degrees of conservation, measured by the β i / α < 1 ratio, were different. Empirical data sets analyzed in the following section provide concrete examples of such sites which are labeled CC meaning they are conserved on the treatment set and conserved on the naive set (for more information, see table 3 or 4).

Sites Evolving Differentially between the Treated and the Naive Sets in the HIV-1 RT Data Set, at p ≤ 0.05 ⁠ .

. . β (substitutions) . . . . Standard FEL P-Value . . .
Codon . α . Treated . Naive . P-Value . q-Value . Treated . Naive . FEL Pattern .
44 1.31 0.00 (9) 1.13 (8) 0.0286 0.799 0.003(−) 0.885(−) CN
651.16 2.12 (11) 0.00 (3) 0.0156 * 0.655 0.226(+) 0.075(−) NN
671.24 1.39 (20) 0.00 (3) 0.0207 * 0.694 0.792(+) 0.024(−) NC
701.31 1.56 (17) 0.00 (5) 0.0374 * 0.963 0.737(+) 0.051(−) NN
750.86 1.80 (15) 0.00 (4) 0.0161 * 0.600 0.130(+) 0.087(−) NN
100 a 1.56 3.26 (29) 0.00 (13) 0.0150 0.836 0.094(+) 0.075(−) NN
103 a 1.47 36.51 (104) 0.00 (7) 0.0000 * 0.000 0.000(+) 0.073(−) PN
151 a 0.93 2.67 (10) 0.00 (8) 0.0150 * 0.719 0.023(+) 0.124(−) PN
181 a 3.32 4.41 (21) 0.00 (7) 0.0010 * 0.083 0.442(+) 0.004(−) NC
184 a 0.00 8.29 (58) 0.34 (1) 0.0000 * 0.000 0.023(+) 0.110(−) PN
188 a 0.18 2.99 (14) 0.00 (0) 0.0061 0.411 0.000(+) 0.491(−) PN
190 a 1.52 3.41 (33) 0.00 (10) 0.0004 * 0.041 0.031(+) 0.011(−) PC
215 a 0.44 1.50 (12) 0.00 (13) 0.0255 * 0.775 0.021(+) 0.199(−) PN
228 a 1.53 1.30 (21) 0.00 (9) 0.0436 0.974 0.753(−) 0.019(−) NC
302 0.62 0.00 (1) 8.05 (3) 0.0420 * 1.000 0.458(−) 0.054(+) NN
. . β (substitutions) . . . . Standard FEL P-Value . . .
Codon . α . Treated . Naive . P-Value . q-Value . Treated . Naive . FEL Pattern .
44 1.31 0.00 (9) 1.13 (8) 0.0286 0.799 0.003(−) 0.885(−) CN
651.16 2.12 (11) 0.00 (3) 0.0156 * 0.655 0.226(+) 0.075(−) NN
671.24 1.39 (20) 0.00 (3) 0.0207 * 0.694 0.792(+) 0.024(−) NC
701.31 1.56 (17) 0.00 (5) 0.0374 * 0.963 0.737(+) 0.051(−) NN
750.86 1.80 (15) 0.00 (4) 0.0161 * 0.600 0.130(+) 0.087(−) NN
100 a 1.56 3.26 (29) 0.00 (13) 0.0150 0.836 0.094(+) 0.075(−) NN
103 a 1.47 36.51 (104) 0.00 (7) 0.0000 * 0.000 0.000(+) 0.073(−) PN
151 a 0.93 2.67 (10) 0.00 (8) 0.0150 * 0.719 0.023(+) 0.124(−) PN
181 a 3.32 4.41 (21) 0.00 (7) 0.0010 * 0.083 0.442(+) 0.004(−) NC
184 a 0.00 8.29 (58) 0.34 (1) 0.0000 * 0.000 0.023(+) 0.110(−) PN
188 a 0.18 2.99 (14) 0.00 (0) 0.0061 0.411 0.000(+) 0.491(−) PN
190 a 1.52 3.41 (33) 0.00 (10) 0.0004 * 0.041 0.031(+) 0.011(−) PC
215 a 0.44 1.50 (12) 0.00 (13) 0.0255 * 0.775 0.021(+) 0.199(−) PN
228 a 1.53 1.30 (21) 0.00 (9) 0.0436 0.974 0.753(−) 0.019(−) NC
302 0.62 0.00 (1) 8.05 (3) 0.0420 * 1.000 0.458(−) 0.054(+) NN

Note .—α, maximum likelihood estimate (MLE) of the site-specific synonymous rate β, nonsynonymous rate.

Permutation P-value is ≤ 0.05 ⁠ substitutions are counted along branches in the corresponding set using joint maximum likelihood inference of ancestral states under the site-level alternative model, codons in italics are known to be involved in drug resistance ( Rhee et al. 2003).

Codon identified as directionally evolving in table 2 of Murrell et al. (2012). FEL P-values are computed by separately testing for nonneutral evolution on the corresponding set of branches, with the + or - sign indicating the nature of selection (positive or negative). FEL pattern encodes the inferred pattern of evolution for treated/naive branches: P, positively selected (at p ≤ 0.05 ⁠ ) C, conserved N, neutral for example, PC means “positively selected” on the treated set, and “conserved” on the naive set.

Sites Evolving Differentially between the Treated and the Naive Sets in the HIV-1 RT Data Set, at p ≤ 0.05 ⁠ .

. . β (substitutions) . . . . Standard FEL P-Value . . .
Codon . α . Treated . Naive . P-Value . q-Value . Treated . Naive . FEL Pattern .
44 1.31 0.00 (9) 1.13 (8) 0.0286 0.799 0.003(−) 0.885(−) CN
651.16 2.12 (11) 0.00 (3) 0.0156 * 0.655 0.226(+) 0.075(−) NN
671.24 1.39 (20) 0.00 (3) 0.0207 * 0.694 0.792(+) 0.024(−) NC
701.31 1.56 (17) 0.00 (5) 0.0374 * 0.963 0.737(+) 0.051(−) NN
750.86 1.80 (15) 0.00 (4) 0.0161 * 0.600 0.130(+) 0.087(−) NN
100 a 1.56 3.26 (29) 0.00 (13) 0.0150 0.836 0.094(+) 0.075(−) NN
103 a 1.47 36.51 (104) 0.00 (7) 0.0000 * 0.000 0.000(+) 0.073(−) PN
151 a 0.93 2.67 (10) 0.00 (8) 0.0150 * 0.719 0.023(+) 0.124(−) PN
181 a 3.32 4.41 (21) 0.00 (7) 0.0010 * 0.083 0.442(+) 0.004(−) NC
184 a 0.00 8.29 (58) 0.34 (1) 0.0000 * 0.000 0.023(+) 0.110(−) PN
188 a 0.18 2.99 (14) 0.00 (0) 0.0061 0.411 0.000(+) 0.491(−) PN
190 a 1.52 3.41 (33) 0.00 (10) 0.0004 * 0.041 0.031(+) 0.011(−) PC
215 a 0.44 1.50 (12) 0.00 (13) 0.0255 * 0.775 0.021(+) 0.199(−) PN
228 a 1.53 1.30 (21) 0.00 (9) 0.0436 0.974 0.753(−) 0.019(−) NC
302 0.62 0.00 (1) 8.05 (3) 0.0420 * 1.000 0.458(−) 0.054(+) NN
. . β (substitutions) . . . . Standard FEL P-Value . . .
Codon . α . Treated . Naive . P-Value . q-Value . Treated . Naive . FEL Pattern .
44 1.31 0.00 (9) 1.13 (8) 0.0286 0.799 0.003(−) 0.885(−) CN
651.16 2.12 (11) 0.00 (3) 0.0156 * 0.655 0.226(+) 0.075(−) NN
671.24 1.39 (20) 0.00 (3) 0.0207 * 0.694 0.792(+) 0.024(−) NC
701.31 1.56 (17) 0.00 (5) 0.0374 * 0.963 0.737(+) 0.051(−) NN
750.86 1.80 (15) 0.00 (4) 0.0161 * 0.600 0.130(+) 0.087(−) NN
100 a 1.56 3.26 (29) 0.00 (13) 0.0150 0.836 0.094(+) 0.075(−) NN
103 a 1.47 36.51 (104) 0.00 (7) 0.0000 * 0.000 0.000(+) 0.073(−) PN
151 a 0.93 2.67 (10) 0.00 (8) 0.0150 * 0.719 0.023(+) 0.124(−) PN
181 a 3.32 4.41 (21) 0.00 (7) 0.0010 * 0.083 0.442(+) 0.004(−) NC
184 a 0.00 8.29 (58) 0.34 (1) 0.0000 * 0.000 0.023(+) 0.110(−) PN
188 a 0.18 2.99 (14) 0.00 (0) 0.0061 0.411 0.000(+) 0.491(−) PN
190 a 1.52 3.41 (33) 0.00 (10) 0.0004 * 0.041 0.031(+) 0.011(−) PC
215 a 0.44 1.50 (12) 0.00 (13) 0.0255 * 0.775 0.021(+) 0.199(−) PN
228 a 1.53 1.30 (21) 0.00 (9) 0.0436 0.974 0.753(−) 0.019(−) NC
302 0.62 0.00 (1) 8.05 (3) 0.0420 * 1.000 0.458(−) 0.054(+) NN

Note .—α, maximum likelihood estimate (MLE) of the site-specific synonymous rate β, nonsynonymous rate.

Permutation P-value is ≤ 0.05 ⁠ substitutions are counted along branches in the corresponding set using joint maximum likelihood inference of ancestral states under the site-level alternative model, codons in italics are known to be involved in drug resistance ( Rhee et al. 2003).

Codon identified as directionally evolving in table 2 of Murrell et al. (2012). FEL P-values are computed by separately testing for nonneutral evolution on the corresponding set of branches, with the + or - sign indicating the nature of selection (positive or negative). FEL pattern encodes the inferred pattern of evolution for treated/naive branches: P, positively selected (at p ≤ 0.05 ⁠ ) C, conserved N, neutral for example, PC means “positively selected” on the treated set, and “conserved” on the naive set.

The heuristic in which sites are called differentially evolving where only one of the sets was under positive selection (the method commonly used in literature) has FPR of 3.3%, much reduced power of 17.1%, NPV of 94.6%, and PPV of 25.9%.

Exploring the Effect of Model Misspecification

To see whether the performance of Contrast-FEL suffers when data are generated under models that make different parametric distributions, we fitted Contrast-FEL to data generated under a branch-site model ( Wisotsky et al. 2020), where ω rates vary from site to site and branch to branch, but using a random effect, that is, set of branches fixed a priori are not expected to show detectably different patterns of site-level ω. This model allows independent unrestricted rate variation between branches and sites and is computationally faster and less parameter-rich than covarion-type models that allow complex correlation structures between rates (see discussion in Murrell et al. [2015]). Using the simulation scenario (CV3o6, see http://data.hyphy.org/web/busteds/), and the branch partition shown in supplementary figure S1 , Supplementary Material online, we obtained a slightly elevated rate of 0.07 false positives with LRT at nominal P = 0.05, suggesting that the LRT is misattributing some “random” rate variation to fixed branch partitions. However, the permutation P-value test, which we designed as a nonparametric guard to correct for some model misspecification, maintains a nominal rate of 0.045 for P = 0.05 (see supplementary fig. S2 , Supplementary Material online). Similarly, no elevated false positives are seen with multiple-test corrected q-values. Because of the limited scope of these simulations, we cannot draw general conclusions about robustness to model misspecification.

Empirical Data

DR in HIV-1 Reverse Transcriptase

We applied Contrast-FEL to an alignment of 476 HIV-1 reverse transcriptase (RT) sequences with 335 codons isolated from 288 HIV-1 infected individuals, previously analyzed in Murrell et al. (2012). There were two sequences sampled from each individual: one prior to treatment with RT inhibitors and one following treatment. We partitioned the branches in the tree into three groups: pretreatment terminal branches (naive), posttreatment terminal branches (treated), and the rest of the branches (nuisance group, supplementary fig. S1 , Supplementary Material online, HIV-1 RT). Because we expect that the primary difference between the selective regimes on naive versus treated branches is due to the action of the antiretroviral drugs, most of the sites that have detectable differences in selective pressures should be implicated in conferring DR. Using nominal P-value cutoff of 0.05, Contrast-FEL identifies 15 sites that evolve differentially, between which 12 are known DR sites, achieving PPV of 0.8. Of the three non-DR sites that are found, codons 44 and 302 are actually more conserved (lower β) in the treated group, which is a different mode of selective pressure than positive selection exerted by antiretroviral drugs. They are also not supported by the permutation test, which could indicate that these sites are picked up due to sampling variation. Among the 12 DR sites identified by LRT P-values, ten are also supported by the permutation test—an indicator of robustness to branch sampling artifacts. The most conservative approach, based on FDR corrected q-values of 0.20 or lower, identifies four codons: 103, 181, 184, and 190, all of which are on the list of canonical escape mutations with very strong phenotypic effects ( Rhee et al. 2003). All of these sites have many inferred mutations in the treated group, and large differences between inferred β rates, which places them in the large effect/large sample size category. As a comparison with one common practice to screen for differential selection today, we also used fixed effects likelihood (FEL Kosakovsky Pond and Frost 2005) to test each branch set for evidence of deviations from neutrality (either positive or negative selection). For site 190, subject to differential selection with a large effect size, FEL reveals that the treated branches experience positive selection (FEL p ≤ 0.05 ⁠ ), and naive branches—negative selection. However, no other sites show such a clean pattern. For sites 103, 151, 184, and 215, test branches are subject to positive selection (FEL p ≤ 0.05 ⁠ ), whereas naive branches evolve neutrally. For sites 67, 181, and 228, naive branches are subject to negative selection (FEL p ≤ 0.05 ⁠ ), whereas test branches evolve neutrally. For the remainder of the sites, neither branch class evolves in a way that is detectably different from neutrality according to FEL. This comparison highlights that comparing the results of two independent tests applied to subsets of the data to detect evolutionary differences is statistically suboptimal.

The performance of Contrast-FEL (a generalist method) in identifying sites of phenotypic relevance compares favorably with the performance of a purpose-built Model of Episodic Directional Selection (MEDS) method (Murrell et al. 2012), designed to find directional evolution along selected branches. MEDS identified 17 sites of which 10 were known DR sites (PPV of 58.9%, see table 2 in Murrell et al. (2012), and both methods agreed on nine sites. Of course, unlike MEDS, Contrast-FEL is not able to identify specific residues that may be the targets of selection at specific sites.

Selection on HIV-1 Envelope Conditioned on the Route of Transmission

We reanalyzed an alignment of 131 partial HIV-1 envelope (no variable loops) sequences with 806 codons from Tully et al. (2016) these sequences were isolated from acute/early infections and represent “founder” viruses. These sequences were labeled by whether or not the infected individual was infected via a heterosexual (HSX) exposure, or men who have sex with men (MSM) exposure interior branches were labeled as HSX or MSM if all of their descendants had the same label, and were left unlabeled (nuisance set) otherwise. Tully et al. (2016) found gene-wide differences in selection among branch groups (a larger proportion of sites, but subject to weaker positive selection, on HSX branches compared with MSM branches), using the RELAX test ( Wertheim et al. 2015), but lacked the framework to pinpoint specific residues that evolved differentially. Contrast-FEL identified 32 differentially selected sites (P-value) of which three passed the FDR correction ( table 4). One of these sites, 626, is conserved in both branch sets, but at a stronger level (lower β) in the MSM set, whereas another (786) is positively selected in both, but at a stronger level on HSX branches.

Sites Evolving Differentially between the HSX and the MSM Sets in the HIV-1 Env Data Set from Tully et al. (2016).

Codon . α . β (substitutions)HSX . MSM . P-Value . q-Value . Standard FEL P-ValueHSX . MSM . FEL Pattern .
49 0.56 0.12 (2) 0.69 (12) 0.0365* 1.000 0.151(−) 0.742(+) NN
50 0.30 0.61 (6) 0.00 (2) 0.0009* 0.177 0.331(+) 0.015(−) NC
53 0.88 0.00 (2) 0.32 (9) 0.0391 1.000 0.002(−) 0.088(−) CN
142 4.13 5.09 (34) 2.86 (37) 0.0497 1.000 0.373(+) 0.338(−) NN
158 0.62 2.24 (9) 0.84 (10) 0.0168 0.905 0.010(+) 0.582(+) PN
197 1.88 2.93 (20) 1.17 (23) 0.0053* 0.616 0.361(+) 0.287(−) NN
233 0.20 0.25 (3) 0.00 (1) 0.0337* 1.000 0.836(+) 0.046(−) NC
264 0.33 1.91 (11) 0.65 (9) 0.0107 0.859 0.005(+) 0.427(+) PN
275 2.45 4.16 (28) 1.71 (28) 0.0031 0.498 0.128(+) 0.344(−) NN
303 1.63 6.98 (34) 3.55 (39) 0.0093* 0.837 0.000(+) 0.082(+) PN
336 3.80 1.09 (11) 2.47 (35) 0.0254 1.000 0.009(−) 0.297(−) CN
344 0.40 1.25 (8) 0.31 (7) 0.0074* 0.742 0.042(+) 0.682(−) PN
408 0.52 0.70 (4) 1.80 (22) 0.0436 1.000 0.663(+) 0.007(+) NP
442 0.00 0.31 (1) 0.00 (0) 0.0362* 1.000 0.031(+) 1.000(−) PN
530 0.69 0.29 (2) 0.00 (3) 0.0359* 1.000 0.305(−) 0.002(−) NC
572 1.55 4.01 (29) 2.27 (33) 0.0469 1.000 0.046(+) 0.456(+) PN
574 1.10 0.52 (6) 0.10 (5) 0.0389 1.000 0.230(−) 0.002(−) NC
598 0.59 1.13 (9) 0.28 (7) 0.0410 1.000 0.235(+) 0.255(−) NN
626 12.77 4.14 (37) 1.40 (41) 0.0003* 0.120 0.001(−) 0.000(−) CC
672 2.46 3.15 (25) 1.46 (37) 0.0139* 0.799 0.433(+) 0.079(−) NN
683 1.88 2.05 (15) 0.85 (17) 0.0404* 1.000 0.786(+) 0.039(−) NC
685 0.00 1.16 (9) 0.12 (2) 0.0008* 0.213 0.001(+) 0.259(+) PN
690 1.24 0.28 (5) 1.33 (18) 0.0176* 0.837 0.050(−) 0.983(+) CN
702 0.00 1.13 (8) 0.28 (4) 0.0174* 0.877 0.002(+) 0.101(+) PN
703 1.31 1.27 (7) 0.13 (11) 0.0119* 0.739 0.958(−) 0.003(−) NC
720 0.49 1.47 (11) 0.44 (13) 0.0107 0.785 0.026(+) 0.860(−) PN
722 0.00 0.13 (1) 0.78 (9) 0.0325 1.000 0.288(+) 0.007(+) NP
734 0.50 2.23 (14) 0.75 (10) 0.0111 0.749 0.005(+) 0.536(+) PN
773 0.25 0.00 (0) 0.31 (7) 0.0406 1.000 0.093(−) 0.778(+) NN
781 0.51 0.50 (5) 0.00 (2) 0.0031* 0.416 0.975(−) 0.005(−) NC
786 1.40 6.67 (24) 3.33 (38) 0.0274 1.000 0.000(+) 0.009(+) PP
804 0.25 0.00 (0) 1.53 (15) 0.0000* 0.031 0.119(−) 0.002(+) NP
Codon . α . β (substitutions)HSX . MSM . P-Value . q-Value . Standard FEL P-ValueHSX . MSM . FEL Pattern .
49 0.56 0.12 (2) 0.69 (12) 0.0365* 1.000 0.151(−) 0.742(+) NN
50 0.30 0.61 (6) 0.00 (2) 0.0009* 0.177 0.331(+) 0.015(−) NC
53 0.88 0.00 (2) 0.32 (9) 0.0391 1.000 0.002(−) 0.088(−) CN
142 4.13 5.09 (34) 2.86 (37) 0.0497 1.000 0.373(+) 0.338(−) NN
158 0.62 2.24 (9) 0.84 (10) 0.0168 0.905 0.010(+) 0.582(+) PN
197 1.88 2.93 (20) 1.17 (23) 0.0053* 0.616 0.361(+) 0.287(−) NN
233 0.20 0.25 (3) 0.00 (1) 0.0337* 1.000 0.836(+) 0.046(−) NC
264 0.33 1.91 (11) 0.65 (9) 0.0107 0.859 0.005(+) 0.427(+) PN
275 2.45 4.16 (28) 1.71 (28) 0.0031 0.498 0.128(+) 0.344(−) NN
303 1.63 6.98 (34) 3.55 (39) 0.0093* 0.837 0.000(+) 0.082(+) PN
336 3.80 1.09 (11) 2.47 (35) 0.0254 1.000 0.009(−) 0.297(−) CN
344 0.40 1.25 (8) 0.31 (7) 0.0074* 0.742 0.042(+) 0.682(−) PN
408 0.52 0.70 (4) 1.80 (22) 0.0436 1.000 0.663(+) 0.007(+) NP
442 0.00 0.31 (1) 0.00 (0) 0.0362* 1.000 0.031(+) 1.000(−) PN
530 0.69 0.29 (2) 0.00 (3) 0.0359* 1.000 0.305(−) 0.002(−) NC
572 1.55 4.01 (29) 2.27 (33) 0.0469 1.000 0.046(+) 0.456(+) PN
574 1.10 0.52 (6) 0.10 (5) 0.0389 1.000 0.230(−) 0.002(−) NC
598 0.59 1.13 (9) 0.28 (7) 0.0410 1.000 0.235(+) 0.255(−) NN
626 12.77 4.14 (37) 1.40 (41) 0.0003* 0.120 0.001(−) 0.000(−) CC
672 2.46 3.15 (25) 1.46 (37) 0.0139* 0.799 0.433(+) 0.079(−) NN
683 1.88 2.05 (15) 0.85 (17) 0.0404* 1.000 0.786(+) 0.039(−) NC
685 0.00 1.16 (9) 0.12 (2) 0.0008* 0.213 0.001(+) 0.259(+) PN
690 1.24 0.28 (5) 1.33 (18) 0.0176* 0.837 0.050(−) 0.983(+) CN
702 0.00 1.13 (8) 0.28 (4) 0.0174* 0.877 0.002(+) 0.101(+) PN
703 1.31 1.27 (7) 0.13 (11) 0.0119* 0.739 0.958(−) 0.003(−) NC
720 0.49 1.47 (11) 0.44 (13) 0.0107 0.785 0.026(+) 0.860(−) PN
722 0.00 0.13 (1) 0.78 (9) 0.0325 1.000 0.288(+) 0.007(+) NP
734 0.50 2.23 (14) 0.75 (10) 0.0111 0.749 0.005(+) 0.536(+) PN
773 0.25 0.00 (0) 0.31 (7) 0.0406 1.000 0.093(−) 0.778(+) NN
781 0.51 0.50 (5) 0.00 (2) 0.0031* 0.416 0.975(−) 0.005(−) NC
786 1.40 6.67 (24) 3.33 (38) 0.0274 1.000 0.000(+) 0.009(+) PP
804 0.25 0.00 (0) 1.53 (15) 0.0000* 0.031 0.119(−) 0.002(+) NP

Note .—Other notation the same as in table 3.

Sites Evolving Differentially between the HSX and the MSM Sets in the HIV-1 Env Data Set from Tully et al. (2016).

Codon . α . β (substitutions)HSX . MSM . P-Value . q-Value . Standard FEL P-ValueHSX . MSM . FEL Pattern .
49 0.56 0.12 (2) 0.69 (12) 0.0365* 1.000 0.151(−) 0.742(+) NN
50 0.30 0.61 (6) 0.00 (2) 0.0009* 0.177 0.331(+) 0.015(−) NC
53 0.88 0.00 (2) 0.32 (9) 0.0391 1.000 0.002(−) 0.088(−) CN
142 4.13 5.09 (34) 2.86 (37) 0.0497 1.000 0.373(+) 0.338(−) NN
158 0.62 2.24 (9) 0.84 (10) 0.0168 0.905 0.010(+) 0.582(+) PN
197 1.88 2.93 (20) 1.17 (23) 0.0053* 0.616 0.361(+) 0.287(−) NN
233 0.20 0.25 (3) 0.00 (1) 0.0337* 1.000 0.836(+) 0.046(−) NC
264 0.33 1.91 (11) 0.65 (9) 0.0107 0.859 0.005(+) 0.427(+) PN
275 2.45 4.16 (28) 1.71 (28) 0.0031 0.498 0.128(+) 0.344(−) NN
303 1.63 6.98 (34) 3.55 (39) 0.0093* 0.837 0.000(+) 0.082(+) PN
336 3.80 1.09 (11) 2.47 (35) 0.0254 1.000 0.009(−) 0.297(−) CN
344 0.40 1.25 (8) 0.31 (7) 0.0074* 0.742 0.042(+) 0.682(−) PN
408 0.52 0.70 (4) 1.80 (22) 0.0436 1.000 0.663(+) 0.007(+) NP
442 0.00 0.31 (1) 0.00 (0) 0.0362* 1.000 0.031(+) 1.000(−) PN
530 0.69 0.29 (2) 0.00 (3) 0.0359* 1.000 0.305(−) 0.002(−) NC
572 1.55 4.01 (29) 2.27 (33) 0.0469 1.000 0.046(+) 0.456(+) PN
574 1.10 0.52 (6) 0.10 (5) 0.0389 1.000 0.230(−) 0.002(−) NC
598 0.59 1.13 (9) 0.28 (7) 0.0410 1.000 0.235(+) 0.255(−) NN
626 12.77 4.14 (37) 1.40 (41) 0.0003* 0.120 0.001(−) 0.000(−) CC
672 2.46 3.15 (25) 1.46 (37) 0.0139* 0.799 0.433(+) 0.079(−) NN
683 1.88 2.05 (15) 0.85 (17) 0.0404* 1.000 0.786(+) 0.039(−) NC
685 0.00 1.16 (9) 0.12 (2) 0.0008* 0.213 0.001(+) 0.259(+) PN
690 1.24 0.28 (5) 1.33 (18) 0.0176* 0.837 0.050(−) 0.983(+) CN
702 0.00 1.13 (8) 0.28 (4) 0.0174* 0.877 0.002(+) 0.101(+) PN
703 1.31 1.27 (7) 0.13 (11) 0.0119* 0.739 0.958(−) 0.003(−) NC
720 0.49 1.47 (11) 0.44 (13) 0.0107 0.785 0.026(+) 0.860(−) PN
722 0.00 0.13 (1) 0.78 (9) 0.0325 1.000 0.288(+) 0.007(+) NP
734 0.50 2.23 (14) 0.75 (10) 0.0111 0.749 0.005(+) 0.536(+) PN
773 0.25 0.00 (0) 0.31 (7) 0.0406 1.000 0.093(−) 0.778(+) NN
781 0.51 0.50 (5) 0.00 (2) 0.0031* 0.416 0.975(−) 0.005(−) NC
786 1.40 6.67 (24) 3.33 (38) 0.0274 1.000 0.000(+) 0.009(+) PP
804 0.25 0.00 (0) 1.53 (15) 0.0000* 0.031 0.119(−) 0.002(+) NP
Codon . α . β (substitutions)HSX . MSM . P-Value . q-Value . Standard FEL P-ValueHSX . MSM . FEL Pattern .
49 0.56 0.12 (2) 0.69 (12) 0.0365* 1.000 0.151(−) 0.742(+) NN
50 0.30 0.61 (6) 0.00 (2) 0.0009* 0.177 0.331(+) 0.015(−) NC
53 0.88 0.00 (2) 0.32 (9) 0.0391 1.000 0.002(−) 0.088(−) CN
142 4.13 5.09 (34) 2.86 (37) 0.0497 1.000 0.373(+) 0.338(−) NN
158 0.62 2.24 (9) 0.84 (10) 0.0168 0.905 0.010(+) 0.582(+) PN
197 1.88 2.93 (20) 1.17 (23) 0.0053* 0.616 0.361(+) 0.287(−) NN
233 0.20 0.25 (3) 0.00 (1) 0.0337* 1.000 0.836(+) 0.046(−) NC
264 0.33 1.91 (11) 0.65 (9) 0.0107 0.859 0.005(+) 0.427(+) PN
275 2.45 4.16 (28) 1.71 (28) 0.0031 0.498 0.128(+) 0.344(−) NN
303 1.63 6.98 (34) 3.55 (39) 0.0093* 0.837 0.000(+) 0.082(+) PN
336 3.80 1.09 (11) 2.47 (35) 0.0254 1.000 0.009(−) 0.297(−) CN
344 0.40 1.25 (8) 0.31 (7) 0.0074* 0.742 0.042(+) 0.682(−) PN
408 0.52 0.70 (4) 1.80 (22) 0.0436 1.000 0.663(+) 0.007(+) NP
442 0.00 0.31 (1) 0.00 (0) 0.0362* 1.000 0.031(+) 1.000(−) PN
530 0.69 0.29 (2) 0.00 (3) 0.0359* 1.000 0.305(−) 0.002(−) NC
572 1.55 4.01 (29) 2.27 (33) 0.0469 1.000 0.046(+) 0.456(+) PN
574 1.10 0.52 (6) 0.10 (5) 0.0389 1.000 0.230(−) 0.002(−) NC
598 0.59 1.13 (9) 0.28 (7) 0.0410 1.000 0.235(+) 0.255(−) NN
626 12.77 4.14 (37) 1.40 (41) 0.0003* 0.120 0.001(−) 0.000(−) CC
672 2.46 3.15 (25) 1.46 (37) 0.0139* 0.799 0.433(+) 0.079(−) NN
683 1.88 2.05 (15) 0.85 (17) 0.0404* 1.000 0.786(+) 0.039(−) NC
685 0.00 1.16 (9) 0.12 (2) 0.0008* 0.213 0.001(+) 0.259(+) PN
690 1.24 0.28 (5) 1.33 (18) 0.0176* 0.837 0.050(−) 0.983(+) CN
702 0.00 1.13 (8) 0.28 (4) 0.0174* 0.877 0.002(+) 0.101(+) PN
703 1.31 1.27 (7) 0.13 (11) 0.0119* 0.739 0.958(−) 0.003(−) NC
720 0.49 1.47 (11) 0.44 (13) 0.0107 0.785 0.026(+) 0.860(−) PN
722 0.00 0.13 (1) 0.78 (9) 0.0325 1.000 0.288(+) 0.007(+) NP
734 0.50 2.23 (14) 0.75 (10) 0.0111 0.749 0.005(+) 0.536(+) PN
773 0.25 0.00 (0) 0.31 (7) 0.0406 1.000 0.093(−) 0.778(+) NN
781 0.51 0.50 (5) 0.00 (2) 0.0031* 0.416 0.975(−) 0.005(−) NC
786 1.40 6.67 (24) 3.33 (38) 0.0274 1.000 0.000(+) 0.009(+) PP
804 0.25 0.00 (0) 1.53 (15) 0.0000* 0.031 0.119(−) 0.002(+) NP

Note .—Other notation the same as in table 3.

Cell Shape in Epidermal Leaf Trichomes

Mazie and Baum (2016) investigated which codons in a developmentally important gene (BRT) in Brassicaceae (58 sequences, 318 codons) may be associated with the evolution of a different trichome cell shape in the genus Physaria. Using gene-level mean differences in ω between subsets of branches, they identified that the average strength of selection is different in Physaria compared with the rest of the taxa. They then used a revised restricted branch-site ( Zhang et al. 2005) method to detect ten codons that were subject to positive selection in the Physaria clade and four codons were “distinctive” (majority amino acid was different in Physaria), but not positively selected. Contrast-FEL identified 29 differentially selected codons at p ≤ 0.05 (18 at q ≤ .20 ⁠ ), including all ten positive codons from Mazie and Baum (2016) and one out of four “distinctive” codons ( table 5). Given the general conservative nature of Contrast-FEL, it is reasonable to assume that it is more powerful (rather than prone to making more Type I errors) than the original test which was limited to a 50 % subset of branches and used much more stringent parametric assumptions on rate distributions, including shared negative selection regimes on background branches, and a single ω to account for the positive selection rate class.

Sites Evolving Differentially between Physaria and Other Taxa in the BLT Gene from Mazie and Baum (2016).

Codon . α . β (substitutions)Physaria . Other . P-Value . q-Value . Standard FEL P-ValuePhysaria . Other . FEL Pattern .
4 0.00 80.00 (2) 0.76 0.0320 0.392 0.018(+) 0.220(+) PN
43 0.69 4.27 (1) 0.00 0.0110* 0.194 0.251(+) 0.043(−) NC
60 0.00 57.88 (1) 0.82 0.0265 0.421 0.020(+) 0.140(+) PN
107 0.93 1.70 (3) 0.16 0.0123* 0.206 0.511(+) 0.108(−) NN
122 0.21 0.00 (0) 1.40 0.0308* 0.426 0.476(−) 0.094(+) NN
126 1.62 0.89 (2) 0.08 0.0467 0.550 0.563(−) 0.017(−) NC
153 a 0.24 4.17 (6) 0.23 0.0001* 0.006 0.001(+) 0.960(−) PN
156 0.66 2.91 (7) 0.67 0.0283* 0.409 0.051(+) 0.986(+) NN
163 0.35 1.31 (2) 0.12 0.0473 0.538 0.276(+) 0.476(−) NN
164 0.32 2.04 (3) 0.25 0.0267 0.404 0.088(+) 0.847(−) NN
167 a 0.37 6.54 (10) 0.70 0.0000* 0.003 0.001(+) 0.549(+) PN
169 5.04 0.74 (4) 0.07 0.0476 0.522 0.016(−) 0.000(−) CC
171 a 2.03 4.00 (4) 0.00 0.0000* 0.003 0.320(+) 0.000(−) NC
173 a 0.47 4.12 (6) 0.36 0.0011* 0.044 0.026(+) 0.820(−) PN
174 a 0.00 11.60 (8) 0.31 0.0000* 0.000 0.000(+) 0.346(+) PN
175 a 0.85 3.67 (7) 0.00 0.0000* 0.002 0.046(+) 0.006(−) PC
176 0.56 1.22 (1) 0.00 0.0057* 0.130 0.450(+) 0.025(−) NC
178 1.29 1.93 (3) 0.12 0.0085 0.160 0.636(+) 0.026(−) NC
179 0.77 1.85 (2) 0.00 0.0008* 0.038 0.306(+) 0.008(−) NC
180 a 0.17 2.22 (3) 0.00 0.0001* 0.006 0.008(+) 0.161(−) PN
187 0.46 0.91 (2) 0.00 0.0066* 0.141 0.503(+) 0.023(−) NC
188 1.31 1.03 (3) 0.00 0.0075* 0.149 0.790(−) 0.001(−) NC
190 a 0.00 1.47 (4) 0.07 0.0015* 0.053 0.020(+) 0.562(+) PN
191 a 0.45 2.49 (4) 0.13 0.0025* 0.080 0.046(+) 0.302(−) PN
198 0.66 2.17 (2) 0.00 0.0026* 0.076 0.210(+) 0.023(−) NC
255 a 2.06 1.47 (3) 0.07 0.0047* 0.124 0.636(−) 0.000(−) NC
262 1.43 1.27 (3) 0.17 0.0320* 0.407 0.872(−) 0.009(−) NC
270 2.32 0.87 (3) 0.00 0.0047* 0.115 0.248(−) 0.000(−) NC
278 b 1.64 0.86 (1) 0.00 0.0320* 0.424 0.567(−) 0.001(−) NC
Codon . α . β (substitutions)Physaria . Other . P-Value . q-Value . Standard FEL P-ValuePhysaria . Other . FEL Pattern .
4 0.00 80.00 (2) 0.76 0.0320 0.392 0.018(+) 0.220(+) PN
43 0.69 4.27 (1) 0.00 0.0110* 0.194 0.251(+) 0.043(−) NC
60 0.00 57.88 (1) 0.82 0.0265 0.421 0.020(+) 0.140(+) PN
107 0.93 1.70 (3) 0.16 0.0123* 0.206 0.511(+) 0.108(−) NN
122 0.21 0.00 (0) 1.40 0.0308* 0.426 0.476(−) 0.094(+) NN
126 1.62 0.89 (2) 0.08 0.0467 0.550 0.563(−) 0.017(−) NC
153 a 0.24 4.17 (6) 0.23 0.0001* 0.006 0.001(+) 0.960(−) PN
156 0.66 2.91 (7) 0.67 0.0283* 0.409 0.051(+) 0.986(+) NN
163 0.35 1.31 (2) 0.12 0.0473 0.538 0.276(+) 0.476(−) NN
164 0.32 2.04 (3) 0.25 0.0267 0.404 0.088(+) 0.847(−) NN
167 a 0.37 6.54 (10) 0.70 0.0000* 0.003 0.001(+) 0.549(+) PN
169 5.04 0.74 (4) 0.07 0.0476 0.522 0.016(−) 0.000(−) CC
171 a 2.03 4.00 (4) 0.00 0.0000* 0.003 0.320(+) 0.000(−) NC
173 a 0.47 4.12 (6) 0.36 0.0011* 0.044 0.026(+) 0.820(−) PN
174 a 0.00 11.60 (8) 0.31 0.0000* 0.000 0.000(+) 0.346(+) PN
175 a 0.85 3.67 (7) 0.00 0.0000* 0.002 0.046(+) 0.006(−) PC
176 0.56 1.22 (1) 0.00 0.0057* 0.130 0.450(+) 0.025(−) NC
178 1.29 1.93 (3) 0.12 0.0085 0.160 0.636(+) 0.026(−) NC
179 0.77 1.85 (2) 0.00 0.0008* 0.038 0.306(+) 0.008(−) NC
180 a 0.17 2.22 (3) 0.00 0.0001* 0.006 0.008(+) 0.161(−) PN
187 0.46 0.91 (2) 0.00 0.0066* 0.141 0.503(+) 0.023(−) NC
188 1.31 1.03 (3) 0.00 0.0075* 0.149 0.790(−) 0.001(−) NC
190 a 0.00 1.47 (4) 0.07 0.0015* 0.053 0.020(+) 0.562(+) PN
191 a 0.45 2.49 (4) 0.13 0.0025* 0.080 0.046(+) 0.302(−) PN
198 0.66 2.17 (2) 0.00 0.0026* 0.076 0.210(+) 0.023(−) NC
255 a 2.06 1.47 (3) 0.07 0.0047* 0.124 0.636(−) 0.000(−) NC
262 1.43 1.27 (3) 0.17 0.0320* 0.407 0.872(−) 0.009(−) NC
270 2.32 0.87 (3) 0.00 0.0047* 0.115 0.248(−) 0.000(−) NC
278 b 1.64 0.86 (1) 0.00 0.0320* 0.424 0.567(−) 0.001(−) NC

Note .—Other notation the same as in table 3.

Codon identified as positively selected in Physaria ( table 3 of Mazie and Baum [2016]).

Codon identified as “distinctive” in Physaria ( table 4 of Mazie and Baum [2016]).

Sites Evolving Differentially between Physaria and Other Taxa in the BLT Gene from Mazie and Baum (2016).

Codon . α . β (substitutions)Physaria . Other . P-Value . q-Value . Standard FEL P-ValuePhysaria . Other . FEL Pattern .
4 0.00 80.00 (2) 0.76 0.0320 0.392 0.018(+) 0.220(+) PN
43 0.69 4.27 (1) 0.00 0.0110* 0.194 0.251(+) 0.043(−) NC
60 0.00 57.88 (1) 0.82 0.0265 0.421 0.020(+) 0.140(+) PN
107 0.93 1.70 (3) 0.16 0.0123* 0.206 0.511(+) 0.108(−) NN
122 0.21 0.00 (0) 1.40 0.0308* 0.426 0.476(−) 0.094(+) NN
126 1.62 0.89 (2) 0.08 0.0467 0.550 0.563(−) 0.017(−) NC
153 a 0.24 4.17 (6) 0.23 0.0001* 0.006 0.001(+) 0.960(−) PN
156 0.66 2.91 (7) 0.67 0.0283* 0.409 0.051(+) 0.986(+) NN
163 0.35 1.31 (2) 0.12 0.0473 0.538 0.276(+) 0.476(−) NN
164 0.32 2.04 (3) 0.25 0.0267 0.404 0.088(+) 0.847(−) NN
167 a 0.37 6.54 (10) 0.70 0.0000* 0.003 0.001(+) 0.549(+) PN
169 5.04 0.74 (4) 0.07 0.0476 0.522 0.016(−) 0.000(−) CC
171 a 2.03 4.00 (4) 0.00 0.0000* 0.003 0.320(+) 0.000(−) NC
173 a 0.47 4.12 (6) 0.36 0.0011* 0.044 0.026(+) 0.820(−) PN
174 a 0.00 11.60 (8) 0.31 0.0000* 0.000 0.000(+) 0.346(+) PN
175 a 0.85 3.67 (7) 0.00 0.0000* 0.002 0.046(+) 0.006(−) PC
176 0.56 1.22 (1) 0.00 0.0057* 0.130 0.450(+) 0.025(−) NC
178 1.29 1.93 (3) 0.12 0.0085 0.160 0.636(+) 0.026(−) NC
179 0.77 1.85 (2) 0.00 0.0008* 0.038 0.306(+) 0.008(−) NC
180 a 0.17 2.22 (3) 0.00 0.0001* 0.006 0.008(+) 0.161(−) PN
187 0.46 0.91 (2) 0.00 0.0066* 0.141 0.503(+) 0.023(−) NC
188 1.31 1.03 (3) 0.00 0.0075* 0.149 0.790(−) 0.001(−) NC
190 a 0.00 1.47 (4) 0.07 0.0015* 0.053 0.020(+) 0.562(+) PN
191 a 0.45 2.49 (4) 0.13 0.0025* 0.080 0.046(+) 0.302(−) PN
198 0.66 2.17 (2) 0.00 0.0026* 0.076 0.210(+) 0.023(−) NC
255 a 2.06 1.47 (3) 0.07 0.0047* 0.124 0.636(−) 0.000(−) NC
262 1.43 1.27 (3) 0.17 0.0320* 0.407 0.872(−) 0.009(−) NC
270 2.32 0.87 (3) 0.00 0.0047* 0.115 0.248(−) 0.000(−) NC
278 b 1.64 0.86 (1) 0.00 0.0320* 0.424 0.567(−) 0.001(−) NC
Codon . α . β (substitutions)Physaria . Other . P-Value . q-Value . Standard FEL P-ValuePhysaria . Other . FEL Pattern .
4 0.00 80.00 (2) 0.76 0.0320 0.392 0.018(+) 0.220(+) PN
43 0.69 4.27 (1) 0.00 0.0110* 0.194 0.251(+) 0.043(−) NC
60 0.00 57.88 (1) 0.82 0.0265 0.421 0.020(+) 0.140(+) PN
107 0.93 1.70 (3) 0.16 0.0123* 0.206 0.511(+) 0.108(−) NN
122 0.21 0.00 (0) 1.40 0.0308* 0.426 0.476(−) 0.094(+) NN
126 1.62 0.89 (2) 0.08 0.0467 0.550 0.563(−) 0.017(−) NC
153 a 0.24 4.17 (6) 0.23 0.0001* 0.006 0.001(+) 0.960(−) PN
156 0.66 2.91 (7) 0.67 0.0283* 0.409 0.051(+) 0.986(+) NN
163 0.35 1.31 (2) 0.12 0.0473 0.538 0.276(+) 0.476(−) NN
164 0.32 2.04 (3) 0.25 0.0267 0.404 0.088(+) 0.847(−) NN
167 a 0.37 6.54 (10) 0.70 0.0000* 0.003 0.001(+) 0.549(+) PN
169 5.04 0.74 (4) 0.07 0.0476 0.522 0.016(−) 0.000(−) CC
171 a 2.03 4.00 (4) 0.00 0.0000* 0.003 0.320(+) 0.000(−) NC
173 a 0.47 4.12 (6) 0.36 0.0011* 0.044 0.026(+) 0.820(−) PN
174 a 0.00 11.60 (8) 0.31 0.0000* 0.000 0.000(+) 0.346(+) PN
175 a 0.85 3.67 (7) 0.00 0.0000* 0.002 0.046(+) 0.006(−) PC
176 0.56 1.22 (1) 0.00 0.0057* 0.130 0.450(+) 0.025(−) NC
178 1.29 1.93 (3) 0.12 0.0085 0.160 0.636(+) 0.026(−) NC
179 0.77 1.85 (2) 0.00 0.0008* 0.038 0.306(+) 0.008(−) NC
180 a 0.17 2.22 (3) 0.00 0.0001* 0.006 0.008(+) 0.161(−) PN
187 0.46 0.91 (2) 0.00 0.0066* 0.141 0.503(+) 0.023(−) NC
188 1.31 1.03 (3) 0.00 0.0075* 0.149 0.790(−) 0.001(−) NC
190 a 0.00 1.47 (4) 0.07 0.0015* 0.053 0.020(+) 0.562(+) PN
191 a 0.45 2.49 (4) 0.13 0.0025* 0.080 0.046(+) 0.302(−) PN
198 0.66 2.17 (2) 0.00 0.0026* 0.076 0.210(+) 0.023(−) NC
255 a 2.06 1.47 (3) 0.07 0.0047* 0.124 0.636(−) 0.000(−) NC
262 1.43 1.27 (3) 0.17 0.0320* 0.407 0.872(−) 0.009(−) NC
270 2.32 0.87 (3) 0.00 0.0047* 0.115 0.248(−) 0.000(−) NC
278 b 1.64 0.86 (1) 0.00 0.0320* 0.424 0.567(−) 0.001(−) NC

Note .—Other notation the same as in table 3.

Codon identified as positively selected in Physaria ( table 3 of Mazie and Baum [2016]).

Codon identified as “distinctive” in Physaria ( table 4 of Mazie and Baum [2016]).

Evolution of Rubisco in C3 versus C4 Photosynthetic Pathway Plants

Several studies comparing evolutionary selective pressures on the rbcL gene in C3 and C4 plants have identified several sites that appear to be under positive selection in either C3 or C4 plants ( Kapralov and Filatov 2007 Kapralov et al. 2012), as well as several others that have different targets for directional evolution based on the pathway ( Parto and Lartillot 2018). In this alignment of 179 sequences and 447 codons, Contrast-FEL identified 15 sites that evolve differentially between C3 and C4 plants (LRT p ≤ 0.05 ⁠ ), of which six had been previously identified as being subject to differential directional selection by a mutation-selection model, and five additional sites were identified by this model (cf. table 6). An interesting example in this data set is site 309 which was found as positively selected previously, but is classified as conserved in both C3 and C4 plants by FEL this appears to be a result of the high synonymous rate inferred at the site, which is a hallmark false positive for standard selection analyses that ignore site-to-site synonymous rate variation ( Kosakovsky Pond and Muse 2005 Wisotsky et al. 2020). However, a weaker extent of conservation in C4 plants is inferred by Contrast-FEL at this site.

Sites Evolving Differentially between C3 and Other Taxa in the rbcL Gene.

Codon . α . β (substitutions)C3 . C4 . P-Value . q-Value . Standard FEL P-ValueC3 . C4 . FEL Pattern .
23 0.00 0.87 (4) 0.00 (0) 0.0494 1.000 0.083(+) 1.000(−) NN
86 a 1.02 1.22 (7) 0.00 (1) 0.0164* 1.000 0.794(+) 0.039(−) NC
249 0.98 0.99 (6) 0.00 (0) 0.0375* 1.000 0.995(+) 0.062(−) NN
251 0.00 0.95 (5) 0.00 (0) 0.0169 1.000 0.342(+) 1.000(−) NN
262 a ,b 0.48 4.15 (15) 1.15 (3) 0.0292* 1.000 0.005(+) 0.460(+) PN
281 a ,c 0.00 0.44 (2) 3.90 (10) 0.0008* 0.372 0.216(+) 0.001(+) NP
295 2.46 0.00 (1) 0.62 (4) 0.0482* 1.000 0.000(−) 0.084(−) CN
309 a ,c 76.21 0.00 (0) 0.95 (5) 0.0047* 0.700 0.000(−) 0.013(−) CC
315 1.43 0.00 (2) 0.72 (3) 0.0479* 1.000 0.008(−) 0.452(−) CN
332 0.00 0.00 (1) 0.85 (2) 0.0472* 1.000 1.000(−) 0.101(+) NN
354 a 0.49 1.32 (6) 0.00 (1) 0.0160* 1.000 0.309(+) 0.187(−) NN
367 0.96 1.21 (6) 0.00 (1) 0.0285 1.000 0.780(+) 0.074(−) NN
439 a ,b 1.04 2.26 (10) 0.31 (4) 0.0143* 1.000 0.136(+) 0.216(−) NN
443 a ,b 0.00 2.29 (9) 0.00 (0) 0.0018* 0.412 0.005(+) 1.000(−) PN
456 3.15 0.00 (0) 0.72 (8) 0.0481* 1.000 0.000(−) 0.051(−) CN
Codon . α . β (substitutions)C3 . C4 . P-Value . q-Value . Standard FEL P-ValueC3 . C4 . FEL Pattern .
23 0.00 0.87 (4) 0.00 (0) 0.0494 1.000 0.083(+) 1.000(−) NN
86 a 1.02 1.22 (7) 0.00 (1) 0.0164* 1.000 0.794(+) 0.039(−) NC
249 0.98 0.99 (6) 0.00 (0) 0.0375* 1.000 0.995(+) 0.062(−) NN
251 0.00 0.95 (5) 0.00 (0) 0.0169 1.000 0.342(+) 1.000(−) NN
262 a ,b 0.48 4.15 (15) 1.15 (3) 0.0292* 1.000 0.005(+) 0.460(+) PN
281 a ,c 0.00 0.44 (2) 3.90 (10) 0.0008* 0.372 0.216(+) 0.001(+) NP
295 2.46 0.00 (1) 0.62 (4) 0.0482* 1.000 0.000(−) 0.084(−) CN
309 a ,c 76.21 0.00 (0) 0.95 (5) 0.0047* 0.700 0.000(−) 0.013(−) CC
315 1.43 0.00 (2) 0.72 (3) 0.0479* 1.000 0.008(−) 0.452(−) CN
332 0.00 0.00 (1) 0.85 (2) 0.0472* 1.000 1.000(−) 0.101(+) NN
354 a 0.49 1.32 (6) 0.00 (1) 0.0160* 1.000 0.309(+) 0.187(−) NN
367 0.96 1.21 (6) 0.00 (1) 0.0285 1.000 0.780(+) 0.074(−) NN
439 a ,b 1.04 2.26 (10) 0.31 (4) 0.0143* 1.000 0.136(+) 0.216(−) NN
443 a ,b 0.00 2.29 (9) 0.00 (0) 0.0018* 0.412 0.005(+) 1.000(−) PN
456 3.15 0.00 (0) 0.72 (8) 0.0481* 1.000 0.000(−) 0.051(−) CN

Note .—Other notation the same as in table 3.

Codon reported as differentially evolving by mutation-selection directional DS3 model in Parto and Lartillot (2018).

Codon identified as positively selected in C3 plants ( c C4 plants) previously ( table 1 of Parto and Lartillot [2018]).

Sites Evolving Differentially between C3 and Other Taxa in the rbcL Gene.

Codon . α . β (substitutions)C3 . C4 . P-Value . q-Value . Standard FEL P-ValueC3 . C4 . FEL Pattern .
23 0.00 0.87 (4) 0.00 (0) 0.0494 1.000 0.083(+) 1.000(−) NN
86 a 1.02 1.22 (7) 0.00 (1) 0.0164* 1.000 0.794(+) 0.039(−) NC
249 0.98 0.99 (6) 0.00 (0) 0.0375* 1.000 0.995(+) 0.062(−) NN
251 0.00 0.95 (5) 0.00 (0) 0.0169 1.000 0.342(+) 1.000(−) NN
262 a ,b 0.48 4.15 (15) 1.15 (3) 0.0292* 1.000 0.005(+) 0.460(+) PN
281 a ,c 0.00 0.44 (2) 3.90 (10) 0.0008* 0.372 0.216(+) 0.001(+) NP
295 2.46 0.00 (1) 0.62 (4) 0.0482* 1.000 0.000(−) 0.084(−) CN
309 a ,c 76.21 0.00 (0) 0.95 (5) 0.0047* 0.700 0.000(−) 0.013(−) CC
315 1.43 0.00 (2) 0.72 (3) 0.0479* 1.000 0.008(−) 0.452(−) CN
332 0.00 0.00 (1) 0.85 (2) 0.0472* 1.000 1.000(−) 0.101(+) NN
354 a 0.49 1.32 (6) 0.00 (1) 0.0160* 1.000 0.309(+) 0.187(−) NN
367 0.96 1.21 (6) 0.00 (1) 0.0285 1.000 0.780(+) 0.074(−) NN
439 a ,b 1.04 2.26 (10) 0.31 (4) 0.0143* 1.000 0.136(+) 0.216(−) NN
443 a ,b 0.00 2.29 (9) 0.00 (0) 0.0018* 0.412 0.005(+) 1.000(−) PN
456 3.15 0.00 (0) 0.72 (8) 0.0481* 1.000 0.000(−) 0.051(−) CN
Codon . α . β (substitutions)C3 . C4 . P-Value . q-Value . Standard FEL P-ValueC3 . C4 . FEL Pattern .
23 0.00 0.87 (4) 0.00 (0) 0.0494 1.000 0.083(+) 1.000(−) NN
86 a 1.02 1.22 (7) 0.00 (1) 0.0164* 1.000 0.794(+) 0.039(−) NC
249 0.98 0.99 (6) 0.00 (0) 0.0375* 1.000 0.995(+) 0.062(−) NN
251 0.00 0.95 (5) 0.00 (0) 0.0169 1.000 0.342(+) 1.000(−) NN
262 a ,b 0.48 4.15 (15) 1.15 (3) 0.0292* 1.000 0.005(+) 0.460(+) PN
281 a ,c 0.00 0.44 (2) 3.90 (10) 0.0008* 0.372 0.216(+) 0.001(+) NP
295 2.46 0.00 (1) 0.62 (4) 0.0482* 1.000 0.000(−) 0.084(−) CN
309 a ,c 76.21 0.00 (0) 0.95 (5) 0.0047* 0.700 0.000(−) 0.013(−) CC
315 1.43 0.00 (2) 0.72 (3) 0.0479* 1.000 0.008(−) 0.452(−) CN
332 0.00 0.00 (1) 0.85 (2) 0.0472* 1.000 1.000(−) 0.101(+) NN
354 a 0.49 1.32 (6) 0.00 (1) 0.0160* 1.000 0.309(+) 0.187(−) NN
367 0.96 1.21 (6) 0.00 (1) 0.0285 1.000 0.780(+) 0.074(−) NN
439 a ,b 1.04 2.26 (10) 0.31 (4) 0.0143* 1.000 0.136(+) 0.216(−) NN
443 a ,b 0.00 2.29 (9) 0.00 (0) 0.0018* 0.412 0.005(+) 1.000(−) PN
456 3.15 0.00 (0) 0.72 (8) 0.0481* 1.000 0.000(−) 0.051(−) CN

Note .—Other notation the same as in table 3.

Codon reported as differentially evolving by mutation-selection directional DS3 model in Parto and Lartillot (2018).

Codon identified as positively selected in C3 plants ( c C4 plants) previously ( table 1 of Parto and Lartillot [2018]).

Selection on Cytochrome B of Haemosporidians Infecting Different Hosts

Pacheco et al. (2018) performed an in-depth evolutionary analysis of three mitochondrial genes from 102 Haemosporidian parasite species partitioned into four groups based on the hosts. The analysis concluded that the genes were subject to mostly purifying selection, with different gene-level strengths of selection established using RELAX. For example, in the cytochrome B gene (376 codons) which we reanalyze here, selection in the plasmodium infecting avian hosts clade was intensified relative to the plasmodium infecting primate/rodent hosts. Because this analysis contained more than two branch groups, Contrast-FEL conducted seven tests per site—the omnibus test and six pairwise group comparisons ( table 7). Overall, 28 sites showed evidence of differential selection with at least one test (FWER corrected), and five tests passed FDR correction for the omnibus test. For clarity, we did not consider FEL analyses on individual branch sets and only focused on Contrast-FEL inference. Twenty-two of 28 sites were detected by the omnibus and between one and three pairwise tests, whereas six sites were reported only by one of the pairwise tests, highlighting the additional resolution offered by these more focused tests. Patterns of differences at individual sites varied widely, with every possible pair being significantly different at least once. The simplest case (e.g., 160 and 179) is a significant discordance between two groups of branches. Another repeated pattern is when one group of branches stands apart from all others (e.g., 89 and 102).

Sites Evolving Differentially among the Four Branch Groups in the Cytochrome B Mitochondrial Gene of Haemosporidians from Pacheco et al. (2018), According to the Omnibus Test or At Least One Pairwise Test at LRT Corrected P-value of ≤ 0.05 ⁠ .

Codon . α . β (substitutions) Haemoproteidae . Avian Hosts . Mammalian Hosts . Leucocytozoon . Other . P-Value . q-Value . Significant Pairwise Tests .
56 3.81 0.00 (2) 0.00 (2) 0.00 (3) 0.75 (4) 0.00 0.0465* 0.795 AL, ML
89 0.94 0.00 (1) 0.00 (3) 0.00 (4) 1.33 (5) 0.00 0.0001* 0.017 HL, AL, ML
102 1.66 0.00 (2) 0.00 (2) 0.23 (5) 1.83 (6) 0.42 0.0014* 0.103 HL, AL, ML
150 0.79 0.00 (1) 0.00 (2) 1.27 (9) 0.00 (2) 0.56 0.0075* 0.403 AM, ML
158 0.42 0.00 (0) 0.00 (0) 0.00 (2) 1.30 (5) 0.00 0.0207* 0.518 AL, ML
160 1.25 0.94 (6) 0.00 (1) 0.08 (8) 0.43 (2) 0.75 0.0422 0.755 HA
179 0.89 0.48 (2) 0.00 (0) 0.25 (6) 1.46 (6) 1.10 0.0171* 0.495 AL
182 1.43 0.00 (0) 0.00 (0) 0.00 (0) 2.23 (9) 0.42 0.0001* 0.013 HL, AL, ML
183 1.80 0.00 (0) 0.00 (3) 0.08 (2) 1.20 (4) 0.54 0.0058* 0.361 HL, AL, ML
186 0.00 0.22 (1) 0.00 (0) 0.17 (2) 0.65 (3) 0.16 0.3127* 1.000 AL
193 1.53 0.00 (0) 0.00 (5) 0.00 (1) 1.07 (5) 0.00 0.0147* 0.462 AL, ML
194 1.05 0.00 (0) 0.00 (0) 0.00 (1) 0.49 (4) 0.19 0.1347* 1.000 ML
222 0.64 0.00 (0) 0.00 (0) 0.00 (0) 0.70 (4) 0.00 0.0246* 0.514 AL, ML
223 1.45 0.48 (3) 0.00 (0) 0.00 (1) 0.70 (6) 0.00 0.0200* 0.538 AL, ML
248 0.48 0.00 (0) 0.10 (1) 1.01 (9) 0.22 (1) 0.64 0.0573* 0.937 AM
253 1.67 0.00 (4) 0.00 (4) 0.00 (6) 0.95 (4) 0.40 0.0132* 0.452 AL, ML
256 1.92 0.00 (6) 0.00 (3) 0.73 (8) 0.00 (0) 0.64 0.0386* 0.725 AM
283 0.38 0.00 (1) 0.16 (1) 0.00 (0) 1.36 (6) 0.64 0.0241* 0.534 ML
285 1.17 0.22 (1) 0.00 (1) 0.39 (5) 1.44 (6) 1.01 0.0852* 1.000 AL
289 3.28 0.00 (0) 0.00 (1) 0.00 (0) 1.31 (3) 0.00 0.0012* 0.110 HL, AL, ML
309 1.05 0.83 (2) 0.36 (6) 1.11 (9) 4.24 (8) 0.61 0.0822* 1.000 AL
310 1.96 0.22 (1) 0.00 (1) 0.00 (2) 0.83 (4) 0.00 0.0119* 0.499 AL, ML
331 0.15 0.00 (0) 0.00 (0) 0.00 (0) 7.31 (9) 0.00 0.0000* 0.012 HL, AL, ML
338 0.00 0.00 (0) 0.00 (0) 0.00 (0) 0.98 (3) 0.00 0.0115 0.541 AL, ML
341 0.43 0.49 (3) 0.28 (2) 0.00 (2) 0.83 (6) 0.45 0.1118* 1.000 ML
343 0.25 0.97 (3) 0.14 (1) 0.00 (1) 1.23 (4) 1.05 0.0224* 0.527 HM, ML
351 1.75 0.00 (2) 0.00 (3) 0.91 (8) 0.88 (4) 0.25 0.0367* 0.727 AM
366 0.52 1.43 (5) 0.20 (3) 0.00 (1) 0.61 (6) 0.67 0.0123* 0.463 HM, ML
Codon . α . β (substitutions) Haemoproteidae . Avian Hosts . Mammalian Hosts . Leucocytozoon . Other . P-Value . q-Value . Significant Pairwise Tests .
56 3.81 0.00 (2) 0.00 (2) 0.00 (3) 0.75 (4) 0.00 0.0465* 0.795 AL, ML
89 0.94 0.00 (1) 0.00 (3) 0.00 (4) 1.33 (5) 0.00 0.0001* 0.017 HL, AL, ML
102 1.66 0.00 (2) 0.00 (2) 0.23 (5) 1.83 (6) 0.42 0.0014* 0.103 HL, AL, ML
150 0.79 0.00 (1) 0.00 (2) 1.27 (9) 0.00 (2) 0.56 0.0075* 0.403 AM, ML
158 0.42 0.00 (0) 0.00 (0) 0.00 (2) 1.30 (5) 0.00 0.0207* 0.518 AL, ML
160 1.25 0.94 (6) 0.00 (1) 0.08 (8) 0.43 (2) 0.75 0.0422 0.755 HA
179 0.89 0.48 (2) 0.00 (0) 0.25 (6) 1.46 (6) 1.10 0.0171* 0.495 AL
182 1.43 0.00 (0) 0.00 (0) 0.00 (0) 2.23 (9) 0.42 0.0001* 0.013 HL, AL, ML
183 1.80 0.00 (0) 0.00 (3) 0.08 (2) 1.20 (4) 0.54 0.0058* 0.361 HL, AL, ML
186 0.00 0.22 (1) 0.00 (0) 0.17 (2) 0.65 (3) 0.16 0.3127* 1.000 AL
193 1.53 0.00 (0) 0.00 (5) 0.00 (1) 1.07 (5) 0.00 0.0147* 0.462 AL, ML
194 1.05 0.00 (0) 0.00 (0) 0.00 (1) 0.49 (4) 0.19 0.1347* 1.000 ML
222 0.64 0.00 (0) 0.00 (0) 0.00 (0) 0.70 (4) 0.00 0.0246* 0.514 AL, ML
223 1.45 0.48 (3) 0.00 (0) 0.00 (1) 0.70 (6) 0.00 0.0200* 0.538 AL, ML
248 0.48 0.00 (0) 0.10 (1) 1.01 (9) 0.22 (1) 0.64 0.0573* 0.937 AM
253 1.67 0.00 (4) 0.00 (4) 0.00 (6) 0.95 (4) 0.40 0.0132* 0.452 AL, ML
256 1.92 0.00 (6) 0.00 (3) 0.73 (8) 0.00 (0) 0.64 0.0386* 0.725 AM
283 0.38 0.00 (1) 0.16 (1) 0.00 (0) 1.36 (6) 0.64 0.0241* 0.534 ML
285 1.17 0.22 (1) 0.00 (1) 0.39 (5) 1.44 (6) 1.01 0.0852* 1.000 AL
289 3.28 0.00 (0) 0.00 (1) 0.00 (0) 1.31 (3) 0.00 0.0012* 0.110 HL, AL, ML
309 1.05 0.83 (2) 0.36 (6) 1.11 (9) 4.24 (8) 0.61 0.0822* 1.000 AL
310 1.96 0.22 (1) 0.00 (1) 0.00 (2) 0.83 (4) 0.00 0.0119* 0.499 AL, ML
331 0.15 0.00 (0) 0.00 (0) 0.00 (0) 7.31 (9) 0.00 0.0000* 0.012 HL, AL, ML
338 0.00 0.00 (0) 0.00 (0) 0.00 (0) 0.98 (3) 0.00 0.0115 0.541 AL, ML
341 0.43 0.49 (3) 0.28 (2) 0.00 (2) 0.83 (6) 0.45 0.1118* 1.000 ML
343 0.25 0.97 (3) 0.14 (1) 0.00 (1) 1.23 (4) 1.05 0.0224* 0.527 HM, ML
351 1.75 0.00 (2) 0.00 (3) 0.91 (8) 0.88 (4) 0.25 0.0367* 0.727 AM
366 0.52 1.43 (5) 0.20 (3) 0.00 (1) 0.61 (6) 0.67 0.0123* 0.463 HM, ML

Note .—Individual pair tests in the last column and codes as follows. HA, Haemoproteidae versus Avian Hosts (one site) HM, Haemoproteidae versus Mammalian Hosts (two sites) HL, Haemoproteidae versus Leucocytozoon (six sites) AM, Avian Hosts versus Mammalian Hosts (four sites) AL, Avian Hosts versus Leucocytozoon (eighteen sites) ML, Mammalian Hosts versus Leucocytozoon (twenty sites). Other notation the same as in table 3.

Sites Evolving Differentially among the Four Branch Groups in the Cytochrome B Mitochondrial Gene of Haemosporidians from Pacheco et al. (2018), According to the Omnibus Test or At Least One Pairwise Test at LRT Corrected P-value of ≤ 0.05 ⁠ .

Codon . α . β (substitutions) Haemoproteidae . Avian Hosts . Mammalian Hosts . Leucocytozoon . Other . P-Value . q-Value . Significant Pairwise Tests .
56 3.81 0.00 (2) 0.00 (2) 0.00 (3) 0.75 (4) 0.00 0.0465* 0.795 AL, ML
89 0.94 0.00 (1) 0.00 (3) 0.00 (4) 1.33 (5) 0.00 0.0001* 0.017 HL, AL, ML
102 1.66 0.00 (2) 0.00 (2) 0.23 (5) 1.83 (6) 0.42 0.0014* 0.103 HL, AL, ML
150 0.79 0.00 (1) 0.00 (2) 1.27 (9) 0.00 (2) 0.56 0.0075* 0.403 AM, ML
158 0.42 0.00 (0) 0.00 (0) 0.00 (2) 1.30 (5) 0.00 0.0207* 0.518 AL, ML
160 1.25 0.94 (6) 0.00 (1) 0.08 (8) 0.43 (2) 0.75 0.0422 0.755 HA
179 0.89 0.48 (2) 0.00 (0) 0.25 (6) 1.46 (6) 1.10 0.0171* 0.495 AL
182 1.43 0.00 (0) 0.00 (0) 0.00 (0) 2.23 (9) 0.42 0.0001* 0.013 HL, AL, ML
183 1.80 0.00 (0) 0.00 (3) 0.08 (2) 1.20 (4) 0.54 0.0058* 0.361 HL, AL, ML
186 0.00 0.22 (1) 0.00 (0) 0.17 (2) 0.65 (3) 0.16 0.3127* 1.000 AL
193 1.53 0.00 (0) 0.00 (5) 0.00 (1) 1.07 (5) 0.00 0.0147* 0.462 AL, ML
194 1.05 0.00 (0) 0.00 (0) 0.00 (1) 0.49 (4) 0.19 0.1347* 1.000 ML
222 0.64 0.00 (0) 0.00 (0) 0.00 (0) 0.70 (4) 0.00 0.0246* 0.514 AL, ML
223 1.45 0.48 (3) 0.00 (0) 0.00 (1) 0.70 (6) 0.00 0.0200* 0.538 AL, ML
248 0.48 0.00 (0) 0.10 (1) 1.01 (9) 0.22 (1) 0.64 0.0573* 0.937 AM
253 1.67 0.00 (4) 0.00 (4) 0.00 (6) 0.95 (4) 0.40 0.0132* 0.452 AL, ML
256 1.92 0.00 (6) 0.00 (3) 0.73 (8) 0.00 (0) 0.64 0.0386* 0.725 AM
283 0.38 0.00 (1) 0.16 (1) 0.00 (0) 1.36 (6) 0.64 0.0241* 0.534 ML
285 1.17 0.22 (1) 0.00 (1) 0.39 (5) 1.44 (6) 1.01 0.0852* 1.000 AL
289 3.28 0.00 (0) 0.00 (1) 0.00 (0) 1.31 (3) 0.00 0.0012* 0.110 HL, AL, ML
309 1.05 0.83 (2) 0.36 (6) 1.11 (9) 4.24 (8) 0.61 0.0822* 1.000 AL
310 1.96 0.22 (1) 0.00 (1) 0.00 (2) 0.83 (4) 0.00 0.0119* 0.499 AL, ML
331 0.15 0.00 (0) 0.00 (0) 0.00 (0) 7.31 (9) 0.00 0.0000* 0.012 HL, AL, ML
338 0.00 0.00 (0) 0.00 (0) 0.00 (0) 0.98 (3) 0.00 0.0115 0.541 AL, ML
341 0.43 0.49 (3) 0.28 (2) 0.00 (2) 0.83 (6) 0.45 0.1118* 1.000 ML
343 0.25 0.97 (3) 0.14 (1) 0.00 (1) 1.23 (4) 1.05 0.0224* 0.527 HM, ML
351 1.75 0.00 (2) 0.00 (3) 0.91 (8) 0.88 (4) 0.25 0.0367* 0.727 AM
366 0.52 1.43 (5) 0.20 (3) 0.00 (1) 0.61 (6) 0.67 0.0123* 0.463 HM, ML
Codon . α . β (substitutions) Haemoproteidae . Avian Hosts . Mammalian Hosts . Leucocytozoon . Other . P-Value . q-Value . Significant Pairwise Tests .
56 3.81 0.00 (2) 0.00 (2) 0.00 (3) 0.75 (4) 0.00 0.0465* 0.795 AL, ML
89 0.94 0.00 (1) 0.00 (3) 0.00 (4) 1.33 (5) 0.00 0.0001* 0.017 HL, AL, ML
102 1.66 0.00 (2) 0.00 (2) 0.23 (5) 1.83 (6) 0.42 0.0014* 0.103 HL, AL, ML
150 0.79 0.00 (1) 0.00 (2) 1.27 (9) 0.00 (2) 0.56 0.0075* 0.403 AM, ML
158 0.42 0.00 (0) 0.00 (0) 0.00 (2) 1.30 (5) 0.00 0.0207* 0.518 AL, ML
160 1.25 0.94 (6) 0.00 (1) 0.08 (8) 0.43 (2) 0.75 0.0422 0.755 HA
179 0.89 0.48 (2) 0.00 (0) 0.25 (6) 1.46 (6) 1.10 0.0171* 0.495 AL
182 1.43 0.00 (0) 0.00 (0) 0.00 (0) 2.23 (9) 0.42 0.0001* 0.013 HL, AL, ML
183 1.80 0.00 (0) 0.00 (3) 0.08 (2) 1.20 (4) 0.54 0.0058* 0.361 HL, AL, ML
186 0.00 0.22 (1) 0.00 (0) 0.17 (2) 0.65 (3) 0.16 0.3127* 1.000 AL
193 1.53 0.00 (0) 0.00 (5) 0.00 (1) 1.07 (5) 0.00 0.0147* 0.462 AL, ML
194 1.05 0.00 (0) 0.00 (0) 0.00 (1) 0.49 (4) 0.19 0.1347* 1.000 ML
222 0.64 0.00 (0) 0.00 (0) 0.00 (0) 0.70 (4) 0.00 0.0246* 0.514 AL, ML
223 1.45 0.48 (3) 0.00 (0) 0.00 (1) 0.70 (6) 0.00 0.0200* 0.538 AL, ML
248 0.48 0.00 (0) 0.10 (1) 1.01 (9) 0.22 (1) 0.64 0.0573* 0.937 AM
253 1.67 0.00 (4) 0.00 (4) 0.00 (6) 0.95 (4) 0.40 0.0132* 0.452 AL, ML
256 1.92 0.00 (6) 0.00 (3) 0.73 (8) 0.00 (0) 0.64 0.0386* 0.725 AM
283 0.38 0.00 (1) 0.16 (1) 0.00 (0) 1.36 (6) 0.64 0.0241* 0.534 ML
285 1.17 0.22 (1) 0.00 (1) 0.39 (5) 1.44 (6) 1.01 0.0852* 1.000 AL
289 3.28 0.00 (0) 0.00 (1) 0.00 (0) 1.31 (3) 0.00 0.0012* 0.110 HL, AL, ML
309 1.05 0.83 (2) 0.36 (6) 1.11 (9) 4.24 (8) 0.61 0.0822* 1.000 AL
310 1.96 0.22 (1) 0.00 (1) 0.00 (2) 0.83 (4) 0.00 0.0119* 0.499 AL, ML
331 0.15 0.00 (0) 0.00 (0) 0.00 (0) 7.31 (9) 0.00 0.0000* 0.012 HL, AL, ML
338 0.00 0.00 (0) 0.00 (0) 0.00 (0) 0.98 (3) 0.00 0.0115 0.541 AL, ML
341 0.43 0.49 (3) 0.28 (2) 0.00 (2) 0.83 (6) 0.45 0.1118* 1.000 ML
343 0.25 0.97 (3) 0.14 (1) 0.00 (1) 1.23 (4) 1.05 0.0224* 0.527 HM, ML
351 1.75 0.00 (2) 0.00 (3) 0.91 (8) 0.88 (4) 0.25 0.0367* 0.727 AM
366 0.52 1.43 (5) 0.20 (3) 0.00 (1) 0.61 (6) 0.67 0.0123* 0.463 HM, ML

Note .—Individual pair tests in the last column and codes as follows. HA, Haemoproteidae versus Avian Hosts (one site) HM, Haemoproteidae versus Mammalian Hosts (two sites) HL, Haemoproteidae versus Leucocytozoon (six sites) AM, Avian Hosts versus Mammalian Hosts (four sites) AL, Avian Hosts versus Leucocytozoon (eighteen sites) ML, Mammalian Hosts versus Leucocytozoon (twenty sites). Other notation the same as in table 3.


Introduction

Climate change affects various aspects of biodiversity across the planet (e.g., [1, 2]). In particular, shifts in phenotypic distributions within populations are widely reported, for a variety of morphological, phenological, or life-history traits [2–4]. Surprisingly, however, little is still known about the relative contributions of mechanisms underlying these shifts [5]. Within a population, phenotypic distributions may change due to a change in population structure (e.g., age structure or sex ratio), due to phenotypic plasticity (within or between individuals), and due to genetic change [6–8]. The exact mixture of mechanisms driving phenotypic change will determine the future of a population facing a prolonged change in environmental conditions [9], for several reasons. First, the consequences of changing population structure are variable and may be idiosyncratic (e.g., [8, 10]). Second, phenotypic plasticity can provide an efficient way to cope with a changing environment, but its effect may be short-lived and even maladaptive [11–13]. Third, genetic evolution, when driven by natural selection, can improve population growth rate, potentially contributing to long-term population persistence [12].

In wild populations, the respective contributions of plasticity versus evolution remain unknown for the vast majority of documented phenotypic changes [14, 15] (note that by evolution we mean genetic change, here and in the rest of the manuscript). To date, most of the evidence for evolutionary responses to climate change comes from plants [16]. In contrast, despite numerous examples of phenotypic changes apparently related to climate, there have been surprisingly few examples demonstrating unambiguously that a vertebrate population is evolving in response to climate change (see discussions in [17–20]). This lack of evidence may, in part, be due to the question not being prioritized [14, 15]. However, it probably also reflects the substantial challenges inherent in testing for adaptive evolution, in terms of requirements for appropriate data and statistical methods. For wild populations in which experimental manipulations are not feasible, the most plausible means of testing for the genetic basis of phenotypic changes is to use long-term pedigree data to test for changes in “breeding values,” the estimated genetic merit of individuals as ascertained from the phenotypes of their relatives [21]. This needs to be done with care, as trends in predicted breeding values can be confounded with environmental trends unless appropriately controlled for [22], and the precision of estimates of evolutionary rates can be inflated if the correlation structure of breeding value estimates is not properly handled [23]. To our knowledge, among the studies of wild vertebrate populations that properly account for uncertainty in breeding value predictions, only 3 have found evidence of genetic change underlying phenotypic change in line with selection pressures changing with climate: plumage colouration in collared flycatchers [20], and body size in Siberian Jays [24] and snow voles [25]. However, only with more empirical studies explicitly testing for evolution will it become possible to say whether the current lack of evidence also reflects a generally slow rate of adaptation to environmental change in natural populations [26].

Climate change may have impacts on numerous aspects of an organism’s biology, but phenology (i.e., the seasonal timing of life-history events) appears to be particularly affected [3, 27–29]. Dramatic changes of phenologies in response to earlier onset of spring are particularly well documented in mid- and high-latitude passerines, where breeding times are occurring earlier in numerous populations and species [18, 30]. The study of avian systems in particular has shown that a fine-tuning of phenology to the climate is crucial in determining individual fitness. Mistiming between mean breeding date and a fitness optimum that shifts with climate may re-shape selective pressures and hence potentially reduce population growth rate [31, 32], although establishing the link between individual-level and population-level processes is challenging [33, 34]. The effects of climate change on mammalian phenology are less well documented and less clear than those of birds [29] and may be more complex because mammals’ long gestation times likely make their breeding phenology sensitive to climate across a longer timeframe [17]. Furthermore, despite the extensive evidence for phenotypic shifts in phenology, the few studies that test for a genetic basis to changes in phenology in wild populations have not found evidence of genetic changes [35–38]. One possible exception is the change of egg hatching date in winter moths [39], for which a common garden experiment suggested a contribution of genetic change.

In a population of red deer (Cervus elaphus, Linnaeus 1758) on the Isle of Rum, Northwest Scotland, parturition date has advanced at a rate of 4.2 days per decade since 1980, a change that has been linked to temperatures and other weather conditions in the year preceding parturition, especially around the time of conception [40, 41]. Previous studies of this population have shown that phenotypic plasticity in response to temperature and population structure explain a substantial proportion (23%) of the advance in parturition dates [41] and also that within-individual plasticity is sufficient to explain the population-level relationship between temperature and parturition date [42]. However, the documented plasticity does not explain the majority of the observed phenotypic change over time, leaving room for processes that have not been investigated as of yet. It is plausible that evolution plays a role because the observed phenotypic change is qualitatively consistent with a genetic response to selection: parturition date is heritable in this population [43] and also under selection for earlier dates [44].

In this study, we use quantitative genetic animal models [21, 45] to estimate the rate of evolution in parturition date and the contribution of plastic and demographic processes to the observed shift in phenology in the Rum red deer study population. We start by considering the response to selection that might be expected from the observed strength of selection and (narrow-sense) heritability of parturition date, based on a simple “breeder’s equation” prediction [46]. One of the most striking conclusions from the recent application of quantitative genetic theory in evolutionary ecology has been the failure of univariate “breeder’s equation” predictions to capture trait dynamics in wild populations [47, 48]. This may be for multiple reasons, foremost of which is likely to be the unrealistic assumption that only the focal trait is relevant. We therefore also consider a multivariate breeder’s equation [49] and ask how selection on offspring size and the genetic correlation between parturition date and size alters the expected evolutionary response. However, there is a second, less well-explored reason for the failure of the theory: predicted genetic responses to selection are often compared to observed rates of phenotypic change rather than of genetic change. Phenotypic changes are generally affected not only by genetic changes but also by numerous nongenetic processes and therefore may poorly reflect underlying genetic changes. As the central analysis of this work we use trends in breeding values and the secondary theorem of natural selection (STS) to estimate the rate of evolution in parturition date. We then test whether the estimated rate of evolution is compatible with the response to selection predicted by either the univariate or multivariate “breeder’s equation,” or with genetic drift. We also consider the effect of nongenetic processes contributing to phenotypic change and quantify the role of phenotypic plasticity in response to warming temperatures and of changes in population structure.


Acknowledgments

We thank P. Visconti and M. Čengić for their contributions to the habitat cross-walk. We thank M. Busana and S. Hoeks for helping out with improving the efficiency of the code. We are thankful to Phylopic (www.phylopic.org) for providing nice free silhouettes and especially to Lukasiniho and T. Michael Keesey. J.H. and A.S. were supported by the GLOBIO project (www.globio.info). A.B.L. was supported by a Juan de la Cierva-Incorporación grant (IJCI-2017-31419) from the Spanish Ministry of Science, Innovation and Universities.

Description of land-use allocation by the GLOBIO model (Appendix S1) and of crosswalk between the GLOBIO land-use map and the IUCN habitat classification (Appendix S2), comparison between hunting database and selected tropical mammal species (Appendix S3), area loss by land use and hunting with 2 thresholds (Appendix S4), area loss due to different pressures (Appendix S5), patterns of tropical mammal species richness (Appendix S6), species affected by land use or hunting pressure as main driver of distribution reduction (Appendix S7), model selection results for the binomial hunting model (Appendix S8), mean area loss due to different pressures (Appendix S9), and cross-walk from IUCN habitat classes to ESA CII and GLOBIO classes 11 (Appendix S10) are available online. The authors are solely responsible for the content and functionality of these materials. Queries (other than absence of the material) should be directed to the corresponding author.

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.


Watch the video: A2 Biology - Natural Selection selection pressures (November 2024).