Information

7.4: Crossovers Allow Recombination of Linked Loci - Biology

7.4:  Crossovers Allow Recombination of Linked Loci - Biology


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Thus far, we have only considered situations with either no linkage (50% recombination) or complete linkage (0% recombination). It is also possible to obtain recombination frequencies between 0% and 50%, which is a situation we call incomplete (or partial) linkage. Genes that are on the same chromosome are said to be syntenic regardless of whether they are completely or incompletely linked. All linked genes are syntenic, but not all syntenic genes are linked, as we will learn later.

Crossovers occur during prophase I of meiosis, when pairs of homologous chromosomes have aligned with each other in a process called synapsis. Crossing over begins with the breakage of DNA of a pair of non-sister chromatids. The breaks occur at corresponding positions on two non-sister chromatids, and then the ends of non-sister chromatids are connected to each other resulting in a reciprocal exchange of double-stranded DNA (Figure (PageIndex{4})). Generally every pair of chromosomes has at least one (and often more) crossovers during meioses (Figure (PageIndex{5})).

Figure (PageIndex{5}): A crossover between two linked loci can generate recombinant genotypes (AB, ab), from the chromatids involved in the crossover. Remember that multiple, independent meioses occur in each organism, so this particular pattern of recombination will not be observed among all the meioses from this individual. (Original-Deyholos-CC:AN)

Because the location of crossovers is essentially random along the chromosome, the greater the distance between two loci, the more likely a crossover will occur between them. Furthermore, loci that are on the same chromosome, but are sufficiently far apart from each other, will on average have multiple crossovers between them and they will behave as though they are completely unlinked. A recombination frequency of 50% is therefore the maximum recombination frequency that can be observed, and is indicative of loci that are either on separate chromosomes, or are located very far apart on the same chromosome.


Progressive Recombination Suppression and Differentiation in Recently Evolved Neo-sex Chromosomes

Recombination suppression leads to the structural and functional differentiation of sex chromosomes and is thus a crucial step in the process of sex chromosome evolution. Despite extensive theoretical work, the exact processes and mechanisms of recombination suppression and differentiation are not well understood. In threespine sticklebacks (Gasterosteus aculeatus), a different sex chromosome system has recently evolved by a fusion between the Y chromosome and an autosome in the Japan Sea lineage, which diverged from the ancestor of other lineages approximately 2 Ma. We investigated the evolutionary dynamics and differentiation processes of sex chromosomes based on comparative analyses of these divergent lineages using 63 microsatellite loci. Both chromosome-wide differentiation patterns and phylogenetic inferences with X and Y alleles indicated that the ancestral sex chromosomes were extensively differentiated before the divergence of these lineages. In contrast, genetic differentiation appeared to have proceeded only in a small region of the neo-sex chromosomes. The recombination maps constructed for the Japan Sea lineage indicated that recombination has been suppressed or reduced over a large region spanning the ancestral and neo-sex chromosomes. Chromosomal regions exhibiting genetic differentiation and suppressed or reduced recombination were detected continuously and sequentially in the neo-sex chromosomes, suggesting that differentiation has gradually spread from the fusion point following the extension of recombination suppression. Our study illustrates an ongoing process of sex chromosome differentiation, providing empirical support for the theoretical model postulating that recombination suppression and differentiation proceed in a gradual manner in the very early stage of sex chromosome evolution.


Introduction

High-throughput sequencing technology has greatly accelerated the construction of dense linkage maps in nonmodel species (e.g. Amores et al. 2011 Miller et al. 2012 Everett & Seeb 2014 ). The emerging maps allow unprecedented insights into the genomic architecture of adaptive divergence in the wild (e.g. Chutimanitsakun et al. 2011 Gagnaire et al. 2013 Richards et al. 2013 ). These studies catalyse our understanding of the number, genomic location and colocation of loci affected by natural selection. Unfortunately, centromeres are rarely included on linkage maps because the additional mapping efforts such as mapping half-tetrads add significant logistical hurdles (cf., Thorgaard et al. 1983 Brieuc et al. 2014 ).

Centromeres represent a fundamental component of chromosomal structure and function (Henikoff et al. 2001 ), and information about centromere location is vital for properly understanding genomes. Studies describing genetic divergence have shown striking patterns in either centromeric or telomeric regions (Carneiro et al. 2009 Ellegren et al. 2012 ). Hence, knowledge about chromosome type (i.e. acrocentric vs metacentric) is of paramount importance for understanding the underlying architecture of adaptive traits. However, interpretations often suffer from the lack of known centromeres on reference maps and genomes this has impoverished interpretations of results plotted along high-density maps (e.g. Wang et al. 2012 Gagnaire et al. 2013 Carlson et al. 2015 ) or genome assemblies (e.g. Ellegren et al. 2012 Tine et al. 2014 Xu et al. 2014 ).

Here, we demonstrate a straightforward method to identify centromeric regions on linkage maps by phasing the same recombination data used to construct the map. We validate the method by comparing phased centromere placement with more direct centromere placement using half-tetrad analysis in sockeye salmon, Oncorhynchus nerka. Finally, we provide test examples that highlight advantages and limitations of the method for mapping centromeres in different sexes and taxa.


Abstract

During meiosis, a programmed induction of DNA double-strand breaks (DSBs) leads to the exchange of genetic material between homologous chromosomes. These exchanges increase genome diversity and are essential for proper chromosome segregation at the first meiotic division. Recent findings have highlighted an unexpected molecular control of the distribution of meiotic DSBs in mammals by a rapidly evolving gene, PR domain-containing 9 (PRDM9), and genome-wide analyses have facilitated the characterization of meiotic DSB sites at unprecedented resolution. In addition, the identification of new players in DSB repair processes has allowed the delineation of recombination pathways that have two major outcomes, crossovers and non-crossovers, which have distinct mechanistic roles and consequences for genome evolution.


Materials and methods

Sampling and DNA sequencing

X-chromosomes were sampled from 14 geographic populations that span the D. neotestacea species range (Table S1, See table 1 of Dyer, 2012 ). These males are the same as used in Dyer ( 2012 ), in which wild-caught males were identified as carrying ST or SR X-chromosomes by the proportion of female offspring they produced. From each population where SR is present, at least three ST and three SR males were sequenced at each of 11 randomly chosen X-linked loci, which included six protein-coding genes (marf, mof, pgd, rpl, spk and sxl) and the flanking regions of six microsatellite loci (neo5261, neo6002, neo7029, neo7040, neo8377 and neo8385) (Table S2) (Dyer, 2007 ). In total, each male was sequenced at 4510 base pairs on the X-chromosome. In populations without SR present, only ST males were sampled. At least 53 ST and 41 SR males were included for every marker, although the total sample number is variable for each individual locus. From each population, at least three males were chosen randomly with respect to their X-chromosome type using a random number generator (www.random.org). These males were sequenced at seven arbitrary autosomal protein-coding loci located on all of the other Muller elements except F, which included esc, gl, mago, ntid, sia, tpi and wee for a total of 2657 bp per individual (Table S2). All these loci were also sequenced in one or two individuals of D. orientacea and D. putrida, which are two of the other members of the testacea species group. From the final member of the testacea species group, D. testacea, one female from each of 24 isofemale lines collected in Munich, Germany, were chosen for sequencing.

DNA extractions were performed with Qiagen Puregene Core Kit A. Fragments were amplified using standard PCR protocols (Table S3) and sequenced on an Applied Biosystems (Foster City, CA, USA) 3730xl DNA Analyzer at the Georgia Genomics Facility. Base calls were confirmed using Geneious (Kearse et al., 2012 ), and heterozygous SNPs in diploid loci were phased using PHASE (Scheet & Stephens, 2006 ). Sequences were aligned by hand in Geneious, and microsatellite repeats were removed manually, leaving only the flanking regions for use in analyses (Kearse et al., 2012 ). Open reading frames were assigned using annotated D. melanogaster orthologs as a guide (www.flybase.org).

Recombination mapping

We used recombination mapping to determine the order of the loci on the ST chromosome. We performed single pair crosses with flies from isofemale lines originally collected in Seattle, WA (Sea), and Coeur d'Alene, ID (ID-1). Sea females were crossed to ID-1 males, and the F1 females were collected and crossed to males from an inbred laboratory stock. The F2 males (carrying a maternally derived X-chromosome) were collected and frozen for genotyping. All flies were reared on Instant Drosophila Medium (Carolina Biological Supply, Burlington, NC, USA) supplemented with commercial mushroom (Agaricus bisporus) at 20 °C with 60% relative humidity on a 12-h light/dark cycle. DNA from the parents and 92 F2 males was extracted as described above. Parents were genotyped using repeat number at X-linked microsatellite markers (as described in Pinzone & Dyer, 2013 ) and sequenced at X-linked protein-coding loci to identify SNPs at restriction enzyme cut sites. BsaWI (New England Biolabs, Ipswich, MA, USA) was used for restriction fragment analysis of spk. F2 males were genotyped at the microsatellite loci and spk and were sequenced at marf and pgd to identify which parental allele they carried. The remaining two loci (rpl and mof) contained no polymorphisms in the parental cross and were thus unable to be mapped. The most likely map order and distances were calculated using the Kosambi mapping function in MapDisto with 1000 bootstrap replicates (Kosambi, 1943 Lorieux, 2012 ).

As SR/SR females are fertile, we attempted to use two independently collected and maintained SR lines for recombination mapping of the SR chromosome. These lines were collected in Eugene, OR, in 2001 (SR-Par) and in Rochester, New York, in 1990 (SR-Lab). However, based on sequencing each locus from each line, there was not enough variation between them to determine anything except that neo6002 and neo7029 are at least 50 cM apart on SR, the same as on ST.

The ST laboratory stock was also originally collected in New York in 1990, and the SR-Lab stock is maintained in the same genetic background. Every generation, ST/Y males are crossed to SR/SR females to generate SR/Y males, and SR/Y males are also crossed to SR/SR females to produce more SR/SR females. We tested for recombination between ST and SR by crossing ST/SR heterozygous females with ST/Y males and then by genotyping 95 male offspring at the six X-linked microsatellites described above. Parental ST and SR genotypes were identified using four SR/SR females and two ST/Y males from these highly inbred SR and ST stocks.

Phylogenetic analysis

Multilocus phylogenetic species trees were constructed using *BEAST with the HKY nucleotide substitution model and a chain length of 100 million, with 20% burn-in removed (Heled & Drummond, 2010 Bouckaert et al., 2014 ). Orthologs from D. testacea, D. orientacea and D. putrida were included in the analysis, and based on previous phylogenetic work, D. putrida was used as an outgroup to root the trees (Perlman & Jaenike, 2003 Dyer et al., 2011 ). Trees for X-chromosome and autosomal loci were constructed separately. For the X-linked species tree, D. neotestacea samples were divided into ST and SR. The X-linked tree also included the marker sxl, which was dropped from further analyses due to extremely low polymorphism. The autosomal tree used all seven autosomal markers. The individual gene trees generated as a part of these *BEAST analyses were used to infer the relationship between D. neotestacea ST and SR and D. testacea at the individual locus level.

Relationships between individual samples were inferred using a multilocus phylogenetic tree constructed with all the X-linked markers except neo5261, which lacked a sequence from D. putrida used to root the tree. This tree was also built in *BEAST using the HKY substitution model and a chain length of 2 billion with 20% burn-in removed.

Patterns of genetic differentiation and recombination

To estimate the patterns of genetic differentiation, we calculated KST and Snn (Hudson et al., 1992 Hudson, 2000 ) in DnaSP (Librado & Rozas, 2009 ) using both geographic sampling location and ST and SR as groupings (Librado & Rozas, 2009 ). Significance of individual KST and Snn values was determined using 1000 random permutations according to the method of Hudson et al. ( 1992 ), Librado & Rozas ( 2009 ). Statistical analyses were carried out in RStudio ( 2014 ).

To estimate the patterns of LD within and across loci, all X-linked markers were concatenated in the order of the ST genetic map and the pairwise correlation coefficient (R 2 ) between each pair of parsimony informative sites was calculated according to the method of Hill & Robertson ( 1968 ). Significance of each association was calculated in DnaSP using a Fisher's exact test with a Bonferroni correction for multiple testing using α = 0.05 (Librado & Rozas, 2009 ). LD was inferred using all samples (SR and ST) and separately for SR and ST chromosomes. ZnS for each locus was calculated according to the method of Kelly ( 1997 ). The population recombination rate (ρ) was estimated for each locus using the composite-likelihood method of Hudson ( 2001 ) in the program LDHat (McVean et al., 2004 ). For the autosomes, ρ = (1/2)*4Ner = 2Ner (as males do not recombine in D. neotesteacea), where Ne is the effective population size and r is the per-generation recombination rate. For the X-chromosome, ρ = (2/3)*3Ner = 2Ner. ρ was calculated for ST and SR separately for each marker and then scaled by the number of sites. Genetic exchange between SR and ST was detected in the concatenated alignment of all X-linked markers using the method of Betran et al. ( 1997 ) as implemented in DnaSP (Librado & Rozas, 2009 ).

Patterns of nucleotide polymorphism

For each marker, nucleotide polymorphism was analysed separately for ST and SR sample groups. In markers that contained open reading frames, segregating sites were split into silent site polymorphisms (synonymous changes and changes outside the open reading frame) and nonsynonymous polymorphisms. Microsatellite flanking regions were assumed to be silent sites. To evaluate the patterns of nucleotide polymorphism, average pairwise nucleotide differences (π) (Nei, 1987 ), Watterson's θ (Watterson, 1975 ) and Tajima's D (Tajima, 1989 ) were calculated in DnaSP using only silent sites (Librado & Rozas, 2009 ). π was also calculated using only nonsynonymous variation for the protein-coding genes. Net nucleotide substitutions per site (Da) between ST and SR were calculated for a combined set of all X-linked markers (Nei, 1987 Librado & Rozas, 2009 ). Parsimony informative sites were identified for ST and SR individually in DnaSP and used to identify private alleles. Expected Tajima's D values were generated using 10 000 coalescent simulations in the program HKA (https://bio.cst.temple.edu/

hey/software/software.htm#HKA). Ka/Ks and πa/πs were calculated for all protein-coding loci in DnaSP (Librado & Rozas, 2009 ), with orthologs from D. putrida used to identify the substitutions.

Polytene chromosome squashes

Inversions on SR were confirmed using squashes of polytene chromosomes from the salivary glands of D. neotestacea third-instar larvae. Salivary glands were dissected out in phosphate-buffered saline, fixed in 45% acetic acid, stained with orcein and then physically squashed on a microscope slide to separate the chromosomes (Sullivan et al., 2000 ). Chromosomes spreads were examined under 400× magnification using phase-contrast microscopy. The X-chromosome was identified in ST males from the inbred laboratory stock because it showed no synapses or chromosomal inversions, and then, ST/SR heterozygote females were generated and used to identify the inversions between ST and SR.


Abstract

The Morgan2McClintock Translator permits prediction of meiotic pachytene chromosome map positions from recombination-based linkage data using recombination nodule frequency distributions. Its outputs permit estimation of DNA content between mapped loci and help to create an integrated overview of the maize nuclear genome structure.

TWO fundamentally different but colinear types of gene maps can be produced, linkage maps and physical maps. Classical linkage (genetic) maps are based on allele-recombination frequencies, whereas physical maps are based on the linear DNA molecules that compose the chromosomes.

In maize, a model genetic and major agricultural species, � high-resolution linkage maps composed of thousands of markers are available, whereas detailed physical maps of DNA sequence and chromosome structure are still in development. The three main types of maize physical maps differ in the level of molecular resolution. They are (1) genome sequence assembly maps at DNA base-pair resolution (see, e.g., D ong et al. 2005 Fu et al. 2005) (2) fingerprint-contig maps, resolved at the level of overlapping restriction fragments from cloned segments of genomic DNA (see, e.g., P ampanwar et al. 2005) and (3) cytological maps constructed by microscopic observation of pachytene chromosome structure (e.g., the Cytogenetic FISH 9 map created by K oumbaris and B ass 2003 and A marillo and B ass 2004).

Linkage and physical maps have different coordinate systems for positioning loci. The genetic map unit is called a �ntiMorgan” (cM) in honor of Thomas Hunt Morgan. One centimorgan is equal to 1% crossing over between two linked loci. Fingerprint-contig and genomic-assembly maps are measured in base pairs, whereas physical maps based on pachytene chromosome structure (also called cytological or cytogenetic maps) position each locus as the fractional distance along the arm from the centromere to the telomere. Recently, maize researchers have begun to call the unit of this sort of map denomination a �ntiMcClintock” (cMC) in honor of maize genetics pioneer Barbara McClintock. Here we formally define 1 cMC as 1% of the length of the chromosome arm upon which a given locus resides. For example, if the short arm of chromosome 9 is 8.70 μm in length and the bronze1 (bz1) locus lies 5.66 μm from the centromere on that chromosome arm, bz1 lies (5.66/8.70 × 100 =) 65% of the distance from the centromere to the chromosome tip or 65 cMC from the centromere. A locus at position 66 would lie exactly 1 cMC from the bz1 locus. Because maize chromosome arm lengths vary and the centiMcClintock is a relative unit, 1 cMC on, e.g., the short arm of chromosome 9 does not necessarily consist of the same number of micrometers as 1 cMC on any of the 19 other chromosome arms. The cytological conventions are further described and defined at http://www.maizegdb.org/coordinateDef.php.

Recombination rates vary tremendously along individual chromosomes such that the map distance between two loci on a linkage map may not accurately predict the physical distance between them (A nderson et al. 2004). This variation has made integrating the two types of maps difficult and also has important implications for genome-assembly efforts and positional-cloning strategies (S adder and W eber 2002).

A method for linking genetic maps with chromosome structure has recently been developed. A nderson et al. (2003) determined the frequency distributions of recombination nodules (RN) along the 10 pachytene chromosomes of maize. Because each RN represents a crossover on the physical structure of the chromosome, these RN maps are unique in that they contain both linkage and cytological information that allows the prediction of the cytological position of any genetically mapped marker (A nderson et al. 2004). We have developed a tool, the Morgan2McClintock Translator (accessible at http://www.lawrencelab.org/Morgan2McClintock), which automates the cytological-position prediction process for any input linkage data.

Conversion of maize linkage map coordinates into cytological coordinates requires both linkage data and RN frequencies as input. The Morgan2McClintock Translator includes as data files the maize RN map (A nderson et al. 2003) as well as two genetic maps, the University of Missouri at Columbia (UMC) 1998 map (D avis et al. 1999) and the 1997 genetic map (N euffer et al. 1995). More than a thousand other genetic maps, which also can be used as input files, are available at MaizeGDB (L awrence et al. 2005 and http://www.maizegdb.org/map.php). The translator itself was coded with PHP, and the equations that it uses to convert linkage maps into cytological maps are those described by A nderson et al. (2004). The application can be run online, or it can be downloaded for local use on any machine equipped to serve PHP. Aspects of the input and output displays for the translator for the UMC 98 genetic map are shown in Figure 1 (D avis et al. 1999).

The Morgan2McClintock Translator. Screen capture images taken from http://www.lawrencelab.org/Morgan2McClintock show examples of data input (top) and output (bottom). (A) The user first chooses the maize linkage group as chromosome number (arrow at Step 1) and then the corresponding centimorgan linkage-map data set (arrow at Step 2). The linkage map data can be chosen from among stored data sets available for common maps or pasted directly into a text box for map data not currently stored. Clicking the �lculate” button submits input data and calculates centiMcClintock values from the RN frequency distribution. The output web page contains a table that summarizes one locus per row and includes columns that describe the input data in centimorgans (B) and the output data in predicted locations along the pachytene chromosome, expressed in microns and in centiMcClintocks (C).

The distribution of RNs provides an important connection between genetic maps and chromosomal structure, which has allowed the examination of gene distribution at the chromosomal level in maize (A nderson et al. 2006). This integration also permits estimation of DNA and chromosomal distances between genetic loci, a feature that will assist in the sequence assembly of the maize genome. Theoretically, this approach is applicable to other organisms with comparable cytological crossover-distribution data such as tomato (S herman and S tack 1995) and mouse (F roenicke et al. 2002), and we plan to develop a set of similar tools for these organisms that should be useful in comparing genetic and chromosomal aspects of genomes in different species.


Materials and methods

Plant material

Two half-sib panels of 11 and 13 half-sib families were established in the Plant-KBBE project CornFed, one for European Dent and one for European Flint maize. The lines used in this study, their origins, and assignment to Dent or Flint pools are listed in Additional file 1. Each of the two panels consists of a central line (or common parent) that was crossed to founder lines that represent important and diverse breeding lines of the European maize germplasm. The central line (F353) of the Dent panel was crossed with 10 Dent founder lines. For the Dent panel the prefix for populations is CFD. In the Flint panel, the central line UH007 was crossed with 11 Flint founder lines and the prefix for populations of this panel is CFF. In addition, each of the founder lines was crossed with B73 and also the reciprocal populations F353 × UH007 and UH007 × F353 were generated. These additional populations were made to connect the two panels with each other and with the US NAM population [32] via the parental line B73. The crossing scheme of the two half-sib panels and their connection is shown in Additional file 15. All progenies were homozygous DH lines obtained from F1 plants. The resulting 24 DH populations consisted of 35 to 129 lines (Table 1). In total, 2,267 DH lines were used for analysis in this study.

Crosses between central and founder lines were made by hand-pollinations using F353 or UH007 as female lines and the founder lines as males. F1 plants were pollinated with an inducer line for in vivo haploid production followed by chromosome doubling and selfing of D0 plants, and subsequent multiplication to obtain D1 plants [36]. Atypical lines within a cross or atypical plants within rows of DH lines were eliminated based on phenotypic observations.

SNP genotyping

Bulk samples of dried leaves or kernels from up to eight D1 plants derived from the same D0, were used for DNA extraction using the cetyl trimethylammonium bromide (CTAB) procedure. DNA samples were adjusted to 50 to 70 ng/μl and 200 ng per sample were used for genotyping. DH line purity and integrity was first checked using a custom 96plex VeraCode assay (Illumina ® , San Diego, CA, USA) with genome-wide SNP markers to ensure that the lines carried only one of the parental alleles at each SNP, that they did not carry alleles of the inducer line and that they were derived from true F1 plants. For a subset of DH lines, 13 proprietary SNP markers assayed with the KASP™ technology (LGC Genomics, Berlin, Germany) were used for testing line purity and integrity. True DH lines were then used for genotyping with the Illumina ® MaizeSNP50 BeadChip [33] on an Illumina ® iScan platform. Array hybridization and raw data processing were performed according to manufacturer’s instructions (Illumina ® ). Raw data were analyzed in Illumina ® ’s Genome Studio software version v2011 (Illumina ® ) using an improved version of the public cluster file (MaizeSNP50_B.egt, [62]). SNP data were filtered based on the GTscore using a threshold of 0.7. Heterozygous SNPs were set to missing values (NA) and only markers with a minor allele frequency >0.1 per population were used for mapping. For each population, the allele of the central line was coded as the 'A' allele, and the allele of the founder line was coded as 'B' allele (Additional file 4). Raw genotyping data of parents and DH lines are available at NCBI Gene Expression Omnibus as dataset GSE50558 [63].

Analysis of parental genetic diversity

Genetic diversity between parental lines was assessed with genome-wide SNP markers by principal coordinate analysis, cluster analysis, and by a pairwise genome scan for polymorphism between the parents of each population. For details, see Additional file 8.

Genetic map construction

Genetic maps were constructed for each individual population as described earlier [33] using CarthaGene [64] called from custom R scripts. In the first step, statistically robust scaffold maps were constructed with marker distances of at least 10 cM. In a second step, marker density was increased to produce framework maps containing as many markers as possible, while keeping a LOD score >3.0 for the robustness of marker orders. Finally, the complete maps were obtained by placement of additional markers using bin-mapping [65]. CentiMorgan (cM) distances were calculated using Haldane’s mapping function [66]. Individual genetic maps and genotypic data used for construction of the maps (Additional file 4) were deposited at MaizeGDB under the project acronym CORNFED [67].

Physical map coordinates of SNPs

Chromosome and position assignments of SNPs of the MaizeSNP50 BeadChip supplied by the manufacturer (Illumina ® , San Diego, CA, USA), are based on the B73 AGPv1 assembly with many markers lacking a chromosome and/or position information. We therefore performed a new mapping of the SNPs on the B73 AGPv2 assembly [68] using BWA [69]. The new assignments were used for all analyses involving the physical mapping information. Assignments are available in Additional file 4.

Construction of bare and masked Marey maps

Given a chromosome and the associated genetic map of an individual population, we determined the marker positions on the B73 assembly. From these physical and genetic positions, we constructed a first Marey map [70] containing all syntenic markers. This Marey map was smoothed using cubic spline interpolations [71], producing a 'bare' Marey map that was forced to be monotonic. Then regions where mapping information was lacking (for example, segments IBD in the parents) were masked, producing 'masked' Marey maps (Additional file 9). The detailed procedure is explained in Additional file 8.

Recombination landscapes

Once a bare Marey map was constructed, we defined the recombination landscape function as its derivative. Since the bare map was monotonic by construction, the recombination rates were positive as they should be. In effect, this landscape function provided the local recombination rate (in cM per Mbp) for any physical position of the B73 assembly. Note that this procedure did not distinguish the regions where these recombination rates were estimated reliably from those where they were not (unmasked versus masked regions). For comparison tests, it was thus necessary to resort to imputation, that is, to infer missing data from other maps in a conservative way.

Imputed Marey maps for comparison tests

To compare the genetic lengths and the recombination landscapes between two different populations or pools of populations, we used 'imputed' Marey maps where the information missing in the masked Marey map of either population was replaced by the information available in the other population. If a region had masked data in both populations, its content was imputed using the averaged data of all other populations. In all cases, the imputation procedure was designed so that missing or unreliable data in either population to be compared never induced artificial differences. The detailed procedure used for imputation is explained in Additional file 8.

Comparing genetic map lengths

For any given population, mapping data led to an estimate of the genetic length for a given chromosome. We examined pairwise differences in genetic length between populations as well as differences between pools of populations and tested them for their level of statistical significance. Such comparisons were performed from the imputed Marey maps with a significance threshold of 5%, using the welch.test() function in the R software and the conservative Bonferroni correction for multiple testing. The origin of the information (original or imputed data along the chromosomes) was taken into account when comparing populations or pools of populations. For details see Additional file 8.

Effect of population structure on recombination rate

Genetic map lengths tended to be longer in populations involving Flint parents, suggesting that some alleles of factors controlling recombination rate may be differentially fixed in the two pools. To use a solid and objective measure of degree of 'flintness', we estimated the probability of the 22 parental lines to belong to one of the two main groups (Dent or Flint). To do this, we estimated admixture in a combined analysis of the Dent and Flint diversity panels described earlier [37], which included also the lines of our study. This analysis was done with the Admixture software (version 1.22) [72], using 25,237 PANZEA SNPs and 559 maize lines. We chose k = 2 for the number of groups assuming the two pools Dent and Flint, and used the probability of each of the 22 parental lines to belong to the Flint pool for a correlation analysis with recombination rates. More precisely, we analyzed the correlation between the GWRRs of the 23 populations and the average of the 'flintness' of the two parents of each population using the function lm() of the R software. The associated R 2 specifies what fraction of the variance in the GWRR is explained by the group structure of the parental lines. The function lm() also provides the P value for testing the absence of correlation.

Individual additive effects for recombination rate

The 23 genetic map lengths we estimated showed a clear positive correlation with the average 'flintness' of the parents in the crosses. However, the two central lines could be driving this correlation. To remove effects coming from the two central lines, we considered an additive model whereby the GWRR of a population produced by a cross is given by the average of two effects, one from each parent in the cross. For each founder line except for B73, there is a single cross in which it is involved. We took the GWRR of that cross and subtracted the GWRR of the cross involving the same central line and B73. This difference gives the individual additive effect of the founder line minus that of B73 up to a constant. This constant does not affect a putative correlation between individual-specific 'flintness' and GWRR. We performed the statistical test for significance of this correlation using the same procedures as in the previous section, central lines and B73 omitted.

Comparing recombination landscapes

Just as the genetic length of chromosome maps may differ, two recombination landscapes can have different features (different shapes of the Marey maps). To test whether these differences were statistically significant, we normalized the genetic lengths of the two maps or pools of maps to be compared, by rescaling both of them to their mean value. Then, to compare the shape of both normalized Marey maps, our approach was based on binning the landscapes, representing each as a histogram and then applying a chi-squared test with a conservative Bonferroni-corrected significance threshold of 5% (Additional file 11). The detailed procedure is explained in Additional file 8.

Interference analyses

CO interference was modeled in the framework of the Gamma model [73], including a second pathway using the sprinkling procedure [11] whereby non-interfering pathway P2 COs are simply added to those of P1. So the features of CO distributions along chromosomes were modeled using two parameters: the intensity nu of interference in the interfering pathway P1, and the proportion p of COs formed through the non-interfering pathway P2. The detailed implementation of the maximum-likelihood method used to estimate the values of the two parameters nu and p of the model was described earlier [17].


Acknowledgements

We thank J. Wu for technical support, J. Chen (MD Anderson Cancer Center) for 53BP1 −/− mouse embryonic fibroblasts, M. Jasin (Memorial Sloan-Kettering Cancer Center) for U2OS DR-GFP reporter cells, G. Stewart (University of Birmingham) for RIDDLE cells and S. Janicki (Wistar Institute) and D. Spector (Cold Spring Harbor Laboratories) for the U2OS 2-6-3 reporter cell line. We thank K.M. Miller (University of Texas) and A. Sfeir (New York University) for their critical reading of the manuscript and helpful comments. R.A.G. is supported by grant 1R01CA138835-01 from the National Cancer Institute (NCI), a Research Scholar Grant from the American Cancer Society, a Department of Defense Breast Cancer Idea Award, a UPENN–Fox Chase Cancer Center (FCCC) Specialized Program of Research Excellence (SPORE) Pilot Grant and funds from the Abramson Family Cancer Research Institute and Basser Research Center for BRCA. G.M. acknowledges support from NCI grant 1R01CA132878 and funds from the Mayo Clinic Breast Cancer SPORE NCI grant P50CA116201.


7.4: Crossovers Allow Recombination of Linked Loci - Biology

Understanding the molecular underpinnings of evolutionary adaptations is a central focus of modern evolutionary biology. Recent studies have uncovered a panoply of complex phenotypes, including locally adapted ecotypes and cryptic morphs, divergent social behaviours in birds and insects, as well as alternative metabolic pathways in plants and fungi, that are regulated by clusters of tightly linked loci. These ‘supergenes’ segregate as stable polymorphisms within or between natural populations and influence ecologically relevant traits. Some supergenes may span entire chromosomes, because selection for reduced recombination between a supergene and a nearby locus providing additional benefits can lead to locus expansions with dynamics similar to those known for sex chromosomes. In addition to allowing for the co-segregation of adaptive variation within species, supergenes may facilitate the spread of complex phenotypes across species boundaries. Application of new genomic methods is likely to lead to the discovery of many additional supergenes in a broad range of organisms and reveal similar genetic architectures for convergently evolved phenotypes.


Methods

Simulations

I conducted 200 replicate forward-time simulations of a metapopulation adapting to a heterogeneous spatial environment (Figure 1) with SLiM v. 3.2 (Haller and Messer 2017) to create SNP data for each individual. The simulations resulted in a population that had isolation-by-distance structure along an environmental gradient (e.g., isolation by environment, Wang and Bradburd 2014). For simplicity in interpreting the results, only one type of genomic heterogeneity was simulated on each LG, such that each LG evolved approximately independently. Each of the 9 LGs were 50,000 bases and 50 cM in length. The base recombination rate Ner = 0.01 (unless manipulated as described below) gave a resolution of 0.001 cM between proximate bases. The recombination rate was scaled to mimic the case where SNPs were collected across a larger genetic map than what was simulated (similar to a SNP chip), but still low enough to allow signatures of selection to arise in neutral loci linked to selected loci (in the simulations 50,000 bases / (r = 1e-05) * 100 = 50 cM in humans 50,000 bp would correspond to 0.05 cM). Thus, SNPs at the opposite ends of linkage groups were likely to have a recombination rate between them of 0.5 (unlinked), but there would otherwise be some degree of linkage among SNPs within linkage groups. For all LGs, the population-scaled mutation rate Neμ equaled 0.001. For computational efficiency, 1000 individuals were simulated with scaling of mutation rate and recombination rate as described above (Fisher 1930 Wright 1931, 1938 Crow and Kimura 1970 Bürger 2000). In the first generation, individuals were placed randomly on a spatial map between the coordinates 0 and 1. Individuals dispersed a distance given by a bivariate normal distribution with zero mean and variance (Table 2).

Example landscape simulation. Each box is an individual, colored by their phenotypic value. The background is the selective environment. This output was generated after 1900 generations of selection by the environment, resulting in a correlation of 0.52 between the phenotype and the environment.


Watch the video: DNA recombination basic (September 2022).


Comments:

  1. Nazilkree

    Wonderful, very useful message

  2. Gardasida

    Is it the draw?

  3. Luki

    the Incomparable answer)

  4. Tiernan

    In my opinion, he is wrong. I'm sure. I am able to prove it. Write to me in PM, speak.

  5. Gardam

    Willingly I accept. The question is interesting, I too will take part in discussion.



Write a message