(A) GC content variance around CO breakpoints (blue dots and line). The window 0 on the x-axis is the GC content of the breakpoints and the negative and positive values represent the distance away from the breakpoints. Each of these windows is defined as 2 kb sequence and the GC content is calculated for each window. The red dots and line are one of the GC content random samples simulated like the numbers of CO breakpoints (blue dot and line). After 10,000 repeats, not one of random samples is as extreme as the observed (blue line) (P <0.0001). (B) Relationship between recombination and GC content. When the chromosomes are dissected into 10 kb non-overlapping regions, recombination rate (cM/Mb) and GC content can be obtained for each of them. After the bins are sorted by the GC content, the windows are divided into 31 groups based on GC content (approximately 20% to 51%, 1% interval), and the average (and s.e.m.) recombination rates reported for each group.
In both we dissect the genome into 10 kb non-overlapping windows of which there are 19,297. First, we ask about the raw correlation between GC% and cM/Mb for these windows, which as expected is positive and significant (Spearman’s rho = 0.192; P <10 -15 ). Second, we wish to know the average effect of increasing one unit in either parameter on the other. Given the noise in the data (and given that current recombination rate need not imply the ancestral recombination rate) we approach this issue using a smoothing approach. We start by rank ordering all windows by GC content and then dividing them into blocks of 1% GC range, after excluding windows with more than 10% ‘N'. The resulting plot is highly skewed by bins with very high GC (55% to 58%) as these have very few data points (Additional file 1: Figure S10E) (the same outliers likely effect the raw correlation too). Removing these three results in a more consistent trend (Additional file 1: Figure S10F). This also suggests that below circa 20% GC the recombination rate is zero (Additional file 1: Figure S10F). Removing those with GC <20% and, more generally, any bins with fewer than 100 windows (all bins with GC < 20% have fewer than 100 windows) leaves 18,680 (96.8%) of the windows, these having a GC content between approximately 20% and 51%.
Dating between recombination and you will GC-posts
By the observation, we guess one to normally a 1 cm/Mb increase in recombination price try associated with the an increase in GC blogs of around 0.5%. On the other hand a 1% escalation in GC blogs represents a more or less dos cM/Mb increase in recombination rate. We finish you to definitely considering the apparent rareness of NCO gene transformation, no less than throughout the bee genome, extrapolation from GC blogs to help you mediocre crossing-more speed therefore seems to be justifiable, no less than getting GC content more 20%. We notice also you to definitely on tall GC articles new recombination price can be more otherwise underestimated. This might echo a great discordance between most recent and you can previous recombination prices.
These are accustomed build Shape 4B, and this gifts a fairly audio-totally free (shortly after smoothing) monotonic relationships between them variables
Crossing-over speed is additionally of nucleotide variety, gene occurrence, and content matter adaptation nations (Contour S11-S13 for the Additional file step one) . Considering our very own removal of hetSNPs off study aforementioned result is maybe not trivially a CNV relevant artifact. Our very own great-size analyses inform you a confident relationship ranging from nucleotide variety and recombination price e-chat whatsoever the new scales from ten, 100, two hundred, otherwise five hundred kb sequence window (Contour S11 in Additional file 1). That it bolsters prior analyses, one of which stated brand new development but found it to be non-high, when you’re other advertised a development anywhere between people hereditary estimates regarding recombination and you can genetic variety. The new development accords towards opinion you to recombination grounds shorter Hill-Robertson interference ergo enabling reduced rates away from hitchhiking and you can record choices, therefore enabling greater diversity. I and select an effective negative correlation ranging from recombination and you will gene density (Shape S12 into the Even more document step 1) and you can an effective self-confident correlation ranging from recombination and the length of multi-content regions from the various window sizes (Contour S13 from inside the A lot more document step one). Brand new relationship that have CNVs was in keeping with a task for low-allelic recombination promoting duplications and you may deletions via uneven crossing over .