Extended MKT

The null hypothesis of neutrality is rejected in a MKT when D_i/D₀>P_i/P₀, inferring adaptation, but also when P_i/P₀>D_i/D₀. In this later case, there is an excess of polymorphism relative to divergence for the non-synonymous class n, due to (i) slightly deleterious variants segregating at low frequency in the population subject to weak negative selection, which contribute to polymorphism but not to divergence, or (ii) relaxation of selection where sites previously under strong or weak purifying selection have become neutral, causing an increased level of polymorphism relative to divergence. Adaptive mutations and weakly deleterious selection act in opposite directions on the MKT, so α will be underestimated when the two selection regimes occur.

Figure 1. Probability of fixation of a mutation depending on the regimes of selection.

Because slightly deleterious mutations tend to segregate at lower frequencies than do neutral mutations, they can be partially controlled for by removing low frequency polymorphisms from the analysis, generally the 5% (see FWW correction for more information about this correction). However, this method is still expected to lead to biased estimates. To take adaptive and slightly deleterious mutation mutually into account, P_i the count of segregating sites in the non-synonymous class, should be separated into the number of neutral variants and the number of weakly deleterious variants, P_i=P_{i neutral}+P_{weakly del.} (Mackay et al. 2012). If both numbers are estimated, adaptive and weakly deleterious selection can be evaluated independently. Consider the following pair of 2×2 contingency tables:

The table on the left if the standard MKT table with the theoretical counts of segregating sites and divergent sites for each cell. The table on the right contains the count of P_i and P₀ for two-frequency categories. The estimate of the fraction of sites segregating neutrally within the Derived Frequency Spectrum (DAF) below a 5% cutoff f_{neutral DAF<5%} is f_{neutral DAF<5%}=P_DAF<5%/P₀.

The expected number of segregating sites in the non-synonymous class which are neutral within the DAF<5% is P_{neutral DAF<5%}=P_i×f_{neutral DAF<5%}. The expected number of neutral segregating sites in the non-synonymous class is P_{i neutral}=P_{neutral DAF<5%}+P_{i DAF≥5%}.

To estimate α from the standard MKT table correcting by the segregation of weakly deleterious variants, we have to substitute P_i by the expected number of neutral segregating sites, P_{i neutral}. The correct estimate of α is then α=1-P_{i neutral}/P₀×D₀/D_i.

Furthermore, the DGRP approach also quantifies negative selection. The excess of sites segregating with DAF below 5% with respect to the neutral site class are considered to be weakly deleterious (b) and therefore, b can be estimated as b=P_{i weakly del.}/P₀ × m₀/m. Then, the neutral fraction estimated from the neutral class after correcting for weakly deleterious sites is f=m_s×P_{n neutral}/m_n× P_s. Finally, the fraction of new mutants which are strongly deleterious (d) and therefore not segregating is d=1-f-b. daf

Figure 2. Negative selection fraction. b is the fraction of weakly deleterious sites, d the fraction of strongly deleterious sites, and f is the neutral fraction.