Lecture notes for ZOO 4400/5400 Population Ecology

Lecture 36 (24-Apr-06)

Return to Main Index page       Go back to notes for Lecture 35,  21-Apr

Go to worked example for take-home exam       

  Go to Excel spreadsheet for calculating gene frequencies and local inbreeding coefficient from genotypic counts

Population Genetics (continued)

      Go to Glossary of genetic terms

       Go to worked FST calculation web page.

Last time I introduced the topic of hierarchical F-statistics.  We build those up by calculating three kinds of heterozygosities, HI, HS, and HT.  Let's look at the general formulae for these heterozygosities, and how they contribute to the calculation of hierarchical F-statistics, and then I will work through an example.

We begin with HI, the observed heterozygosity in individuals, calculated as a weighted average across the subpopulations.

                                                       Eqn 36.1
where the subscript s refers to the sth of n subpopulations.  That is, first we multiply each subpopulation's observed heterozygosity by its population size. Then we sum those weighted heterozygosities.  Finally, we divide by the sum of all the subpopulation sizes.  See an example of a specific case example calculation in the FST example page.

Next we calculate HS as the global weighted average of the expected heterozygosities across all the subpopulations:

                                                      Eqn 36.2
The formula differs from that of Eqn 36.1 only because we are now using Hexp (calculated from each subpopulation's gene frequencies by Eqn 37.1) instead of Hobs.

Finally we use the global mean gene frequencies to calculate HT, the global expected heterozygosity.  This will not give us the same answer as the weighted average of the separate subpopulation values for expected heterozygosity.  The formula is:

                                                              Eqn 36.3
The only difference between this formula and that of Eqn 37.1 is that here we specify the global mean (pi-bar) for the gene frequencies over all the subpopulations, rather than the subpopulation-by-subpopulation values.

With HI, HS, and HT in hand we are ready to calculate our hierarchical F-statistics.  First, FIS:

                                                              Eqn 36.4
You will often see this written in the mathematically equivalent form:
                                                               Eqn 36.5

This first "global" F-statistic is the ratio of the difference between the global-average expected and observed heterozygosities in subpopulations (HS - HI) to the global-average expected heterozygosity (HS).  It gives us a view of the average inbreeding over the entire set of subpopulations (that is, it very closely resembles the local F or Fs of Step 5 in the  FST example page.

Next we calculate the F-statistic that tells us the most about the degree of genetic difference among the subpopulations -- FST.  It is calculated as

                                                             Eqn 36.6
Here we assess the difference between the expected heterozygosities in the subpopulations and the expected heterozygosity based on the global gene frequencies.

Let's consider two extreme examples that will illustrate how  FST can vary between zero and one.  Consider a system with three alleles where we have three subpopulations.

Case 1: Maximal FST.  If each subpopulation is fixed for a different allele, then HS will be zero (if we have only one allele, we don't expect any heterozygotes).  In that case, Eqn 33.6 simplifies to HT / HT = 1.

Case 2: Minimal FST.  If the gene frequencies are the same in each of the subpopulations, they will all have the same HS, which will be the same as HT.  In that case, the numerator of Eqn 33.6 goes to zero and FST is zero. Why can't FST be negative?  You cannot arrange a set of populations to have HS > HT.

The final (and least often used) global F-statistic is FIT, given by the formula:

                                                             Eqn 36.7
FIT is relatively little used for two reasons.  First, it is often quite similar to FIS, thereby providing little new information.  If and when it does differ from FIS it may even be somewhat misleading.  It is possible to construct scenarios in which FIT produces an "overall" picture that differs from the picture in any particular subpopulation.  The reasonable context for an individual (observed heterozygosity) is against its own subpopulation.  Juxtaposing individual against total population is less intuitively meaningful.

Go to worked FST calculation web page.

§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§

Return to top of page