Lecture notes for ZOO 4400/5400 Population Ecology

Lecture 34 (19-Apr-06)

Return to Main Index page       Go back to notes for Lecture 33,  20-Apr    Go forward to Lecture 35,  21-Apr-06

Go to worked example for take-home exam

Population Genetics (continued): Hardy-Weinberg Principle

      Go to Glossary of genetic terms

Probably the most important basic concept in population genetics is the Hardy-Weinberg principle. It provides an expectation for genotypic patterns in populations. Deviations from the predicted pattern can provide very important insights into processes of genetic and evolutionary change.

Before we examine Hardy-Weinberg consider the five primary forces that drive evolutionary change:

If the mating system is non-random, if individuals with different genotypes move in or out of populations (leading to gene flow), if mutations arise, in the DNA, or if particular alleles are favored or disfavored by selection, then the gene frequencies will change and the population will evolve. Patterns of genetic variation in natural populations (such as the WY black bears I will discuss in a future lecture) are extraordinarily complex. As a starting point, we look at a very simple case (Hardy-Weinberg equilibrium or HWE) as a basis for seeing where/how actual patterns of genetic variation differ from the simple expectation.

Developed by George Hardy and Wilhelm Weinberg in the early 1900's, the Hardy-Weinberg principle is a model that relates allele frequencies to genotype frequencies. Like most models, Hardy-Weinberg is a simplification of real world complexities.

Remember that an allele is a variant form of a gene (piece of DNA) at a single locus (Latin for "place", so we are referring to a particular stretch -- for example a stretch of 275 base pairs on Chromosome 13).  An allele frequency (geneticists call it "gene frequency") is therefore a measure of the commonness of an allele in a population (the proportion of a specific allele in a population -- how common is the A ["big A"] allele, or the a ["little a"] allele). A genotype is the specific allele composition for a certain locus or set of loci (Aa,AA, or AaBBcc for several loci vs. a second genotype AabbCc). Genotype frequency is a measure of the commonness of a genotype in a population; i.e., the proportion of a specific genotype in a population. Two major terms are important in discussing genotypes: homozygote and heterozygote. A homozygote has two copies of the same allele (e.g., AA or bb). A heterozygote has two different alleles at a given locus (e.g., Aa or Dd). Because the allele and genotype frequencies are proportions they always sum to 1.0, if we have included all the possible variants.

Allele frequencies:

p + q = 1                                                                              Eqn 34.1

Expected genotype frequencies:

p2 + 2pq + q2=1                                                                   Eqn 34.2
The possible range for an allele frequency or genotype frequency therefore lies between zero and one, with zero meaning complete absence of that allele or genotype from the population (no individual in the population carries that allele or genotype), a one means complete fixation of the allele or genotype (fixation means that every individual in the population is homozygous for the allele -- i.e., has the same genotype at that locus).

Hardy-Weinberg Principle

The Hardy-Weinberg model makes eight assumptions (note that the first five of these relate directly to the five major forces that drive evolutionary change)

With these assumptions, it is easy to calculate the genotype frequencies for a gene with two alleles (A and a). The frequency of homozygous genotype AA is the probability of one allele A being in combination with another allele A. The expected frequency is simply the product of the separate allele frequencies. We will use the term p to refer to the frequency of allele A:

Frequency of AA = p2    (Homozygote for A)                                          Eqn 34.3

The frequency of heterozygous genotype Aa is the probability of allele A being in combination with allele a. Note that there are two possible ways to get those combinations -- A from Dad and a from Mom, or vice versa (examine Fig. 6.1 below).

Frequency of Aa = 2pq    (Heterozygote)                                                 Eqn 34.4

The frequency of homozygous genotype aa is the probability of one allele a in combination with another allele a.

Frequency of aa = q2     (Homozygote for a)                                           Eqn 34.5

Fig. 34.1. Diagram of Hardy-Weinberg genotype proportions from male (sperm) and female (egg) contributions. Given a locus with two alleles designated A and a that occur with frequencies p and q, the chart shows the genotype frequencies (p2, 2pq, and q2) as differently colored areas. Note that the heterozygotes (blue + yellow = green) can be formed in two different ways.

Example 1 -- calculation of expected genotype frequencies from gene (allele) frequencies:

If p = 0.75 and q = 0.25 we can use Eqns 34.3, 34.4, and 34.5 to calculate the expected genotype frequencies.

AA = p2 = 0.75 * 0.75           =         0.5625

Aa = 2pq = 2 * 0.75 * 0.25    =         0.375

aa = q2 = 0.25 * .025              =         0.0625                                       Eqns 34.6

The values we have just calculated are EXPECTED genotype frequencies IF the Hardy-Weinberg assumptions hold. We now turn to how we could check that from actual OBSERVED genotypic data (such as microsatellite data for Wyoming black bears). In order to calculate allele frequencies all we need are the observed genotype frequencies.

p = p2 + (2pq/2) and q = q2 + (2pq/2)                                                     Eqn 34.7



Example 2 -- fit of observed genotype frequencies to Hardy-Weinberg expectation:

Let's look at an example from the beginning. We will examine a population of trout with a di-repeat (two base pair repeat) microsatellite marker that has two alleles, 120 and 122. For simplicity, let's call allele 120 A and allele 122 a. We genotype 100 individuals and find genotype frequencies of AA = 0.25, Aa = 0.5, and aa = 0.25 (check that when summed these frequencies add to one). We ask the question of whether this population is in Hardy-Weinberg equilibrium. We first need to calculate the p and q (allele frequencies of A and a; note that the A and a refer to the alleles themselves, the p and q refer to the frequencies of those alleles). We calculate the frequencies using Eqns 34.6.

p = p2 + (2pq/2) = 0.25 + (0.5/2)    =    0.5

q = q2 + (2pq/2) = 0.25 + (0.5/2)    =    0.5                                        Eqns 34.8

We see that the allele frequencies sum to one, as required by Eqn 34.1. Using the allele frequencies, we then calculate the expected genotype frequencies using Eqns 34.3, 34.4, and 34.5.

AA = p2 = 0.5 * 0.5 = 0.25

Aa = 2pq = 2 * 0.5 * 0.5 = 0.5

Aa = q2 = 0.5 * .05 = 0.25                                                                 Eqns 34.9

The expected genotype frequencies are same as the observed genotype frequencies (from the microsatellite data). This tells us that our population is in Hardy-Weinberg equilibrium. If the expected genotype frequencies calculated from the allele frequencies were not the same as the observed genotype frequencies our population would not be in Hardy-Weinberg equilibrium -- we assess whether the difference is statistically significant using a chi-square test, as we will see shortly.

The expected frequency distribution of genotypes AA, Aa, and aa in proportions p2, 2pq and q2 respectively is called the Hardy-Weinberg equilibrium. If the population meets the eight assumptions listed above, then the population will go to the Hardy-Weinberg equilibrium in the first generation, and remain there. Again, the Hardy-Weinberg principle and its predicted equilibrium, is a simple model that serves as a starting point for examining the genetic structure of populations.

Violating Hardy-Weinberg assumptions

How likely are we to meet the major assumptions of random mating, no drift, no mutation, no migration, and no natural selection? If we violate the assumptions, how much difference does it make? Here is a list of processes that violate the Hardy-Weinberg assumptions and some discussion of each of them.

Non-random mating

Random mating means that alleles (as carried by the gametes -- eggs or sperm) come together strictly in proportion to their frequencies in the population as a whole. Example: if p = 0.6 and q = 0.4, then the probability of an Aa heterozygote is 0.48 (the product of the allele frequencies, plus consideration of the fact that two ways exist to make a heterozygote; see Fig. 34.1). Situations where the random mating assumption does not hold include:

Often, the impact of a moderate amount of non-random mating has a negligible impact on our conclusions about the patterns and causes of genetic variation.

(To be continued with the four remaining forces in  the next lecture)

§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§

Return to top of page                     Go forward to lecture 35  21-Apr-06