How to find Hamilton’s Rule
In 1964, William Hamilton published two papers that revolutionized the study of the evolution of altruism. His famous eponymous equation was rb>c, and it purported to show the conditions in which a gene for altruism could spread in a population. A trait for altruism is difficult to spread in a population since it is easily exploited by others who benefit from the altruist’s actions without returning the favor. Hamilton approached this problem by identifying a particular condition in which the gene for altruism (in reality, it is almost certainly not a single gene, but it can be thought of roughly as the inherited components of the genotype that lead to an altruistic phenotype) would be more likely to be shared. If altruistic acts could be focused only on kin who are likely to share the gene from common descent, then the gene would be advantageous and its frequency in the population would grow.
I’m not going to discuss the real life implications of Hamilton’s Rule here, nor will I discuss the empirical evidence for its application to social evolution. I’ve just recently learned how Hamilton’s Rule was first derived (using the book Mathematical Models of Social Evolution by Richard McElreath and Robert Boyd) and I want to repeat it here because I think it’s interesting and I don’t want to forget how it was done. A few years after Hamilton published his pair of papers, his rule was re-derived using the mathematical methods of covariance genetics introduced by George Price. That more recent version also arrives at rb>c, though its implications and assumptions are more clear (apparently, according to McElreath and Boyd’s book; I haven’t worked through it yet)
The model uses a tool for studying altruism known as the prisoner’s dilemma. This is one of the better known scenarios in game theory so I’ll only discuss it briefly. In short, it poses a situation in which two players—let’s call them P_{1} and P_{2}—must decide whether they will cooperate with the other in order to receive a mutual benefit at a cost. There must be a cost to the cooperation for it to be considered altruism; if an act was of mutual benefit to both, then its explanation would not require any extra explanation. I will introduce a more concrete description shortly. All of the scores described are for P_{1}. If P_{1} decides to cooperate and P_{2} also cooperates, then P_{1} receives the benefit of the cooperation, b, though the cooperative act comes at a cost, c. More precisely, b – c represents the average payoff over all interactions between cooperators—each individual interaction will be to the benefit of one at a cost to the other, but over all interactions, the payoff will be b – c. However, if P_{1} chooses not to cooperate—to defect—and P_{2} decides to cooperate, then P_{1} will receive the benefit at no cost. Conversely, if P_{1} decides to cooperate and P_{2} decides to defect, then P_{1} will receive only the cost. If they both defect, then P_{1} will receive 0.
The payoffs in each cell are for P_{1}.
P_{2} Cooperates | P_{2} Defects | |
P_{1} Cooperates | b – c | – c |
P_{1} Defects | b | 0 |
To make this less abstract, consider a fictional species of rodents that forage in pairs. It is in both of their interest to avoid predators. If either of the rodents see a predator, it has the option of either running away and hiding or alerting the other naïve rodent to the danger and then they can both avoid the predator. Alerting the other rodent to the predator comes at a cost however, since alerting the second rodent reveals one’s location and increases the chance of being killed by it.
Let’s say that the the tendency for individuals of the rodent species to alert another of a predator has a genetic component and that there are two different versions of that gene (alleles) throughout the population: one which makes the rodent alert others and one that doesn’t. The pairs in which they forage also frequently shuffle randomly, so the number of times that they will be paired with either type over time will reflect the frequency of each type in the population. Over time, assuming each rodent randomly encounters about the same number of predators as any other, the average payoff for any individual will depend on whether each is an altruist, A, or a non-altruist, N, and how often they are paired with others of each type.
We can then find the payoffs to the of individuals of each type when they encounter another individual of either type. These take the form V( P_{1}| P_{2}), which in plain language is the payoff, V, to P_{1} when encountering P_{2}. These terms can appear bulky, but each are only a single variable which could as easily be replaced by single letter variables. However, when tracking so many variants, it is preferable to have the amount of information at hand that is included in variables like V(A|A).
A | N | |
A | V(A|A) = b – c | V(N|A) = – c |
N | V(A|N) = b | V(N|N) = 0 |
(The payoffs are again for the individual of the type in the first column.)
Also relevant are the probabilities of being paired with either type of rodent, since the frequencies of each type may be uneven in the population. These take the form of Baysian probabilities such as “Pr(A|A).” Again, these are single variables. However, it is important to note that the arrangement within the parentheses is reversed: Pr(P_{2}|P_{1}), or “the probability of encountering P_{2} if you are P_{1}.”
Thus another table can be made:
A | N | |
A | Pr(A|A) | Pr(N|A) |
N | Pr(A|N) | Pr(N|N) |
With all of that set up, we can now find the average fitness of being either an altruist or a non-altruist. That fitness will depend on the payoff of each strategy given the frequencies of each type in the group in question. One strategy may be more advantageous when, for example, the frequency of altruists is low but less advantageous when there are too many altruists.
We can call the average fitness, W, of altruists W(A) and the average fitness of non-altruists W(N). These equations typically include an extra variable, w_{0}, which represents a baseline fitness that essentially determines how much of an impact the rest of the equation will have (i.e., if w_{0} is high then the rest of the equation will have little impact and vice versa). However, that baseline fitness is the same across strategies, so they are canceled out when comparing specific strategies.
The average fitness of being an altruist is then the baseline fitness plus the sums of each payoff multiplied by their frequency in the population:
W(A) = w_{0} +Pr(A|A)V(A|A) + Pr(N|A)V(A|N)
or,
W(A) = w_{0} +Pr(A|A)(b – c) + Pr(N|A)( – c)
W(A) = w_{0} +Pr(A|A)b – Pr(A|A) c – Pr(N|A)c
W(A) = w_{0} +Pr(A|A)b – c{Pr(A|A) + Pr(N|A)}
And since {Pr(A|A) + Pr(N|A)} = 1,
W(A) = w_{0} +Pr(A|A)b – c
And the average fitness of being a non-altruist is:
W(N) = w_{0} + Pr(A|N)V(N|A) + Pr(N|N)V(N|N)
W(N) = w_{0} + Pr(A|N)(b) + Pr(N|N)(0)
W(N) = w_{0} + Pr(A|N)b
If altruism is going to spread in the population, then the average fitness of altruists must be higher than that of non-altruists, or:
W(A) > W(N)
Pr(A|A)b – c > Pr(A|N)b
Pr(A|A)b – Pr(A|N)b > c
{Pr(A|A) – Pr(A|N)}b > c
As long as the probability of an altruist interacting with another altruist, Pr(A|A), is greater than than the probability of a non-altruist interacting with an altruist, Pr(A|N), then there is a chance for altruism to spread, though only in situations in which b is greater than c. If the two probabilities are equal, then altruism cannot evolve since the left-hand side of the inequality would resolve to zero and the cost would have to be negative.
McElreath and Boyd give three ways in which the probability of an altruist interacting with an altruist can increase to a high enough level to allow altruism to evolve. The first is limited dispersal, in which individuals live their lives close to their place of birth, making it likely that they will interact with others that are of the same type. The second is behavioral bookkeeping, in which individuals track whether another is an altruist or not by observing and recalling their past behaviors. This strategy is effective, but requires a higher level of cognitive ability. And the third, which will be treated below, is by kin recognition. If individuals preferably interact with kin, then the chances of sharing the allele increases due to common descent.
In the model above, all interactions between individuals is random, which means that the frequency of altruists in the population, p, will determine the probability that either type will be paired with an altruist or a non-altruist. In that case, we can say that:
Pr(A|A) = Pr(A|N) = p
and,
Pr(N|A) = Pr(N|N) = 1 – p
since the probability of either altruists or non-altruists is dependent only on the proportion of altruists in the population.
However, if individuals only direct their altruistic behaviors at kin, then the degree of relatedness, r, can be used to compute the likelihood that a relative will have inherited the same allele and therefore also be an altruist. The degree of the relatedness is the approximate measure of shared genes one has with another individual from shared ancestry (siblings share ½ of their genes, as do a parent and their offspring, half-siblings share ¼, cousins share 1/8, etc.). If it is in one’s benefit to be altruistic only with others that share the gene for altruism—called positive assortment—then one knows that a sibling will have a 50% chance of sharing it and use that information strategically. Of course that “knowledge” is not necessarily cognitive; organisms have developed different non-cognitive mechanisms for identifying kin, such as chemical signals.
When interacting with kin, the probability that an altruist is paired with another altruist then depends on both the likelihood that they share the trait due to shared genes from common descent as well as the likelihood that they share the trait because of how commonly it appears in the population (multiplied by the 1 – r, or the degree to which they are not related). Thus:
Pr(A|A) = r(1) + (1 – r)p
Pr(N|A) = r(0) + (1 – r)(1 – p)
Pr(A|N) = r(0) + (1 – r)p
Pr(N|N) = r(1) + (1 – r)(1 – p)
If we take these new probabilities, we can substitute them into the inequality that we produced above:
{Pr(A|A) – Pr(A|N)}b > c
{[r + (1 – r)p] – [(1 – r)p]}b > c
{r + (1 – r)p – (1 – r)p}b > c
rb > c
And here we arrive at Hamilton’s Rule. It implies that when interacting with kin, altruism can spread in a population as long as the cost of the altruistic action is less than the benefit of the altruistic action to the recipient proportional to their degree of relatedness. In the rodent example, predator alarm calling can spread in a population if it is directed at kin who may share the trait and if the increased risk of getting killed after giving an alarm call, c, is not greater than the benefit of those warned, proportional to their degree of relatedness. This is also known as kin selection, in which traits are selected for due to their effects on one’s kin’s genes rather than one’s own since there is a chance that they are shared by common descent.
There are many more implications and concerns about Hamilton’s Rule that could be treated, but since I intended this post only as an introduction to the math, I will end here for now, but perhaps I’ll write more on the topic later. For more information, I very much recommend the book mentioned earlier, Mathematical Models of Social Evolution by Richard McElreath and Robert Boyd.
One comment