May 12, 2011
Altruism and evolutionary stable strategies

One thing that I am fascinated with more than anything else is is strategy.  Usually equated with games of the mind such as chess, rather than everyday life, strategy can both subtle or overt, but when it comes to the end of the day, it’s all-encompassing.

Another thing I’m pretty head-over-heels for is public radio.  If you don’t listen to public radio, honestly, you should.  There is so much variety and tons of quality programming.  So now you’re wondering why I’ve completely changed the subject and have gone off on a tangent…Well, it’s because, by the glory of all things good in public radio and science, there just happens to be a show that took on this very issue.  That show, RadioLab, does an absolutely excellent job not only on this one specific episode, but in general.  I won’t ramble too much more about it, but know that it’s a worthwhile listen for anyone interested in science, be it on those actual radio things, or as the free podcast.

Finally, the show, entitled, “The Good Show" covers a few of the basics regarding what strategy is, what strategies there are, and a few of the key players that and involved in furthering research on the subject.

On that note- you should go listen to it…

But before I end this brief post, let me just introduce you to a simple thought experiment that has been foundational to research on strategy.

This though experiment is something you may very well have heard of before, it’s called, “The Prisoner’s Dilemma.”

Imagine two crooks, bank-robbers,murders, or some other type of criminal offenders have been caught for a crime.  Although apprehended, the police don’t have quite all the evidence they need to really put the two in the slammer for committing the aforementioned crime(s).  So, what the police decide to do is split the two prisoners up for a period of time then take each one, individually into a room and say…”your other guy, you know he ratted you out, you’re going to go away for 5 years in prison…But, if you spill the beans on him, we’ll make you a deal, cutting that time, to only a year.”  The prisoner ponders this offer and has to decide whether he is going to reciprocate against his partner, or if he is going to keep silent and not say anything.  These two options are now to be referred to as “defect,” and “cooperate,” respectively.

Here’s the kicker, the police don’t actually have much evidence at all, and-they’re liars.  The other prisoner did not say anything, in fact, the police have yet to even meet with him yet.  They are trying to trick the prisoner into giving them the testimony they need in order to convict him and his accomplice.

So let’s step back away from the thought experiment for a moment and consider that each prisoner has two unique choices he can make.  We can therefore look at all the possible outcomes just like a Mendelian punnett square

There are four possible outcomes:

(C,C), (C,D), (D,C) and (D,D)

where C= cooperate and D= defect

It’s easy to understand the outcomes of each of these situations by assigning a value to each decision and outcome.  The graphic to the left (which I’m so graciously borrowing) assigns the most logical values for this situation, “years in prison.” 

So lets run through these situations:

(C,C) where both prisoners cooperate they both get 2 years in prison

(C,D) Prisoner 2, Henry in the case of the graphic, defects, squealing on his partner, leaving him with a quick 1 year in prison and his partner with 5 years in prison.

(D,C) Prisoner 1, Dave, defects, and ends up with the same deal as Henry would have if he defected, 1 year in prison for the rat, and 5 years for the transgressed.

(D,D) Both Prisoners 1 and 2 defect, giving testimony that the other was the leader of the crime, and they were simply the accomplice.  Due to the contradiction, neither can be fully convicted, but they still both end up with 3 years in prison each.

SO.  What is a prisoner to do?  It’s obvious if this testify that their partner was in charge that they will receive the least amount of time in prison.  If they say nothing, they will end up with more time in the slammer.  This is where the dilemma comes in.  Both prisoners are told that their partner ratted them out.  This means, if they believe the police officers words, then they will go away for 5 years and their partner will get off with a much reduced sentence.  So what is to be done?

None of the options are really obvious to the prisoners as they are not privy to exactly what evidence is held against them.  They also cannot speak with their accomplice to confirm or deny the statements made against them.

Again, I ask, What is a prisoner to do?  Well, Robert Axelrod, a professor of public policy at the University of Michigan, wondered the same thing.  Except, he approached the problem with a little more robust treatment than just thinking and anecdote.  Axelrod decided an excellent way to determine what to do was to hold a contest.  The guidelines of this contest were relatively simple, contestants would write a program that would “play out” this situation a number of times, against a program of another contestant.  Instead of “years in prison,” point values would be assigned to the programs, based upon their decisions, respectively.  The highest point value would be assigned to the best outcome, or the least amount of time in prison.  Therefore, the prisoner who defected against his cooperating accomplice would receive the most points.

By running these simulations over time, particular strategies, or (sometimes) conditional behaviors would be highlighted as good, bad, better, etc.  Therefore, one could use these as guides as to what to do in the Prisoner’s Dilemma.  I’m not going to elaborate too much on these strategies, except for the winning one, was known as “Tit-for-Tat,” which I believe Wikipedia does a good job of explaining (this is verbatim from the site):

This strategy is dependent on four conditions, which have allowed it to become the most successful strategy for the iterated prisoner’s dilemma:[1]

  1. Unless provoked, the agent will always cooperate
  2. If provoked, the agent will retaliate
  3. The agent is quick to forgive
  4. The agent must have a good chance of competing against the opponent more than once.

In the last condition, the definition of “good chance” depends on the payoff matrix of the prisoner’s dilemma. The important thing is that the competition continues long enough for repeated punishment and forgiveness to generate a long-term payoff higher than the possible loss from cooperating initially.

A fifth condition applies to make the competition meaningful: if an agent knows that the next play will be the last, it should naturally defect for a higher score. Similarly if it knows that the next two plays will be the last, it should defect twice, and so on. Therefore the number of competitions must not be known in advance to the agents.

Therefore, always cooperate, and if someone burns you, burn them back; but be forgiving.

That’s the Prisoner’s Dilemma, foundational for a basic understanding of strategy, and a central tenet of game theory.

Thanks for sticking with me for the rather lengthy explanation.  Now if you haven’t already, GO LISTEN TO RADIOLAB.  It’s a great show and the episode features Robert Axelrod, who does a much better job explaining his own work than I.

If you’d like to pursue this subject a little bit further I’d suggest two books, one which is rather technical (read: full of math) and the other, which is written towards a popular audience, but still features all of the meaty bits of science (sorry I can’t equate science to veggies folks)

Read: “Evolution and the Theory of Games” by John Maynard Smith

Read: “The Selfish Gene" by Richard Dawkins

Liked posts on Tumblr: More liked posts »