In 1950, John Nash, as a 21-year-old Princeton grad student who was studying game theory, solved the problem of finding equilibria in nonzero sum games. His idea of equilibrium strategies, at its very simplest terms, is “what is the best reply to the other person’s strategy?” But it goes deeper than that—and in this guide we’ll explain it all.
When solving for equilibria in zero sum games you don’t worry about the opponent’s payoffs, just their strategies. The reason is that the opponent’s payoffs are implicitly known—they are always diametrically opposed to yours. But in nonzero sum games, when computing the equilibrium strategies, you have to consider what the opponent’s payoffs are. As you might expect, the solution technique will be a little different.
Take a look at the Coordination game matrix. Imagine that Sony picks Blu-ray, while Toshiba decides to give in and opts for Blu-ray as well. The outcome is much more convenient for Sony but is better for Toshiba than if they didn’t coordinate. The payoff of (4, 1) reflects this.
The thing is, (Blu-ray, Blu-ray) is a pure strategy equilibrium. Both players are playing their best reply to the other’s strategy. If Sony and Toshiba both pick Blu-ray, they are at the outcome (4, 1). Suppose Toshiba reconsiders and is thinking about picking HD DVD. Given that Sony’s choice has restricted the outcomes to the first row of the matrix, Toshiba’s choice of HD DVD will leave it with a payoff of 0, which is worse than the payoff of 1 from selecting Blu-ray. So Toshiba’s best response is to stick with Blu-ray.
Similarly, (HD DVD, HD DVD) is another pure strategy equilibrium. Game theorists call these pure strategy equilibria in nonzero sum games Nash equilibria.
The Coordination game represents a commonplace problem in the world of business and technology and beyond. And although it’s nice to know there are two “reasonable” solutions to the game in the form of the Nash equilibria, neither solution is entirely satisfactory. Either Sony or Toshiba is going to be somewhat inconvenienced. It is important to discover whether a mixed strategy Nash equilibrium exists, and if so, whether it somehow improves on the pure strategy pairs.
To solve for the mixed strategy Nash equilibrium, you solve for each player’s mix one at a time. The guiding principle behind finding the best reply is that for any player, each pure strategy that is used in their equilibrium mix must yield the same expected payoff. This means that each player must solve for the mix that equalizes the other player’s average payoffs.
If you’re Sony, you will find the mixed strategy x for Blu-ray and (1 − x) for HD DVD that makes Toshiba indifferent between their two choices. We use Toshiba’s payoffs and equate its expected values for Blu-ray and HD DVD.
If Toshiba picks Blu-ray | If Toshiba picks HD DVD | ||
---|---|---|---|
1 * x + 0 * (1−x) | = | 0 * x + 4 * (1−x) | |
x | = | 4 − 4 * x | |
5 * x | = | 4 | |
So | x | = | 4D5 |
So Sony’s equilibrium strategy is to randomly pick Blu-ray ⅘ of the time, and to pick HD DVD the other ⅕ of the time.
Now Toshiba will mix strategies (y, 1 − y) to neutralize Sony’s expected outcomes under each of Sony’s strategies. (We use Sony’s payoffs now.)
If Sony picks Blu-ray | If Sony picks HD DVD | ||
---|---|---|---|
4 * y + 0 * (1−y) | = | 0 * y + 1 * (1−y) | |
4 * y | = | 1 − y | |
5 * y | = | 1 | |
So | y | = | ⅕ |
Toshiba is thus advised to pick Blu-ray (randomly) ⅕ of the time and HD DVD the remaining ⅘ of the time.
The Nash equilibrium strategy has both Sony and Toshiba go for their preferred choices ⅘, or 80 percent, of the time, and give in (randomly) the other ⅕, or 20 percent, of the time. This might have a certain intuitive appeal. But what are the average payoffs?
Finding the equilibrium solution is crucial but you will also want to determine the average payoffs to the players when they play their equilibrium strategies. Let’s do Sony’s equilibrium payoff first. (Since the game is symmetric, i.e., Toshiba’s strategies and payoffs precisely mirror Sony’s, Toshiba’s equilibrium mix will be identical.)
Sony picks Blu-ray ⅘ of the time. But the two players coordinate on only ⅕ of those selections since that is the probability that Toshiba will also pick Blu-ray. The other ⅘ of the time they do not coordinate.
Sometimes Sony will give in and pick HD DVD; in fact, this will happen the other ⅕ of the time. When Sony picks HD DVD, ⅕ of the time Toshiba gives in and picks Blu-ray (tragic!) and ⅘ of the time Toshiba picks HD DVD.
Four outcomes are possible, and now you can find the probabilities associated with all of them. For example, the probability that both pick Blu-ray is ⅘ * ⅕. The following expression gives the expected value for Sony under its equilibrium strategy:
(⅘) * (⅕) * 4 + (⅘) * (⅘) * 0 + (⅕) * (⅕) * 0 + (⅕) * (⅘) * 1 = ⅘.
Unfortunately, Sony realizes that this mixed strategy equilibrium leaves them worse off than if they had ended up with HD DVD. Because the game is entirely symmetric, the equilibrium payoff for Toshiba is the same as Sony’s, i.e., equal to ⅘. Thus Toshiba discovers that they would have been better off agreeing to switch to Blu-ray.
Now that you understand the Nash Equilibrium, studying game theory should become much easier. Good luck!
From The Complete Idiot’s Guide to Game Theory by Edward C. Rosenthal, Ph.D.