Justin Spencer: The Art of Strategy by Avinash K. Dixit & Barry J. Nalebuff

Strategic thinking is the art of outdoing an adversary, knowing that the adversary is trying to do the same to you. It is also the art of finding ways to cooperate, even when others are motivated by self-interest, not benevolence. It is the art of convincing others, and even yourself, to do what you say. It is the art of interpreting and revealing information. It is the art of putting yourself in others’ shoes so as to predict and influence what they will do.
The branch of social science that studies strategic decisions making is called game theory.
Science and art, by their very nature, differ in that science can be learned in a systematic and logical way, whereas expertise in art has to be acquired by example, experience, and practice.
The key lesson of game theory is to put yourself in the other player’s shoes.
They point out that if you flip a coin long enough, you will find some very long series of consecutive heads.
If you have the lead, the surest way to stay ahead is to play monkey see, monkey do.
There are two ways to move second. You can imitate as soon as the other had revealed his approach or wait longer until the success or failure of the approach is known. The longer wait is more advantageous in business because, unlike in sports, the competition is usually not winner-take-all. As a result, market leaders will not follow the upstarts unless they also believe in the merits of their course.
A compromise in the short term may prove a better strategy over the long haul.
Most of the time, the future self wins because it gets to move last. The trick is to change the incentives for the future self so as to change its behavior.
How do you get people to do something that is against their interest? Put them in what is known as the prisoner's’ dilemma.
It turns out most people fall into predictable patterns. You can test this yourself online where computer programs are able to find the pattern and beat you. In an effort to mix things up, players often rotate their strategies too much. This leads to the surprise success of the “avalanche” strategy: rock, rock, rock.
People are also too influenced by what the other side did last time.
The importance of randomized strategies was one of the early insights of game theory. The idea is simple and intuitive but needs refinement to be useful in practice.
Every action someone takes tells us something about what he knows, and you should use these inferences along with what you already know to guide your actions.
You need to understand the other player’s perspective. You need to consider what they know, what motivates them, and even how they think about you.
When thinking strategically, you have to work extra hard to understand the perspective and interactions of all the other players in the game, including ones who may be silent.
You may be thinking you are playing one game, but it is only part of a larger game. There is always a larger game.
“For every action we take, there is a reaction.” We do not live and act in a vacuum. Therefore, we cannot assume that when we change our behavior everything else will remain unchanged.
The essence of a game of strategy is the interdependence of the players’ decisions. These interactions arise in two ways. The first is sequential, as in the Charlie Brown story. The players make alternating moves. [...] The second kind of interaction is simultaneous, as in the prisoner's’ dilemma. The players act at the same time, in ignorance of the others’ current actions. However, each must be aware that there are other active players, who in turn are similarity aware, and so on.
When you find yourself playing a strategic game, you must determine whether the interaction is simultaneous or sequential.
The general principle for sequential-move games is that each player should figure out the other players’ future responses and use them in calculating his own best current move. This idea is so important that it is worth codifying into a basic rule of strategic behavior:

Rule 1: Look forward and reason backward.

Anticipate where your initial decisions will ultimately lead and use this information to calculate your best choice.
A tree diagram of the choices in the game sometimes serves as a visual aid for correct reasoning in such games.
In single-person decisions, greater freedom of action can never hurt. But in games, it can hurt because its existence can influence other players’ actions. Conversely, tying your own hands can help.
To figure out what the other players will choose at future points in the game, you need to know what their objectives are and, in the case of multiple objectives, how they will trade one off against the other. You can almost never know this for sure and must make educated guesses. You must not assume that other people will have the same preferences as you do, or as a hypothetical “rational person” does, but must genuinely think about their situation.
Putting yourself in the other person’s shoes is a difficult task, often made more complicated by your emotional involvement in your own aims and pursuits.
Players in many games myst face uncertainty about other players’ choices; this is sometimes called strategic uncertainty to distinguish it from the natural aspects of chance, such as a distribution of cards or the bounce of a ball from an uneven surface.
Backward reasoning along a tree is the correct way to analyze and solve games where the players movie sequentially. Those who fail to do so either explicitly or intuitively are harming their own objectives; they should read our book or hire a strategic consultants. But that is an advisory or normative use of the theory of backward reasoning.
Hypotheses don’t have to be either fully correct or totally wrong; accepting one need not mean rejecting all others.
To really look forward and reason backward, you have to predict what the other players will actually do, not what you would have done in their shoes. The problem is that when you try to put yourself in the other players’ shoes, it is hard if not impossible to leave your own shoes behind. You know too much about what you are planning to do in your next move and it is hard to erase that knowledge when you are looking at the game from the other player’s perspective.
One of the general morals of this story is that if you have to take some risks, it is often better to do so as quickly as possible.
The wisdom of taking risks early applies to most aspects of life, whether it be career choices, investments, or dating.
A player is said to have a dominant strategy if that same strategy is better for him than all of his other available strategies no matter what strategy or strategy combination the other player or players choose. And we have a simple rule for behavior in simultaneous-move games:

Rule 2: If you have a dominant strategy, use it.

Before cheating can be punished, it must be detected. If detection is fast and accurate, the punishment can be immediate and accurate. That reduces the gain from cheating while increasing its cot, and thus increases the prospects for successful cooperation.
Next, there is the choice of punishment. Sometimes the players have available to them actions that hurt others, and these can be invoked after an instance of cheating even in a one-time interaction.
The boundaries of acceptable behavior, and the consequences of cheating, should be clear to a prospective cheater. If these things are complex or confusing, the player may cheat by mistake or fail to make a rational calculation and play by some hunch.
Players should have confidence that defection will be punished and cooperation rewarded. This is a major problem in some international agreements like trade liberalization in the World Trade Organization (WTO).
If the punishment is strong enough to deter cheating, it need never actually be inflicted. THerefore it may as well be set at a sufficiently high level to ensure deterrence.
If successive elimination of dominated strategies (or never-best-response strategies) and choice of dominant strategies does leed to a unique outcome, that is a Nash equilibrium. When this works, it is an easy way to find Nash equilibria. Therefore we summarize our discussion of finding Nash equilibria into two rules:

Rule 3: Eliminate from consideration any dominated strategies and strategies that are never best response, and go on doing so successively.
Rule 4: Having exhausted the simple avenues of looking for dominant strategies or ruling out dominated ones, next search all the cells of the game table for a pair of mutual best responses in the same cell, which is a Nash equilibrium of the game.

A Nash equilibrium is a configuration of strategies where each player’s choice is his best response to the other player’s choice (or the other players’ choices when there are more than two players in the game). If some outcome is not a Nash equilibrium, at least one player must be choosing an action that is not his best response. Such a player has a clear incentive to deviate from that action, which would destroy the proposed solution.
A Nash equilibrium is a combination of two conditions:

Each player is choosing a best response to what he believes the other players will do in the game.
Each player’s beliefs are correct. The other players are doing just what everyone else thinks they are doing.

A game is a situation of strategic interdependence: the outcome of your choices (strategies) depends upon the choices of one or more other persons acting purposely. The decisions makers involved in a game are called players, and their choices are called moves. The interests of the players in a game may be in strict conflict; one person’s gain is always another’s loss. Such games are called zero-sum. More typically, there are zones of commonality of interests as well as of conflict and so, there can be combinations of mutually gainful or mutually harmful strategies. Nevertheless, we usually refer to the other players in a game as one’s rivals.
The moves in a game may be sequential or simultaneous. In a game of sequential moves, there is a linear chain of thinking: If I do this, my rival can do that, and in turn I can respond in the following way. Such a game is studied by drawing a game tree. The best choices of movies can be found by applying Rule 1: Look forward and reason backward.
In a game with simultaneous moves, there is a logical circle of reasoning: I think that he thinks that I think that...and so on. This circle must be squared; one must see through the rival’s action even though one cannot see it when making one’s own move. To tackle such a game, construct a table that shows the outcomes corresponding to all conceivable combinations of choices.
Begin by seeing if either side has a dominant strategy--one that outperforms all of that side’s other strategies, irrespective of the rival’s choice. This leads to Rule 2: If you have a dominant strategy, use it. If you don’t have a dominant strategy, but your rival does, then count on his using it, and choose your best response accordingly.
Next, if neither side has a dominant strategy, see if either has a dominated strategy--one that is uniformly worse for the side playing it than all the rest of its strategies. If so, apply Rule 3: Eliminate dominated strategies from consideration. Go on doing so successively. If during the process any dominant strategies emerge in the smaller games, they should be chosen.
Finally, if there are neither dominant nor dominated strategies, or after the game has been simplified as far as possible using the second step, apply Rule 4: Look for an equilibrium, a pair of strategies in which each player’s action is the best response to the other’s.
Rule 5: In a game of pure conflict (zero-sum game), if it would be disadvantageous for you to let the opponent see your actual choice in advance, then you benefit by choosing at random from your available pure strategies. The proportions in your mix should be such that the opponent cannot exploit your choice by pursuing any particular pure strategy from the ones available to him--that is, you get the same average payoff when he plays any of his pure strategies against your mixture.
A commitment is an unconditional strategic move; as the Nike slogan says, you “just do it”; then the other players are followers.
Threats and promises, on the other hand, are more complex conditional moves; they require you to fix in advance a response rule, stating how you would respond to the other player’s move in the actual game.
A threat is a response rule that punishes others who fail to act as you would like them to. A promise is an offer to reward other players who act as you would like them to.
The response rule prescribes your action as a response to the others’ moves. Although you act as a follower in the actual game, the response rule must be put in place before others make their moves.
You must seize the first-move status in the matter of putting the response rule in place and communicating it to the other player. You must ensure that your response rule is credible, namely that if and when the time comes for you to make the stated response, you will actually choose it. This may require changing the game in some way to ensure that the choice is in fact best for you in that situation.
The overall purpose of threats and promises is similar to that of commitments, namely, to induce the others to take actions different than they would otherwise.
When you want to stop the others from doing something they would otherwise do, that is deterrence. Its mirror image, namely to compel the others to do something they would not otherwise do, can then be termed compellence.
Promises can also be compellent or deterrent. A compellent promise is designed to induce someone to take a favorable action.
A deterrent promise is designed to prevent someone from taking an action that is against your interests.
Like the two kinds of threats, the two promises also share a common feature. After the other player has complied with one’s wishes, the promiser no longer needs to pay the cost of delivering the reward and has the temptation to renege.
All threats and promises have a common feature: the response rule requires you to take actions that you would not take in its absence. If instead the rule merely says that you will do what is best at the time, this is as if there is no rule: there is no change in others’ expectations about your future actions and hence no change in their actions. Still, there is an informal role for stating what will happen, even without any rule; these statements are called warnings and assurances.
When it is in your interest to carry out a threat, we call this a warning.
When it is in your interest to carry out a promise, we call this an assurance.
Strategic moves, therefore, contain two elements: the planned course of action and the associated actions that make this course credible.
It is never advantageous to allow others to threaten you. You could always do what they wanted you to do without the threat. The fact that they can make you worse off if you do not cooperate cannot help, because it limits your available options. But this maxim applies only to allowing threats. If the other side can make promises, then you can both be better off.
Sometimes the distinctions between threats and promises are blurred.
So should you use a threat or a promise? The answer depends on two considerations. The first is the cost. A threat can be less costly; in fact, it is costless if it is successful. If it changes the other player’s behavior in the way you want, you don’t have to carry out the costly action you had threatened. A promise, if successful, must be fulfilled--if the other player acts as you want him to, you have to deliver the costly action you had promised.
Deterrence does not necessarily have a deadline. It simply involves telling the other player not to do such and such, and credibly communicating the bad consequences that would follow if he takes the forbidden action. [...] Therefore deterrence can be achieved more simply and better by a threat. YOu set up a tripwire, and it is up to the other to decide whether to trigger it.
In contrast, compellence must have a deadline. [...] Therefore compellence is often better achieved by giving the other player the incentive not to procrastinate. This means that earlier performance must get a better reward or lighter punishment. This is a promise.
When making a threat or a promise, you must communicate to the other player quite clearly what actions will bring what punishment (or what reward). Otherwise, the other may form a wrong idea of what is forbidden and what is encouraged and miscalculate the consequences of his actions.
For a threat or promise to have its desired effect, the other player must believe it. Clarity without certainty doesn’t cut it.
If a threat is successful, the threatened action does not have to be carried out. Even though it may be costly for you to carry it out, since you don’t have to do so, the cost is irrelevant.
Very often you don’t know the exact size of a threat that is needed to deter or compel your adversary. You want to keep the size as low as possible to minimize the cost to you in the event that things go wrong and you have to go through with the action. So you start small and gradually raise the size of the threat. This is the delicate strategy of brinkmanship.
The key to understanding brinkmanship is to realize that the brink is not a sharp precipice but a slippery slope, getting gradually steeper.
The essence of brinkmanship is the deliberate creation of risk. This risk should be sufficiently intolerable to your opponent to induce him to eliminate the risk by following your wishes.
In other words, brinkmanship is “chicken in real time”: a game of increasing risk, just like the interrogation games in the movies.
Commitments, threats, and promises will not improve your outcome in a game if they are not credible.
In most situations, mere verbal promises should not be trusted. As Sam Goldwyn put it, “A verbal contract isn’t worth the paper it’s written on.”
A straightforward way to make your commitment credible is to agree to pay a penalty if you fail to follow through.
Alternative institutions of enforcement, such as the Mafia, get their credibility by developing a reputation. They may also develop expertise, which enables them to evaluate evidence faster or more accurately than the court system can. These advantages can prevail even when the court system is reliable and fair, and the alternative tribunals coexist with the formal machinery of the law.
If you try a strategic move in a game and then back off, you may lose your reputation for credibility. In a once-in-a-lifetime situation, reputation may be unimportant and therefore of little commitment value. But you typically play several games with different rivals at the same time, or the same rivals at different times. Future rivals will remember your past actions and may hear of your past actions in dealing with others. Therefore you have an incentive to establish a reputation, and this serves to make your future strategic moves credible.
Cutting off communication succeeded as a credible commitment device because it can make an action truly irreversible.
Armies often achieve commitment by denying themselves an opportunity to retreat.
A reputation is valuable only to the extent that it gets publicized; you can make it ineffective by maintaining secrecy.
Cutting off communication may protect the player making a strategic move by making his action irreversible. But if the other player is unavailable to receive the information about the opponent’s commitment or threat in the first place, the strategic move is pointless.
The credibility of mutual promises can be enhanced by breaking large actions into a sequence of small ones. But you can try to destroy the credibility of an opponent's threat by going against his wishes in small steps. Each step should be so small in relations to the threatened costly action that it is not in the interests of the other to invoke it.
In general, when there is a problem of commitment, one way around the issue is to rent rather than sell the product. That way no one has an incentive to take advantage of the used book stockpile, because there isn’t any.
When we recognize that players in a game may not have perfect information, it becomes important, even essential, to specify who knows what. The better informed player may want to convey the truth to a skeptical adversary; the less informed player generally wants to find the truth. This makes the true game between them one of manipulating information. Concealing, revealing, and interpreting information each require their own special strategies.
Why can’t we just rely on others to tell the truth? The answer is obvious: because it might be against their interests.
The general principle governing all such situations is: Actions speak louder than words. Players should watch what another player does, not what he or she says. And, knowing that the others will interpret actions in this way, each player should in turn try to manipulate actions for their information content.
Strategic game players who possess any special information try to conceal it if they will be hurt when other players find out the truth. And they will take actions that, when appropriately interpreted, reveal information that works favorable for them. They know that their actions, like their faces, leak information. They will choose actions that promote favorable leakage; such strategies are called signalling. They will act in ways that reduce or eliminate unfavorable leakage; this is signal jamming. It typically consists of mimicking something that is appropriate under different circumstances than the ones at hand.
If you want to elicit information from someone else, you should set up a situation where that person would find it optimal to take one action if the information was of one kind, and another action if it was of another kind; action (or inaction) then reveals the information. This strategy is called screening.
To be an effective signal, an action should be incapable of being mimicked by a rational liar: it must be unprofitable when the truth differs from what you want to convey.
Actions that are intended to convey a player’s private information to other players are called signals. For a signal to be a credible carrier of a specific item of information, it must be the case that the action is optimal for the payer to take if, but only if, he has that specific information.
You want evidence that is credible and hard to mimic.
An MBA serves as a credible signal that the person intends to work for several years. If she was planning to drop out of the labor force in a year, it would not have made sense to have invested the two years in getting an MBA.
Practically speaking, it likely takes at least five years to recover the cost of the MBA in terms of tuition and lost salary.
Everything you do sends a signal, including not sending a signal.
In some circumstances, the most powerful signal you can send is that you don’t need to signal.
The application of the concept of screening that most impinges on your life is price discrimination. For almost any good or service, some people are willing to pay more than others--either because they are richer, more impatient, or just have different tastes. So long as the cost of producing and selling the good to a customer is less than what the customer is willing to pay, the seller would like to serve that customer and get the highest possible price.
Sellers do not know exactly how much each individual customer is willing to pay.
When one participant improves his own ranking, he necessarily worsens every else’s ranking. But the fact that one’s victory requires someone else’s defeat does not make the game zero-sum. In a zero-sum game it is not possible to make everyone better off.
The essential point is that it is not necessary to convert everyone, just a critical mass. Given enough of a toehold, the better technology can take it from there.
If nobody is abiding by the law, then you have [...] reasons to break it too.
This median is not necessarily the average position. The median position is determined by where there are an equal number of voices on each side, while the average gives weight to how far the voices are away.
The trick is to get a critical mass of people to switch, and then the bandwagon effect makes the new equilibrium self-sustaining. In contrast, a little bit of pressure over a long period of time would not have the same effect.
A dominant strategy is your best play no matter what others are doing. Thus you don’t need to know how many others there are or what they are thinking or doing. Your best strategy doesn’t depend on what anyone else bids.
The larger takeaway here is that you may change the rules of the game, but the players will adapt their strategies to take those new rules into account. In many cases, they will precisely offset what you’ve done.
A powerful idea in game theory is the concept of acting like a consequentialist. By that we mean to look ahead and see where your actions have consequences. You should then assume that situation is the relevant one at the time of your initial play. It turns out that this perspective is critical in auctions and in life. It is the key tool to avoid the winner’s curse.
The larger moral here is that you can write down a set of rules for a game, but the players can undo those rules.
The way we describe a preemption game suggests a duel and that analogy is apt. If you fire too soon and miss, your rival will be able to advance and hit with certainty. But if you wait too long, you may end up dead without having fired a shot.
The opposite of the preemption game is a war of attrition. Instead of seeing who jumps in first, hear the objective is to outlast your rival. Instead of who goes in first, the game is who gives in first.
Moral: If you don’t like the game you are playing, look for the larger game.
The first step in any negotiation is to measure the pie correctly.
The whole point of the negotiation is how much value can be created above and beyond the sum [...].
As always, actions speak louder than words. And, as always, conveying information by a signal entails a cost, or sacrifice of efficiency.
Risk and brinkmanship change the process of bargaining in a fundamental way.
An integral aspect of brinkmanship is that sometimes the parties do go over the brink.
Time is money in many different ways. Most simply, a dollar received earlier is worth more than the same dollar received later, because it can be invested and earn interest or dividends in the meantime.
The foundation of a democratic government is that it respects the will of the people as expressed through the ballot box. Unfortunately, those lofty ideals are not so easily implemented. Strategic issues arise in voting, just as in any other multi person game.
The most commonly used election procedure is simple majority voting. And yet the results of the majority-rule system can have paradoxical properties [...].
The way the U.S. judicial system works, a defendant is first found to be innocent or guilty. The punishment sentence is determined only after a defendant has been found guilty. It might seem that this is a relatively minor procedural issue. Yet the order of this decision making can mean the difference between life and death, or even between conviction and acquittal.
There are three alternative procedures to determine the outcome of a criminal court case. Each has its merits, and you might want to choose from among them based on some underlying principles.

Status Quo: First determine innocence or guilt; then, if guilty, consider the appropriate punishment.
Roman Tradition: After hearing the evidence, start with the most serious punishment and work down the list. First decide if the death penalty should be imposed for this case. If not, decide whether a life sentence is justified. If, after proceeding down the list, no sentence is imposed, the defendant is acquitted.
Mandatory Sentencing: First specify the sentence for the crime. Then determine whether the defendant should be convicted.

The only difference between these systems is one of agenda: what gets decided first.
A vote can have one of two effects. It can be instrumental in determining the outcome, or it can be a “voice” that influences the margin of victory or defeat without altering the outcome. In a decision-making body like the Senate, the first aspect is the more important one.
Much of a senator’s power comes from the work in committees.
A market economy has a better natural incentive mechanism, namely the profit motive. A company that succeeds in cutting costs or introducing a new product makes a greater profit; one that lags behind stands to lose money.
Economists and game theorists take a more relaxed attitude. They think it perfectly natural that people would respond in their own best interests to the incentives they face. If they can get away with shirking on the job, they will do so.
The main problem of moral hazard is the unobservability of a worker’s action or effort. Therefore payments cannot be based on effort, even though more or better effort is what you as an employer want to achieve. Payments must be based on some observable metric, such as the outcome or your profit. If there were a perfect and certain one-to-one relationship between such observable outcomes and the unobservable underlying action, perfect control of effort would be possible. But in practice outcome depends on other chance factors, not just the effort.
Rewarding an employee for a good outcome in part rewards him for good luck, and penalizing him for a poor outcome in part penalizes him for bad luck. If the chance element is too large, the reward is only poorly related to effort, and therefore the effect of outcome-based incentives on effort is weak. Realizing this, you would not offer such incentives to any great extent. Conversely, if the chance element is small, then stronger and sharper incentives can be used. This contrast will appear repeatedly in what follows.
An incentive payment scheme has two key aspects: the average payment to the worker, which must be enough to fulfill the participation constraint, and the spread of payments in good versus bad outcomes, which is what provides the incentive to exert more or better effort. The bigger the spread, the more powerful the incentive.
Promotion incentives are most useful for younger employees at lower and middle levels.
The purpose of the wage is to get the worker to put in the requisite effort and work more efficiently, and so it is called an efficiency wage. The excess above the basic wage elsewhere [...] is called the efficiency premium.
In some organizations the control structure is not a pyramid. In places the pyramid gets inverted: one worker is responsible to several bosses. This happens even in private companies but is much more common in the public sector. Most public sector agencies have to answer to the executive, the legislature, the courts, the media, various lobbies, and so on. The interests of these multiple owners are often imperfectly aligned or even totally opposed.
When you can’t observe the quality of effort, we know that you have to base your reward scheme on something you can observe. In the present instance the only thing that can be observed is the ultimate outcome, namely success or failure of the programming effort. This does have a link to effort, albeit an imperfect one--higher quality of effort means a greater chance of success. This link can be exploited to generate an incentive scheme. What you do is offer the expert a remuneration that depends on the outcome: a larger sum upon success and a smaller sum in the event of failure. The difference, or the bonus for success, should be just enough to make it in the employee’s own interest to provide high-quality effort.
The inevitable truth about gambling is that one person’s gain must be another person’s loss. Thus it is especially important to evaluate a gamble from the other side’s perspective before accepting. If they are willing to gamble, they expect to win, which means they expect you to lose.
The general point is that in games it is not always an advantage to seize the initiative and move first. This reveals your hand, and the other players can use this to their advantage and your cost. Second movers may be in the stronger strategic position.
Your chances of survival depend on not only your own ability but also whom you threaten. A weak player who threatens no one may end up surviving if the stronger players kill each other off.
Basic training in the armed forces everywhere is a traumatic experience. THe new recruit is maltreated, humiliated, and put under such immense physical and mental strain that the few weeks quite alter his personality. An important habit acquired in this process is an automatic, unquestioning obedience. There is no reason why socks should be folded, or beds made, in a particular way, except that the officer has so ordered. The idea is that the same obedience will occur when the order is of greater importance. Trained not to question orders, the soldier becomes a fighting machine; commitment is automatic.

Justin Spencer

Pages

20180729

The Art of Strategy by Avinash K. Dixit & Barry J. Nalebuff

No comments:

Post a Comment