Abstract Behavioral game theory models are important in organizing experimental data of strategic decision making. However, are subjects classified

Identifying Predictable Players: ⇤

Relating Behavioral Types and Subjects with Deterministic


Daniel E. Fragiadakis Daniel T. Knoepfle Muriel Niederle

Stanford University Stanford University Stanford University and NBER

December 29, 2013


Behavioral game theory models are important in organizing experimental data of strategic decision making. However, are subjects classified as behavioral types more predictable

than unclassified subjects? Alternatively, how many predictable subjects await new behavioral models to describe them? In our experiments, subjects play simple guessing games against random opponents and are subsequently asked to replicate or best-respond to their past choices. We find that existing behavioral game theory types capture 2/3 of strategic subjects, i.e., individuals who can best respond. However, there is additional room for non-strategic rule-of-thumb models to describe subjects who can merely replicate their actions.

1 Introduction A robust finding of strategic choice experiments is that deviations from Nash equilibrium are common. This has lead to alternative behavioral models with varying specifications of beliefs and derived choices; of these, hierarchy models, particularly the level-k model, are probably the most prominent. 1 In a typical empirical paper, participants in the laboratory play a ⇤ We are especially grateful to Asen Ivanov for his impact on the design of the experiment. We thank Vince Crawford, Guillaume Fr´chette and Matt Jackson for helpful comments and the NSF for generous support.

e A level-k player best-responds to beliefs that opponents are level-(k 1), with a level-0 player assumed to randomly choose any action or to choose a fixed action considered to be focal. The model originated in empirical papers that found it rationalized large fractions of behavior in beauty contest games (Nagel, 1995) and small normal-form games (Stahl and Wilson, 1994, 1995). The level-k model has since been used to model strategic behavior in a multitude of experiments, and has spawned a literature on extensions and theoretical underpinnings. A notable variant is the cognitive hierarchy model (Camerer, Ho, and Chong, 2004), in which frequencies of types k in the population are assumed to be distributed according to some distribution, and a player of type k has beliefs about opponent types corresponding to this distribution truncated at k 1.

set of games and are then classified as specific behavioral types given their observed choices;

see Crawford, Costa-Gomes, and Iriberri (2013) for an overview. 2 Such studies generally allow some discrepancy between subjects’ choices and the models, with subjects assumed to be implementing their behavioral types’ prescriptions with error. Beyond some threshold, subjects are left unclassified and unexplained.

This literature leads naturally to two questions that we address in this paper. The first concerns the extent to which existing behavioral game theory types coincide with the set of participants who play deliberately according to deterministic rules. We construct an environment and a test allowing us to assess whether subjects use a deterministic rule, even for subjects whose rules are unknown to us. If existing behavioral game theory models capture most subjects who deliberately use a deterministic rule, then we expect subjects matching behavioral game theory types to score better on our test than other subjects. Other well performing subjects not matching existing models indicate there is deliberate behavior that may be captured by future models. The second question concerns the boundaries of classification methods given an existing set of behavioral types. We provide a quantification of the downside of classifying a greater number of subjects.

In our experiments, subjects first play a sequence of 20 two-player guessing games (of the form in Costa-Gomes and Crawford, 2006, henceforth CGC) with anonymous opponents and without feedback. We then present each subject a series of strategic decisions dependent on her original choices. These second-phase choices are unanticipated and we will show that exact purely numeric memory of Phase I choices is very limited. Our two main treatments di↵er only in their Phase II design.

In Phase II of the Replicate treatment, a subject plays the same 20 games in the same order and player role as in Phase I. However, subjects are now paid as a function of how close their Phase II guesses are to their corresponding Phase I guesses. Any subject who deliberately uses a well-defined deterministic rule and is aware of doing so should be able to replicate her actions.

Given reasonable assumptions of self-awareness and cognitive ability, a failure to replicate one’s actions suggests substantial idiosyncratic randomness in decisions.

In Phase II of the BestRespond treatment, participants play the same 20 games once more in the same order; however, they now take the role previously occupied by their opponents.

The subject is informed she plays against a computer whose guess in each game is the guess she herself made in that game in Phase I; however, she is not reminded of her exact Phase I choices.

In e↵ect, subjects are playing against their own first-phase behavior. 3 The payo↵-maximizing In addition to the works mentioned above, some leading examples of papers that seek to classify participants are Costa-Gomes, Crawford, and Broseta (2001) for normal form games, and Crawford, Gneezy, and Rottenstreich (2008) for coordination games.

This treatment is inspired by the design in Ivanov, Levin, and Niederle (2010) but has important di↵erences choice is the best-response in that game to the subject’s original Phase I action. We reason that any subject who is playing purely according to a deliberate rule in Phase I, is aware of doing so, and who best-responds to beliefs about her opponents’ play, should be able to first replicate her former guess and then find the best-response to it.

Our results show most subjects are unsuccessful at replicating or best-responding to their past behavior; in both treatments, fewer than half of the participants exceed a permissive 40% threshold for the number of optimal Phase II choices. This aggregate failure suggests that much of the observed behavior is idiosyncratic, even to the decision-makers themselves. However, such a pessimistic outlook for the overall scope of behavioral game theory is counterbalanced by the success of its existing models in explaining the deliberate subjects.

Using a conventional approach and the Phase I observations, we classify subjects as matching given behavioral types (from a set of models that includes equilibrium, level-k, and others).

Subjects whose behavior does not match better than a specific threshold are left unclassified.

We find that a vast majority of subjects classified as matching a behavioral type are able to replicate and best-respond to their past behavior. Furthermore, classified subjects are substantially more successful in replicating or best-responding to their former guesses than are subjects not classified as a behavioral type. These results unequivocally confirm the success of existing behavioral models like level-k in accurately describing the decision-making of deliberate subjects. The remaining subjects as a group are di↵erent and cannot be described as equally deliberate decision makers who use well-defined rules in a similar manner.

In addition, our environment allows us to distinguish between strategic behavior and systematic but non-strategic behavior. For instance, while the level-k model is often described as best-response to non-equilibrium beliefs, some researchers have argued that the lowest levels, such as L1, may instead arise as players using “rules of thumb”, that is behaving in systematic but not very strategic ways. Our BestRespond treatment is more strategically demanding than the Replicate treatment, requiring a change in behavior in response to the change in opponent.

As the additional strategic reasoning required is minimal, the distinction might appear trivial; however, our empirical results show it to be significant. We show that classified subjects can best-respond to their former guesses just as well as they can replicate them. In contrast, subjects unclassified fail to best-respond to their past behavior much more so than they fail to replicate it. Subjects who can replicate their guesses but fail to best-respond to them are probably better described as using rules of thumb than as best-responding to beliefs.

Overall, only about 40% of subjects who have high rates of replicating their former guesses are classified as behavioral types, suggesting appreciable room for additional behavioral game we discuss below. A design in which subjects play against themselves is also a central component of Blume and Gneezy (2010).

theory models. However, existing behavioral models have been very successful in identifying a large majority of strategic subjects. Of subjects who have high rates of best-responding to their former guesses, roughly two-thirds are behavioral types. Note that Nash equilibrium only accounts for about twenty-six percent of strategic subjects, while the level-k model accounts for an additional 35%. 4 There is still room for behavioral game models to describe the remaining 32% of strategic subjects.

In the second part of the paper we confine attention to an existing set of behavioral game theory types. In addition to our previous approach we now use a maximum likelihood estimation to classify subjects. As we relax classification criteria more subjects are classified as behavioral types. Our design allows us to show the trade-o↵ between classifying more subjects and capturing subjects whose Phase II type matches that implied by their Phase I classification.

The paper proceeds as follows: Section 2 describes the experiment and Section 3 the non-parametric classification of subjects. In Section 4 we present results pertaining to the predictability of classified and non-classified subjects using a model-free approach. Section 5 confines attention to behavioral game theory types and discusses the trade-o↵ between classifying more subjects and capturing subjects whose behavior conforms to the behavior predicted by their Phase I play. Section 6 discusses the literature and we conclude in Section 7.

2 The Experiment 2.1 Two-Person Guessing Games Participants interact in simple complete information “two-person guessing games”. 5 In a twoperson guessing game, player i facing opponent j wishes to guess as close as possible to her goal, which equals her target multiple ti times her opponent’s guess xj. Likewise, player j’s goal equals his target multiple tj times xi. Each player has a range of allowable guesses [li, ui ], and the two players simultaneously submit guesses xi and xj. The payo↵ of i is a function of the realized distance from the player’s guess xi to her goal ti xj, ei = |xi ti xj |; the function strictly decreases until the payo↵ reaches zero. We present the 20 games used in the experiment, as well as the predictions of various behavioral game theory models in Table 1. Further details are given later in this section.

The dominance-k model adds another 6%.

Another “two-person guessing game” is that of Grosskopf and Nagel (2008). They consider the familiar “p-beauty contest” guessing game where n players guess a number between 0 and 100, and the winner is the player closest to p times the mean of all submitted guesses, with p 1. When n = 2, as in their experiments, guessing 0 becomes a dominant strategy. We opt for CGC games as they allow us to have subjects play many di↵erent games in which models that have agents best-respond to beliefs result in di↵erent actions.

–  –  –

2.2 Experimental Treatments All treatments but one shared a common two-phase structure. In Phase I, subjects played a series of 20 two-person guessing games against anonymous opponents without feedback. Game parameters were public information in all games and were presented as in Figure 1. In Phase II, subjects were tasked with either replicating or best-responding to their own first-phase choices in the same series of games.

The experiment consisted of the Replicate, BestRespond, ShowGuesses, and Memory treatments. The Phase I tasks of the Replicate, BestRespond, and ShowGuesses treatments were the same and are described in a single subsection below. We then explain Phase II of each of these treatments separately. Finally, we discuss the Memory treatment.

Phase I of the Replicate, BestRespond, and ShowGuesses Treatments2.2.1

Subjects played all 20 games in individually-specific random orders without feedback on realized payo↵s or opponents’ guesses. Subjects were randomly and anonymously rematched with opponents before each game. 6 Subjects always saw themselves in the role of player 1 (called “Decision Maker 1” or “DM1”) in instructions and the experimental task, as shown in Figure 1. If i was matched to opponent j in a given trial, she wished to make a guess xi as close as possible to her goal ti xj and earned a payo↵ decreasing in ei = |xi ti xj |.

Phase II of the Replicate Treatment2.2.2

In Phase II of the Replicate treatment, a subject faced the same sequence of 20 games from Phase I in the same order. 7 Participants were told they would be paid as a function of how close their Phase II guess was to the guess they made in Phase I in the same game. The payo↵ Unknown to subjects, in each session we split them into two equal-sized groups, G1 and G2. The group specified the game role in each of the 20 games; that is, the matching was such that each pair of opposing players consisted of one member of G1 and one member of G2.

The motivation behind the preservation of order across phases was twofold. First, subjects may switch rules during Phase I and only remember the number of games played before the switch; they may not remember the specific games for which they used each of their rules. Second, for every game in Phase I, the subject made the same number of guesses, 19, before seeing that same game again in Phase II. On average, for both this treatment as well as the BestRespond treatment, 45 minutes have passed between playing a given game in Phase I and playing that same game in Phase II.

Table 1.—The 20 two-person guessing games and behavioral game theory type guesses

–  –  –

