How stereotypes shape our perceptions of other minds

McGlothlin & Killen (2006) showed groups of (predominantly white) American elementary school children from ages 6 to 10 a series of vignettes depicting children in ambiguous situations. For instance, one picture (above) showed two children by a swing set, with one on the ground frowning, and one behind the swing with a neutral expression. 

Two things might be going on in this picture: 

Crucially, McGlothlin and Killen varied the race of the children depicted in the image, such that some children saw a white child standing behind the swing (left), and some saw a black child (right). Children were asked to explain what had just happened in the scenario, to predict what would happen next, and to evaluate the action that had just happened. 

Overwhelmingly, children were more likely to give the harmful scenario interpretation — that the child behind the swing intentionally pushed the other child — when the child behind the swing was black than when she was white. The race the child depicted, it seems, influenced whether or not participants made an inference to harmful intentions.

This is yet another depressing example of how racial bias can warp our perceptions of others. But this study (and others like it: Sagar & Schofield 1990; Burnham & Harris 1992; Condry et al. 1985)) also hint at a relationship between two forms of social cognition that are not often studied together: mindreading and stereotyping

The stereotyping component is clear enough. The mindreading component comes from the fact that race didn’t just affect kids’ attitudes towards the target — it affected what they thought was going on in the target’s mind. Although these two ways of thinking about other people — mindreading and stereotyping — both seem to play an important role in how we navigate the social world, curiously little attention has been paid to understanding the way they relate to one another. 

In this post, I want to explore this relationship. I’ll first briefly explain what I mean by “mindreading” and “stereotyping.” Next, I’ll discuss one existing proposal about the relationship between mindreading and stereotyping, and raise some problems for it. Then I will lay out the beginnings of a different way of cashing out this relationship.

First, lets get clear on what I mean by “mindreading” and “stereotyping.”


In order to achieve our goals in highly social environments, we need to be able to accurately predict what other people will do, and how they will react to us. To do this, our brains generate complex models of other people’s beliefs, desires, and intentions, which we use to predict and interpret their behaviour. This capacity to represent other minds is known various as theory of mind, mindreading, mentalising, and folk psychology

In human beings, this ability begins to emerge very early in development. As adults, we use it constantly, in a fast, flexible, and unconscious fashion. We use it in many important social activities, including communication, social coordination, and moral judgment.


Stereotypes are ways of storing generic information about social groups (including races, genders, sexual orientation, age-groups, nationalities, professions, political affiliation, physical or mental ability, and so on) (Amodio 2014). A particularly important aspect of stereotypes is that they often contain information about stable personality traits. Unfortunately, it is all too easy for us to think of stereotypes about how certain social groups are lazy, or greedy, or aggressive, or submissive, and so on. 

According to Susan Fiske and colleagues’ Stereotype Content Model (SCM), there is an underlying pattern to the way we attribute personality traits to groups (Cuddy et al. 2009; Cuddy et al. 2007; Fiske et al. 2002; Fiske 2015). Personality trait attribution, on this view, varies along two primary dimensions: warmth and competence

The warmth dimension includes traits like (dis-)honesty, (un-)trustworthiness, and (un-)friendliness. These are traits that tell you whether or not someone is liable to help you or harm you. The competence dimension contains traits like (un-)intelligence, skilfulness, persistence, laziness, clumsiness, etc. These traits tell you how effective someone is at achieving their goals.

Together, these two dimensions combine to yield four distinct clusters of traits, each of which picks out a different kind of stereotype:

The Stereotype Content Model (Fiske et. al 2002)

So what do stereotyping and mindreading have to do with one another? There are some obvious differences, of course: stereotypes are mainly about groups, while mindreading is mainly about individuals. But intuitively, it seems like knowing about somebody’s social group membership could tell you a lot about what they think: if I tell you that I am a liberal, for instance, that should tell you a lot about my beliefs, values, and social preferences — valuable information, when it comes to predicting and interpreting my behaviour.

Some philosophers and psychologists, such as Kristin Andrews, Anika Fiebich and Mark Coltheart, have suggested that stereotypes and mindreading may actually be alternative strategies for predicting and interpreting behaviour (Andrews 2012; Fiebich & Coltheart 2015). That is, it may be that sometimes we use stereotypes instead of mindreading to figure out what a person is going to do. 

According to one such proposal (Fiebich & Coltheart 2015), stereotypes allow us to predict behaviour because they encode associations between social categories, situations, and behaviours. Thus, one might form a three-way association between the social category police, the situation donut shops, and the behaviour eating donuts, which would lead one to predict that, when one sees a police officer in a donut shop, he or she will likely be eating a donut. 

A more complex version of this associationist approach would be to associate social groups with particular traits labels (as per the SCM), and thus consist in four-way associations between social categories, trait labels, situations, and behaviours (Fiebich & Coltheart 2015; Andrews 2012). Thus, one might come to the trait of generosity with leaving large tips in restaurants, and associate the social category of uncles with generosity, and thereby come to expect uncles to leave large tips in restaurants. One might then explain this behaviour by referring to the uncle’s generosity. 

The key thing to notice about these accounts is that their predictions do not rely at all upon mental-state attributions. This is by design: these proposals are meant to show that we often don’t need mindreading to predict or interpret behaviour.

One problem for this sort of view comes from its invocation of “situations.” What information, one might wonder, is contained within the scope of a particular “situation”? Surely, a situation does not include everything about the state of the world at a given moment. 

Situations are probably meant to pick out local states of affairs. But not all the facts about a local state of affairs will be relevant to behaviour prediction. The presence of mice in the kitchen of a restaurant, for instance will not affect your predictions about the size of your uncle’s tip. It might, however, affect our predictions about the behaviour of the health inspector, should one suddenly arrive. 

Which local facts are predictively useful will ultimately depend upon their relevance to the agent whose behaviour we are predicting. But whether or not a fact is relevant to an agent will depend upon that agent’s beliefs about the local state of affairs, as well as her goals and desires. 

If this is how representations of predictively useful situations are computed, then the purportedly non-mentalistic proposal given above really includes a tacit appeal to mindreading. If this is not how situations are computed, then we are owed an explanation for how the non-mentalistic behaviour-predictor arrives at predictively useful representations of situations that do not depend upon considerations of relevance.

Instead of treating mindreading and stereotypes as separate forms of behaviour-prediction and interpretation, we might instead explore the ways in which stereotypes might inform mindreading. The key to this approach, I suggest, lies in the fact that stereotypes encode information about personality traits. 

In many ways, personality traits are like mental states: they are unobservable mental properties of individual, and they are causally related to behaviour. But they also differ in one key respect: their temporal stability. Beliefs and desires are inherently unstable: a belief that P can be changed by the observation of not‑P; a desire for Q can be extinguished by the attainment of Q. 

Personality traits, in contrast, cannot be extinguished or abandoned based on everyday events. Rather, they tend to last throughout a person’s lifetime, and manifest themselves in many different ways across many different situations. 

A unique feature of personality traits, in other words, is that they are highly stable mental entities (Doris 2002). So when stereotypes ascribe traits to groups, they are ascribing a property that one could reasonably expect to remain consistent across many different situations.

The temporal properties of mental states are extremely relevant for mindreading, especially in models that employ Bayesian Predictive Coding (Kilner & Frith 2007; Koster-Hale & Saxe 2013; Hohwy & Palmer 2014; Hohwy 2013; Clark 2015). To see why, let’s start with an example:

Suppose that we believe that Laura is thirsty, and have attributed to her the goal of getting a drink (G). As goals go, this one is relatively short-term (unlike, say, the goal of getting a PhD). But we know that in order to achieve (G), we predict that Laura must form a number of even shorter-term sub-goals: 

But each of these requires the formation of still shorter-term sub-sub-goals: 

Predicting Laura’s behaviour in this context thus begins with the ascription of a longer-duration mental state (G), followed by the ascription of successively shorter-term mental-state attributions (G1, G2, G1a, G1b, G1c, G2a, G2b).

As mindreaders, we can use attributions of more abstract, temporally extended mental states to make top-down inferences about more transient mental states. At each level in this action-prediction hierarchy, we use higher-level goal-attributions to constrain the space of possible sub-goals that the agent might form. We then use our prior experience to select the most likely sub-goal from the hypothesis space, and the process repeats itself.

Ultimately, this yields fairly fine-grained expectations about motor-intentions, which manifest themselves as mirror-neuron activity (Kilner & Frith 2007; Csibra 2008). Action-prediction thus plays out as a descent from more stable mental-state attributions to more transient ones, which ultimately bottom out in highly concrete expectations about behaviour.

Personality traits, which are distinguished by their high degree of temporal stability, fit naturally into the upper levels of this action-prediction hierarchy. Warmth traits, for instance, can tell us about the general preferences of an agent: a generous person probably has a general preference for helping others, while a greedy person probably has a general desire to enrich herself. These broad preference-attributions can in turn inform more immediate goal-attributions, which can then be used to predict behaviour.

This role for representations of personality traits in mental-state inference fits well with what we know about how we reason about traits more generally. For instance, we often make extremely rapid judgments about the warmth and competence traits of individuals based on fairly superficial evidence, such as facial features (Todorov et al. 2008); we also tend to over attribute the causes of behaviour to personality traits, rather than situational factors — a phenomenon commonly known as the “fundamental attribution error” or the “correspondence bias (Gawronski 2004; Ross 1977; Gilbert et al. 1995). 

Prioritizing personality traits makes a lot of sense if they form the inferential basis for more complex forms of behaviour prediction. It also makes sense that this aspect of mindreading would need to rely on fast, rough-and-ready heuristics, since personality trait information would need to be inferred very quickly in order to be useful in action-planning.

From a computational perspective, thus, using personality traits to make inferences about behaviour makes a lot of sense, and might make mindreading more efficient. But in exchange for this efficiency, we make a very disturbing trade. 

Stereotypes, which can be activated rapidly based on easily available perceptual cues provide the mindreading system with a rapid means for storing trait information (Mason et al. 2006; Macrae et al. 1994). With this speed comes one of the most morally pernicious forms of human social cognition, one that helps to perpetuate discrimination and social inequality.

The picture I’ve painted in this post is, admittedly, rather pessimistic. But just because the roots of discrimination are cognitively deep, we should not conclude that it is inevitable. 

More recent work from McGlothlin and Killen (2010) should give us some hope: while children from racially homogeneous schools (who had little direct contact with members of other races) tended to show signs of biased intention-attribution, McGlothlin and Killen also found that children from racially heterogeneous schools (who had regular contact with members of other races) did not display such signs of bias. Evidently, intergroup contact is effective in curbing the development of stereotypes — and, by extension, biased mindreading.


Amodio, D.M., 2014. The neuroscience of prejudice and stereotyping. Nature Reviews: Neuroscience, 15(10), pp.670–682.

Andrews, K., 2012. Do apes read minds?: Toward a new folk psychology, Cambridge, MA: MIT Press.

Burnham, D.K. & Harris, M.B., 1992. Effects of Real Gender and Labeled Gender on Adults’ Perceptions of Infants. Journal of Genetic Psychology, 15(2), pp.165–183.

Clark, A., 2015. Surfing uncertainty: Prediction, action, and the embodied mind, Oxford: Oxford University Press.

Condry, J.C. et al., 1985. Sex and Aggression : The Influence of Gender Label on the Perception of Aggression in Children Development Sex and Aggression : The Influence of Gender Label on the Perception of Aggression in Children. Child Development, 56(1), pp.225–233.

Csibra, G., 2008. Action mirroring and action understanding: an alternative account. In P. Haggard, Y. Rossetti, & M. Kawato, eds. Sensorymotor Foundations of Higher Cognition. Attention and Performance XXII. Oxford: Oxford University Press, pp. 435–459.

Cuddy, A.J.C. et al., 2009. Stereotype content model across cultures: Towards universal similarities and some differences. British Journal of Social Psychology, 48(1), pp.1–33.

Cuddy, A.J.C., Fiske, S.T. & Glick, P., 2007. The BIAS map: behaviors from intergroup affect and stereotypes. Journal of personality and social psychology, 92(4), pp.631–48.

Doris, J.M., 2002. Lack of character: Personality and moral behavior, Cambridge, UK: Cambridge University Press.

Fiebich, A. & Coltheart, M., 2015. Various Ways to Understand Other Minds: Towards a Pluralistic Approach to the Explanation of Social Understanding. Mind and Language, 30(3), pp.235–258.

Fiske, S.T., 2015. Intergroup biases: A focus on stereotype content. Current Opinion in Behavioral Sciences, 3(April), pp.45–50.

Fiske, S.T., Cuddy, A.J.C. & Glick, P., 2002. A Model of (Often Mixed Stereotype Content: Competence and Warmth Respectively Follow From Perceived Status and Competition. Journal of personality and social psychologyersonality and social psychology, 82(6), pp.878–902.

Gawronski, B., 2004. Theory-based bias correction in dispositional inference: The fundamental attribution error is dead, long live the correspondence bias. European Review of Social Psychology, 15(1), pp.183–217.

Gilbert, D.T. et al., 1995. The Correspondence Bias. Psychological Bulletin, 117(1), pp.21–38.

Hohwy, J., 2013. The predictive mind, Oxford University Press.

Hohwy, J. & Palmer, C., 2014. Social Cognition as Causal Inference: Implications for Common Knowledge and Autism. In M. Gallotti & J. Michael, eds. Perspectives on Social Ontology and Social Cognition. Dordrecht: Springer Netherlands, pp. 167–189.

Kilner, J.M. & Frith, C.D., 2007. Predictive coding: an account of the mirror neuron system. Cognitive Processes, 8(3), pp.159–166.

Koster-Hale, J. & Saxe, R., 2013. Theory of Mind: A Neural Prediction Problem. Neuron, 79(5), pp.836–848.

Macrae, C.N., Stangor, C. & Milne, A.B., 1994. Activating Social Stereotypes: A Functional Analysis. Journal of Experimental Social Psychology, 30(4), pp.370–389.

Mason, M.F., Cloutier, J. & Macrae, C.N., 2006. On construing others: Category and stereotype activation from facial cues. Social Cognition, 24(5), p.540.

McGlothlin, H. & Killen, M., 2010. How social experience is related to children’s intergroup attitudes. European Journal of Social Psychology, 40(4), pp.625–634.

Mcglothlin, H. & Killen, M., 2006. Intergroup Attitudes of European American Children Attending Ethnically Homogeneous Schools. Child Development, 77(5), pp.1375–1386.

Ross, L., 1977. The Intuitive Psychologist And His Shortcomings: Distortions in the Attribution Process. Advances in Experimental Social Psychology, 10©, pp.173–220.

Sagar, H.A. & Schofield, J.W., 1990. Racial and behavioral cues in Black and White children’ s perception of ambiguously aggressive acts. Journal of personality and social psychology, 39(October), pp.590–598.

Todorov, A. et al., 2008. Understanding evaluation of faces on social dimensions. Trends in Cognitive Sciences, 12(12), pp.455–460.

Thanks to Melanie Killen and Joan Tycko for permission to use images of experimental stimuli from McGlothlin & Killen (2006, 2010).