Trusting the Uncanny Valley: Exploring the relationship between AI, mental state ascriptions, and trust.

25 July 2016 | Henry Powell (University of Warwick)

Interactive artificial agents such as social and palliative robots have become increasingly prevalent in the educational and medical fields (Coradeshi et al. 2006). Different kinds of robots, however, seem to engender different kinds of interactive experiences from their users.

Social robots, for example, tend to afford positive interactions that look analogous to the ones we might have with one another. Industrial robots, on the other hand, rarely, if ever, are treated in the same way. Some very lifelike humanoid robots seem to fit somewhere outside of these two spheres, inspiring feelings of discomfort or disgust from people who come into contact with them. 

One way of understanding why this phenomenon obtains is via a conjecture developed by the Japanese roboticist Masahiro Mori in 1970 (Mori, 1970, pp. 33–35). This so called “uncanny valley” conjecture has a number of potentially interesting theoretical ramifications. Most importantly, that it may help us to understand a set of conditions under which humans could potentially ascribe mental states to beings without minds – in this case, that trusting an artificial agent can lead one to do just that. 

With this in mind the aims of this post are two-fold. Firstly, I wish to provide an introduction to the uncanny valley conjecture and secondly, I want to raise doubts concerning its ability to shed light on the conditions under which mental state ascriptions occur. Specifically, in experimental paradigms that see subjects as trusting their AI co-actors.

Mori’s uncanny valley conjecture proposes that as robots increase in their likeness to human beings, their familiarity likewise increases. This trend continues up to a point at which their lifelike qualities are such that we become uncomfortable interacting with them. At around 75% human likeness, robots are seen as uncannily like human beings and are viewed with discomfort, or, in more extreme cases, disgust, significantly hindering their potential to galvanise positive social interactions.

Diagram of familiarity and human likeness of different entities

This effect has been explained in a number of ways. For instance, Saygin et al. (2011, 2012), have suggested that the uncanny valley effect is produced when there is a perceived incongruence between an artificial agent’s form and its motion. If an agent is seen to be clearly robotic but move in a very human-like way, or vice-versa, there is an incompatibility effect in the predictive, action simulating cognitive mechanisms that seek to pick out and forecast the actions of humanlike and non-humanlike objects. 

This predictive coding mechanism is provided contradicting information by the visual system ([human agent] with [nonhuman movement]) that prevents it from carrying out predictive operations to its normal degree of accuracy (Urgen & Miller, 2015). I take it that the output of this cognitive system is presented in our experience as being uncertain and that this uncertainty accounts for the feelings of unease that we experience when interacting with these uncanny artificial agents.

Of particular philosophical interest in this regard is a strand of research that has suggested that humans can be seen to make mental state ascriptions to artificial agents that fall outside the uncanny valley in given situations. This story was posited in two studies published in 2011 and 2015 by Kurt Gray & Daniel Wegner and Maya Mathur & David Reichling respectively. As I believe that it contains the most interesting evidential basis for thinking along these lines, I will limit my discussion here to the latter experiment.

Mathur & Reichling’s study saw subjects partake in an “investment game” (Berg et al. 1995) – a generally accepted experimental standard in measuring trust – with a number of artificial agents whose facial features varied in their human likeness. This was to test whether subjects were willing to trust different kinds of artificial agents depending on where they fell on the uncanny valley scale. 

What they found was that subjects played the game in such a way that indicated that they trusted robots with certain kinds of facial features to act in certain ways so as to reach an outcome that was mutually beneficial to both of them, rather than favouring one or the other. The authors surmised that because the subjects seemed to trust these artificial agents, in a way that suggested that they had thought about what the artificial agent’s intentions might be, the subjects had ascribed mental states to their robotic partners in these cases.

It was proposed that subjects had believed that the artificial agents had mental states encompassing intentional propositional attitudes (beliefs, desires, intentions etc.). This was because subjects seemed to assess the artificial agent’s decision making processes in the form of what the robots “interests” in the various outcomes might be. This result is potentially very exciting but I think that it jumps to conclusions rather too quickly. I’d now like to briefly give reasons for my thinking along these lines.

Mathur and Reichling seem to be making two claims in the discussion of their study’s results.

My objections here are the following. I think that the first claim, i), is more complicated than the authors make it out to be and that the second claim, ii), is just not at all obvious and does not follow from i) when i) is analysed in the proper way. Let us address i) first as it leads into the problem with ii).

When elaborated, I think that i) is making a claim that the subjects believed that the artificial agents would act in a certain way and that this action would be satisfactorily reliable. I think that this is plausible but I also think that the form of trust here is not that which is intended by Mathur and Reichling and is thus uninteresting in relation to ii). 

There are, as far as I can tell, at least two ways in which we can trust things. 

The first and perhaps most interesting form of trust is that one expressible in sentences like “I trust my brother to return the money that I lent him”. This implies that I think of my brother as the sort of person who would not, given the opportunity and upon rational reflection, do something contrary to what he had told me he would do. 

The second form of trust is that which we might have towards a ladder or something similar. We might say of such objects that “I trust that if I walk up this ladder it will not collapse because I know that it is sturdy”. The difference here should be obvious. 

I trust the ladder because I can infer from its physical state that it will perform its designated function. It has no loose fixtures, rotting parts or anything else that might make it collapse when I walk up it. To trust the ladder in this way, I do not think that it has to make commitments to the action expected of it based on a given set of ethical standards. 

In the case of trusting my brother, my trust in him is expressible as a belief in the idea that given the opportunity to choose not do what I have asked of him he will chose in favour of that which I have asked. The trust that I have in my brother requires that I believe that he has mental states that inform and help him to choose to act in favour of my asking him to do something.

 One form of trust implies the existence of mental states while the other does not. In regards to ii) then, as has just been argued, trust only implies mental states if it is of the form that I would ascribe to my brother in the example just given, but not if that trust was of the sort that we would normally ascribe to reliably functional objects like ladders. So ii) only follows from i) if the former kind of trust is evinced and not otherwise.

This analysis suggests that if we are to believe that the subjects in this experiment ascribed mental states to the artificial agents (or indeed subjects in any other experiment that reaches the same conclusions) then we need sufficient reasons for thinking that the subjects were treating the artificial agents like I would treat my brother and not like I would treat the ladder in respect to ascriptions of trust. Mathur and Reichling are silent as to these and thus we have no good reason for thinking that mental state ascriptions were taking place in the minds of the subjects in their experiment. While I do not think that it is entirely impossible that such a thing might obtain in some circumstances it is just not clear from this experiment that it obtains in this instance.

What I have hopefully shown in this post is that is important that proceed with caution when making claims about our willingness to ascribe other minds to certain kinds of objects and agents (either artificial or otherwise). Specifically, it is important to do so in relation to our ability to hold such things in seemingly special kinds of relations with ourselves, trust being an important example of this.


Berg, J., Dickhaut J., McCabe, K., (1995). Trust, Reciprocity, and Social History. Game and Economic Behaviour, 10, 122–142.

Coradeschi, S., Ishiguro, H., Asada, M., Shapiro, S. C., Thielscher, M., Breazeal, C., … Ishida, H. (2006). Human-inspired robots. IEEE Intelligent Systems, 21(4), 74–85.

Gray, K., & Wegner, D. M. (2012). Feeling robots and human zombies: Mind perception and the uncanny valley. Cognition, 125(1), 125–130.

MacDorman, K. F. (2005). Androids as an experimental apparatus: Why is there an uncanny valley and can we exploit it. in CogSci-2005 workshop: toward social mechanisms of android science (pp. 106–118).

314: B. Mathur and D. B. Reichling, “An uncanny game of trust: Social trustworthiness of robots inferred from subtle anthropomorphic facial cues, “Human-Robot Interaction (HRI), 2009 4th ACM/IEEE International Conference on, La Jolla, CA, 2009, pp. 313–314.

Saygin, A. P. (2012). What can the Brain Tell us about Interactions with Artificial Agents and Vice Versa? in Workshop on Teleoperated Androids, 34th Annual Conference of the Cognitive Science Society.

Saygin, A. P., & Stadler, W. (2012). The role of appearance and motion in action prediction. Psychological Research, 76(4), 388–394.‑0426‑z

Urgen, B. A., & Miller, L. E. (2015). Towards an Empirically Grounded Predictive Coding Account of Action Understanding. Journal of Neuroscience, 35(12), 4789–4791.