Multisensory Perception

30 June 2015 | Alisa Mandrigin (University of Warwick)

We see, hear, touch, taste and smell. This is what common sense tells us about perception.

The view that the nature and breadth of our perceptual experience is accurately captured by those five verbs and that the sensory systems are discrete and isolated from one another has governed much of the philosophical research into perception. Recently, though, researchers in psychology and neuroscience have started to focus on the myriad interactions between the senses and the many other kinds of sensory information available to the nervous system.

How do we know that the sensory systems interact with one another? Some interactions result in effects that feature in everyday experience. 

For example, if we are presented with a light flash and a beep at different locations but at the same time and then asked to locate the beep, we judge it to be at, or at least close to, the location of the light flash (Bertelson, 1999). Judgements about location are one of the measures of the Ventriloquism effect: the mis-location in perceptual experience of auditory objects or events as a result of seeing something at a different location.

We can measure the effect in the laboratory with visual and auditory stimuli, but in everyday life we are often in situations in which we acquire spatially discrepant information about what is apparently the same source. Ventriloquists make use of the effect in their acts, hence the name given to the effect. At the cinema the film’s soundtrack is played through speakers spread out around the screening room, and not from behind the parts of the screen on which visual images of moving lips, explosions, and so on are presented.

Another visual-auditory interaction gives rise to the McGurk effect. If we see a video clip of lip movements that should produce the phoneme /ga/ with an audio recording of the phoneme /ba/ dubbed over the top, the result is perception of the phoneme /da/ (McGurk & MacDonald, 1976). Again, it seems that processing in the visual system influences processing in the auditory system. 

We can appreciate this by listening to the same auditory stimulus with our eyes closed: without the visual stimulus there is no effect. You can try it for yourself by viewing this video:

Multisensory interactions are not limited to vision and audition. We have evidence of interactions between all five of the sensory systems, as traditionally conceived. It’s only now, though, that philosophers are beginning to take proper notice of the implications of these discoveries for perceptual experience. 

This brings us to a question that seems to be important if we are to make sense of these interactions and their consequences for perceptual experience: how do we distinguish the senses? Can we distinguish the senses on the basis of the different kinds of experience produced, or should we distinguish them on the basis of distinct sensory processing systems in the brain, or by means of the nature of the proximal stimulus of the experience, or by something else entirely (Macpherson, 2011)? 

There’s an analogous question about kinds of experience: How can we distinguish experiences from one another as being, for example, visual or auditory?

Settling on an answer to these questions seems to be necessary if we are to make any headway in classifying interactions as multisensory and deciding whether these interactions result in multimodal perceptual experiences.

For example, our perceptual experience when we eat and drink involves retro-nasal smell—the sensing of odours when we breathe out—as well as taste. When you’ve had a blocked nose you’ve probably noticed this, finding food to be temporarily flavourless and insipid. 

One response to this has been to claim that we have a distinct kind of flavour experience, resulting from interactions between the olfactory and taste systems (Smith, 2013). This view conceives of the experience as multisensory in so far as it involves processing in two distinct sensory systems, but the experience itself is not taken to be multimodal since it is thought of as being a kind of experience in it’s own right, distinct from either smelling or tasting. The matter is complicated further by evidence that what we see and hear, and tactile sensations within the mouth also contribute to our perceptual experience when we eat (Auvray & Spence, 2008).

Even if we can settle on a way of distinguishing the senses, there are further questions about the kinds of interactions that take place between the sensory systems. 

One kind of interaction might involve mere modulation of processing in one sensory system by processing in the other. Another kind of interaction might involve the integration of redundant information across the senses. A further kind of interaction might involve the binding together of information about different properties of the same object. 

For example, when you look at a key that you hold in your hand, visual information about colour might be bound together with tactile information about texture, generating a multisensory representation of the key as smooth and silver (O’Callaghan, 2014). 

These different kinds of interaction may have different kinds of impact on perceptual experience. What, for example, is the nature of the interaction between vision and audition in ventriloquism and how does it impact perceptual experience?

One approach to ventriloquism explains the effect in terms of the modulation of information in audition by information in vision. Ventriloquism is often measured by subjects’ pointing responses to the auditory stimulus. Subjects point to a position in between the actual locations of the auditory and the visual stimuli. How can we explain this in terms of modulation? 

We can say that the conflicting visual information about location modifies the auditory information about location (and vice versa). The result is that subjects hear the auditory stimulus as being in between the actual position of the auditory and the visual stimuli. This explanation of ventriloquism is consistent with perceptual experiences remaining modality-specific throughout.

There is, however, an alternative explanation of the mis-localisation of auditory stimuli in ventriloquism. This alternative explains subjects’ pointing behaviour in terms of the integration of conflicting spatial information. 

If sensory information is integrated, it seems possible that this integration will result in a single multimodal experience of an object at a location in space, in this case an audio-visual experience. If there is integration (or binding) of information across the senses, then we need to give some account of how the sensory systems determine that information belongs together.

The issues I’ve mentioned here offer just one avenue that we can pursue in rethinking and revising our views of perceptual experience in light of empirical discoveries about multisensory processing. Another avenue concerns crossmodal correspondences. 

We reliably match, for example, high pitch sounds with bright lights, high spatial elevations or small objects (Spence, 2011). How, though, are these associations between what seem to be different kinds of properties established? Are pairs or groups of apparently unrelated features of objects, or dimensions of stimuli, encoded in the brain in the same way?

A further line of research concerns synaesthesia. In some cases of synaesthesia an experience in one sensory modality seems to induce an experience in another, non-stimulated sensory modality. For instance, for some synaesthetes, hearing sounds causes them to have colour experiences. Franz Liszt and Olivier Messiaen reportedly experienced colours when they heard particular tones in this way. 

As with crossmodal correspondences, synesthetic experience is reliable and robust: hearing particular tones consistently induces experiences of particular colour hues. How do we explain the phenomenon? Do synaesthetes have two distinct modality-specific experiences—an auditory experience and a colour experience, for example—or are their experiences altogether different, experiences of coloured sounds, for instance (Deroy, in press)?

We are just now starting to understand the many and varied interactions that occur across the sensory systems and their impact on perceptual experience. What is clear, though, is that the multisensory nature of perceptual experience is relevant to all us, not just to those who work on the philosophy of perception or in psychology, or to those who work in the arts or in marketing, but to all of us, simply because we are perceivers.


Auvray, M. & Spence, C. (2008). The multisensory perception of flavour. Consciousness & Cognition. 17. p. 1016–1031. doi:10.1016/j.concog.2007.06.005

Bertelson, P. (1999). Ventriloquism: a case of cross-modal perceptual grouping. In Aschersleben, G., Bachmann, T., & Müsseler, J. (eds.). Cognitive contributions to the perception of spatial and temporal events. Amsterdam: Elsevier.

Deroy, O. (in press). Can sounds be red? A new account of synaesthesia as enriched experience. In Coates, P. & Coleman, S. (eds.). Phenomenal qualities. Oxford: Oxford University Press.

Macpherson, F. (2011). Cross-modal experiences. Proceedings of the Aristotelian Society. 111 (3). p. 429 – 468. doi:10.1111/j.1467–9264.2011.00317.x

McGurk, H. & MacDonald, J. (1976). Hearing lips and seeing voices. Nature. 264. p. 746 – 748. doi:10.1038/264746a0

O’Callaghan, C. (2014). Not all perceptual experience is modality specific. In Stokes, D., Matthen, M. & Biggs, S. (eds.) Perception and Its Modalities. Oxford: OUP.

Smith, B. C. (2013). Philosophical Perspectives on Taste. In Pashler, H. (ed.). The Encyclopaedia of Mind. Newbury Park, CA.: Sage.

Spence, C. (2011). Crossmodal correspondences: A tutorial review. Attention, Perception, & Psychophysics. 73. p. 971–995. doi:10.3758/s13414-010‑0073‑7