Reflections on Mirror Neurons and Our New Insights on Embodied Simulation

Reflections on Mirror Neurons and Our New Insights on Embodied Simulation

Amanda Gillis-Furutaka & Curtis Kelly


In 1992, Rizzolatti and a team of researchers at the University of Parma attached sensors to Macaque monkey motor neurons to map muscle-motor interactions. Then, as the story goes (Taylor, 2016), during a break, one of the researchers started eating lunch in the same room. The monitors started showing an odd pattern. Whenever the researcher grasped his lunch and lifted it for a bite, the neurons in the monkeys’ brains for the same grasping and lifting actions fired as well, as if they were doing the eating. In that way, mirror neurons were discovered.

This discovery was a big event and it got a lot of buzz. “What?” the world of science said. “Why are the monkey brains mirroring these actions? Is that what enables learning through mimicry? Have we discovered the secret of empathy?” These interpretations, especially in regard to empathy, have been both toasted and taunted over the years (for: Ramachandran, 2020; against: Hickok, 2014), and even as recently as a year ago, even though the empathy view has lost so much ground, new studies like this one have come out supporting it.

While in no position to make authoritative claims, I wonder if the way we have framed mirror neurons and drawn conclusions about them, might have been misleading. First of all, what people write about mirror neurons makes them seem separate from all the other sensory and motor neurons, existing in a special system devoted just to mirroring. Yet, I have seen nothing in the literature that says outright that mirror neurons are a special, separate system. So, instead of saying “mirror neurons,” maybe we should say that some “neurons mirror.” Indeed, that is more in line with Andersons’ neural reuse hypothesis.

In the last decade, Anderson caused a paradigm shift in how we think about the brain. Throughout history, we have tended to portray neural cortices, areas, and circuits as each having specific functions, much like a computer motherboard. The motor cortex just controls muscles; Broca’s region just processes language. But neither assumption is true.

Anderson overturned this single-purpose thinking by showing how a particular group of neurons is likely to be pulled into all kinds of disparate tasks, just because that bit of brain tends to be good at the kind of processing needed for that task. For example, the inferior parietal sulcus in the motor cortex, a part of the brain that keeps track of our fingers, is also activated when we compare number sizes, and other functions. In fact, a particular circuit might be pulled into a processing team one time and then left out the next. Neural circuits are being redeployed in coalitions like this all the time.


You’re right. Dehaene (2020) identifies the same sort of process, which he calls neuronal recycling, and demonstrates how this takes place not just when we compare numbers, but also when we read. One example is the way that the brain of a literate  English-speaking person uses the face-recognition system to distinguish between different letters of the alphabet and then between words. The area which is repurposed in this way is referred to as the Visual Word Form Area (Dehaene video). When the silent letters on a page are deciphered, the word can then be “heard” in the auditory cortex and recognized, a phenomenon which Baddeley (2003) calls the phonological loop by. Bergen (2012) explains why we “hear” the words in our heads when we read. Although we make no sound when we produce inner speech, we use the same parts of the motor cortex that control the tongue and mouth. In other words, we do not actually move our mouth when thinking or reading silently, but the regions of our brain that control speech are activated, and therefore we hear what we are reading as if we were articulating the words. This process seems similar in some ways to what the discoverers of mirror neurons observed in the monkeys, but rather than responding to physical actions they see, readers respond to symbols on a page by hearing the sounds the symbols describe as spoken language in their mind’s ear. This is far more complex than merely imitating. 


So, in trying to figure out why neurons mirror others’ actions, the early conclusions – mimicry and empathy – seem far too narrow. There seems to be something bigger going on, much bigger.  Saying that mirror neurons exist so that we can empathize is like saying teeth exist so that we can smile; not wrong per se, but only one brushstroke in a much bigger picture.  I think your comments on how reading words activates the same motor neurons as articulating words, even when there isn’t any speaking, is a big clue. I believe that motor firing is not just connected to the mental model of a word, it is the mental model.

So, this brings us to a fascinating theory in neuroscience, embodied simulation. It is described in Bergen’s (2012) book, Louder than Words, and might be the single most important concept in neuroscience a language teacher should know. He says that basically, neurons mirror in order to make meaning from what we observe.[1]  Make meaning!  As you can see, we are going far beyond the teeth-are-just-for-smiling way of thinking.

Sensory input is messy. Our senses are bombarded with gigabytes of mechanical and electromagnetic energy that our brains have to sort out into sources, entities, and processes. To figure out what things are, we match the sensory input to models that we already have in our heads, models laid down through previous encounters and experiences. These mental models exist as neural representations in one of the most amazing encyclopedias you could imagine: The groups of sensory and motor neurons that fire together for the sensory input of one entity (dark, breathy, heavy background music: Darth Vader! ) become increasingly tied together the more times that particular set of firings happens. Hence, that is how we make mental models.

[1] To be fair, many others said this before Bergen. See Caroline Handley’s article in this issue.

If I see someone holding a sandwich, the visual input of that sandwich will also fire my olfactory, somatosensory (touch), and taste sequences that are recorded in my representation (mental model) of eating a sandwich. If I see someone lifting it, that particular sequence of visual input fires my own motor neurons for lifting because that is how I recorded in my brain what lifting is. These representations were built up over decades of generally positive experiences with sandwiches, and thereby connected to other networks, such as lunch bags, spreading mustard, chewing, convenience, mom, hoagies, or whatever other features in sandwich experiences I had, especially when those features showed up repeatedly.

Once these models are made, identifying objects becomes much faster. Once a particular sequence starts firing, with the emphasis on “starts,” the rest of the model becomes activated as well, even before all the sensory input is there. That allows me to jump back almost instantly when I see a snake in the grass (the downside being that sometimes it is really just a stick), but the speed of my reaction is important. 

The important part is that all these components of a model were all first forged in my own physical experiences. When I learned to lift things as a baby, my brain made a model for lifting. When I ate my first sandwich, my brain made a model for sandwich that included the smell, texture, taste, appearance, and maybe a crunching sound if it was toasted. Since the models are based on my own physical experiences, this gives us a different picture of why neurons in the motor and sensory cortices are mirroring. Rather than mirroring for the less essential purposes of mimicry and empathy, mirroring represents the activation or our own internal models of the same action, thereby making meaning of what we see. Applying what I learned in my own direct experiences lets me understand others’ actions. Mirroring, although I’m not sure we should even use that term, is the fundamental way we make meaning.


Yes! Seeing the actions of others activates models in our mind, and so does language. Let’s go back to what I was explaining about the Visual Word Form Area and its role in reading an alphabet-based language. When we become efficient at recognizing combinations of letters automatically as words, the phonological loop can be by-passed and the visuo-spatial sketchpad is called into service to help us “see” in our mind’s eye what is being said by the words of the text. But the BIG question is, how do we actually understand what we see and what is meant by the writer?

"Seeing the actions of others activates models in our mind, and so does language."
Amanda Gillis-Furutaka
TT Author

It is likely that we draw on our experience and knowledge of the world to form movie scenarios in our mind. The brain does this by calling into action all the parts of our brain that deal with the physical and emotional sensations that we are reading about and which we also use when we are experiencing them firsthand.


So, you are saying that language activates the circuits for the physical and emotional sensations we are reading about as if we were experiencing them firsthand! This fits embodied simulation perfectly! On hearing a word, a particular sequence of neurons for the sound of the word are activated in the auditory cortex. That particular auditory sequence, part of the larger mental model, then activates all the downstream circuits also wired into that model, including those related to visual images and motor actions.  As a result, we see the “movie” as you put it.  But let me add that it is a super movie with feel, smell, emotion, predicted outcomes, and much more in it.

In the same way the sensory input of long, skinny, and brown activated a model that made me perceive a snake in the grass, the word “snake” would activate that model as well. Likewise, if I hear or read the phrase “lift a sandwich,” the same networks get fired up again as if I was seeing someone doing it, or as if I was doing it myself.  Words and phrases trigger the firing of those neural networks representing models the same way sensory input does. Embodied simulation (“embodied” meaning it happens in our sensory and body-related networks) is how we make meaning from language, too. That is amazing.


Exactly! And Bergen gives us a very memorable example of how we process language through sensory simulation:

“When hunting on land, the polar bear will often stalk its prey almost like a cat would, scooting along its belly to get right up close, and then pounce, claws first, jaws agape” (2012, p. 15). When reading this sentence, people probably see a frozen white landscape, possibly hear the sound of the bear sliding over the ice, and may smell the freezing air and feel the cold even though this information is not included in the passage. Bergen explains that meaning is “a creative process in which people construct virtual experiences–embodied simulations–in their mind’s eye” (p. 16). These words activate neural routines that give them meaning, sensory routines based on all our prior knowledge about polar bears, their appearance, and their habitat, not to mention the routines related to “stalk,” “scoot,” “ice,” and so on. That is how we make that mental image.

It seems that to understand what is happening, either in our immediate surroundings or on a page or screen filled with written text, the brain calls into action everything it has learned through the firsthand experiences of our senses and carries out a vicarious simulation. It mirrors and matches or compares what is known to what is happening and new.


I wonder, though. A lot of what we wrote above about mirroring is based on our own views of embodied simulation. I haven’t found much in the literature that supports this interpretation. The literature talks about embodied simulation and mirror neurons, but rarely together, and the debate on both topics is still pretty fierce. 

And meaning from language based on our own direct experiences? Embodied simulation might explain how we understand the word “lift” because we have had the direct experience of lifting things.  But what about abstract concepts like “fame” or “justice”?  Amanda, are we going too far?


Embodied simulation certainly makes sense for sandwiches and polar bears, but you are right to raise the question about how that would work for abstract concepts and words. We could write an entire article  on that fascinating topic.  Fortunately, both Brian and Caroline are going to discuss abstract language in their articles, coming right after this one.

So, for now, let us just leave our readers with the following delightful illustration of how we are not the only species to understand abstract concepts such as justice and injustice, fairness, and equal pay for equal work.

A series of experiments were carried out by Frans de Waal and his team on animal cooperation, empathy, and sense of fairness. (Watch the last part of this talk – de Waal, 2012 – on Capuchin monkeys) to see what happened when the monkeys received unequal pay for the same work. In fact, this experiment has been replicated with other species (dogs, birds, and chimpanzees), so it is clear that certain abstract concepts can be understood and acted upon by several species. Judging from the behavior of the animals de Waal has worked with, the abstract concepts that many species understand are closely related to emotional states.


Seeing that monkey mirrors how I sometimes feel about my learners when they can’t remember the words I taught them. But now I realize that my way of teaching vocabulary–giving them word lists to memorize–is not very brain-friendly. If the meaning of language exists in rich motor and sensory images, I should include all those cues as well. Teaching words removed from a meaningful context, such as a story or physical gesture, reduces their potential to be internalized.


And an additional brain-friendly step when teaching vocabulary is for the students to create sentences about themselves, or people they know, using the new words, and sharing their ideas (and the backstories) with each other. By doing this, they will be 1) setting up multiple neural networks for the new words when they connect these words to their own experiences and 2) firing the networks again when explaining the stories to their classmates, all of which which will aid them in the learning process.

Amanda Gillis-Furutaka, PhD, program chair of the JALT Mind, Brain, and Education SIG, is a professor of English at Kyoto Sangyo University in Japan. She researches and writes about insights from psychology and neuroscience that can inform our teaching practices and improve the quality of our lives, both inside and outside the classroom.

Curtis Kelly (EDD) was the first coordinator of the JALT Mind, Brain, and Education SIG. He is a Professor at Kansai University in Japan. He has written over 30 books and 100 articles, and given over 400 presentations. His life mission is “to relieve the suffering of the classroom.”

Leave a Reply

Your email address will not be published. Required fields are marked *