Multi-word Expressions Made Easy or Difficult: What L1 and L2 Processing Tells Us

Multi-word Expressions Made Easy or Difficult: What L1 and L2 Processing Tells Us

By: Harumi Kimura

When I wrote an article for the “Emotion” issue of the Think Tank in 2018, I learned that emotions are predictions our brain makes, not reactions to some stimuli from the outside world (Barrett, 2017). Although this claim sounded somewhat surprising and went against common beliefs about how our minds work, neuroscientific evidence has accumulated that our brain is, in fact, a prediction-making machine. The brain constantly creates predictions about our senses, cognition, and behaviors, let alone our emotions, and creating predictions is indeed its main function. Simply put, our brain is using statistics in order to make better and more fine-tuned future predictions, and when the predictions are found to be wrong, the brain adjusts and renews the model. This is indeed learning.

Frequency is of central importance. The more we are exposed to a language, the more word combinations become learned as chunks, i.e., multi-word expressions (MWEs) at the basic level and combinations of those at a higher level. These become basic linguistic units themselves and this reduces the amount of language processing we have to do. Chunking is an effective strategy in language processing. In fact, it is the only way for normal human functioning. Were we to think of the meaning of each word one by one to process language, we wouldn’t have enough mental resources left over to process overall meaning, and you can actually see that happening with beginners. Instead, our brain learns the patterns of language so that it can predict each utterance as it is happening, which is a huge reduction to the cognitive load.

This is the key to predictive language processing. It reduces the processing load so that we have more processing power to use for meaning and relevance. It is crucial to both understanding and retrieving. Our brains use two tools to do predictive processing of language: 1) syntax, which limits the number of words that can come after a particular sequence: “I got wet because it was____” and 2) multi-word expressions. In this article, we will look at the latter, multi-word expressions, which are far more important in a language teaching syllabus than most teachers realize.

Psycholinguists, corpus linguists, cognitive linguists, and SLA researchers who take bottom-up approaches toward language acquisition, like usage-based and exemplar-based theories, support this computational view of language acquisition. It is believed that our brain takes statistics from the incoming language input and that it is sensitive to frequency; thus, language processing is sensitive to the frequency of language usage in context (Ellis, 2002). Every language-learning episode affects a learner’s knowledge of language—that is, a frequency-biased abstraction of patterns, or regularities, identified in language use. All aspects or levels of language are affected by probabilities of occurrence. Ellis (2002) put this idea as follows: “Frequency is thus a key determinant of acquisition because ‘rules’ of language, at all levels of analysis (from phonology, through syntax, to discourse), are structural regularities that emerge from learners’ lifetime analysis of the distributional characteristics of the language input. Learners have to figure language out” (p. 144).

The studies of recurrent word combinations, called multi-word expressions here, question the traditional distinction between syntactic knowledge and lexical knowledge. It was generally believed that we produce language by correctly combining single words according to grammatical rules, so our utterances are new, creative, and original combinations of words. Well, not really. We operate with a large number of larger, formulaic units and use them in a highly automatic way without thinking about appropriate combinations all the time, because a previous word or words signal the following word(s). Therefore, processing prefabricated chunks is fast, effortless, and economical. Extensive exposure to language use and personal engagement in interaction makes this predictive processing possible. Researchers have estimated that multi-word expressions comprise from 20 to over 50 per cent of our total language use (Siyanova-Chanturia & Martinez, 2015). Numbers vary, depending on what corpus data the researchers use, which group(s) of MWEs they investigate, and how they identified MWEs. Let us examine different groups of MWEs below.

What are multi-word expressions?

Multi-word expressions are combinations of words that co-occur frequently. In general, collocations (e.g., take a photo), verb-particle combinations (e.g., put off), binominals (e.g., bride and groom), lexical bundles (e.g., if you look at), prefabricated routines (e.g., What’s up?), and idioms (e.g., kick the bucket) are classified as multi-word expressions. (Formulaic language—language fixed in form—includes one-word expletives such as “Damn!” in addition to MWEs and is more inclusive.) Some multi-word expressions have figurative meanings (e.g., icing on the cake), while others have literal meanings (e.g., anti-icing agent). Some are polysemous—having multiple meanings—(e.g., come across) and others are not (e.g., put up with). In some MWEs, individual components are more strongly associated (e.g., bread and butter) with each other than in others (e.g., rye bread). Remember that those distinctions are not binary and rigid, but continuous and flexible. Despite all of these differences, language is largely formulaic, and MWEs constitute an integral component of language knowledge.

As previously noted, by knowing MWEs, we can predict the whole expression by encountering just one part of it. The combinations of constituents in multi-word expressions can thus be anticipated to a greater or lesser degree because they are fixed and familiar to language users. Knowledge of MWEs makes language processing easier because of this predictability. This characteristic of MWEs reduces language users’ cognitive load. Listeners are faster at processing more frequent items and speakers find it easier to recall more frequent items compared to less frequent ones. Thus, intuitively, MWEs are likely to be stored in language users’ mental lexicon as a whole and retrieved from memory as a whole; they are thus used as meaningful chunks in real time. Research has demonstrated that although MWEs can be processed holistically as unanalyzed chunks, each constituent in the combination can also be accessed individually (Siyanova-Chanturia & Pellicer-Sánchez, 2019).

"the skillful use of multi-word expressions is a hallmark of highly proficient users"
Harumi Kimura
TT Author

L1 speakers use more multi-word expressions than L2 speakers (Siyanova & Schmitt, 2007). In fact, skillful use of multi-word expressions is a hallmark of highly proficient users of the language. Even proficient L2 speakers are known to produce word combinations which are grammatical, but unconventional and a bit weird to native speaker ears. Furthermore, this is true not only lexically, but phonetically as well, since MWEs are often phonetically reduced and involve connected speech (e.g., wanna, gonna, and Whaddaya). In addition, MWEs often realize specific speech acts and help speakers to structure discourse in a socially appropriate way; thus, language is highly conventional and knowledge of MWEs constitutes an integral part of socio-linguistic competence, or discourse knowledge (Bardovi-Harlig, 2019).

Research has demonstrated that both L1 and L2 readers appear to be sensitive to phrase frequency. Both L1 readers and advanced L2 readers process MWEs faster when the constituents are adjacent (e.g., provide information) compared to when they are nonadjacent (e.g., provide some of the information) (Hernández et al., 2016). It is likely that L2 learners learn to process MWEs faster as they become more proficient, which is good news for L2 learners: Efficient predictive processing of MWEs seems learnable.

On the other hand, lower intermediate learners were found to misinterpret MWEs with non-literal meaning, such as it’s about time, by attributing literal meanings to individual words (non-literal = something should happen soon vs. literal = it’s a time issue) (Martinez & Murphy, 2011) and fail to comprehend figurative idioms, such as follow suit, although they might be able to discern the meaning from the context (Boers et al., 2007). Furthermore, L2 readers did not have as much processing ease as L1 readers when the constituents were non-adjacent (e.g., provide some of the information) (Vilkaité & Schmitt, 2017). The fact is that L2 readers could not process discontinuous MWEs as efficiently as L1 readers. It is likely that the processing advantage of MWEs almost disappeared for L2 readers when there was an intervening string (e.g., some of the) between the collocating constituents, while ease remained consistent for L1 readers. We do not know for sure, at the moment, why the predictive processing of non-adjacent collocations is hindered for L2 readers, but this may provide a clue to further investigations of L1 and L2 processing.

Teaching multi-word expressions

In this section, let us look at some suggestions from experts for learning and teaching multi-word expressions. Although MWEs are a hugely important tool for predictive language processing, their importance has generally been overlooked. We can correct that problem now by adopting some of the teaching strategies and techniques offered below.

First, L2 learners can benefit from incidental learning of multi-word expressions through extensive reading/listening (Siyanova-Chanturia, 2020). Learners can nurture their knowledge by picking up expressions from input without intentionally studying them. Hernández et al. (2016) demonstrated that both naturalistic (immersion) and classroom learners were sensitive to both word and phrasal frequency and able to recognize MWEs efficiently. Learners should be repeatedly exposed to MWEs through reading and listening, as repetition is the key to learning.

Second, learners will also benefit from deliberate learning and explicit instruction. There are three techniques teachers can use to teach multi-word expressions. All of them are designed to promote noticing and make learners pay attention to target items; therefore, they assist deliberate learning. The first is to explore text to identify MWEs. The activity is called “text chunking.” In this activity, learners are told to find MWEs in reading or listening text, for awareness-raising and better uptake. The second is to make use of text enhancement. Reading text can be enhanced by presenting target MWEs in bold type. Szudarski & Carter (2016) demonstrated that input flood (giving learners a lot of reading or listening text) using typographically enhanced text worked even better for learning collocations than input flood only. The third is to engage in decontextualized activities (Pellicer-Sánchez & Boers, 2019). Among them, presenting MWEs as chunks was the most effective, as shown in the following example:


Choose the appropriate verb/noun collocation for each blank from the list.

pay attention / take medicine / do harm

Don’t worry. These pills don’t (               ) to your system.

My son pretended to (               ) when I talked to him last night.

Why don’t you (               )? You’ve been coughing for a week.


This activity was found to be more helpful for retention than other activities such as matching a verb and an object, or choosing the right verb for the object. It appears that MWEs should be presented as intact chunks since exercises that separate them into individual words might leave undesirable traces in the brain and result in learners’ confusion when trying to retrieve them. (Boers et al., 2017).

Third, in terms of production, the 4-3-2 activity appears to be promising for fluency gains (Boers, 2014). In this activity, learners repeat the same task three times, usually with a different partner, under increasing time pressure. Boers examined the effects of repetition and time pressure and found that speech rate, measured by words per minute and syllables per minute, increased either with or without time pressure. The researcher did not specifically investigate MWEs in this study; however, considering that MWEs are ubiquitous, we can reasonably infer that the participants’ talk would include quite a few MWEs, and that they would be articulated more quickly since the overall speech rate increased statistically significantly.

Fourth, corrective feedback, such as recast, might be useful for learners to develop accuracy in the use of MWEs. Recast is the immediate reformulation of a learner’s inappropriate utterance during meaning-focused interaction. Look at the following example:

Student: The big typhoon made damage to the crops.

Teacher: Is that so? I didn’t know that. Did it do a lot of damage?

In this example, the teacher makes a correction to the student’s utterance by changing the verbs, but does so without spoiling the flow of communication. If the student notices the correction, she may reformulate her speech and produce an accurate utterance in the next turn. In the abovementioned study, Boers (2014) pointed out that under time pressure, learners’ speech became faster, but less accurate, although there were also some self-corrections in later utterances. In fact, there were even cases in which a correct expression was replaced by an erroneous one in a subsequent turn. To consolidate the knowledge of target MWEs in their repertoires, learners need to strategically pay attention to teachers’ covert feedback, such as recast.

Fifth, teachers also need to be strategic in teaching multi-word expressions. For example, some of the MWEs include alliteration (repetition of initial consonants as in words of wisdom) and assonance (repetition of similar or identical vowel sounds as in my kind of guy). Both types of MWEs are easy to remember and fun to practice. Teachers can point out these mnemonic patterns, use them to raise awareness, and/or encourage learners to search for them in spoken and written text. Experimental studies have demonstrated that such instruction and activities are beneficial in long-term retention (e.g., Lindstromberg & Boers, 2008a for alliteration; Lindstromberg & Boers, 2008b for assonance).

Sixth, teachers need to help their students to pay more deliberate attention to multi-word expressions, especially when the constituents are non-adjacent (Vilkaité & Schmitt, 2017). Eye-tracking studies have demonstrated that advanced non-native speakers read adjacent collocations (e.g., spend time) faster than non-formulaic control phrases (e.g., eat apples), but they did not read non-adjacent collocations (e.g., spend a lot of time) faster at a statistically significant level. It appears that non-native speakers, even those who are proficient, did not process the constituents as linked when they were not next to each other. When the collocation is not salient—similar to when it is non-adjacent—explicit teaching or some kind of noticing activity would be of considerable importance to promote intentional learning, encourage uptake, and eventually increase automaticity.

Last but not least, teachers need to be consistent in teaching multi-word expressions. Sustained focus on MWEs throughout the course, with a series of MWE activities for learners to notice and practice, would be highly beneficial in language instruction. For this kind of course to be successful, teachers should first be convinced that language is highly formulaic, and that it is not an overstatement to say that predictive processing of MWEs supports language proficiency to a large extent.

To recap, we predict what we will hear or see. This system explains the automaticity of language processing. We L2 teachers should examine to what extent our classroom activities are helping our students to notice MWEs, to practice using them, and to develop both receptive and productive fluency in processing MWEs.


  • Bardovi-Harlig, K. (2019). Formulaic language in second language pragmatics research. In A. Siyanova-Chanturia & A. Pellicer-Sánchez (Eds.), Understanding formulaic language: A second language acquisition perspective (pp. 97–114). New York, NY: Routledge.

  • Barrett, L. F. (2017). How emotions are made: The secret life of the brain. New York, NY: Houghton Mifflin Harcourt.

  • Boers, F. (2014). A reappraisal of the 4/3/2 activity. RELC Journal, 45(3), 221–235.

  • Boers, F., Demecheleer, M., He, L., Deconinck, J., Stengers, H., & Eyckmans, J. (2017). Typographic enhancement of multi-word units in second language text. International Journal of Applied Linguistics, 27(2), 448–469.

  • Boers, F., Eyckmans, J., & Stengers, H. (2007). Presenting figurative idioms with a touch of etymology: More than mere mnemonics? Language Teaching Research, 11, 43–62.

  • Ellis, N. C. (2002). Frequency effects in language processing: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24(2), 143–188.

  • Hernández, M., Costa, A., & Arnon, I. (2016). More than words: Multiword frequency effects in non-native speakers. Language, Cognition, and Neuroscience, 31(6), 785–800.

  • Lindstromberg, S., & Boers, F. (2008a). The mnemonic effect of noticing alliteration in lexical chunks. Applied Linguistics, 29, 200–222.

  • Lindstromberg, S., & Boers, F. (2008b). Phonemic repetition and the learning of lexical chunks: The mnemonic power of assonance. System, 36(3), 423–436.

  • Martinez, R., & Murphy, V. A. (2011). Effect of frequency and idiomaticity on second language reading comprehension. TESOL Quarterly, 45(2), 267–290.

  • Pellicer-Sánchez, A., & Boers, F. (2019). Pedagogical approaches to the teaching and learning of formulaic language. In A. Siyanova-Chanturia & L. Pellicer-Sánchez (Eds.), Understanding formulaic language: A second language perspective (pp. 153–173). New York, NY: Routledge.

  • Siyanova-Chanturia, A. (2020, January 25-26). On the role of multiword expressions in language learning and use [Seminar talk]. Temple University Japan Campus Weekend Seminar, Tokyo, Japan.

  • Siyanova-Chanturia, A., & Martinez, R. (2015). The idiom principle revisited. Applied Linguistics, 36(5), 549–569.

  • Siyanova-Chanturia, A., & Pellicer-Sánchez, A. (Eds.). (2019). Understanding formulaic language: A second language acquisition perspective. New York, NY: Routledge.

  • Siyanova, A., & Schmitt, N. (2007). Native and nonnative use of multi-word versus one-word verbs. International Review of Applied Linguistics, 45, 119–139.

  • Szudarski, P., & Carter, R. (2016). The role of input flood and input enhancement in EFL learners’ acquisition of collocations. International Journal of Applied Linguistics, 26(2), 245–265.

  • Vilkaité, L., & Schmitt, N. (2017). Reading collocations in an L2: Do collocation processing benefits extend to non-adjacent collocations? Applied Linguistics, 40(2), 329–354.

Harumi Kimura teaches at Miyagi Gakuin Women’s University, Sendai, Japan. She earned her doctorate from Temple University. She researched second language listening anxiety in her doctoral study and her academic interests include learner psychology and cooperative learning. She co-authored a book with G. M. Jacobs, Cooperative Learning and Teaching (2013).

Leave a Reply

Your email address will not be published. Required fields are marked *