Cognitive Load Theory: What it is, and Why Teachers Should Care about it

Cognitive Load Theory: What it is, and Why Teachers Should Care about it

By: Julia Daley

Here’s a scene from a classroom. It’s a language class where the students are learning to communicate in English. The teacher explains to the students their next task—to interview a classmate and then make a presentation introducing their partner. First, though, they must create their interview questions. The teacher instructs the students to write at least five questions and emphasizes that they should practice the grammar points from the unit. As the students work individually on writing their questions, the teacher walks around the room and monitors their progress. The teacher notices that one student, let’s call her Amy, hasn’t written any questions; instead, Amy is chatting in her native language with her neighbor (who has written a couple of questions down) about the concert she saw over the weekend. The teacher comes up to Amy and asks “Amy, where are your questions?” To which Amy replies “What questions?” 

I’m sure you can imagine the rest of the conversation because it’s one that we as teachers have pretty much every day, right? It can feel like it’s only in some fantasy dream world that we teachers have classrooms full of students who are perfectly on-task at all times. It’d be easy to look at Amy’s behavior and assume she just wasn’t paying attention or doesn’t care about the class at all—and perhaps, indeed, that is the case! —but there is also the possibility that Amy has simply forgotten what she is supposed to be doing because she has experienced cognitive overload. 

All of us experience cognitive overload: it’s part of the human experience. You’re experiencing overload when you open the door to your refrigerator and suddenly can’t remember what you were even looking for to eat. It’s walking into a room and forgetting what you were doing there in the first place. It’s having a conversation with a new acquaintance and realizing you have forgotten what their name is partway through. It’s the expression “drawing a blank” in action—you reach in vain into your memory to try and salvage whatever situation you’ve found yourself in, but it’s like there’s nothing there for you to grab. That, in a nutshell, is cognitive overload. Let’s take a closer look at how this phenomenon occurs, before we return to the teaching scenario from earlier. 

Overview of Memory

To fully understand cognitive overload, we need to first look at memory and how our brains learn. Particularly, we will look at both Working Memory (WM) and Long-Term Memory (LTM). We’ve had a whole issue on WM in the past, so if you want a deep dive, please go back and look through that issue. I will only go over these concepts briefly here. 

Baddeley and Hitch first proposed the theory of working memory in 1974, and made the most recent additions to their model in 2000. In sum, it is the part of our minds dedicated to the conscious, real-time manipulation of information (Baddeley & Hitch, 1974). We are always using our working memory—it is the core of thought itself. When our students are learning in the classroom (or when we are teaching!), they are using their working memory.

The most important thing to remember is that our working memories have a finite capacity; that is, we can only hold so much in our working memory in a given moment. So far, there is no definitive research showing that we can increase our capacities (Dehn, 2008). We can, however, learn to use our working memories efficiently and make the most of our limited mental workspace (Gathercole & Baddeley, 1993). The best way to do so is to rely upon our long-term memory. 

While our working memory capacities are finite, there is (theoretically) no limit to what we can store away in our long-term memory (though accessing some memories later may be challenging if the connections to them have weakened) (Bahrick, Bahrick, & Wittinger, 1975). Long-term memory can be divided into three distinct memory types: procedural memory, semantic memory, and episodic memory (Tulving, 1972). Procedural memory is unconscious or automatic; it’s the things you can do without having to think about them—things like riding a bike, holding a pencil, or brushing your teeth. Semantic memory, which you may also know as Declarative Knowledge (Cohen & Squire, 1980), is about knowing things about the world—when you play a trivia game, you’re relying heavily on your semantic memory; you are able to recall some fact or skill and can then “declare it” or “demonstrate it” to others. Episodic memory is where you store information about the events you have experienced over the course of your life—these are the memories you can “re-watch” when you consciously choose to remember them. 

Our various long-term memories are grouped and connected together into schemata (Piaget, 1977/2001). A schema (plural: schemata) is a sort of mental framework that helps us to organize and interpret new information as we learn. Piaget (1977/2001) theorized that we are constantly adding information to existing schemata, modifying and adapting them as we need to, and even creating a new schema when we learn or experience something totally novel. The more recent theory of predictive processing, also known as the Bayesian brain, builds off of Piaget’s work; the Bayesian model theorizes that our brains build schemata to test and verify their predictions in addition to holding information in our long-term stores. The more our working memory can attach the information it’s holding to the pre-existing schemata in our long-term memory, the more information we are able to hold in our limited stores.

Cognitive Load Theory

Cognitive Load Theory takes the theories of working memory, long-term memory, and schemata and throws in one extra ingredient—effort—to create a cohesive understanding of how learning works (Sweller, 1988). Imagine your brain as a donkey pulling a cart: sometimes the cart is light and the donkey trots merrily on its way without breaking a sweat; other times, the cart is heavy, and the donkey has to strain and really use its muscles to get moving. That weight, or lack thereof, is cognitive load—the amount of mental effort we need to put into a task.

Cognitive load can be divided into three types: intrinsic, extraneous, and germane (Sweller, Van Merriënboer, & Paas, 1998). Intrinsic load has to do with task difficulty; difficulty cannot be changed and results from the nature of a particular task. For example, learning how to add together two single-digit numbers, like 1 + 2 =3, would be a task that is considered to have a low intrinsic load due to its simplicity; learning how to calculate the square root of a number would be a task with a higher intrinsic load due to its complexity. Difficulty, however, is relative—learning 1+2 might be easy for us as adults who have finished tertiary education, but for a 4-year-old experiencing a math lesson for the first time, that 1+2 would be much more difficult! Your prior level of experience (aka the schemata you can draw upon) can change how much intrinsic load a task has for you. While your status as a novice or an expert on a given task can be in flux during your lifetime, at the moment of attempting the task itself, the intrinsic load you experience will remain constant for the duration of the task (Sweller, Van Merriënboer, & Paas, 1998). 

Extraneous load is something we, as learners and teachers, have more control over. Extraneous load has nothing to do with the task itself, but instead has everything to do with the environment in which the task is happening and the way that the task is presented to the learner (Chandler & Sweller, 1992). For instance, if I want to teach someone about a unicorn, I could describe it with words—a mythical creature with the face and body of a horse, the neck and legs of a deer, and the tail of a lion, that is all white in color and has a single, spiraling horn growing from its forehead—or I could just show a picture. It takes a lot of mental effort to parse the verbal description versus looking at a picture, so much so that the lengthy definition is essentially unnecessary, or extraneous. Thus, that mental effort is considered extraneous load. Similarly, if I were trying to teach in a classroom next to a demolition site, my students would have to expend a lot of effort to focus on my teaching and to ignore the near-constant sound of jackhammers. The noisy environment, in this case, is creating extraneous load for the students and limiting their capacities for learning. 

The third type of cognitive load, germane load, is a more recent addition to the model and is also one that we, as teachers and learners, have some control over (Sweller, Van Merriënboer, & Paas, 1998). The mental effort spent connecting information to existing schemata, adapting older schemata to accommodate new information, or creating new schemata in our minds is considered to be germane load (Sweller, Van Merriënboer, & Paas, 1998). This type of cognitive load is the most conducive to learning because it involves the movement of new information from working memory to long-term memory.

Cognitive Overload

Let’s come back to the Amy situation and take a look at what’s going on behind the scenes in her mind. Amy is a novice speaker of English, so she experiences high intrinsic load when she is listening to her teacher’s instructions given in her non-native language. It’s a Monday, and she’s still experiencing the lingering effects of the concert she saw over the weekend—she relives moments of it in her mind, and she’s rehearsing what she’ll describe to her friends later—and this is creating a significant amount of extraneous load in her working memory. She is able to use her existing schema of “instruction English” words to understand some of the first instructions—that she should talk with a partner in English—but she has so little capacity left that this extra mental effort tips her over the edge and she experiences cognitive overload, causing her to forget all elements of the task she is supposed to be doing. Of course, Amy doesn’t necessarily have the vocabulary or metacognition to understand or explain that she has experienced cognitive overload. She just knows that she doesn’t know what she’s supposed to be doing, and really, speaking English makes her tired and she’d rather just talk with her neighbor about that cool concert.

Cognitive overload is when our mental donkey carts become so overwhelmingly heavy and full that the carts collapse entirely, and the donkey comes to a complete and confused standstill, wondering what it should do now. We all experience cognitive overload, that moment when our working memory capacity is overwhelmed by too much information and too many tasks, and Amy is no exception. Language learning tends to have a higher intrinsic load than most tasks (Osada, 2004), so it is especially prone to this problem. Students experiencing overload tend to display similar behaviors: they’ll suddenly drop an activity or task, chat about unrelated topics with their classmates, “zone out” or otherwise not engage with the task, are unable to explain what they’re supposed to be doing, and may even repeat an already-completed task because they forgot they’d already finished it (Gathercole & Alloway, 2008). Instead of being annoyed with students for being “bad,” as it’s so very easy for teachers to do, try to reframe it in your mind as “Oh, this student is experiencing overload! How can I help them to get back on task?”

Common Signs of Cognitive Overload

Adapted from Gathercole & Alloway, 2008

Fortunately, by applying Cognitive Load Theory in our classrooms, we can do quite a bit to mitigate the consequences of overload in our students. This application involves a two-pronged approach: reducing extraneous load and increasing germane load. Basically, we want our students’ minds to be focused on learning, so we want to free up as much space in their working memories as we can for that good-old germane load.

First, we can change our lesson delivery in order to reduce extraneous load in their minds. When giving directions, we should do so in the most efficient way possible: by starting with the first step students need to do and explaining it in language they can understand (Gathercole & Alloway, 2008). We can pair verbal instructions with written ones so that, in the event of cognitive overload, our students can promptly recover the information they need to complete the task. We can ensure that all materials presented to students, on paper or on a projector, are well-designed, with no distracting visuals or animations. We can also control our classroom environment, making sure that it is a safe space where students can feel comfortable making mistakes—because students who are worried and anxious are students with reduced working memory capacities (Chen & Chang, 2009).

Secondly, we can promote germane load by bridging new material to old, helping our students activate their pre-existing schemata (Gathercole & Alloway, 2008). We can do this by reviewing previous concepts and demonstrating how they build into the new ones—something that is not often clear or obvious to learners. In addition, we can connect our lessons into a continuous unit so that each one leads smoothly to the next—this helps to reinforce existing schemata and allows students to efficiently sort new information into an appropriate schema in their working memories. Students need numerous opportunities to practice and apply what they’ve learned in the classroom, and we can build time for this into our lessons. With repeated practice, students will move from being novices at a task to being experts, freeing up cognitive capacity to tackle more complex tasks as the intrinsic loads they “carry” slowly become lighter with experience.

The more comfortable students become completing a particular task, the more automatic it becomes—signaling that the skill is moving from being stored in their semantic (or declarative) memory system to their procedural one. That means that they need fewer and fewer cognitive resources each time they activate that schema in their long-term memory. To use a language teaching example, let’s say that a student is learning to use the simple past tense (“-ed” form) in conversation. At first the rule has only a tenuous connection to the “English Grammar” schema in their long-term memory (if such a specific schema happens to exist). As they continue to practice applying that concept, consistently recalling it into their working memories, the rule becomes more firmly solidified in their semantic memories. At this stage, they still have to pause and think about how to apply the rule each time they use it. This can make their use of it in conversation halting and inconsistent. But, with more practice, time, and effort, the “knowledge” of the simple past tense and when to use it makes its way into their procedural memory; now they can say words like “talked” and “walked” and “worked” appropriately when speaking about the past. They are now “experts” in the simple past tense, experiencing hardly any intrinsic load when they apply the rule to their conversations.

We’ll make one final visit to the scenario presented at the beginning. The teacher answers Amy’s confusion by carefully explaining the first step—writing interview questions—in language that Amy can understand. The teacher points to the directions written on the board, showing Amy where she can recover the task information should she lose it again to overload. They write the first interview question together. Seeing that Amy is able to continue on her own, the teacher resumes monitoring the other students. Amy completes her questions and interviews her partner, using the best English she can. It may not have been pretty, but she finished the task. Now, class is finished for the day, and Amy can go off and chat with her friends to her heart’s content. Tomorrow, perhaps, she’ll have a bit more working memory capacity to spare for learning English.

(Comment from Julia, 2021: July 27)

I first learned about Cognitive Load Theory early in my bachelor’s degree when I took a required course on developmental psychology. I had a little “Eureka!” moment within myself as my professor introduced this concept because it just made such intuitive sense to me. That feeling has always stuck with me, and everything else I’ve ever learned about best teaching practices always seems to come back to WM and Cognitive Load Theory.

Everything I do as a teacher—every lesson I plan, every activity I design, every explanation I give—I always keep the principles of Cognitive Load Theory in mind. Is this necessary to the lesson, or is it just creating extraneous load? Could I present this information in a different, “lighter” way for next time? How can I bridge this more clearly to what they’ve learned in the past and help them draw on their LTM? I take notes of the activities that seemed to trigger more “overload” behaviors in students, so that I can go back and refine my materials for next time. I don’t think it’s realistic to expect my activities to 100% never overload my students ever (because they’d likely be too easy or boring in that case!), but I’m doing my best to find the right balance in my classroom. I’ve no scientific method to my madness, just a gut feeling of rightness when an activity or lesson seems to be appropriate for my students.

If there was one thing I wish every teacher could be exposed to in their pedagogical courses, it’s Cognitive Load Theory. I hope this article can be a useful introduction to you on this (in my opinion) fundamental theory of cognition.

With much appreciated permission from Dan Piraro,


  • Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4(11), 417-423.

  • Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 8, pp. 47–89). New York, NY: Academic Press.

  • Bahrick, H. P., Bahrick, P. O., & Wittinger, R. P. (1975). Fifty years of memory for names and faces: A cross-sectional approach. Journal of Experimental Psychology: General, 104, 54-75.

  • Chandler, P., & Sweller, J. (1992). The split-attention effect as a factor in the design of instruction. British Journal of Educational Psychology, 62(2), 233–246.

  • Chen, I., & Chang, C. (2009). Cognitive load theory: An empirical study of anxiety and task performance in language learning. Electronic Journal of Research in Educational Psychology, 7(2), 729-746.

  • Cohen, N. J., & Squire, L. R. (1980). Preserved learning and retention of pattern analyzing skill in amnesia: Dissociation of knowing how and knowing that. Science, 210, 207–209.

  • Dehn, M. J. (2008). Working memory and academic learning: Assessment and intervention. Hoboken, NJ: John Wiley & Sons, Inc.

  • Gathercole, S. E., & Alloway, T. P. (2008). Working memory and learning: A practical guide for teachers. Thousand Oaks, CA: Sage Publications.

  • Gathercole, S. E., & Baddeley, A. D. (1993). Working memory and language. Hillsdale, MI: L. Erlbaum Associates.

  • Osada, N. (2004). Listening comprehension research: A brief review of the past thirty years. Dialogue, 3, 53-66.

  • Piaget, J. (1977/2001). Studies in reflection abstraction. (R. L. Campbell (Ed. and Trans.). Sussex, UK: Psychology Press. 


  • Sweller, J (1988, June). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285.


  • Sweller, J., Van Merriënboer, J., & Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10(3), 251–296. 

  • Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization of Memory (pp. 381–403). New York, NY: Academic Press.

Julia Daley is a lecturer at Hiroshima Bunkyo University, where she teaches English conversation and writing. She earned her MA in TESL at Northern Arizona University and is certified to teach secondary English in Arizona. She appreciates everyone’s patience as she’s been learning how to build a website.

Leave a Reply

Your email address will not be published. Required fields are marked *