Upgrading the Class Evaluation Questionnaire System

Upgrading the Class Evaluation Questionnaire System

By: Curtis Kelly

To quote a student I once had: “Class evaluations suck!” I am not sure he is right, but it is something we should find out about.  At 45 minutes per questionnaire from the university administration like the one shown here, my 300 Japanese students used up 13,500 hours of potential learning time per semester, or 500,000 hours (equal to 56 years) for the entire student body, and for the whole country…a couple millennia. There is no room for class evaluations that suck!

Semester-end class evaluation questionnaires are used almost everywhere (McKeachie, 1994) and I suspect, generally look pretty much the same. So, now I confess. I agree with my student. They seem a waste of class time that could be better used for something else. We university teachers had to administer those computer grade sheet evaluations for every class. The results were collated and available to the teacher online if requested, but I got the impression that the only teachers who looked at them were those likely to get good evaluations, not those likely to get poor, although the latter group is the one that most needed to see them. As for the administration, whether positive or negative, the results were never acted on.

In my experience, many of the questions are so scattered and general that they have little value. In the questionnaires at the 5 universities I taught at in my 40 years, there were always items to be rated from 1 to 5 that displayed the weaknesses illustrated below (sample from Ritsumeikan Asia Pacific University survey, 2024):

Click to enlarge.
  • of little value:
    • I could hear the teacher.
      (strongly agree to strongly disagree)
    • I could see the board.
  • hard to answer:
    • The instructor contrasted the implications of various theories/ideas/concepts.
    • During class the instructor encouraged students from various backgrounds to learn from each other.
  • of value to the teacher, but do not reveal reasons:
    • I learned something which I consider valuable.
    • Overall, I would say this course was…
      (very poor to very good)


An illustration of a hand pointing at a three-color sliding scale. The red scale on top has a sad face, the yellow scale in the middle has a "meh" face, and the green scale at the bottom has a smiley face.

The responses for the latter items were the only ones an egotist like me I ever looked at, but in reality, they have so little value. What can one learn from results that show our course is 30% poor and 70% good, or worse, 70% poor and 30% good? How can one know what is working or what should be changed? Or how do these results differ for the most attentive and least attentive students? In short, such questionnaires are almost useless.

Nonetheless, we need evaluations

Unfortunately, as every teacher knows, and as is well noted in the literature, we desperately need information on how to improve our classes. This need is particularly strong in Japan where a major problem language teachers face is low student motivation (Wigzell & Al-Ansari, 1993), so low, as to create a “problem of wastage and low productivity in foreign language courses.” (p. 303). Furthermore, despite years of English education, few Japanese become fluent in the language.  Most critics blame teachers, methods, and materials for the problem (Hansen, 1985). So, obviously, schools need to know which teachers are failing, and teachers need to know which methods or materials work. In short, like it or not, we need an effective way to evaluate classes.

McKeachie agrees. He wrote, “improvement in teaching is facilitated by feedback” (1994, p. 313), but he also noted that class evaluations suffer serious problems with providing useful information and with validity. What are we measuring? Teaching effectiveness? But then, how is that construct defined? Whether the students retain more information? Or improve their critical thinking skills? Or become more interested in the subject? He also cites studies that show that students are able to validly rate teacher behavior, such as “rapport,” but points out that rapport is not necessarily an indicator of greater achievement (p. 320). Another potential problem of validity is that students tend to mix up two separate aspects of a course, the teaching and the subject being taught (Timpson & Andrew, 1997).

Pessimism is widespread. I have heard teachers (usually the ones with poor evaluations) say that class evaluations are meaningless because they just measure how entertaining the teacher is, but that view is contradicted by the research. (Fink, 1995) shows that students are generally able to separate being entertained from the value they get from learning, and McKeachie cites a number of studies that indicate that student ratings are useful in evaluating a broad degree of teaching effectiveness: “highly rated teachers tend to be those whose students achieve well” (1994, p. 314).

An illustration of a confused, worried woman grasping her head with her hands. Around her is an array of speech and thought bubbles representing her inner turmoil.

So, in conclusion, we have two conflicting points: 1) Both teachers and the administration need information on what is working or not in their respective arenas; but 2) class evaluations as they currently exist cannot give that information. So, I would like to propose a different type of class evaluation, developed during my doctoral research, that I believe would be useful to both parties. The system I will offer works for large, university-wide groups, or for small groups, like the 4-5 language teachers of a particular course that I coordinated. I found this modified class evaluation system extremely useful to me as a coordinator, and the teachers in the program seemed to appreciate it too. First, though, let’s figure out what we need to measure.

What class evaluations should measure and for whom

Any evaluation is measuring the gap between what is and what should be, and that is true for class evaluations too. In that regard, we can divide the stakeholders into three groups with different needs:

An illustration of three people sitting at a desk, discussing documents.

The Administration: a need to identify classes that work especially well or poorly

If you think about it, administrators don’t need to know what percentage of the students “could see the board,” nor even whether the students thought “the course was useful.” After all, the general curriculum is not for students to decide on. As long as the existing classes are being muddled through with some degree of success, the administration does not need to know anything particular about them.

I might be a bit idealistic, but there is something I think universities do need to know about. They need to know which classes (read as “teachers”) might be failing, at least in the eyes of students, so that they can arrange some kind of intervention. Finding out that something is going wrong, and then determining how to help the teacher, probably through training, or just removing that teacher, is their primary goal. After all, the teacher might be having a problem with just that particular course for some reason, while succeeding in others. Maybe that teacher just needs a few words of advice. Or, maybe they are one of those rare teachers who is truly terrible with students in any class (see my footnote in the next paragraph) and who shouldn’t be teaching.

On the other hand, there is an ancient view that says the administration has no right to control what a teacherthe absolute monarch of his or her classdoes, but I don’t agree with it. I have seen too many young learners deeply wounded by poor teaching to support this tyranny1 and fortunately, that old view is fading. The trend is towards teacher accountability. Teachers, who manage the development of our youth, should be no less accountable than doctors, who manage the preservation of our health.

In addition to finding which teachers are failing, there is also a benefit in finding out who the stellar teachers are, not just to reward those teachers, but also to spread their know-how to the rest of the faculty. On identifying super successful teachers, the administration might organize class observations or presentations for other teachers,, or ask these educational leaders to mentor the failing teachers.

In other words, the only things the administration needs to identify are the exceptions: the super succeeders and unfortunate failures. Therefore, only one questionnaire, computer-based or not depending on numbers, need be given to the whole student body once per semester: a short questionnaire asking students to identify any teachers/classes they thought were excellent or terrible.

1 So, you want the dirt? Over the years, I had to fire or admonish a teacher…now brace yourself…who spit on a student, who put a movie on in class and went home by bicycle while it was playing, who repeatedly accused a child of cheating that was not, who assigned his underage female students to buy condoms, who sold goods in class, who threw a coffee cup at a learner, who cut to the front of a long bus line, who favored students belonging to the same fringe religion he did, who was seen dumping the term papers he had just collected in the trash on the way out, and so on.  Whew.  And none of these were me.

An illustration of a teacher pointing at a chalkboard. Next to the teacher is a student holding a piece of chalk, turning to look at the teacher for advice after writing on it.

The Teachers: a need to find out what is working or not, and to share resources

The crux is finding out what works or does not. And yet as we well know, computer marksheet questionnaires with 5-point Likert scales are unable to produce that information. A questionnaire with open-ended questions, on the other hand can. Open-ended questions with written responses are useful when seeking suggestions, seeking anecdotal information, or providing a way for respondents to vent dissatisfaction. A particularly useful type of open-ended question, commonly used to assess satisfaction are LB/LL questions (Fink & Kosekoff, 1998, p. 27). Respondents are asked what they “liked best” about an educational experience, and what they “liked least,” giving up to three answers for each. Therefore, in addition to the large-scale one-time survey done by the administration, a second questionnaire should be administered in each individual class, one that has open-ended questions.

As we saw, the administration does not need detailed information, nor can these open-ended written answers be tallied by computer, so the most effective way to get the results acted on is to have them processed by the teachers themselves. To make sure this happens, the administration might require teachers to summarize the class questionnaires in a short report, reflecting on what they think went well and what did not, providing examples. 

Who would these reports be written for? The administration? No. The overall purpose of this questionnaire is faculty development, not teacher ranking, so instead, the reports should be shared with other instructors teaching the same types of courses, so that they can see how things worked. They can learn from each other’s efforts to make a syllabus, employ certain teaching techniques, experiment with innovations, and use the textbook. Class evaluation becomes teacher training; improving teaching becomes a collaboration.

Naturally, reliability in evaluating one’s own performance will degrade since teachers tend to cherry-pick what they divulge, but this is not an issue. No matter what they share, it is bound to be of value to other teachers.

By the way, I suggest including fixed-answer questions as well, to give us general averages, but also for one other very particular reason. We teachers are sensitive. Even if all the students except one give us glowing assessments, we will focus intensely on that one bad result. We need to know who it is coming from, one of the best students in the class or the worst, one who did homework and attended or one who did not. This is important to us, so I suggest including questions on how many classes the respondent missed and what percent of the homework was completed.

An illustration of students sitting and raising their hands.

The Students: a need to be heard

We tend to overlook the price students must pay by filling out class evaluations. In Japan, the average student has to grind through the same 31-item questionnaire, like the one from Ritsumeikan, fifteen times per semester. Semester after semester, potential learning time is given up for box-ticking. Nor are they really able to express the things they’d probably really like to say.

The wonderful/terrible classes questionnaire from the administration will give them the satisfaction of reporting on truly superior or heinous teachers, as should be their right. The open-ended questions in the in-class questionnaire will give them a chance to express their opinions directly, allowing them to say anything they think is important, while at the same time inducing them to reflect on their own educational gains as well. Imagine how much more satisfying that would be for them than ticking off boxes fifteen times about how well a teacher could be heard. Giving them this opportunity is more humane.

The questionnaires

This alternative system of evaluation was not just designed on the fly. It is a product of a doctoral research project using the development research method which included a literature review, a formative panel to generate criteria, and a summative panel to examine criteria adherence. The questionnaires can be seen on the next three pages. Note that I used open-ended questions for the administration questionnaire, which is more useful for a department or coordinated course, but a computer-graded questionnaire could be developed for larger student bodies. Also, below are some links for additional documents written for use at a particular Japanese college.

An image of four types of questionnaires displayed as cards.

Overall System Manual here

Instructions to Teachers Doing the In-Class Questionnaire and Report here

Formatted Questionnaires with Japanese here

Literature Review on Class Evaluation Development here

Class Development Questionnaire

Help your teacher improve this class! Tell your teacher the best and worst points of this class and give suggestions for the future. Keep in mind that only your teacher will read your answers, but they will become part of a report to other teachers. You do not have to write your name on this paper, and nothing you write will affect your grades. Nonetheless, your comments are important to your teacher, so write your answers as clearly and fully as possible.

About the course:

    1. How would you rate this class?
      1. very good
      2. good
      3. average
      4. poor
      5. very poor
    2. How useful were the teaching materials?
      1. very useful
      2. useful
      3. average
      4. poor
      5. very poor
    3. How was the level of this class?
      1. extremely hard
      2. hard
      3. just right
      4. too easy
      5. far too easy
    4. How many classes in this course did you miss this semester?
      1. 0-1
      2. 2-3
      3. 4-5
      4. 6-7
      5. more than 8
    5. How much of the homework did you finish by the due date?
      1. 100%
      2. 75%
      3. 50%
      4. 25%
      5. 0% 

Written Answers:

6. Write three things you liked about this course and explain why.

7. Write three things you would like to see changed in this course and explain why.

8. Do you have any additional comments for the teacher? 

Thank you for your help.

Department Scanning Questionnaire

Help us improve our school! Take some time and tell us the best and worst points of our program. Your opinions are very important to us, so our staff will read them. You do not have to write your name on this paper, and nothing you write will affect your grades. Nonetheless, we think your opinions are extremely valuable, so write your answers as clearly and fully as possible.

About our department:

    1. Are there any classes, teachers, or materials (textbooks, etc.) that you think are especially good?  Tell us specifically what they are, and why you think so.
    2. Are there any classes, teachers, or materials (textbooks, etc.) that you, or other students, are having trouble with?  Explain the problem in detail.  What should be changed?
    3. Are there any other classes or activities you would like to see us add to the schedule?  Why?
    4. Do you have any other problems at this school that we should know about? Or suggestions?

About you:

5. On the average, how many classes do you miss per week at our school?

      1. 0-1
      2. 2-3
      3. 4-5
      4. 6-7
      5. more than 8

6. On the average, in all your classes, how much of your homework do you finish by the due date?

        1. 100%
        2. 75%
        3. 50%
        4. 25%
        5. 0%
An illustration of a woman leaning against a stack of three check boxes.
illustrations adapted from Alphavector on Canva

In conclusion

My student said, “class evaluations suck.” The great educator Parker Palmer adds depth to that comment:

The normal mode of “evaluating” teaching is to give students, toward the end of a course, a standardized questionnaire that reduces this complex craft to ten or fifteen dimensions, measured on a five-point scale: “Gives clear and concise instructions”; “Organizes lectures well”; “Establishes criteria for grading.”

Teachers have every right to be demoralized by such a simplistic approach – the nuances of teaching cannot possibly be captured this way. No uniform set of questions will apply with equal force to the many varieties in which good teaching comes. But if we insist on closing the door on our work, how can others evaluate us except by tossing some questionnaires over the transom just before the end of the term? Evaluations of this sort are not simply the result of administrative malfeasance, as faculty sometimes complain. They are the outcome of a faculty culture that offers no alternative.

Sadly, the limitations of these evaluations are so cynically accepted, and their outcomes so selectively invoked, that the data are easily used in an institutional shell game. (1998, pp. 142-3)

As for me, my life mission has always been to “relieve the suffering of the classroom,” but I always thought of that in terms of bored or alienated students, never their teachers. Maybe I should change that perspective, and this different way to evaluate classes is a start.


  • Fink, A. (1995). How to ask survey questions, The survey kit (Vol. 2, pp. 1-28). Sage Publications.

  • Fink, A., & Kosekoff, J. B. (1998). How to conduct surveys: A step-by-step guide. Sage Publications.

  • Hansen, H. (1985). English education in Japanese universities and its social context. In C. Wordell (Ed.), A guide to teaching English in Japan (pp. 145-170). The Japan Times.

  • McKeachie, W. J. (1994). Teaching tips (9th ed.). D. C. Heath and Company.

  • Palmer, P. (1998). The courage to teach. Jossey-Bass.

  • Timpson, W., & Andrew, D. (1997). Rethinking student evaluations and the improvement of teaching: Instruments for change at the University of Queensland. Studies in Higher Education, 22(1), 55-65.

  • Wigzell, R., & Al-Ansari, S. (1993). The pedagogical needs of low achievers. Canadian Modern Language Review, 49(2), 302-315.

Curtis Kelly is always looking for ways to improve and encourages criticism from his colleagues and fellow speakers. Offering criticism back can be tricky, though.

Leave a Reply

Your email address will not be published. Required fields are marked *