Why Standards-based Grading is Not Enough

March 30, 2017 Arthur Chiaravalli

When I heard the learn’d astronomer,
When the proofs, the figures, were ranged in columns before me,
When I was shown the charts and diagrams, to add, divide,
and measure them,
When I sitting heard the astronomer where he lectured with
much applause in the lecture-room,
How soon unaccountable I became tired and sick,
Till rising and gliding out I wander’d off by myself,
In the mystical moist night-air, and from time to time,
Look’d up in perfect silence at the stars.

Proponents of standards-based learning and grading (myself included) can seem a bit like the learned astronomer of Whitman’s poem, their presentation slides and book appendices brimming with charts and columns, figures and diagrams, means, medians, and modes. There’s a kind of Enlightenment-era elation in contemplating the clear-headed, coherent systems of instruction, assessment, and reporting, and rooting for these stern reformers to topple the cruel despotism of outdated grading practices.

As a teacher of high school math and language arts, I have seen firsthand how these concepts and practices play out in more quantitative and qualitative contexts. And while the benefits of this approach are undeniable in both settings, I have lately found myself becoming “tired and sick” as these structures fail to account for the richness, complexity, and wonder of teaching and learning.

Mastering Math

Teaching in my minor of math, I never really noticed these limitations. In my classroom, learning targets were discrete, sequential, and algorithmic, consisting of homework, quizzes, tests, and an occasional project. The closest my students ever came to any “mystical moist night-air” was determining the amount of mulch needed for my irregularly shaped garden beds and viewing an occasional episode of Numb3rs.

We all use math everyday

When I first heard about standards-based learning and grading back in 2005, it all made sense. Without too much difficulty, I was able to put several standards-based strategies, structures, and tools into practice immediately. Students began keeping track of their own mastery of targets and I merely made sure that the grade in the gradebook continued to reflect their current level of achievement.

See this content in the original post

Homework was pure practice; due to the ease of copying, I didn’t want that to factor into a grade representing individual proficiency. Going over the correct answers, students flagged learning targets they struggled with and tried to shore them up before the eventual quizzes. Quizzes, occurring twice per chapter, did contribute to the grade. Students would again complete an item analysis on these, identifying the learning targets they still hadn’t met.

Students used this tracking sheet to flag learning targets from homework and quizzes (file)

After the second quiz, students would focus their study on targets they had flagged to prepare for the chapter test. Due to the targeted nature of this study, they often scored better on the test than on the quizzes. When this occurred, this score would become their stand-alone grade for the chapter, reflecting their overall level of performance.

Students used this item analysis after a chapter test to target further study and prepare for reassessment (file)

The process didn’t end there, however. Students could complete yet another item analysis for the test and, after successfully doing one or more corrective activities, take additional assessments addressing individual targets they had not yet mastered. Due to the discrete and sequential nature of our school’s approach to teaching mathematics, I was able to achieve all these aims within the preexisting framework.

Student and parent reaction to the program was almost universally positive. With the help of grading expert, Ken O’Connor, I was able to make additional adjustments to the program that enhanced its effectiveness the following year. As time went on, I further augmented this system with Khan Academy-like online videos and accompanying notes. These videos, originally .wmv files, were downloaded thousands of times. To this day, the original page—which I hid in navigation after I stopped teaching math—has been visited 20,000 more times than my main page.

In other words, I lectured with much applause in the lecture-room.

As I reflect back on these experiences, however, I wonder if the standards-based approach gave me a warped view of teaching and learning mathematics. I had equipped my students with dozens of facts, concepts, and algorithms they could put into practice…on the multiple-choice final exam.

Somewhere, I’m sure, teachers were teaching math in a rich, interconnected, contextualized way. But that wasn’t the way I taught it, and my students likely never came to understand it in that way.

Liberating Language Arts

Fast forward to the present. For the past five years I have been back teaching in my major of language arts. Here the shortcomings of the standards-based method are compounded even further.

One of the more commonly stated goals of standards-based learning and grading is accuracy. First and foremost, accuracy means that grades should reflect academic achievement alone—as opposed to punctuality, behavior, compliance, or speed of learning. By implementing assessment, grading, and reporting practices similar to those I’d used in mathematics, I was able to achieve this same sort of accuracy in my language arts classes.

Accuracy, however, also refers to the quality of the assessments. Tom Schimmer, author of Grading From the Inside Out: Bringing Accuracy to Student Assessment through a Standards-based Mindset, states

Low-quality assessments have the potential to produce inaccurate information about student learning. Inaccurate formative assessments can misinform teachers and students about what should come next in the learning. Inaccurate summative assessments may mislead students and parents (and others) about students’ level of proficiency. When a teacher knows the purpose of an assessment, what specific elements to assess…he or she will most likely see accurate assessment information.

Unfortunately, assessment accuracy in the language arts and humanities in general is notoriously elusive. In a 1912 study of inter-rater reliability, Starch and Elliot found that different teachers gave a single English paper scores ranging from 50 to 98%. Other studies have shown similar inconsistencies due to everything from penmanship and the order in which the papers are reviewed.

We might argue that this situation has improved due to common language, range-finding committees, rubrics, and other modern developments in assessment, but problems remain. In order to achieve a modicum of reliability, language arts teams must adopt highly prescriptive scoring guides or rubrics, which as Alfie Kohn, Linda Mabry, and Maja Wilson have pointed out, necessarily neglect the central values of risk taking, style, and original thought.

This is because, as Maja Wilson observes, measurable aspects can represent “only a sliver of…values about writing: voice, wording, sentence fluency, conventions, content, organization, and presentation.” Just as the proverbial blind men touching the elephant receive an incorrect impression, so too do rubrics provide a limited—and therefore inaccurate—picture of student writing.

As Linda Mabry puts it,

The standardization of a skill that is fundamentally self-expressive and individualistic obstructs its assessment. And rubrics standardize the teaching of writing, which jeopardizes the learning and understanding of writing.

The second part of Mabry’s statement is even more disturbing, namely, that these attempts at accuracy and reliability not only obstruct accurate assessment, but paradoxically jeopardize students’ understanding of writing, not to mention other language arts. I have witnessed this trend as we have moved toward common assessments over the years. Our pre- and post-tests have become overwhelmingly populated with knowledge-based questions—terminology, vocabulary, punctuation rules. Pair this with formulaic, algorithmic approaches to the teaching and assessment of writing and you have a recipe for a false positive: students who score well with little vision of what counts for deep thinking or good writing.

See this content in the original post

It’s clear what we’re doing here: we’re trying to do to writing and other language arts what we’ve already done to mathematics. We’re trying to turn something rich and interconnected into something discrete, objective and measurable. Furthermore, the fundamentally subjective nature of student performance in the language arts renders this task even more problematic. Jean-Paul Sartre’s definition of subjectivity seems especially apt:

The subjectivity which we thus postulate as the standard of truth is no narrowly individual subjectivism…we are attaining to ourselves in the presence of the other, and we are just as certain of the other as we are of ourselves…Thus the man who discovers himself directly in the cogito also discovers all the others, and discovers them as the condition of his own existence. He recognises that he cannot be anything…unless others recognise him as such. I cannot obtain any truth whatsoever about myself, except through the mediation of another. The other is indispensable to my existence, and equally so to any knowledge I can have of myself…Thus, at once, we find ourselves in a world which is, let us say, that of “intersubjectivity.”

First and foremost, the language arts involve communication: articulating one’s own ideas and responding to those of others. Assigning a score on a student’s paper does not constitute recognition. While never ceding my professional judgment and expertise as an educator, I must also find ways to allow students and myself to encounter one another as individuals. I must, as Gert Biesta puts it, create an environment in which individuals “come into presence,” that is, “show who they are and where they stand, in relation to and, most importantly, in response to what and who is other and different”:

Coming into presence is not something that individuals can do alone and by themselves. To come into presence means to come into presence in a social and intersubjective world, a world we share with others who are not like us…This is first of all because it can be argued that the very structure of our subjectivity, the very structure of who we are is thoroughly social.

Coming to this encounter with a predetermined set of “specific elements to assess” may hinder and even prevent me from providing recognition, Sartre’s prerequisite to self-knowledge. But it also threatens to render me obsolete.

The way I taught mathematics five years ago was little more than, as Biesta describes, “an exchange between a provider and a consumer.” That transaction is arguably better served by Khan Academy and other online learning platforms than by me. As schools transition toward so-called “personalized” and “student-directed” approaches to learning, is it any wonder that the math component is often farmed out to self-paced online modules—ones that more perfectly provide the discrete, sequential, standards-based approach I developed toward the end of my tenure as math teacher?

Any teacher still teaching math in this manner should expect to soon be demoted to the status of “learning coach.” I hope we can avoid this same fate in language arts, but we won’t if we give into the temptation to reduce the richness of our discipline to standards and progression points, charts and columns, means, medians, and modes.

What’s the alternative? I’m afraid I’m only beginning to answer that question now. Adopting the sensible reforms of standards-based learning and grading seems to have been a necessary first step. But is it the very clarity of its approach—clearing the ground of anything unrelated to teaching and learning—that now urges us onward toward an intersubjective future populated by human beings, not numbers?

Replacing grades with feedback seems to have moved my students and me closer toward this more human future. And although this transition has brought a kind of relief, it has also occasioned anxiety. As the comforting determinism of tables, graphs, charts, and diagrams fade from view, we are left with fewer numbers to add, divide, and measure. All that’s left is human beings and the relationships between them. What Simone de Beauvoir says of men and women is also true of us as educators and students:

When two human categories are together, each aspires to impose its sovereignty upon the other. If both are able to resist this imposition, there is created between them a reciprocal relation, sometimes in enmity, sometimes in amity, always in tension.

So much of this future resides in communication, in encounter, in a fragile reciprocity between people. Like that great soul Whitman, we find ourselves “unaccountable” — or as he says elsewhere, “untranslatable.” We will never fit ourselves into tables and columns. Instead, we discover ourselves in the presence of others who are unlike us. Learning, growth, and self-knowledge occur only within this dialectic of mutual recognition.

Here we are vulnerable, verging on a reality as rich and astonishing as the one Whitman witnessed.

Arthur Chiaravalli serves as House Director at Champlain Valley Union High School in Vermont and is co-founder of Teachers Going Gradeless. Over the course of his career, he has taught high school English, mathematics, and technology. Follow him on Twitter at @iamchiaravalli.