AI = Assessment Innovation

February 22, 2024 Martin Compton

Me: I’ve noticed there’s quite a bit of apprehension about the use of generative AI in your department. I think it’s important we engage with these tools cautiously yet constructively. What are your primary concerns at the moment?

Colleague: My main worry is that students will just start submitting essays written by ChatGPT. It seems like a shortcut that bypasses true learning.

Me: That's a valid concern, but let me offer a different perspective. If an essay produced by ChatGPT can pass our assessments, doesn't that suggest a potential flaw in the assessment itself? Shouldn't our assignments require more critical thinking, originality and/or use of appropriate scholarship than a current AI tool can generate?

Colleague: I suppose that's possible, but doesn't it also indicate a lack of integrity on the student's part?

Me: I understand where you’re coming from, but let me ask you this—do you genuinely distrust most of your students? Do you feel that there is an inherent desire to be dishonest?

Colleague: Well, no, I don’t think they all want to cheat. But the temptation is there, isn't it?

Me: Temptation has always been a part of academic life. Remember, cheating existed long before AI. The real question is, are we creating an environment where we increase distrust? Have you considered the deeper purpose behind why we assess?

Colleague: Assess...Well, to measure the students’ understanding, right?

Me: Certainly, to evaluate, but also to develop their capabilities. Assessment shouldn't just be about judging and ranking but also about fostering growth. It's about the process as much as the product, wouldn't you agree?

Colleague: I see your point. Focusing solely on the final paper or the grade might indeed miss the mark in terms of promoting genuine learning.

Me: Precisely. We have an opportunity to innovate our assessment strategies. By shifting towards assessments that emphasize critical thinking, creativity, and the learning process itself, we can enhance the experience for everyone. Let's not view AI as the end of honest assessment but as a catalyst for rethinking and improving our methods. What do you think?

Colleague: I hadn't considered it that way before. It’s about using AI as a tool to evolve our teaching, not just a challenge to overcome. This could be a chance to really engage students in a new and meaningful way.

Me, thinking: And a way to energize conversations about the problem with grades!

• • •

The conversation above is an amalgam of many such conversations I have had recently with colleagues from across the faculties at my university. From Liberal Arts to Dentistry, from Psychiatry to History the topic du jour remains AI (broadly) and generative AI (in particular) and how we will deal with the implications for the ways our students will work, as well as the careers we are ostensibly preparing them for.

As implied in the exchange above, I am looking for ways to leverage those opportunities to talk (often with colleagues who I would not usually get an opportunity to connect with) about teaching and assessment practices. My own view is that before long the technologies that underpin tools like ChatGPT will be so deeply embedded in the daily use tools we are already familiar with that anxieties about integrity will shift, we will come to accept hybridization of writing and, with that, new ways of perceiving authorship and plagiarism.

This, I accept, is a radical change too far for many, and, given the liminal space we are in as we strive to understand the deeper implications, it is important to work in the ‘now’ given that the future of assessment practices is still uncertain. So, putting aside for a moment all the important and interesting conversations about AI ethics, AI focused assessments, AI marking, and AI chatbot tutors, let’s focus on the ways that AI is already serving as a catalyst for effective, positive changes to the ways we do assessment, feedback, and grading.

Time is moving fast at the moment. This may be a consequence of a recognized psychological phenomenon as I get older and creakier coupled with the seeming relentless rapidity of the changes in the realm of generative AI. Circumstance and fortune (only time will tell how good that fortune is) have led me to a change in role where I am now leading on AI in education across King’s College London.

See this content in the original post

It’s been almost a year since I argued that, despite the fear and uncertainty that were likely to hinder its progress, AI has the potential to revolutionize education by re-energizing student interest in learning for its own sake and de-centering grades. A more personalized, process-focused, and humanized approach to assessment is not only desirable but also driven by a need to engage with new ways of teaching, learning, and assessing. ChatGPT is still a frequent eponym for text-based generative AI, but the proliferation of tools and advances is nigh on impossible to keep up with, often fueling the fractious debates and impeding our ability to make considered responses.

I nevertheless maintain (most of) my early optimism and here share thoughts on some of the possibilities, stumbling blocks, and conversations I have witnessed as the last year has unfolded. With increased debate and dialogue around AI, we increase our understanding about the way education systems work and, in particular, about the way we assess. In other words, AI is a profoundly useful lens to focus attention on the ideas and issues of grading less and going gradeless.

Universities are seats of learning, research, and innovation, but in times of crisis and uncertainty, reactionary voices can elevate knee-jerk, nuance-free solutions to ‘problems’ that themselves are yet to be fully realized, let alone articulated. While my sense is that this is a common phenomenon globally and across all areas of education, I feel we are now moving towards a greater consensus.

Conversations, university policies and guidance, and high-profile position shifts suggest as much, but a panicked response is anything but uncommon. While outright bans are no longer part of most strategic planning, often the only proposal is for all students to be assessed by examination under controlled conditions. Often such solutions are based on limited exposure to the actual capabilities and limitations of these technologies. Assumptions about what they can and cannot do impinge on clarity of thought and behaviors of both staff and students.

At my own institution we have a broad ‘engage cautiously but positively’ multifaceted approach supporting the development of AI literacy through dialogue, experimentation, research, and a short course co-created by colleagues and students from across faculties. We, like many other institutions, have found that beyond guidance regarding critical issues such as data security, the maintenance of academic integrity, and the necessity for all key stakeholders to increase their AI literacy, there can be no “one size fits all” solution. Decision-making about what constitutes appropriate engagement has to come at a local level. Colleagues across our nine faculties are still contemplating the best ways to respond and adapt; those with roles at the center of the institution ponder how we might leverage the interest—which ranges from fascination to horror—to talk about what we have been trying to talk about for years: better teaching, better assessment, and healthier relationships with grades.

It’s amazing how often conversations about the implications of AI revert back to narrow perspectives on the purpose of assessment and the obsession with grades. But the unsurprising call to use formal exams for all summative assessment can trigger wider discussions, ones that I find myself having with increasing frequency.

See this content in the original post

The logic goes as follows: The only way to be certain that the student who has done the work is if we are watching them like hawks while they do it! While I respect the argument that in-person, timed examinations may have a role to play in the assessment diets we serve up to our students, I do not see a massive increase in in-person exams as a panacea or even stop-gap solution to the ‘problem’ that is AI. The traditional closed-book exam format is hugely problematic as it often lacks construct validity, tends toward a focus on knowledge (i.e. more easily measured), induces stress, and reinforces iniquities. Additionally, despite offering efficiency, exams are far from incorruptible given issues of impersonation and the increasing availability of wearable technology. The examples of “cribbing garments” purportedly used in Chinese civil service exams shows that subversion is anything but new and—then as now—is likely to be something more readily available to those with significant wealth.

A starting point in conversations is often to point out distinctions between product- and process-focused assessments. Process-focused assessments harness developmental potential and underscore the importance of growth as central to the learning process. By having these conversations, I am finding that colleagues begin to see more clearly that teaching and assessment are not separate entities, that they are perhaps even more than two sides of the same coin, that perhaps everything is assessment.

Likewise, programmatic assessment promotes a comprehensive perspective that can facilitate dialogue across programs and opportunity for a more integrated understanding. Such approaches necessitate recalibrating feedback as dialogue and development, rather than as a rationale for the grade awarded. This does not mean more grading but rather grading at different times, in different ways, and, potentially, using different media. We might, for example, in those countries where writing remains dominant such as the UK, reconsider the value we put on written assessments and incorporate more oral assessment, a well-established practice in some academic contexts globally. In my professional life, I talk a lot (and, I hope, listen more), certainly a lot more than I write. It often gives me pause why we (in the UK at least) privilege writing over other types of assessment.

Engaging students in a dynamic dialogue, one that evolves iteratively, can significantly enhance their critical thinking and communicative skills. Such practices are already the hallmark of elite institutions blessed with abundant resources. If we think this is good pedagogic practice, then we need to advocate for it. Of course there are logistical demands of such assessment practices, but having personally witnessed huge numbers of students engage in interactive oral assessments in medical science subjects in a single day, I am convinced we are often limited only by our assumptions about what is possible.

As class sizes grow and educational resources are stretched thin, the scalability of such personalized approaches remains a common objection. Nevertheless, I often think of one of the few lessons I remember from school: An appeal to cost or convention is a weak argument for not doing something (or continuing to do something badly). Change requires catalysts, and temporary change was one consequence of the global pandemic.

AI is not a blip or a hurdle but a change that is here to stay. Wherever you position yourself on the enthusiasm spectrum, we have an opportunity to steer conversations towards more inclusive, developmental, dialogic assessment that simultaneously surface discussions about assessment, feedback, and, ultimately, the grades that still carry so much weight. We do not need to be on the same page in relation to AI, but we can collectively leverage the conversations it is sparking to move towards a more positive assessment future.

Martin Compton is an Associate Professor at the Arena Centre for Research-Based Education at University College London with a focus in program design, curriculum development, as well as teaching, assessment, and feedback enhancement. You can follow him on Twitter @mart_compton and on his blog.