Grow Beyond Grades

View Original

“Nothing personal, but…”: Technology, Learning, and Assessment

See this content in the original post

The history of technology in education has a running theme: improving student outcomes through personalized instruction.

From a distance, this seems like a good thing. After all, a building block of good instruction is to know your students. Where do they start from? What are their interests and passions? What do they bring with them that might support or hinder their learning? What are their physical, cognitive, and psychological needs?

Good teaching meets students where they are and supports them on their journey to where they want or need to go. In addition, one-on-one instruction or tutoring has long been viewed as the most effective kind of instruction when done well, and much educational technology has attempted to replicate that relationship.

But there is also a darker side. The focus on personalization in educational technology often comes at the expense of the kinds of relationships we know are important for learning. The personal learning espoused by edtech entrepreneurs often leans towards extreme individualization, and a limited view of knowledge and learning. Can we do better? 

Thorndike vs. Dewey

In discussions of the history of education in the United States, you often see a quote from former Spencer Foundation President Ellen Lageman’s book An Elusive Science: The Troubling History of Education Research: “One cannot understand the history of education in the United States during the twentieth century unless one realizes that Edward L. Thorndike won and John Dewey lost.”

Both were influential thinkers in education from the late 1800s through around 1930. Thorndike was a psychologist who sought to make education scientific through quantification, championing the development of standardized testing that drove the move towards the kind of ranking and sorting that regular readers of this site decry. Also, he was a eugenicist.

Dewey was a philosopher and progressive educator who believed in learning from experience, and was more interested in the process of learning than in grading. He generally opposed separating records of learning from the context in which things were learned. It seems safe to say he would not approve of GPAs as a measure of student accomplishment, and that readers of this site might enjoy a day spent in his famous Laboratory School at the University of Chicago, at least back in Dewey’s time. 

Much of educational technology (edtech), and especially edtech related to assessment, falls into the Thorndike camp. Why is this? Critic Audrey Watters observes that edtech is “more Thorndike than Dewey because education is more Thorndike than Dewey.” The creators of edtech are the products of a system, and visions of how to improve that system are often limited by the entrepreneurs’ own experiences.

Watters also observes that, ironically, the meaning of “progressive” has been warped from its original Deweyan vision to embody the hyper-individualized vision of Thorndike that was oriented more towards sorting students than towards promoting their learning. Many in this camp also view teachers as a limiting factor when it comes to student success, and seek to sideline or replace teachers with technology.

Of course, there are some bright spots along the way. Dewey may have been defeated, but he did not disappear. Much research in the Learning Sciences, and movements around inquiry-based, problem-based, project-based, and collaborative learning are hopeful flickers of Deweyan thinking. And as we will see, some edtech developers do recognize the importance of viewing the learning environment holistically, elevating the role of the teacher as its organizer and guide.

Below, I present a few key examples of technologies that assess learning. I am limiting my focus to tools that combine learning and assessment—what we call formative assessment—rather than on those focused only on summative assessment. We won’t be getting into computer-adaptive testing here!

The tools I will discuss are: teaching machines, computer-based tutors, intelligent tutors, game-based or “stealth” assessment, and learning analytics. In all cases, we’ll come back to the theme of personalized learning and visions (or lack thereof) of learning. 

Before getting into the technologies, let’s briefly consider assessment itself. In the TG2 world, when we talk about “ungrading,” many of our peers think we are primarily talking about changing the way we record and communicate outcomes at the end of the learning experience, or what is known as summative assessment. But we are really talking about how we support learners throughout the entire learning process, with a focus on formative assessment, or assessment in the midst of learning meant to provide feedback and guidance for future learning. Meta-reviews of different elements of the teaching and learning progress document the centrality of formative assessment. Learners need feedback to self-correct and gauge progress towards their goals. Feedback is key to self-regulation, and the wrong kind of feedback or feedback delivered in the wrong way can quickly derail learners’ motivation and the entire learning process. That’s what brings many of us to the Teachers Going Gradeless site, and to the broader discussion of ungrading or alternative grading and assessment systems. OK, now we can get into some edtech!

Teaching Machines and a Limited Vision of Learning

Teaching machines held the attention of the edtech world for a big chunk of the middle of the twentieth century, heavily influenced by the work of Thorndike, first introduced by psychologist Sidney Pressey, and still associated most closely with B.F. Skinner (for a complete history of this technology, see the excellent book Teaching Machines by Audrey Watters). In many ways, teaching machines established the pattern for how learners interact with technology that is still followed to this day. Learners interact with content presented by the machine, are assessed continuously, and by virtue of completing the “program” are said to have mastered the material presented.

The learning theory behind teaching machines is strictly behavioral. For Skinner and his colleagues, there was no theory of mind, only the ability to observe what a student does: their behaviors. Teaching machines employed a pedagogical approach called “programmed instruction,” and the program came from a designer who authored the material students interact with. Notably, teaching machines were not “computers” in the modern sense, and they had no memory or means of recording a student’s score. They were purely mechanical devices that were designed to limit a student’s focus to a single and simple fill-in-the-blank statement. The entire script of items represented an instructional text. The steps between items were so small that a student would most likely correctly fill in the blank, but if not they could self-check and enter the correct response before moving on. The assessment and instruction were thus highly interactive, though the notion of “learning” material was pure call and response, and fully devoid of context. Learners using teaching machines did not interact with each other, or with their teacher. Only with the machine. Teachers were thus sidelined from the main task of instruction. 

I find it truly remarkable how similar Skinner’s way of talking about teaching and learning is to today’s edtech entrepreneurs. You can find many versions of Skinner discussing his teaching machines on YouTube, such as this one. In the video, Skinner talks about the inefficiency of teaching to the middle. “The student is free to move at his own pace. With techniques in which the whole class is forced to move forward together, the bright student wastes time waiting for others to catch up. And the slow student, who may not be inferior in any other respect, is forced to go too fast.”

In terms of feedback, Skinner argues that when using teaching machines, the student “learns immediately whether he is right or wrong. This is a great improvement over the system in which papers are corrected by a teacher, where the student must wait, perhaps until another day, to learn whether or not what he has written is right.” Efficiency is emphasized: “A conservative estimate seems to be that, with these machines, the average grade- or high-school student can cover about twice as much material with the same amount of time and effort as with traditional classroom techniques.” No child left behind, as it were.

Teaching machines represented a highly personalized form of instruction… so long as your notion of the learners is that they are all identical except for the speed at which they can make sense of the material. And so long as your notion of learning is that you can provide the proper response for a given stimulus. 

Computer-based Tutors Increase Interactivity

Around the same time that Skinner was attempting to get teaching machines into wide use (Narrator voice: he did not succeed) actual computer-based systems for teaching were starting to emerge, most notably the PLATO system (Programmed Logic for Automatic Teaching Operations) developed at the University of Illinois in the 1960s and spreading worldwide over the following four decades.

Because early versions of PLATO relied on networked mainframe computers with remote terminals, it was more widely employed in higher education and other specialized settings than in K-12 schools. PLATO was the first system for generalized computer-based tutoring (CBT). One reason for PLATO’s popularity was the accompanying TUTOR scripting language, which made it relatively simple for people to create new programs. At first, PLATO was completely text-based. Over the decades, as multimedia computing became widely available, pictures, audio, and video could be incorporated into PLATO programs. 

If you had an experience with computer-based learning before the 1990s, most likely that experience involved PLATO. When I was an undergrad in the 1980s, my university offered a philosophy course that could be completed (for college credit!) completely on PLATO. But for all of the (relative) razzle-dazzle of computers, instruction on PLATO was not radically different from teaching machines. Learners still interacted with a program created by an expert, and interacted with that program as an individual.

See this content in the original post

One difference is that instead of immediate assessment and feedback, the lessons designed for PLATO (and CBT in general) usually saved quizzes for the end of instructional modules. The TUTOR language allowed for branching, making it slightly more responsive to individual differences in performance compared to teaching machines, but again the system was dependent on user behavior to drive its responses. The benefits proposed by Skinner for his contemporaneous teaching machines also applied, as PLATO was in part inspired by the need to expand educational access to meet the demands of growing college populations after the GI Bill, in addition to the technical challenge represented by Sputnik.

PLATO was a high-tech way to provide instruction to the masses, so long as those masses could sit in front of a hard-wired computer terminal. In many ways the instructional design conventions adopted by PLATO course designers mimicked the approach taken in any large lecture course, and the level of feedback was similar as well; learners could see what they had gotten right or wrong, and then repeat a module or follow a branch for more specialized instruction. PLATO is thus also a direct forerunner of MOOCs. Personalization is limited primarily to pacing, and though behavioral models of learning were starting to be replaced by cognitive models, it would be hard to tell that by looking at most CBT courseware.  

Intelligent Tutors Pay Attention to the Learner…and Embrace Context

Cognitive Tutors, developed at Carnegie Mellon University starting in the 1980s, are one example of systems known as Intelligent Tutoring Systems (ITS), a label that caused some consternation among the CBT crowd, who resented the implication that their systems were “dumb.” (Narrator voice: They kind of were.) Cognitive Tutors and related tools represent the first deep foray into cognitive models of learning that are embodied in the instructional approach. Cognitive Tutors were based on a cognitive theory called ACT-R, developed by John Anderson and refined by many colleagues over decades of work.

Put as simply as possible, ACT-R required a knowledge base of rules related to the content area, along with information about what understanding certain concepts might mean in relation to understanding other concepts. If you wanted to teach algebra, for example, you needed to first construct a detailed representation of what understanding algebra looks like, and also what it looks like when someone is learning algebra. Because they are rule-based, Cognitive Tutors and ITS more generally represented what is known as symbolic artificial intelligence, as opposed to the connectionist models behind generative AI tools like ChatGPT.

The ACT-R model allowed for continuous assessment of learning and true customized responses as students followed different paths towards mastery. I would be the first to admit that I’m not doing the description of Cognitive Tutors justice, but suffice to say that in their design and theory of knowledge/learning they differ markedly from CBT and teaching machines. The program of research associated with Cognitive Tutors is top-notch, and I consider them one of the most thoughtfully-designed edeech tools of all time. Along with other ITS systems, these are the first real uses of AI in education. 

One thing that sets the Cognitive Tutor crowd apart from the other systems and approaches described in this post is that they were perceptive enough to realize that, if they were to make a difference in schools, they required the cooperation of teachers.

To this end, Ken Koedinger and colleagues wrote what remains one of my personal favorite papers, called “Intelligent Tutoring Goes to School in the Big City,” published in 1997 in the International Journal of Artificial Intelligence in Education. In the article, the authors describe the importance of blending the “classroom expertise” of teachers with the “AI expertise” of the Carnegie Mellon team. As a practical matter, they recommended mixing experiences with the Cognitive Tutor with a (truly) progressive mathematics curriculum developed in the context of Pittsburgh Public Schools. Students went back and forth between time on the Cognitive Tutors and time working collaboratively in the classroom, and teachers were able to use data from the Cognitive Tutors to better support student learning.

In their current incarnation, the Cognitive Tutors live on in a product called MATHia, and tools such as Assistments (the name itself meant to imply formative assessment to assist learning), which are tailored to support teachers in diagnosing areas where students are making progress or may require additional support. This approach starts to feel like a much better form of personalization, recognizing both individuals (though within the constraints of the cognitive model), and context through connection to the broader classroom environment.

Stealth Assessment: A Whole New Game?

Our next candidate is not yet a concrete “product” like teaching machines, CBT, or ITS. But it is an area of edtech with interesting potential for assessment.

Video games were one of the reasons I first became interested in new forms of assessment and grading, leading to the development of what I call “gameful learning.” A well-designed game gets people to take on challenges and persist through multiple challenges (and even failures!) on the way to success. Well-designed games also have to walk a line between being difficult enough to not become boring, while not being so difficult as to be frustrating. In psychology, this sweet spot is known as the “flow zone.”

Some games find this space by paying attention to how the player is doing, and either adjusting difficulty or offering additional support as needed. To do this, a game needs to have an underlying model of performance against which the current player can be compared. Valerie Shute and Matthew Ventura of Florida State University recognized the way that games track player skill, tactics, and accomplishments and wondered if a similar approach could be employed to gauge hard-to-measure but desirable attributes in learners, like the development of problem-solving skills and creativity, alongside the development of more typically measured things like science knowledge. But remember, this assessment is happening in the context of playing a game. You don’t stop a game to give a player a quiz. The key is to conduct the assessment and provide feedback towards greater learning without the learner even being aware that the assessment is happening. Shute and Ventura called this approach stealth assessment.

Stealth assessment is built on top of an approach called evidence-centered design (ECD), pioneered by assessment scholars at the Educational Testing Service (yes, that ETS) who sought to measure learning beyond what was possible with standardized multiple-choice tests. Designing a learning environment with ECD means constructing three interrelated models: a competency model that describes different stages of understanding, a task or action model that embodies that activities that might indicate understanding, and an evidence model that provides definitions and statistical rules for determining whether and how learner actions relate to competencies.

The processes employed in ECD are analogous to the cognitive modeling underlying intelligent tutors, though the measurement targets are more ambitious. The team at Florida State built a game-based environment called Newton’s Playground to test their ideas, and were successfully able to measure learners’ creativity, persistence, and conceptual understanding of Newtonian physics, all without interrupting students’ gameplay experience. But Shute and colleagues caution that this approach is extremely challenging, and building the required ECD models requires a lot of up-front effort that must be tailored to each environment and topic. But the approach demonstrates what is possible, and provides some hope that in the future assessment can be personalized in ways that mesh with the engagement and interests of learners.

Learning Analytics: Nudging and Tailoring

Somewhere around 2010, folks realized that all the technologies we interact with generate tons of data about us. To be clear, marketers and venture capitalists realized this much earlier, but around 2010 educators and educational researchers started to focus on how we might use this data to provide better support for learners, instructors, and institutions seeking to improve learning. The Society for Learning Analytics Research defines learning analytics (LA) as “the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs.”

Much academic work in LA has led to the construction of a range of dashboards and other tools—typically integrated with an institution’s learning management system—designed to support student success. For example, a tool developed at the University of Michigan called eCoach provides personalized guidance and nudges to learners based on their current activities and performance in a class, giving them feedback for improvement based on the past performance of students who are similar to them, at least with respect to trackable parameters in the system. When did a student start working on a particular assignment? If similar students who performed better than they did started earlier, a nudge might recommend that they get moving.

This kind of approach is really useful for instructors in larger courses who want to personalize support for learners, but find it difficult to keep track of individuals’ progress. LA tools like eCoach have been instrumental in helping educators do traditional instruction better (and often are paired with non computer-mediated efforts to improve and innovate learning), but since these LA tools often use traditional measures of academic performance (e.g., percentage-based grades, letter grades, and GPAs) as metrics of success, they rarely promote efforts to rethink grading or assessment.

Like many of the edtech tools before them, they are generally trying to help students succeed in Thorndike’s world, not expand Deweyan approaches. I’m generally a big fan of academic LA research. Even with a normative vision of what learning could be, it still has tremendous potential for supporting academic innovation, and some of this work is related to efforts to rethink assessment and learning more broadly.

See this content in the original post

In the private sector, however, we find a different and more cautionary tale. 

Perhaps the most infamous example of LA in the private sector is a company called Knewton, a start up company with a free (always a red flag) online tutoring platform that used data from millions of students and, based on multiple-choice questions, recommended a just-right piece of content tailored to an individual learner’s needs. Programmed instruction and computer-based tutors all over again, but this time with massive data mining.

I was listening to Morning Edition on National Public Radio one morning in 2015 when a story and interview about Knewton came on that nearly caused me to choke on my coffee. Jose Ferreira, the company’s founder and CEO, described Knewton as, “like a robot tutor in the sky that can semi-read your mind and figure out what your strengths and weaknesses are, down to the percentile.”

It’s hard to know where to start with a statement like that. But I am pleased to say that I don’t need to critique it because Ferreira’s comments roused widespread ridicule and critique, including from the reporter in the NPR piece. Knewton effectively failed to realize their vision, and was acquired on the cheap by textbook publisher Wiley. I’d like to say that Knewton is the apotheosis of personalized learning technologies for assessment, but honestly it’s just the latest in a long line of for-profit edtech selling personalization while the actual products are data about the learners themselves. 

 Conclusion

In this brief (and incomplete) tour of educational technologies built around assessment, the focus has been on personalizing learning aimed at individual students. In theory, more personalization leads to better outcomes for students. In practice, the approaches taken fall short because they are too often built on top of views of knowledge that are fact-based, transactional, and contributing to a Thorndike-driven behavioral view of what it means to learn. The competing strand of Deweyan thinking about learning has always been harder to embody in technology, even though modern technologies like Internet search and now generative AI and large language models like ChatGPT are again challenging what is worth teaching in school. As Martin Compton argues in his TG2 post, we need to continually question the weight we put on knowledge-based education, and hold out hope that we might treat the rise of ChatGPT as a “catalyst for systemic and sustained change to the way we do assessment.” But given the history of technology and education, this seems unlikely.

The point of the technologies described in this post was and is to make assessment in support of learning easier to accomplish in the classroom. I believe they found a measure of success in terms of helping teachers enact current practices better. But as I argued in an earlier post, the amount of labor involved in “doing better things” is a major barrier to true reform in grading and assessment practices. Emerging technologies often make us question current practices, but technology itself is rarely a catalyst for true change. For that we need to embrace new understandings of what it means to learn and be prepared to act in the world, along with the challenges that go along with meaningful change. 


Barry Fishman is Arthur F. Thurnau Professor and Professor of Learning Sciences in the Marsal Family School of Education and the School of Information at the University of Michigan in Ann Arbor. His research includes a focus on successful games as models for engaging learning environments, the creation of transformative and sustainable educational innovations using technology, and the design and implementation of new systems for supporting, recording, and reporting learning.