In the age of generative AI, should universities bother evaluating students?

Jun 15, 2023

AI genrated image representing a university degree — “university degree, photorealistic” | Midjourney

Last fall, when OpenAI released ChatGPT to the public, the world of education went into panic mode. There was now a tool freely accessible and easy to use that could produce college-level content on a variety of topics. "Write an essay on X". If the servers were not down because of overwhelming usage, the chatbot would abide in a few seconds. OpenAI reported that its latest large language model (LLM), GPT-4, aced several simulated exams such as the Uniform Bar Exam.

The humanities are especially concerned. The LLMs that power generative artificial intelligence systems like ChatGPT excel at producing output relevant for completing typical humanities assignments. And contrary to mere plagiarism, it is practically impossible to know for sure whether an LLM generated the prose. The exact sequence of words is unique. How can we give grades and deliver diplomas if students don't need to actually read/write to pass assignments and if there is no way to detect that the work is theirs?

Various solutions were proposed. The return of the oral examination. More in-person exams.Embrace the technology. Interestingly, most of the discussion overlooked one fundamental question: Why do we want to give grades that reflect students' genuine understanding in the first place? Students enrol in courses to gain new knowledge and skills. If they don't actually want to learn, why should we care about the grade they receive? They will be the ones outcompeted by others who learned. They will be the ones struggling in their jobs. They will be the ones less equipped to navigate the world. Why should universities overhaul their evaluation methods for students who don't want to learn?

One reason why accurately evaluating students matters is that universities have a certification function. Certification signals the quality of a good or service and thus reduces informational asymmetries. It is ubiquitous, from fair trade labels to professional certifications. When institutions deliver a Bachelor of Science diploma to someone, that piece of paper certifies that the person acquired substantial skills and knowledge in one area of the scientific domain. Certification is useful for both the suppliers and the buyers of knowledge. How can someone prove they really know mechanical engineering versus others who don't? By obtaining a degree. They can also signal that their degree is especially valuable by obtaining it from a prestigious institution. Similarly, organizations looking for mechanical engineers don't necessarily need to trust particular applicants or run extensive tests. As long as they can trust the degree-granting institution, they can trust the degree it confers and what it certifies.

Certification also plays a role in constructing merit rankings. These rankings serve to allocate awards, admission slots, resources, etc. Resources being scarce, we usually want to grant them to the most meriting. Students with the better grades often get admission and scholarships in the better schools. And people with the better grades or diplomas have a better chance of obtaining the most sought-after positions. The ranking function and the certification functions are also interrelated. For instance, and justifiably or not, some institutions are more prestigious than others. Therefore, certifications from more prestigious institutions tend to place people higher on the merit rankings.

Generative AI tools like ChatGPT threaten universities' ability to fulfil their certification function. If universities cannot guarantee an accurate evaluation of students, the certification delivers a signal of inferior quality. A diploma is only as good as the certification process that underlies it. And without an accurate and reliable certification process, the piece of paper loses its meaning and social purpose. Organizations will not be able to trust people have knowledge and skills on the basis of a diploma. Certification is also crucial in high-stakes contexts where not having the appropriate skills and knowledge can cause significant harm. Nobody wants to be treated by unqualified health professionals.

Likewise, if universities cannot guarantee an accurate evaluation of students, grades will stop being good evidence for knowledge and skills. As a result, grades will also stop being good evidence for merit. The person who got the highest grades may be especially adept at prompt engineering, but she should probably not get admitted to a top graduate programme for that reason. The general issue is that we often want to distribute opportunities according to a principle of merit and that grades currently play an important role in the construction of merit. Generative AI tools are disrupting this process.

Universities' primary reaction has been to adapt assignments and grading in order to secure the certification process. This is understandable. But this crisis also gives us the opportunity to question the social purpose of certification by universities. To what extent should universities certify students?

Here are some reasons why universities should be less concerned about certification.

A university degree is a low-quality signal
Certification by universities delivers a noisy signal. First, grade inflation is real. Although there may be social benefits to grade inflation, grades themselves provide less accurate information about a person's educational achievements. For various reasons, teachers and universities have an incentive to give high grades and deliver diplomas. Second, the certification process is not standardized across institutions or domains. For example, two universities that both offer a bachelor in philosophy degree may have substantially different programmes. Although there are standardization efforts such as the Bologna Process in the European Union, what grades and diplomas mean may vary from one institution to the other. Having a degree in a given discipline only provides imperfect information about the skills and knowledge that a person possesses. As a result, it is unclear why we should put too much effort into safeguarding an imperfect certification process.

Organizations have in-house certification processes
Organizations know that grades and degrees provide patchy information. This is why hiring processes often involve various assessments. Interviewers ask applicants questions to test their knowledge. Applicants may be asked to complete in-house examinations or to demonstrate their skills as part of the process. Companies like Google have extensive hiring processes. More generally, organizations are increasingly dropping degree requirements for hiring. LinkedIn believes that skills will become more important than degrees for hiring. If organizations already assess whether people have the requisite skills and knowledge, it is not clear what value degrees and grades are adding to that process.

Experience is better evidence
Actually having done X is better evidence that one can do X than a piece of paper stating that one can do X. As a result, university degrees are at their most useful when people don't have actual experience related to the position they want to obtain. But once they get the position, experience or success in that position quickly constitutes better evidence. If a company paid someone tens or hundreds of thousands of dollars to accomplish certain tasks, surely that person has relevant skills and knowledge. Instead of certifying through exams and essays, universities could develop more experience-based evaluations. It is common in many domains, for instance healthcare, to have students do internships. Experience is high-quality evidence and is harder to game.

Certification doesn't matter equally
There are good reasons why we want to make sure healthcare professionals actually know how to treat patients. In contrast, what difference does it make whether a philosophy major really understands John Searle's Chinese Room thought experiment? Not much. Universities want degrees to be of uniform quality, but the stakes widely differ. One badly certified person might be a life-and-death matter. For another, it might just be a matter of lacklustre copywriting. Accordingly, how hard we seek to secure the certification process should depend on the stakes. Universities' response need not be one-size-fits-all.

I don't consider these reasons decisive in stopping universities from certifying students' knowledge and skills. But I do take these reasons to support further reflection on why we need certification and how we should do it.

Welcome to the Machine

In the age of generative AI, should universities bother evaluating students?