September 2022 was apparently the month when the essay fear about artificial intelligence boiled over in academia, as several media outlets published op-eds lamenting the rise of AI writing systems that will ruin student writing and pave the way for unprecedented levels of intelligence. academic misconduct. Then, on September 23, academic Twitter exploded bit of a panic about this topic. The firestorm was triggered by a post on the OpenAI subreddit where user Urdadgirl69 claimed to get good A’s with essays “written” using artificial intelligence. Professors on Reddit and Twitter expressed frustration and concern about the best way to tackle the threat posed by AI essays. One of the most poignant and most retweeted laments came from Redditor ahumanlikeyou, who wrote, “Judging something an AI has written is an incredibly depressing waste of my life.”
While all this online hand-wringing was going on, my undergraduate students of Rhetoric and Algorithms and I were conducting a little experiment in AI-generated student writing. After reviewing 22 AI essays I asked my students to create, I can confidently tell you that AI-generated essays are nothing to worry about. The technology just isn’t there, and I doubt it will be anytime soon. For the AI essay activity mentioned above, I borrowed an assignment sheet from the University of Texas during my freshman writing class in Austin. The assignment asks students to submit an 1800 to 2200 word proposal on a local problem. Students usually tackle issues on campus, promoting ideas like “It shouldn’t be so hard to take computer science classes” or “Student costs should be lower” or “Campus housing should be more affordable.” For the Rhetoric and Algorithms class, I asked the students to rely on AI as much as possible. They were free to create multiple prompts to generate AI outputs. They were even allowed to use those clues in their essays. The students were also free to rearrange paragraphs, eliminate obvious repetitions, and clean up the formatting. The primary requirement was that they had to make sure that most of the essay was “written” by AI.
The students in this class were mostly juniors and seniors, and many were majors in rhetoric and writing. They did a great job, put in a lot of effort. But in the end, the essays they submitted were not good. If I had thought these were real student essays, the very best would have earned somewhere around a C or C minus. They met the minimum contract requirements, but that was it. In addition, many of the essays had clear red flags for AI generation: outdated facts about the cost of tuition, quotes from past university presidents presented as current presidents, fictitious professors, and named student organizations that don’t exist. Few students in my class have experience with computer programming. As a result, they were particularly drawn to open-access text generators like EleutherAI’s GPT-J-6B. Several students also chose to sign up for free trials of AI writing services like Jasper AI. Regardless of the language model they used, however, the results were pretty consistently mediocre — and usually pretty clear-cut in their fabrication.
At the same time, I asked my students to write short reflections on the quality and difficulty of their AI essays. Nearly every student said he hated this assignment. They quickly realized that their AI-generated essays were below par and that those who were used to getting high grades were reluctant to hand in their results. The students overwhelmingly reported that using AI took much more time than just writing their essays the old-fashioned way. To get a little extra insight into the ‘writing process’, I also asked the students to hand in all the collected outputs of the AI text generation ‘pre-writing’. The students regularly produced 5,000 to 10,000 words (sometimes as many as 25,000 words) of output to cobble together essays that barely fit the 1,800 words mark.
Quite a bit has been written about the alleged impressiveness of AI-generated text. There are even several high-profile AI-written articles, essays, or even scientific papers or scenarios that show this impression. In many of these cases, the ‘authors’ have access to higher quality language models than most students can currently use. More importantly, my experience with this assignment teaches me that it takes a good writer to produce good algorithmic writing. The published samples generally benefit professional writers and editors who create prompts and edit results in a polished form. In contrast, many of my students’ AI-generated essays showed the common problems of student writing: uncertainty about the correct writing style, problems with organization and transitions, and inconsistent paragraphs. Producing a quality essay with AI requires enough fluency with the target writing style to create prompts that lead the model to produce the correct output. It also requires solid organizational and revision skills. As a result, the best writers among my students produced the best AI essays, and the developing writers generated essays with many of the same problems that would have been in their real writing.
All in all, this exercise teaches us that we are not about to receive a deluge of algorithmically generated submissions from students. It’s just too much work to cheat like that. The activity also tells me that the best defense against AI essays is the same as the best defense against essay repositories: a good assignment sheet. If your assignment is “For today’s homework assignment, describe the reasons for the American Civil War” (a literal stock prompt of the GPT-J model mentioned above), then you are much more likely to get AI or downloaded essay submissions than if you create a detailed assignment sheet specific to your classroom context. The assignment I used for my Rhetoric and Algorithms students was a big challenge because they were asked to address local problems. There just aren’t enough relevant examples in the data the AI text generators draw from to generate plausible essays on the subject.
In addition to concerns about academic misconduct, this activity also showed me that using AI text generation can be part of good writing pedagogy. Two of the most important and difficult things to learn about writing are genre awareness and best practices for revision. Developing writers do not have the experience necessary to sense the subtle differences between different types of essays or assignments. This is why student essays often feel over- or under-written. Students are often still figuring out how to find the right spot and how to adapt their style for different writing activities. Plus, the usual delay between submission and feedback doesn’t do much to develop this intuition. However, prompt crafting for AI text generators usually provides instant feedback. By experimenting with sentences that do and do not produce the correct AI output, students can develop an idea of how to write differently for different genres and contexts. Finally, and unfortunately, most of my students complete their writing assignments in a single session just before the deadline. It’s hard to get them to practice revision. AI-generated text offers an interesting opportunity for a kind of pedagogical training exercise. Students may be asked to quickly generate a few thousand words and then convert those words into usable prose. This is not “writing” the way line drills are not basketball. But that does not mean that there is no useful pedagogical role here.
Ultimately, higher education will have to get a grip on AI text generation. Right now, most efforts to address these concerns seem to lean towards AI evangelism or algorithmic desperation. I suppose this is more parallel to the AI discourse. Yet neither evangelism nor despair seems to me the ideal response. For those who despair, I think it’s highly unlikely that we (will) drown in AI-generated essays. With today’s technology, it’s just too much harder and more time consuming than actually writing an essay. At the same time, I’m highly skeptical that even the best models will ever truly enable students to write that far exceeds their current skills. Effective prompt generation and revision depend on high-level writing skills. Even as artificial intelligence gets better, I wonder to what extent budding writers will be able to run text generators skillfully enough to produce impressive results. For the same reasons, I also question the enthusiasm of AI evangelists. It’s been just over five years since Google Brain computer scientist Geoffrey Hinton stated, “We have to stop training radiologists now. It’s just perfectly clear that deep learning is going to outperform radiologists in five years.” Well, we’re still training radiologists, and there’s no indication that deep learning will replace human doctors any time soon.In the same way, I strongly suspect that full robotic writing will always and forever be “just around the corner.”