Skip to main navigation Skip to search Skip to main content

Automating Self-Affirmation Essay Coding: Fine-Tuned BERT Performance Comparable to Human Coders and Comparison with GPT-4

Research output: Contribution to journalArticlepeer-review

Abstract

Previous studies have demonstrated that a self-affirmation writing intervention, in which students reflect on personally important values, positively impacts students’ school performance, and there is active research on this intervention. However, this research requires manual coding of students’ writing exercises, and this manual coding has proved to be a time-consuming and expensive undertaking. To assist future self-affirmation intervention studies or educators implementing the writing exercise, we employed our labeled data to fine-tune a pre-trained language model that achieves a comparable level of performance to that of human coders (Cohen’s Kappa: 0.85 between machine coding and human coders as compared to 0.83 between human coders). To explore the potential of more advanced language models without requiring a large training dataset, we also evaluated OpenAI’s GPT-4 in a zero-shot and few-shot classification setting. GPT-4’s zero-shot predictions yield reasonable accuracy but do not reach the fine-tuned BERT model’s performance or human-level agreement. Adding example essays (few-shot prompting) did not appreciably improve GPT-4’s results. Our analysis also finds that the BERT model’s performance is consistent across student subgroups, with minimal disparity between “stereotype-threatened” and “non-threatened” students, which are the focal groups for comparison in the self-affirmation intervention. We further demonstrate the generalizability of the fine-tuned model on an external dataset collected by a different research team: the model maintained a high agreement with human coders (Cohen’s Kappa = 0.86) on this new sample. These results suggest that a fine-tuned transformer model can reliably code self-affirmation essays, thereby reducing the coding burden for future researchers and educators. We make the fine-tuned model publicly available to help the research community automate the burdensome task of coding at https://github.com/visortown/bert-self-affirm

Original languageEnglish (US)
Pages (from-to)66-88
Number of pages23
JournalJournal of Educational Data Mining
Volume18
Issue number1
DOIs
StatePublished - 2026

Keywords

  • BERT fine-tuning
  • GPT-4
  • automated essay classification
  • self-affirmation intervention
  • student writing coding
  • text classification

ASJC Scopus subject areas

  • Education
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Automating Self-Affirmation Essay Coding: Fine-Tuned BERT Performance Comparable to Human Coders and Comparison with GPT-4'. Together they form a unique fingerprint.

Cite this