Improving the Generalizability of Models of Collaborative Discourse

Abstract

We investigated methods to enhance the generalizability of large language models (LLMs) designed to classify dimensions of collaborative discourse during small group work. Our research utilized five diverse datasets that spanned various grade levels, demographic groups, collaboration settings, and curriculum units. We explored different model training techniques with RoBERTa and Mistral LLMs, including traditional fine-tuning, data augmentation paired with fine-tuning, and prompting. Our findings revealed that traditionally fine-tuning RoBERTa on a single dataset (serving as our baseline) led to overfitting, with the model failing to generalize beyond the training data’s specific curriculum and language patterns. In contrast, fine-tuning RoBERTa with embedding augmented data led to significant improvements in generalization, as did pairing Mistral embeddings with a support vector machine classifier. However, fine-tuning and few-shot prompting Mistral did not yield similar improvements. Our findings highlight scalable alternatives to the resource-intensive process of curating labeled datasets for each new application, offering practical strategies to enhance model adaptability in diverse educational settings.

Keywords

Generalization, Natural language processing, Collaboration
analytiGeneralization, Natural language processing, Collaboration
analytics

Improving the Generalizability of Models of Collaborative Discourse

Improving the Generalizability of Models of Collaborative Discourse

Related Public Good

Sense-Making with an AI-Enhanced Coaching Tool: A Think-Aloud Study

An Evaluation of Perceptions Regarding Mentor Competencies for Technology-based Personalized Learning

Development of Scenario-Based Mentor Lessons