Education technology applications are continually being invented, and recent AI innovations have only increased that rate of innovation. While there are widely-accepted approaches to evaluate stable, late-stage products (e.g., Randomized Controlled Trials), there is much less clarity about how to conduct these evaluations at earlier stages of development. Given the potential risks in AI-powered solutions due to potential hallucinations and concerns about fairness, early evaluations are more important than ever.
Five New Tools From The AI LEVI Lab
The Learning Engineering Virtual Institute (LEVI) created an AI Lab to help develop new ChatGPT-based tools. LEVI is supported by the Walton Family Foundation, and through an extensive collaboration with educators, students, and parents, the AI Lab developed five new pilot tools that address critical issues in education.
CMU PLUS: Using Tutoring And AI To Overcome Math Anxiety
PLUS, part of the Learning Engineering Virtual Institute, is currently used in 13 schools in four states, reaching an estimated 2,800 students. Researchers have determined that students who use the platform improve their math skills and may even double their rate of math learning. Read about how one student ALSO overcame his math anxiety by using PLUS.
Development of Scenario-Based Mentor Lessons
This demonstration shows the recent advancement of scenario based tutor training and its focus on using the learn-by-doing approach. The 15 minute lessons outlined in this study use the predict-observe-explain inquiry method to develop tutor skills in helping student motivation. These methods are being developed within the Personalized Learning2 (PL2) program. PL2 is an app that combines student software with human tutors to improve mentoring ability. Enhancing mentor training will help to increase student ability while also maintaining low costs. This form of training works best when tutors have scenario based practice with response specific feedback.
Rewriting Math Word Problems with Large Language Models
In a recent study, math problems in Carnegie Learning’s MATHia adaptive learning software were rewritten by human authors and AI to improve clarity. Findings showed that readers spent less time reading rewritten human content and achieved higher mastery than did readers who read the original content. The team conducting the study also used GPT-4 to rewrite the same set of math word problems with the same guidelines that the human authors used. comparing zero-shot, few-shot and chain-of-thought prompting strategies. Overall, report analysis of human-written, original and GTP-written problems showed that GTP rewrites have the most optimal readability, lexical diversity and cohesion scores, though used more low frequency words. Carnegie Learning plans to present their outputs at randomized field trials in MATHia.
Scenario-Based Training and On-The-Job Support for Equitable Mentoring
Personalized Learning2 (PL2) is a professional mentoring platform created by researchers at Carnegie Mellon. Its goal is to improve workplace efficiency and utilize personalized learning to teach through situation-based instruction. PL2 combines both AI and mentor driven research training to help under-trained tutors with personalized learning. This platform includes social-emotional learning, math content and culturally responsive teaching practices to address the gap between historically marginalized students by training tutors to be more efficient and productive. PL2 offers a lower cost option for deliberate practice in order to increase impact and learning capacity of tutors.
Computer-Supported Human Mentoring
In a recent study, math problems in Carnegie Learning’s MATHia adaptive learning software were rewritten by human authors and AI to improve clarity. Findings showed that readers spent less time reading rewritten human content and achieved higher mastery than did readers who read the original content. The team conducting the study also used GPT-4 to rewrite the same set of math word problems with the same guidelines that the human authors used. comparing zero-shot, few-shot and chain-of-thought prompting strategies. Overall, report analysis of human-written, original and GTP-written problems showed that GTP rewrites have the most optimal readability, lexical diversity and cohesion scores, though used more low frequency words. Carnegie Learning plans to present their outputs at randomized field trials in MATHia.
Retrieval-augmented Generation to Improve Math Question-Answering: Trade-offs Between Groundedness and Human Preference
Interactive question-answering (QA) with tutors has shown to be an effective way for middle school math students to learn. While not all students have access to a tutor, large language models make it possible to automate portions of the tutoring process–including interactive QA to support students’ conceptual discussion of mathematical concepts. Some have questioned how LLM responses can be better aligned with a school’s curriculum. In this paper, Levonian and colleagues explore how retrieval-augmented generation (RAG) can help improve response quality by incorporating textbook information and other educational resources, while also identifying trade-offs of using RAG.