Abstract

This study tackles a dual challenge in writing assessment: validating the reliability of a Large Language Model (LLM) as an automated evaluator and assessing the efficacy of a Narrative Structure Generator (NSG) in improving the structural quality of story outlines. Through a systematic methodology of prompt engineering and repeated evaluations, we first established a highly reliable automated assessment framework using Google Gemini 1.5 Pro. The framework demonstrated excellent inter- rater reliability (ICC) and internal consistency (Cronbach's Alpha), successfully mitigating the model's inherent stochastic ity. Leveraging this validated tool, a within-subjects experiment was conducted to compare outlines produced by students with and without NSG assistance. A paired-samples t-test showed that the NSG significantly enhanced outline quality across three core dimensions: Narrative Logic, Dramatic Conflict, and Emotional Arc, with the strongest impact observed in the construction of Dramatic Conflict. Consequently, this study not only presents a validated methodology for employing LLMs in rigorous academic research but also provides robust empirical evidence for the pedagogical potential of NSG technology.

Keywords

Narrative Structure; LLM-based Assessment; Writing Scaffolding; Automated Writing Evaluation

Creative Commons License

Creative Commons Attribution-NonCommercial 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Conference Track

Track 12 - Design Education

Share

COinS
 
Dec 2nd, 9:00 AM Dec 5th, 5:00 PM

Scaffolding the Story: An LLM-Based Assessment of a Next-Generation Narrative Structure Generator

This study tackles a dual challenge in writing assessment: validating the reliability of a Large Language Model (LLM) as an automated evaluator and assessing the efficacy of a Narrative Structure Generator (NSG) in improving the structural quality of story outlines. Through a systematic methodology of prompt engineering and repeated evaluations, we first established a highly reliable automated assessment framework using Google Gemini 1.5 Pro. The framework demonstrated excellent inter- rater reliability (ICC) and internal consistency (Cronbach's Alpha), successfully mitigating the model's inherent stochastic ity. Leveraging this validated tool, a within-subjects experiment was conducted to compare outlines produced by students with and without NSG assistance. A paired-samples t-test showed that the NSG significantly enhanced outline quality across three core dimensions: Narrative Logic, Dramatic Conflict, and Emotional Arc, with the strongest impact observed in the construction of Dramatic Conflict. Consequently, this study not only presents a validated methodology for employing LLMs in rigorous academic research but also provides robust empirical evidence for the pedagogical potential of NSG technology.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.