Abstract
This paper investigates how interaction modalities, specifically text-based versus speech-based interfaces, influence the experience and outcomes of academic co-writing with generative AI (GenAI). Drawing on distributed and situated cognition and human-AI creativity studies, I argue that modality is a structuring force in how ideas emerge, evolve, and stabilize in academic writing. Text-based interaction supports branching exploration, comparative revision, and temporal layering, enabling asynchronous control and persistent visual memory. In contrast, speech-based interaction facilitates associative thinking and spontaneous ideation, offering immediacy and flow. However, it can limit memory offloading, structural manipulation, and cross-comparison. I synthesize findings from related work in HCI, including multimodal interaction and mixed-initiative systems, to propose design implications for GenAI tools. These include supporting multi-threaded interfaces, prompt orchestration, hybrid modality integration, and friction-aware scaffolding. The paper reframes modality as a central design dimension in AI-supported knowledge work, contributing to ongoing conversations in HCI and creativity research.
Keywords
Interaction modality; Generative artificial intelligence; Writing; Creativity-support tools; Interaction design; Human-computer interaction
DOI
https://doi.org/10.21606/iasdr.2025.423
Citation
Dalsgaard, P.(2025) How Interaction Modalities Influence Academic Co-Writing with Generative Artificial Intelligence, in Chang, C.-Y., and Hsu, Y. (eds.), IASDR 2025: Design Next, 02-05 December, Taiwan. https://doi.org/10.21606/iasdr.2025.423
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Conference Track
Track 4 - Human-Centered AI
How Interaction Modalities Influence Academic Co-Writing with Generative Artificial Intelligence
This paper investigates how interaction modalities, specifically text-based versus speech-based interfaces, influence the experience and outcomes of academic co-writing with generative AI (GenAI). Drawing on distributed and situated cognition and human-AI creativity studies, I argue that modality is a structuring force in how ideas emerge, evolve, and stabilize in academic writing. Text-based interaction supports branching exploration, comparative revision, and temporal layering, enabling asynchronous control and persistent visual memory. In contrast, speech-based interaction facilitates associative thinking and spontaneous ideation, offering immediacy and flow. However, it can limit memory offloading, structural manipulation, and cross-comparison. I synthesize findings from related work in HCI, including multimodal interaction and mixed-initiative systems, to propose design implications for GenAI tools. These include supporting multi-threaded interfaces, prompt orchestration, hybrid modality integration, and friction-aware scaffolding. The paper reframes modality as a central design dimension in AI-supported knowledge work, contributing to ongoing conversations in HCI and creativity research.