Abstract

Large Language Models are increasingly integrated into UX design. However, their effectiveness in meeting visual accessibility requirements is under-explored. This research evaluates ChatGPT and Microsoft Copilot to generate visually accessible interfaces using a Research through Design methodology. First, an accessibility scoring system was created from the Apple, WCAG 2.2, and Microsoft accessibility guidelines. Second, design experiments were conducted using ChatGPT and Copilot, and the outputs were evaluated using the new scoring system. Findings indicate ChatGPT and Copilot can respond effectively to well-structured prompts, but they demonstrate low competence in executing visually accessible interfaces. This research makes two valuable contributions to the field. It accesses the state-of-the-art capabilities of AI-generated design for visual accessibility, proposing a balanced positioning of AI as an assistive tool rather than an autonomous designer; and, it provides a new ‘cross-standard’ scoring system and method for evaluating the visual accessibility of AI-generated outputs.

Keywords

visual accessibility, AI-generated, accessibility compliance, cross-standard scoring system

Creative Commons License

Creative Commons Attribution-NonCommercial 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Share

COinS
 
Jun 8th, 9:00 AM Jun 12th, 5:00 PM

Evaluating visual accessibility of AI-generated interfaces

Large Language Models are increasingly integrated into UX design. However, their effectiveness in meeting visual accessibility requirements is under-explored. This research evaluates ChatGPT and Microsoft Copilot to generate visually accessible interfaces using a Research through Design methodology. First, an accessibility scoring system was created from the Apple, WCAG 2.2, and Microsoft accessibility guidelines. Second, design experiments were conducted using ChatGPT and Copilot, and the outputs were evaluated using the new scoring system. Findings indicate ChatGPT and Copilot can respond effectively to well-structured prompts, but they demonstrate low competence in executing visually accessible interfaces. This research makes two valuable contributions to the field. It accesses the state-of-the-art capabilities of AI-generated design for visual accessibility, proposing a balanced positioning of AI as an assistive tool rather than an autonomous designer; and, it provides a new ‘cross-standard’ scoring system and method for evaluating the visual accessibility of AI-generated outputs.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.