Abstract

The use of generative artificial intelligence (AI) is more vital ever than before to create new content, especially images. Recent breakthroughs in text-to-image diffusion models showed the potential to drastically change the way we approach image content creation. However, artists still face challenges when attempting to create images that reflect their specific themes and formats, as the current generative systems, such as Stable Diffusion models, require the right prompts to achieve the desired artistic outputs. In this paper, we propose future design considerations to develop more intuitive and effective interfaces that can be used for text-to-image prompt engineering from a human-AI interaction perspective using a data-driven approach. We collected 78,911 posts from the internet community and analyzed them through thematic analysis. Our proposed directions for interface design can help improve the user experience as well as usability, ultimately leading to a more effective, desired image generation process for creators.

Keywords

Stable Diffusion, Human-AI Interaction, Thematic Analysis, Interface Design

Creative Commons License

Creative Commons Attribution-NonCommercial 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Conference Track

fullpapers

Share

COinS
 
Oct 9th, 9:00 AM

Designing interfaces for text-to-image prompt engineering using stable diffusion models: a human-AI interaction approach

The use of generative artificial intelligence (AI) is more vital ever than before to create new content, especially images. Recent breakthroughs in text-to-image diffusion models showed the potential to drastically change the way we approach image content creation. However, artists still face challenges when attempting to create images that reflect their specific themes and formats, as the current generative systems, such as Stable Diffusion models, require the right prompts to achieve the desired artistic outputs. In this paper, we propose future design considerations to develop more intuitive and effective interfaces that can be used for text-to-image prompt engineering from a human-AI interaction perspective using a data-driven approach. We collected 78,911 posts from the internet community and analyzed them through thematic analysis. Our proposed directions for interface design can help improve the user experience as well as usability, ultimately leading to a more effective, desired image generation process for creators.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.