Social stereotypes in text-to-image generation: Examining user perceptions and debiasing strategies
Barve, Saharsh Sandeep
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/129520
Description
Title
Social stereotypes in text-to-image generation: Examining user perceptions and debiasing strategies
Author(s)
Barve, Saharsh Sandeep
Issue Date
2025-04-10
Director of Research (if dissertation) or Advisor (if thesis)
Saha, Koustuv
Department of Study
Siebel School Comp & Data Sci
Discipline
Computer Science
Degree Granting Institution
University of Illinois Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Generative AI
Social Computing
Text-to-Image Generation
Social Stereotypes
User Perception
Large Language Models
Human-Computer Interaction
Language
eng
Abstract
The rise of generative AI has fueled a shift from traditional search-based image retrieval to text-based image synthesis. While this advancement enables more creative and dynamic image generation, it also raises significant ethical concerns. The growing use of AI-generated images risks reinforcing or even amplifying societal stereotypes related to gender, race, and ethnicity, ultimately contributing to social divisions and distorting cultural representation in our digital world. As these technologies scale, robust frameworks to assess and mitigate social biases remain lacking. Our research addresses this gap by proposing an evaluation mechanism that leverages LLMs as judges while keeping humans in the loop. We evaluate this method across three prompt categories—Geocultural, Occupational, and Adjective—and three T2I models (DALL·E 3, Midjourney v6.1, and Stability AI Core). Additionally, we present a user study to understand how users perceive generated images and the social stereotypes embedded within them, ensuring our approach aligns with real-world expectations. Our findings reveal a key tension: while explicit prompt refinement can reduce stereotypical cues in images, it can also reduce contextual alignment to the original prompt. Conversely, our user study reveals how people often relate to stereotypical cues as more contextually relevant and recognizable. Our work seeks to ensure that text-based image synthesis preserves global diversity, fosters social inclusivity, and accurately represents real-world society.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.