Files in this item

FilesDescriptionFormat

application/pdf

application/pdfBHARGAVA-THESIS-2019.pdf (6MB)Restricted to U of Illinois
(no description provided)PDF

Description

Title:Exposing and correcting the gender bias in image captioning datasets and models
Author(s):Bhargava, Shruti
Advisor(s):Forsyth, David Alexander
Department / Program:Computer Science
Discipline:Computer Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
Degree:M.S.
Genre:Thesis
Subject(s):Image captioning
gender bias
machine learning
deep learning
fairness
Abstract:The task of image captioning implicitly involves gender identification. However, due to the gender bias in data, gender identification by an image captioning model suffers. Also, due to the word-by-word prediction, the gender-activity bias in the data tends to influence the other words in the caption, resulting in the well know problem of label bias. In this work, we investigate gender bias in the COCO captioning dataset, and show that it engenders not only from the statistical distribution of genders with contexts but also from the flawed per instance annotation provided by the human annotators. We then look at the issues created by this bias in the models trained on the data. We propose a technique to get rid of the bias by splitting the task into 2 subtasks: gender-neutral image captioning and gender classification. By this decoupling, the gender-context influence can be eradicated. We train a gender neutral image captioning model, which does not exhibit the language model based bias arising from the gender and gives good quality captions. This model gives comparable results to a gendered model even when evaluating against a dataset that possesses similar bias as the training data. Interestingly, the predictions by this model on images without humans, are also visibly different from the one trained on gendered captions. For injecting gender into the captions, we train gender classifiers using cropped portions of images that contain only the person. This allows us to get rid of the context and focus on the person to predict the gender. We train bounding box based and body mask based classifiers, giving a much higher accuracy in gender prediction than an image captioning model implicitly attempting to classify the gender from the full image. By substituting the genders into the gender-neutral captions, we get the final gendered predictions. Our predictions achieve similar performance to a model trained with gender, and at the same time are devoid of gender bias. Finally, our main result is that on an anti-stereotypical dataset, our model outperforms a popular image captioning model which is trained with gender.
Issue Date:2019-04-26
Type:Text
URI:http://hdl.handle.net/2142/105104
Rights Information:Copyright 2019 Shruti Bhargava
Date Available in IDEALS:2019-08-23
Date Deposited:2019-05


This item appears in the following Collection(s)

Item Statistics