Category information in real-world scenes: evaluation and reconstruction of human category spaces
Yang, Pei-Ling
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/132654
Description
Title
Category information in real-world scenes: evaluation and reconstruction of human category spaces
Author(s)
Yang, Pei-Ling
Issue Date
2025-11-26
Director of Research (if dissertation) or Advisor (if thesis)
Beck, Diane M
Doctoral Committee Chair(s)
Beck, Diane M
Committee Member(s)
Koehn, Hans F
Hummel, John E
Simons, Daniel J
Federmeier, Kara D
Willits, Jon A
Department of Study
Psychology
Discipline
Psychology
Degree Granting Institution
University of Illinois Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
scene categorization, similarity tasks, convolutional neural network, multidimensional scaling
Abstract
Categorization is fundamental to scene understanding, yet there is relatively little research into the structure of human scene categories. This thesis focuses on examining whether various image-based feature spaces can approximate human category representations for real-world scenes. Similarity was used to assess category representation, and the category structure was visualized and compared by constructing geometric representations. Several feature spaces were tested, ranging from low- and mid-level visual features, layer activations of convolutional neural networks (CNNs) trained on real-world scenes, and transformer models. Chapter 2 examined whether the layer activations of CNNs can capture human typicality effects. Chapter 3 compared various categorization tasks and laid the groundwork for building a reliable and valid human categorization space. Building on the results of Chapter 3, Chapter 4 further measured the correspondence between feature spaces and human categorization results by comparing the distance matrix correlations derived from ordinal multidimensional scaling (MDS) and the alignment of p-median clustering results. Among the tested features here, the fc7 layer, the last layer of CNN before the classification layer, showed the strongest typicality effect, the high correlations (r = 0.74) in the ordinal MDS distance matrix and the highest correspondence to human p-median clusters (ARI = 0.56). This suggests that the category information aggregated in the later layer of the CNNs has some agreement with the category information used in similarity tasks for humans. Moreover, GPT4-o models when prompted with the same task instructions as human participants, had the best correlations to human category spaces, implicating the importance of task alignment between the models and human task. More clustering methods that capture the flexible nature of similarity, however, are needed to support stronger claims. Overall, this thesis demonstrates and highlights the usefulness of DNN information in modeling human categorization of real-world scenes. Based on these findings, it may be possible, eventually, to use DNNs as a substitute for human participants when generating similarity measures for items.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.