Evaluation of the split-data strategy in factor analysis
Zhou, Xinchang
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/116043
Description
Title
Evaluation of the split-data strategy in factor analysis
Author(s)
Zhou, Xinchang
Issue Date
2022-07-08
Director of Research (if dissertation) or Advisor (if thesis)
Xia, Yan
Committee Member(s)
Jiang, Ge
Zhang, Jinming
Department of Study
Educational Psychology
Discipline
Educational Psychology
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
confirmatory factor analysis
exploratory factor analysis
cross-validation
parallel analysis
model-data fit
Abstract
When evaluating the psychometric properties of an assessment, researchers can perform an exploratory factor analysis (EFA), followed by a confirmatory factor analysis (CFA) on the same dataset (the whole-sample strategy) to evaluate the model structure. However, the model structure obtained by the whole-sample strategy is based on only one dataset and is, therefore, subject to capitalization on chance. To strengthen the generalizability of models, researchers suggest conducting cross-validation and applying different datasets in practice. Nevertheless, because collecting multiple datasets are not always feasible in practice, researchers commonly conduct the split-data strategy by randomly splitting the dataset into two halves, performing EFA on the first half, and conducting CFA on the second half to validate the structure obtained from EFA. Despite the popularity of the split-data strategy, evidence supporting this strategy is not sufficient in the literature. To examine the utility of the split-data strategy, this thesis research includes two studies using Monte Carlo simulations to explore whether the split-data strategy has advantages over the whole-sample strategy in correctly identifying two critical aspects of model structures in psychological assessments: the number of latent factors and the existence of cross-loadings. Results show that the split-data strategy is less effective than the whole-sample strategy in evaluating the number of factors and cross-loadings in all simulation conditions. Using the split-data strategy is only acceptable, though not necessary, under conditions with large samples (greater than 1,000 for the investigated models) and good model quality (i.e., large primary loadings, no cross-loading, and small factor correlations).
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.