Evaluation of the split-data strategy in factor analysis

Zhou, Xinchang

Evaluation of the split-data strategy in factor analysis

Zhou, Xinchang

Permalink

https://hdl.handle.net/2142/116043

Description

Title

Evaluation of the split-data strategy in factor analysis

Author(s)

Zhou, Xinchang

Issue Date

2022-07-08

Director of Research (if dissertation) or Advisor (if thesis)

Xia, Yan

Committee Member(s)

Jiang, Ge
Zhang, Jinming

Department of Study

Educational Psychology

Discipline

Educational Psychology

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Keyword(s)

cross-validation
parallel analysis
model-data fit
confirmatory factor analysis
exploratory factor analysis

Language

eng

Abstract

When evaluating the psychometric properties of an assessment, researchers can perform an exploratory factor analysis (EFA), followed by a confirmatory factor analysis (CFA) on the same dataset (the whole-sample strategy) to evaluate the model structure. However, the model structure obtained by the whole-sample strategy is based on only one dataset and is, therefore, subject to capitalization on chance. To strengthen the generalizability of models, researchers suggest conducting cross-validation and applying different datasets in practice. Nevertheless, because collecting multiple datasets are not always feasible in practice, researchers commonly conduct the split-data strategy by randomly splitting the dataset into two halves, performing EFA on the first half, and conducting CFA on the second half to validate the structure obtained from EFA. Despite the popularity of the split-data strategy, evidence supporting this strategy is not sufficient in the literature. To examine the utility of the split-data strategy, this thesis research includes two studies using Monte Carlo simulations to explore whether the split-data strategy has advantages over the whole-sample strategy in correctly identifying two critical aspects of model structures in psychological assessments: the number of latent factors and the existence of cross-loadings. Results show that the split-data strategy is less effective than the whole-sample strategy in evaluating the number of factors and cross-loadings in all simulation conditions. Using the split-data strategy is only acceptable, though not necessary, under conditions with large samples (greater than 1,000 for the investigated models) and good model quality (i.e., large primary loadings, no cross-loading, and small factor correlations).

Graduation Semester

2022-08

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/116043

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Psychology

Dissertations and Theses from the Dept. of Psychology

Evaluation of the split-data strategy in factor analysis

Zhou, Xinchang

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Psychology

Log In