Assessing the Interpretive Component of Criterion-Referenced Test Item Validity
Secolsky, Charles
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/65968
Description
Title
Assessing the Interpretive Component of Criterion-Referenced Test Item Validity
Author(s)
Secolsky, Charles
Issue Date
1980
Department of Study
Education
Discipline
Education
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
Ph.D.
Degree Level
Dissertation
Keyword(s)
Education, Tests and Measurements
Language
eng
Abstract
The usefulness of an innovative testing technique for empirically detecting ambiguously worded or structurally deficient test items was explored. In addition to responding, eighty-nine undergraduate students in the architecture curriculum at the University of Illinois at Urbana-Champaign were asked to classify each test item on an electricity examination as having been generated from one or more of the topics representing the sub-unit headings from their text. These judgments were compared to their professor's "standard" judgments. For these data, an index of item-domain divergence in perceived item meaning between examinees on the average and the professor as content specialist was computed. Second, the amount of unexplained variation in response data not accounted for by classification data was computed for each "standard" domain label for each item. Third, the mean, variance, and n of total test scores for examinees was reported for each of the following examinee groups for each item: responded incorrectly, same classification; responded correctly, same classification; responded incorrectly, different classification; and responded correctly, different classification. Finally, z scores were computed for the difference between the mean total test scores for each of the above cells as compared to the mean total test score for all examinees for each "standard" domain for each item.
Examinee perceptions of the topic(s) that generated items in relation to the professor's judgment coupled with responses to the items provided estimates of the extent to which responses to items contained partial knowledge or careless errors (responded incorrectly, same classification), were valid (responded correctly, same classification), were misinterpreted (responded incorrectly, different classification), or were testwise (responded correctly, different classification). The above new measures provide a means for detecting ambiguous items not otherwise detectable using biserial correlations based upon response data only. Items deemed ambiguous by these exploratory procedures were compared with items indicated as being ambiguous by examinees in taped interviews during the week following the examination. Limitations of the study such as the use of the pre-existing sub-unit headings as domain labels which affected the divergence index and avenues for future research are discussed.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.