A Comparison of Classical Test Theory and Nonparametric Item Response Theory with Respect to the Effects of Testees’ Characteristics on Item Characteristics and vice versa

Authors

1 Professor of Allameh Tabatabai University in Tehran

2 Master of assessment and measurement of Allameh Tabatabai University

3 Master of Educational Research Tehran University

Abstract

This study aimed at comparing Classical Test Theory and Nonparametric Item Response Theory with respect to the effect of testees’ characteristics on item characteristics and vice versa. The research method was of the applied, descriptive type. Library research served to exam the issue from a theoretical point of view and the descriptive method was employed to treat the issue from a theoretical perspective. In the practical study, the answer sheets of the candidates for the university entrance exam in the field of physics-math, and emprial science were studied for their performance on the math respectively. First, out of the candidates in the physics-math group, 3,000 were selected based on systematic sampling. Then several sample groups were selected, using the appropriate software and the size of the sample to respond to the research questions. Out of the candidates in the empirical science group, 1,000 were chosed for the purpose of this research. The two research questions in this study were treated theatrically, and in each question, the advantages of Nonparametric Item Response Theory over Classical Test Theory were examined. The research questions were also examined from a practical point of view. To analyze data, the statistical methods based on Classical Test Model – such as mean, item difficulty, item variance, biserial correlation (rb), and point-biserial correlation (rpb) –  were applied. KR-20, frequency distribution, and test score plot were employed to analyze the total tests in this model. To analyze data according to Item Response Theory, statistical methods such as paired t-test, Pearson Cofficient Correlation and tests of significant were used. The result from the two methods of data analysis showed that the differences between the two groups of candidates affected the item indexes in CTT, but they did not affect the item index in NIRT. The results also showed that the ability parameter was affected by Item characteristics in CTT; however, in NIRT, item characteristics failed to measure the ability parameter of the testees and, like the item parameter, the ability parameter was invariant and non-skewed.

Keywords