All Duquesne Faculty Scholarship

Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items

DOI

10.1177/0013164419899731

Cindy M. Walker, Duquesne University
Sakine Göçer Şahin, University of Wisconsin-Madison

Document Type

Journal Article

Publication Date

8-1-2020

Publication Title

Educational and Psychological Measurement

Volume

Issue

First Page

808

Last Page

820

ISSN

131644

Keywords

classical test theory, constructed response items, differential item functioning, interrater reliability, rater severity

Abstract

The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared with traditional interrater reliability measures. Three different procedures that can be used as measures of interrater reliability were compared: (1) intraclass correlation coefficient (ICC), (2) Cohen’s kappa statistic, and (3) DIF statistic obtained from Poly-SIBTEST. The results of this investigation indicated that DIF procedures appear to be a promising alternative to assess the interrater reliability of constructed response items, or other polytomous types of items, such as rating scales. Furthermore, using DIF to assess interrater reliability does not require a fully crossed design and allows one to determine if a rater is either more severe, or more lenient, in their scoring of each individual polytomous item on a test or rating scale.

Open Access

Green Final

Preprint

Link to Free Full Text

Repository Citation

Walker, C., & Göçer Şahin, S. (2020). Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items. Educational and Psychological Measurement, 80 (4), 808-820. https://doi.org/10.1177/0013164419899731

Link to Full Text

COinS

All Duquesne Faculty Scholarship

Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items

DOI

Document Type

Publication Date

Publication Title

Volume

Issue

First Page

Last Page

ISSN

Keywords

Abstract

Open Access

Preprint

Repository Citation

Browse

Search

Author Corner

All Duquesne Faculty Scholarship

Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items

DOI

Authors

Document Type

Publication Date

Publication Title

Volume

Issue

First Page

Last Page

ISSN

Keywords

Abstract

Open Access

Preprint

Repository Citation

Share

Browse

Search

Author Corner