Inter-Rater Reliability
- INTER‐rater reliability: Consistency of a measure assessed by multiple raters
- Multiple PTs doing the test
- Purpose: Determine the agreement between scores given independently by 2 practitioners1
The important point here is the necessity for the independence of the raters; 1 rater must be ignorant of the other’s score. Otherwise, we would have a measure of the degree to which 1 person can accurately copy from another, which tells us much about the raters, but little about the subject.
Tests for Inter-Rater Reliability
- Total Agreement: Percentage of examinations in which the task scores assigned by the two raters were identical (total agreement)andersonUnitedStatesUS2011?
- Cohen’s Kappa: The chance corrected measure of agreement on tasks
- Krippendorff’s Alpha
- Has the advantage of high flexibility regarding the measurement scale and the number of raters, and, unlike Fleiss’ K, can also handle missing values2.
- Pearson’s coefficients: Of the correlation between the two raters’ scores for each subscale
- T-Tests: between the two raters’ mean scores for the each subscale
- Intraclass Correlation Coefficient
References
1.
Streiner DL. Statistics Commentary Series: Commentary #15-Reliability. Journal of Clinical Psychopharmacology. 2016;36(4):305-307. doi:10.1097/JCP.0000000000000517
2.
Zapf A, Castell S, Morawietz L, Karch A. Measuring inter-rater reliability for nominal data - which coefficients and confidence intervals are appropriate? BMC medical research methodology. 2016;16:93. doi:10.1186/s12874-016-0200-9
Citation
For attribution, please cite this work as:
Yomogida N, Kerstein C. Inter-Rater
Reliability. https://yomokerst.com/The
Archive/Evidene Based
Practice/Reliability/inter-rater_reliability.html