Intra-class Correlation Coefficient (ICC)
Intra‐class correlation coefficient (ICC) is a modified pearson correlation coefficient that indexes both degree of correlation and agreement between two measurements1.
The ICC is the most comprehensive measure of reliability since it depends on level of agreement (like kappa) and on the correlation between 2 measures (like correlation coefficient)
It is sensitive for example to the extent to which subjects (individuals) keep their ranking order in repeated measurements. Moreover, itmay indicate the ability of an experimental method to detect and measure systematic differences between subjects. This ability islimited since those differences may be more or less masked by individual variations of
How to use this test
Dr. Monroe stated that ICC is the most comprehensive form of reliability. If the ICC is high, then you do not need to examine the other statistical results. If the ICC is low, then
Models
One-Way Random-Effects Model
- Either participants or evaluators are random
- This is not very common
In this model, each subject is rated by a different set of raters who were randomly chosen from a larger population of possible raters. Practically, this model is rarely used in clinical reliability analysis because majority of the reliability studies typically involve the same set of raters to measure all subjects. An exception would be multicenter studies for which the physical distance between centers prohibits the same set of raters to rate all subjects. Under such circumstance, one set of raters may assess a subgroup of subjects in one center and another set of raters may assess a subgroup of subjects in another center, and hence, 1-way random-effects model should be used in this case.
Two-Way Random
- both the patients and researchers are random
Two-Way Mixed
- Most common
- Raters are fixed (not random)
Type
- Consistency
- Absolute agreement
Consistency
- Evaluating any potential or level of linear relationship between instructors
Absolute Agreement
- Evaluates how close the raters were in terms of their scores
- Not interested in linear relationship
- Focuses on whether raters have close or identical ratings
ICC Forms
10 forms of ICC based on the “Model” (1-way random effects, 2-way random effects, or 2-way fixed effects), the “Type” (single rater/ measurement or the mean of k raters/measurements), and the “Definition” of relationship considered to be important (consistency or absolute agreement)1.
Choosing the Right ICC Form
Calculation
However, modern ICC is calculated by mean squares (ie, estimates of the population variances based on the variability among a given set of measures) obtained through analysis of variance.
Alternative equation from3
the word “subject” in the previous equation likely refers to “rater” since this test assesses inter-rater reliability
What happens when the unwanted variance (
We will use an example of this, with variance of interst being “5” and unwanted variance being “6”
As a result, the reliability (ICC) of the method will be poor, resulting with a value of 0.4552.
Output
An ICC will generally provide a single measure and average measure output for ICC.
Single
Scoring
Values range from 0.00 (not reliable) to 1.00 (perfectly reliable)
Score | Reliability |
---|---|
0.00 – 0.20 | Poor |
0.21 – 0.40 | Fair |
0.41 – 0.60 | Moderate |
0.61 – 0.80 | Good |
0.81 – 1.00 | Excellent |