Regression of the mean

A threat to internal validity

Authors
Affiliations

Doctor of Physical Therapy

B.S. in Kinesiology

Doctor of Physical Therapy

B.A. in Neuroscience

AKA
  • Regression toward mediocrity

Regression towards the mean (RTM) refers to the statistical tendency for extreme scores to move or regress toward the mean value with repeated measurements1. Regression of the mean can usually cause an underestimation of treatment effects due to greater potential for improvement in extreme scores in the control group. Regression toward the mean states that if a score deviates from the mean on one occasion, then a subsequent measurement will be closer to the average1

Example

Patients in control group have significantly higher pain scores at baseline compared to patients in experimental group Example of Selection & Regression of the mean

Example
  • A study chooses participants whether they are above or below a cut-off score1.
  • Due to measurement error

Explanation

According to measurement theory, the observed score (\(X_o\)) is the sum of the “True score” (\(X_T\)) and some error (\(e\)). Measurement error (\(e\)) of the instrument may cause some individuals to erroneously be considered outside of the cut-off point and individuals who truly fit the criteria and excluded. Simultaneously, (\(e\)) can cause individuals who do not truly meet the criteria to be erroneously scored an extreme score far from their mean and therefore included in the participants. Once the study begins and more data points are taken over time, these individuals who were improperly included will begin to “regress” towards their mean score.

Note

True score (\(X_T\)) refers to the score if there were no error. True score does not refer to the accuracy of the score.

Factors affecting RTM

  • Reliability: The less reliable the scale, the greater the RTM1

Calculation

  • \(T_2\): Predicted score at time 2
  • \(r\): Test retest reliability
  • \(T_1\): Score at time 1
  • (\(T_1 - \overline{X}\)): How much a score deviates from the mean

\[ T_2 = \overline{X} + r(T_1 - \overline{X}) \]

How to interpret this result:

  • This does not indicate that a score will change1
  • This calculation shows the magnitude that RTM can affect the results1

Resources

  • To read more about RTM see: Streiner (2001) Regression toward the mean: its etiology, diagnosis, and treatment2
  • Deep dive: Campbell (1999) A PRIMER ON REGRESSION ARTIFACTS3

References

1.
Streiner DL. Statistics Commentary Series: Commentary #16-Regression Toward the Mean. Journal of Clinical Psychopharmacology. 2016;36(5):416-418. doi:10.1097/JCP.0000000000000551
2.
Streiner DL. Regression toward the mean: Its etiology, diagnosis, and treatment. Canadian Journal of Psychiatry Revue Canadienne De Psychiatrie. 2001;46(1):72-76. doi:10.1177/070674370104600111
3.
Campbell DT, Kenny DA, Reichardt CS. A Primer on Regression Artifacts. Paperback ed. Guilford Press; 2003.

Citation

For attribution, please cite this work as: