Volume: | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 |

A peer-reviewed electronic journal. ISSN 1531-7714

Copyright is retained by the first or sole author, who grants right of first publication to |

Osborne, Jason W. (2003). Effect sizes and the disattenuation of correlation and regression coefficients: lessons from educational psychology. Practical Assessment, Research & Evaluation, 8(11). Retrieved July 31, 2014 from http://PAREonline.net/getvn.asp?v=8&n=11 . This paper has been viewed 35,788 times since 5/27/2003.
The
nature of social science research means that many variables we are interested
in are also difficult to measure, making measurement error a particular
concern. In simple correlation and regression, unreliable measurement causes
relationships to be In both cases this is a significant concern if the goal of research is to accurately model the “real” relationships evident in the population. Although most authors assume that reliability estimates (Cronbach alphas) of .70 and above are acceptable (e.g., Nunnally, 1978) and Osborne, Christensen, and Gunter (2001) reported that the average alpha reported in top Educational Psychology journals was .83, measurement of this quality still contains enough measurement error to make correction worthwhile, as illustrated below. Correction for low reliability is simple, and widely disseminated in most texts on regression, but rarely seen in the literature. I argue that authors should correct for low reliability to obtain a more accurate picture of the “true” relationship in the population, and, in the case of multiple regression or partial correlation, to avoid over-estimating the effect of another variable.
Since “the presence of measurement errors in behavioral research is the rule rather than the exception” and the “reliabilities of many measures used in the behavioral sciences are, at best, moderate” (Pedhazur, 1997, p. 172) it is important that researchers be aware of accepted methods of dealing with this issue. For simple correlation, Equation #1 provides an estimate of the “true” relationship between the IV and DV in the population:
In this equation, ris the observed correlation, and _{12}
r and _{11}r
are the reliability estimates of the variables. There are examples of the
effects of disattenuation in Table 1. For example, even when reliability is
.80, correction for attenuation substantially changes the effect size
(increasing variance accounted for by about 50%). When reliability drops to
.70 or below this correction yields a substantially different picture of the
“true” nature of the relationship, and potentially avoids Type II errors.
_{22}
With each independent variable added to the regression equation, the effects of less than perfect reliability on the strength of the relationship becomes more complex and the results of the analysis more questionable. With the addition of one independent variable with less than perfect reliability each succeeding variable entered has the opportunity to claim part of the error variance left over by the unreliable variable(s). The apportionment of the explained variance among the independent variables will thus be incorrect. The more independent variables added to the equation with low levels of reliability the greater the likelihood that the variance accounted for is not apportioned correctly. This can lead to erroneous findings and increased potential for Type II errors for the variables with poor reliability, and Type I errors for the other variables in the equation. Obviously, this gets increasingly complex as the number of variables in the equation grows. A simple example, drawing
heavily from Pedhazur (1997), is a case where one is attempting to assess the
relationship between two variables controlling for a third variable ( _{11}, r_{22}, and r_{33} are
reliabilities, and r_{12}, r_{23}, and r_{13} are
relationships between variables. If one is only correcting for low reliability
in the covariate one could use Equation #3.
Table 2 presents some examples of corrections for low reliability in the covariate (only) and in all three variables. Table 2 shows some of the many possible combinations of reliabilities, correlations, and the effects of correcting for only the covariate or all variables. Some points of interest: (a) as in Table 1, even small correlations see substantial effect size (r ^{2}) changes when
corrected for low reliability, in this case often toward reduced effect sizes
(b) in some cases the corrected correlation is not only substantially different
in magnitude, but also in direction of the relationship, and (c) as expected,
the most dramatic changes occur when the covariate has a substantial
relationship with the other variables.
Research by Bohrnstedt (1983) has argued that regression coefficients are primarily affected by reliability in the independent variable (except for the intercept, which is affected by reliability of both variables), while true correlations are affected by reliability in both variables. Thus, researchers wanting to correct multiple regression coefficients for reliability can use Formula 4, which is presented in Bohrnstedt (1983), and which takes this issue into account:
Some examples of disattenuating multiple regression coefficients are presented in Table 3. In these examples (which admittedly are a very narrow subset of the total possibilities), corrections resulting in impossible values were rare, even with strong relationships between the variables, and even when reliability
To this point the discussion has been confined to the relatively simple issue of the effects of low reliability, and correcting for low reliability, on simple correlations and higher-order main effects (partial correlations, multiple regression coefficients). However, many interesting hypotheses in the social sciences involve curvilinear or interaction effects. Of course, poor reliability of main effects is compounded dramatically when those effects are used in cross-products, such as squared or cubed terms, or interaction terms. Aiken and West (1996) present a good discussion on the issue. An illustration of this effect is presented in Table 4. As Table 4 shows, even at relatively high reliabilities, the reliability of cross-products is relatively weak. This, of course, has deleterious effects on power and inference. According to Aiken and West (1996) there are two avenues for dealing with this: correcting the correlation or covariance matrix for low reliability, and then using the corrected matrix for the subsequent regression analyses, which of course is subject to the same issues discussed above, or using SEM to model the relationships in an error-free fashion.
The goal of disattenuation is to be simultaneously accurate (in estimating the “true” relationships) and conservative in preventing overcorrecting. Overcorrection serves to further our understanding no more than leaving relationships attenuated. There are several scenarios that might lead to inappropriate inflation of estimates, even to the point of impossible values. A substantial under-estimation of the reliability of a variable would lead to substantial over-correction, and potentially impossible values. This can happen when reliability estimates are biased downward by heterogeneous scales, for example. Researchers need to seek precision in reliability estimation in order to avoid this problem. Given accurate reliability estimates, however, it is possible that sampling error, a well-placed outliers, or even suppressor variables could inflate relationships artificially, and thus, when combined with correction for low reliability, produce inappropriately high or impossible corrected values. In light of this, I would suggest that researchers make sure they have checked for these issues prior to attempting a correction of this nature (researchers should check for these issues regularly anyway).
Fortunately, as the field of measurement and statistics advances, other options to these difficult issues emerge. One obvious solution to the problem posed by measurement error is to use Structural Equation Modeling to estimate the relationship between constructs (which can be theoretically error-free given the right conditions), rather than utilizing our traditional methods of assessing the relationship between measures. This eliminates the issue of over or under-correction, which estimate of reliability to use, and so on. Given the easy access to SEM software, and a proliferation of SEM manuals and texts, it is more accessible to researchers now than ever before. Having said that, SEM is still a complex process, and should not be undertaken without proper training and mentoring (of course, that is true of all statistical procedures). Another emerging technology that can potentially address this issue is the use of Rasch modeling. Rasch measurement utilizes a fundamentally different approach to measurement than classical test theory, which many of us were trained in. Use of Rasch measurement provides not only more sophisticated, and probably accurate, measurement of constructs, but more sophisticated information on the reliability of items and individual scores. Even an introductory treatise on Rasch measurement is outside the limits of this paper, but individuals interested in exploring more sophisticated measurement models are encouraged to refer to Bond and Fox (2001) for an excellent primer.
To
give a concrete example of how important this process might be as it applies to
our fields of inquiry, I will draw from a survey I and a couple graduate
students completed of the Educational Psychology literature from 1998 to 1999.
This survey consisted of recording all effects from all quantitative studies
published in the Studies
from these years indicate a mean effect size ( From the same review of the literature, where reliabilities (Cronbach’s α) are reported, the average reliability is α = .80, with a standard deviation of .10. Table
5 contains the results of what would be the result for the field of Educational
Psychology in general if all studies in Educational Psychology disattenuated
their effects for low reliability (and if we assume reported reliabilities are
accurate). For example, while the average reported effect equates to a
correlation coefficient of
If
the goal of research is to be able to provide the best estimate of an effect
within a population, and we know that many of our statistical procedures assume
perfectly reliable measurement, then we must assume that we are consistently
under-estimating population effect sizes, usually by a dramatic amount. Using
the field of Educational Psychology as an example, and using averages across
two years of high-quality studies, we can estimate that while the average
reported effect size is equivalent to However, there are some significant caveats to this argument. In order to disattenuate relationships without risking over-correction you must have a good estimate of reliability, preferably Cronbach’s alpha from a homogeneous scale. Second, when disattenuating relationships, authors should report both original and disattenuated estimates, and should explicitly explain what procedures were used in the process of disattenuation. Third, when reliability estimates drop below .70 authors should consider using different measures, or alternative analytic techniques that do not carry the risk of over-correction, such as latent variable modeling, or better measurement strategies such as Rasch modeling.
Table 2 was published previously in Osborne and Waters (2002). I would like to acknowledge the contributions of Thomas Knapp in challenging my assumptions and thinking on this topic. I hope the paper is better because of his efforts.
Aiken, L. S., & West, S. G. (1996). Multiple regression: Testing and interpreting interactions. Thousand Oaks, CA: Sage. Bohrnstedt, G. W. (1983). Bond, T. G., & Fox, C. M. (2001). Nunnally, J. C. (1978). Osborne, J. W., Christensen, W. R., & Gunter, J.
(April, 2001). Osborne, J. W., & Waters, E. (2002).
Four assumptions of multiple regression
that researchers should always test. Pedhazur, E. J., (1997).
Jason W. Osborne can be contacted via email at jason_osborne@ncsu.edu, or via mail at: North Carolina State University, Campus Box 7801, Raleigh NC 27695-7801.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Descriptors: Statistical Adjustments; Correlation; Effect Size; Regression [Statistics] |