Statistical Resources

Macro programs for:

[main][tools][misc][SPSS][Stata][R][SAS]

Statistical Resources

by
John Hendrickx

Perturb


Click here for a pdf version of the paper presented at the RC33 conference in Amsterdam, August 17-20 2004.

Perturb is a tool for evaluating ill conditioning in statistical analysis. It is based on the procedures desribed in chapter 11 of Belsley (1991). Ill conditioning means that estimated coefficients are unstable, i.e. small changes within the range of measuring error of the variables can lead to disproportionately large changes in the estimates. Collinearity is a type of ill-conditioning.

Perturb evaluates conditioning by adding small random changes (perturbations) to selected variables, then re-estimating the model. For categorical variables, a small percentage of misclassification is imposed. The model is then re-estimated and the process repeated. Descriptive statistics of the standard deviation, minimum and maximum of the coefficients under the perturbation analysis show how stable these are subject to small changes.

Advantages of this method are:

  • results are less abstract than collinearity diagnostics such as condition indexes or variance inflation factors (VIF)
  • categorical as well as continuous variables can be evaluated
  • perturbation analysis can be used for models other than linear regression
  • derived terms such as interactions or non-linear transformations are taken into account

Macro programs


SPSS
This version is the oldest and most limited. It cannot handle categorical variables and can only be used for linear regression models. Because it uses matrix commands to estimate the regression models, it could run into problems when collinearity is high and the matrix commands produce less accurate results than a built-in regression command.
Stata
This version is available via SSC. Use findit perturb and follow the instructions.
R
Available via CRAN. The R version of perturb could be ported to Splus fairly easily, according to the R-faq.

References

Belsley, D.A. (1991). Conditioning diagnostics, collinearity and weak data in regression. New York: John Wiley & Sons.

[main][tools][misc][SPSS][Stata][R][SAS]