Standardization ado for Stata

I often want to standardize my variables before using them in models. Especially if a variable either is heavily right-skewed (then I like to log standardize the variable; e.g. number of protesters) or the coefficient of variable is uninterpretable anyways (then I tend to follow Gelman’s (2008) approach; e.g. age and age^2 which results in very small coefficients in a regression). So far I generated such standardizations myself, e.g.:

gen log_protesters=log(protesters+1)

However, since I tend to be lazy and try to minimize the amount of syntax I produce, I programmed a small program called standard2 which creates three new variables for each variable specified after the command “standard2“:

standard2 variable1 variable2

It will then create three new variables for each variable specified, e.g. for variable1: std2_variable1 (variable1 standardized by 2 std. dev.); mc_variable1 (variable1 mean centered); log_variable1=log(variable1+1)).

The program can be downloaded as an stata ado file here. You will need to unpack the .zip and copy the ado into your stata ado file directory (find a how to do: here).

Note: Do not “blindly” use the program. Familiarize yourself with the pros and cons of standardization in general and about which approach might be most suitable in your case.

References

Gelman, Andrew. 2008. “Scaling Regression Inputs by Dividing by Two Standard Deviations.Statistics in Medicine 27(15): 2865–73.

King, Gary. 1986. “How Not to Lie with Statistics: Avoiding Common Mistakes in Quantitative Political Science.American Journal of Political Science 30(3): 666–87.

%d bloggers like this: