Statistics 101 courses can be a little misleading. I would say 60% percent of a basic stats course resembles real-life statistics. In the real world, you can have the correct statistics for a particular outcome variable and not have the right to run it. Identifying the correct statistics is only part of the battle. the real battle comes when you test your assumptions about a test statistic. For instance, you may want to determine if your outcome variable is normally distributed. If it does not follow a normal distribution , you may decide to use a log transformation to standardize your results. There are a few typical assumptions that you need to evaluate before you conduct a linear regression. Here is a list of things to evaluate (found under the UCLA site:
- Is the relationship between the predictors and the outcome linear. Another words, look at the scatter plot and determine whether or not the values resemble a straight line or a curve. A curve suggest non-linear relationship and you might decide to transform the data to fit this assumption.
- Normality : You want to make sure your error are normally distributed – self explanatory to an extent. There are test statistics to verify normality.
- Homogeneity of variance (homoscedasticity) -You want to know if the error variance should be constant or heteroscedastic.
- Independence – the errors associated with one observation are not correlated with the errors of any other observation
- Model specification – make sure all the relevant variables are in the model.
If you want the specifics on the assumptions- more to follow!-Amy