Improved 10-Step Modeling Process

When I'm asked to help build a linear regression model from historical data I recommend DoEs if possible. If that's not possible I judiciously point out the pitfalls, then suggest the following:

Select a continuous response variable (Y) to be modeled (we'll leave responses on other measurement scales for another column.)
Choose a set of predictor variables. Choose Xs that subject matter expertise suggests might cause a change in Y.
Using software such as Minitab, perform a best subsets regression with the candidate variables.
Select the best subset model using the following criteria: Cp ≤ p+1, where p is the number of predictors; Cp minimized is a criterion that is often used; the standard error should be small; R2 (or R2 adjusted) should be large. There may be more than one model that bears further study.
Create a regression model (or models) that includes all Xs in the chosen best subset or subsets. Be sure the analysis includes the Variance Inflation Factors. VIFs are a measure of the correlation between Xs in the model and the other Xs in the model. A large VIF indicates that the standard error of the predictor variable will be large as well.
If a "best subset" by the above criteria include Xs with VIFs greater than 10, drop these subsets.
Occam's razor is the philosophy that one should not increase, beyond what is necessary, the number of entities required to explain anything. In this case, applying Occam's razor means that we will use as few predictors as possible. We won't add variables just to get a small improvement in R-square or the standard error. Adjust your model selection based on this philosophy.
Assess the quality of the fitted model. Look at the residuals to see if there are patterns, lack of normality, or outliers. If justified, remove the cases responsible for the problems or apply a transformation to the data. Use a procedure to identify influential observations. Minitab's DFITS metric is a good way to do this. DFITS represents roughly the number of estimated standard deviations that the fitted value changes when the ith observation is removed from the data. An easy way to compare DFITS is to graph the DFITS values using boxplots, then look for extreme values on the boxplot chart. Brush these values to identify the cases responsible and consider dropping them.
Repeat steps 3-5 until all the criteria are met. If no acceptable model results from these steps, return to step 2.
If a transformation was used, convert the predictions back to the original units and compare them to the actual values.

Models built with this procedure won't be perfect. If you can conduct designed experiments you could certainly do better. A trained statistician or, perhaps, a Master Black Belt, could apply a number of advanced statistical methods such as principal components analysis, ridge regression, or partial least squares just to name a few. But most Six Sigma Black Belts and Green Belts will find themselves forced to glean as much information as possible from historical data by using the modeling tool they learned in their training: linear regression analysis. This procedure is written for this group of noble change agents. As DoE guru G.E.P. Box says, "All models are wrong. Some models are useful."