Jared Knowles
October, 2013
Applied modeling goes by many names: statistical learning, machine learning, predictive analytics, and data mining.
The key differences between applied modeling and statistical inference are:
Applied modeling and inferential statistics share many of the same concepts:
A key distinction in statistical learning is that between supervised and unsupervised techniques.
We will focus on supervised learning for the most part in this talk.
It is useful to remember that in statistical modeling, in the supervised case, we are looking at the following relationship:
\[ \hat{Y} = \hat{f}(X) \]
In this case \( \hat{f} \) represents our estimate of the function that links \( X \) and \( Y \). In traditional linear modeling, \( \hat{f} \) takes the form:
\[ \hat{Y} = \alpha + \beta(X) + \epsilon \]
However, there exist limitless alternative \( \hat{f} \) which we can explore. Applied modeling techniques help us expand the \( \hat{f} \) space we search within.
Choosing \( f \) is about tradeoffs, the most obvious is between flexibility and interpretability.
Applied Models:
Inferential Models:
Consider the following training data:
How does our model fit the test data?