Definition:
Statistical learning is a field of study that involves the use of statistical methods to analyze data and make predictions or decisions based on that data. It involves developing models that can be used to identify patterns in data, to predict future outcomes, or to make decisions based on available information. The main goal of statistical learning is to develop a deep understanding of complex data sets and to use this understanding to make accurate predictions or decisions.
Starting point:
- Outcome measurement
Y
: - Vector of predictor measurement
X
:
Also known as (dependent variable
, response
, target
)
Also known as (independent variables
, inputs
, regressors
, covariates
, features
)
Problems:
- Supervised:
- Regression problem:
Y
isquantitative
(e.g. Price, Blood pressures) - Classification:
Y
takes a value in a finite unordered-set (on|off
,digit 0-9
,make of a car
) - We have to train the model on examples or instances of the data
- Unsupervised:
- Clustering:
Y
it has no value forY
; we have to let the model predict it.