14.1 Logistic Regression

  • Logistic Regression belongs to the class of generalised linear models (glms)generalised linear models (glms)

  • Used to model data with a dichotomous response variable.

  • Logistic regression models the conditional probability of the response variable rather than its value.

  • A logit link function, defined as \(logit\,p=log[p/(1-p)]\), is used to transform the output of a linear regression to be suitable for probabilities.

  • A linear model for these transformed probabilities can be setup as

\[\begin{equation} logit\,p=\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots\beta_{k}x_{x} \tag{14.1} \end{equation}\]
  • R provides the glm function for modelling generalised linear models including the logistic regression model
  • We will use the caret package to model logistic regression later in this topic.
  • See Hastie et al. (2013) and Boehmke and Greenwell (2019) for further details on logistic regression.

References

Boehmke, Brad, and Brandon M Greenwell. 2019. Hands-on Machine Learning with r. CRC Press. https://bradleyboehmke.github.io/HOML/.
Hastie, Trevor, Robert Tibshirani, Gareth James, and Daniela Witten. 2013. An Introduction to Statistical Learning with Applications in r. Springer New York.