Classification (Part 1) — Intro to Logistic RegressionMarco PeixeiroBlockedUnblockFollowFollowingDec 10An introduction to classification using logistic regression.Photo by Kimberly Farmer on UnsplashOverviewPreviously, we saw that linear regression assumes the response variable is quantitative..However, in many situations, the response is actually qualitative, like the color of the eyes..This response is known as categorical.Classification is the process of predicting a qualitative response..Methods used for classification often predict the probability of each of the categories of a qualitative variable as the basis for making the classification..In a certain way, they behave like regression methods.With classification, we can answer questions like:A person has a set of symptoms that could be attributed to one of three medical conditions..Which one?Is a transaction fraudulent or not?To help answer such questions, different methods are used, like logistic regression, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), k-nearest neighbors (knn), and others.For now, we will focus mainly on understanding the theory of logistic regression..Let’s get to it!Logistic RegressionWhen it comes to classification, we are determining the probability of an observation to be part of a certain class or not..Therefore, we wish to express the probability with a value between 0 and 1.A probability close to 1 means the observation is very likely to be part of that category.In order to generate values between 0 and 1, we express the probability using this equation:Sigmoid functionThe equation above is defined as the sigmoid function.Plot this equation and you will see that this equation always results in a S-shaped curve bound between 0 and 1.Logistic regression curveAfter some manipulation to equation above, you find that:Take the log on both sides:The equation above is known as the logit..As you can see, it is linear in X..Here, if the coefficients are positive, then an increase in X will result in a higher probability.Estimating the coefficientsAs in linear regression, we need a way to estimate the coefficients..For that, we maximize the likelihood function:Likelihood functionThe intuition here is that we want coefficients such that the predicted probability (denoted with an apostrophe in the equation above) is as close as possible to the observed state.Similarly to linear regression, we use the p-value to determine if the null hypothesis is rejected or not.The Z-statistic is also widely used..A large absolute Z-statistic means that the null hypothesis is rejected.Remember that the null hypothesis states: there is not correlation between the features and the target.Multiple logistic regressionOf course, logistic regression can easily be extended to accommodate more than one predictor:Multiple logistic regressionNote that using multiple logistic regression might give better results, because it can take into account correlations among predictors, a phenomenon known as confounding.. More details