# Introduction

Logistic regression is a popular machine learning algorithm used for predicting dependable variable of categorical type using a given set of independent variable. It’s a supervised machine learning algorithm.

# Description

To get started you should’ve known two concepts. Linear Regression and sigmoid function. click this link to know about Linear regression, l wrote a blog on that. Sigmoid function a simple mathematical function that maps any value between -∞ to +∞ to a values between 0 to 1.

It looks like

Since logistic regression predicts the output of a categorical variable the output must be a categorical value. Example: yes or no, 1 or 0, present or absent, etc.. But it cannot produce exact values like 0 or 1, it gives the probabilistic value, which you know lies between 0 and 1.

At the start of the article I said that you need to know linear regression, because logistic regression is very similar to linear regression. Although LR is used to predict continuous variable and LogReg is used for classification task. In LogReg instead of fitting a line through data points we fit a “S” shaped curved logistic function. It predicts the likelihood of something.

Now before going to model equation and representation let’s look into the assumptions of logistic regression.

# Assumptions:

Logistic regression equation can be obtained from Linear Regression equation.

The hypothesis is

But we know that the output Z can be anything, from -∞ to +∞.

Since in this article we are talking about a binary classification model our output should be a value 0 or 1. To get this we act this hypothesis upon a sigmoid function.

This is the estimated probability of how confident the predicted value is actual value when given an input x. The output of h_theta(x) will look like 0.9 or 0.79 or something like that.

Mathematically :

Or probability of y = 1 given x, parameterized by theta.

# Coefficient:

The coefficients in Logistic Regression algorithm is estimated from your training data using Maximum Likelihood Estimation.

I will write another article on maximum likelihood estimation which will be linked into this page, until that please refer to one the hundreds of article on google.

Put simply MLE (maximum likelihood estimation) is used by many ML algorithms. The intuition for maximum-likelihood for logistic regression is that a search procedure seeks values for the coefficients (Beta values) that minimize the error in the probabilities predicted by the model to those in the data.

The best parameters (beta values ) would result in a model that will predict values very close to 1 for the default class (class 1) , and very close to 0 for the other class.

# Decision Boundary:

To predict the class of a data point a threshold value can be set. Based upon this threshold the obtained estimated probability is classified into classes. Say, if predicted value >= 0.5 then classify the data point as class 1.

Decision boundary can be linear like a straight line , or non linear like circle. Polynomial order can be increased to get complex decision boundary.

# Cost function :

In Linear Regression we used mean squared error as cost function, but her we can not do that. It will be a non convex function of parameters. Gradient descent will converge into global minimum only if the function is convex.

So our cost function for Logistic regression is

If y = 0, first term terminates. If y = 1 second term terminates

Why this is ? we will discuss in the maximum likelihood estimation article.

in the image w is used for our beta.

# Types of logistic regression:

1. Binary logistic regression

The categorical dependent variable has only two possible outcome, e.g. present or absent.

2. Multinomial logistic regression

The dependent or target variable have three or more possible response without ordering. E.g. which social media users turn the most( fb, twitter or instagram)

3. Ordinal logistic regression

Three or more possible response with ordering. e.g. movie rating ( 1 to 5)

# Note:

1. Do not use interrelated data. If some observations are related to another, it will cause machine to overweight their significance their significance.

2. As mentioned earlier avoid continuous outcomes. Temperature, time or anything that is open ended will make the model less precise.

3. Logistic regression is a linear algorithm. It assumes a linear relationship between input variables with the out variable. Data transforms of your input variables that better expose this linear relationship can result in a more accurate model.

4. Remove noise instances. Consider removing outliers from the data, and removing misclassified instances from your training data.

5. Logistic regression uses the concept of predictive modeling as regression; therefore, it is called logistic regression, but is used to classify samples; Therefore, it falls under the classification algorithm.