Sunday, 24 January 2016


Hello once again. How has your week been? Hope it has been good. Thanks for visiting my blog once again. Today I would like to talk about one of the most popular and useful machine learning algorithms – Linear Regression.
First, what is regression? Regression basically describes the relationship between numbers. For example, there is a relationship between height (a number) and weight (another number). Generally, weight tends to increase with height. Formally, regression is concerned with identifying the relationship between a single numeric variable (called the dependent variable, response or outcome) we are interested in and one or more variables (called the independent variable or predictors). If there is only a single independent variable, this is called simple linear regression, otherwise it’s known as multiple linear regression.
What we assume in regression is that the relationship between the independent variable and the dependent variable follows a straight line. It models this relationship using the equation below:

                 y = a + bx 

y – the dependent variable
a – intercept, this is the value of y when x = 0
b – slope, this is how much y changes for an increment in the value of x

How Regression works
The goal of regression is to find a line that best fits our data. Let me illustrate with the following scatterplot showing the relationship between height (in inches) and weight (in pounds)

From the scatterplot, it can be seen that weight generally increases with height and vice-versa. Now how do we find the line that best fits this data? This is done by finding the line has the lowest sum of squared residuals. Let me explain, the equation for y shown above generates the predicted value for y which will differ from the actual value of y by some value (called residual or error). This value is squared and summed for all points in our data and a line that has the lowest sum of squared errors is chosen. This is done by adjusting the values of a and b to values such that they gives a line that fits our data. Let’s show the same data fitted with the line of best of fit.

Although the fitted line does not pass through each point in the data, it does a pretty good job of capturing the trend in our data.

How to choose a and b
Earlier on I said we choose line with a and b such that it gives the lowest sum of squared errors. How exactly do we do this? There are three ways:

1. Ordinary least squares estimation.
2. Gradient descent.
3. The normal equation.

I won’t go in depth in describing this methods but a Google search for any of these terms will give you more information if you are interested in knowing more about them.

Congratulations!!! Now you know about linear regression one of the most powerful tools in machine learning. In the next post, I will demonstrate how to perform linear regression using a popular programming language – Python. If you have a question please feel free to drop a comment.

Thanks once again for visiting my blog, hope you have a wonderful and productive week ahead. Cheers.