You take the following 3 steps to attain the Holy Grail called the best fit line
- Chose the error function.
- Use gradient descent to find out the slope and intercept for which the error is minimum.
- Plot the line with those slope and intercept.
You take the following 3 steps to attain the Holy Grail called the best fit line
Scatter Plot would be good for continuous variables. For categorical variables the box plat would be appropriate. One should plot a scatter matrix to get the feel of the data and then move forward with box plots of categorical variables.
Linear regression is used when the dependent variable shares more or less linear relationship with independent variables. In other words the change in dependent variable is directly proportional to each independent variable, provided other independent variables are kept constant. The equation of straight line, i.e. y = mx + c, represents the relationship between 2 variables, where change in y is explained by change in x. If x changes by a certain amount, say a, then change in y would be equal to m * a. The intercept c represents the value of y when x is equal to zero.
Linear Regression involves finding out the line which gives the best estimate of dependent variable, when dependent variable is plotted against independent variable. Here we don’t find the locus but we decide beforehand that the straight line describes the relationship between the dependent and each independent variable. We instead focus on finding the best fit line i.e. the optimum slope and intercept.