Why is regularization needed?

When you train your neural network on a given dataset, then it learns the underlying patterns that describe the relationship among the data points of the given dataset. Some of these are general patterns, while other are inherent to the data points of the training dataset.

General patterns are those patterns which would still be present when new data is fed to the network. While patterns inherent to data points of the training dataset are classified as noise.

Presence of noise leads to model overfitting. Further, our primary & sole aim is to learn the general patterns, hence learning/presence of noise leads to waste of time and computational resources.

This is where regularization comes in. It restrains the learning process to general patterns and prevents noise from being taken into account i.e. it regulates the learning process. Thus checking model overfitting.

What is the p-value? How does it help in feature selection?

The p-value helps us ensure that the observed results are due to a real relationship and not by chance. Lower the p-value more confident we are about the hypothesis. If the p-value for a feature is larger than the set cut-off then that feature doesn’t significantly explain the change in dependent variable and hence can be dropped.

What is the null and alternate hypothesis?

Null hypothesis conventionally rejects any relationship between independent variable and dependent variable. While alternate hypothesis says that the change in dependent variable is directly proportional to change in independent variable, with coefficient of independent variable being the proportionality constant.

What are Bias and Variance? What is Bias Variance Trade-off?

Bias is the inability of the model to completely identify the general patterns in the training dataset. Variance leads to overfitting i.e. it not only identifies general patters but also the patterns that are inherent to training data(but not test data) and thus decreasing overall test accuracy.

Bias variance trade-off results from the fact that bias is inversely proportional to variance. Hence we make a compromise/trade-off for that values of bias and variance that results in overall minimum errors.

How to interpret a Linear Regression model?

Linear regression model has two coefficients (1) Slope (say m) (2) Intercept (say c)
For the regression equation y= m1x1 + m2x2 + m3x3 + c, a change in x1(say p), provided that x2 and x3 remain constant, would lead to a change in value of y by p/m1. The intercept c represents the value of y when x1, x2 and x3 are equal to zero.

A brief overview of C.

C is a general-purpose programming language that is extremely popular, simple and flexible. It is machine-independent, structured programming language which is used extensively in various applications.

It was initially developed by an American computer scientist Dennis M. Ritchie at Bell Laboratories in 1972.

The main features of C language include low-level access to memory, a simple set of keywords, and clean style, these features make C language suitable for system programming like an operating system and compiler development.

C programming is an excellent language to learn programming for beginners.