January 8, 2021
Cross validation is of three types:
- Hold Out: Here you split you original data into training and test(hold out) sets. The model is trained on training set and then overfitting is checked on test set. the advantage of this method is that it’s inexpensive in terms of computation time, secondly it’s applied across the board. The disadvantage of this method is that in case of smaller datasets if the randomly selected test set is biased then it will permeate the bias to the model as well. Thus the overfitting estimates provided by this method have huge variance depending upon the manner in which data was split.
- K Fold: This is performed in addition to ‘Hold Out’ method. Here the training set is divided into k subsets and the model is trained on (k-1) subsets and tested on the remaining subset. This process is repeated k time, i.e. on all the subsets, and the final value is the average of the k iterations. This is essentially repeating the ‘Hold Out’ method k times on the training set. The disadvantage of this method is that the algorithm is to run k times and thus is computationally intensive.
- Leave One Out: This is taking K Fold method to its extreme where k is equal to number of observations. The overfitting estimate provided by this method is good but it requires huge computation power.
by : Monis Khan
Quick Summary:
Cross validation is of three types: Hold Out: Here you split you original data into training and test(hold out) sets. The model is trained on training set and then overfitting is checked on test set. the advantage of this method is that it’s inexpensive in terms of computation time, secondly it’s applied across the board. […]