Posts

Showing posts from February, 2016

K- Fold Cross Validation for reducing over-fit issue on classifiers

Image
In k -fold cross-validation, the original sample is randomly partitioned into k equal sized sub-samples of the k subsamples, A single sub-sample is retained as the validation data for testing the model, and the remaining k − 1 subsamples are used as training data . The cross-validation process is then repeated k times (the folds ), with each of the k subsamples used exactly once as the validation data. The k results from the folds can then be averaged (or otherwise combined) to produce a single estimation . The advantage of this method over repeated random sub-sampling is that all observations are used for both training and validation , and each observation is used for validation exactly once. The disadvantage of this method is training algorithm has to be re-run from the scratch k-times which means it takes as much computation to make an evaluation . The error of the classifier is the averages testing error across k-testing parts.

Regression and Classification

Image
Regression and Classifications are two major area in Classification Technique in Data Mining. Toady I heared a question what is regression and what is classification and where am i use which condition. The Image says lots of word than I write.