12-23-2022, 07:10 AM
When a model performs well on training data but poorly on untried data, this is over-fitting.
Over-fitting is a very common problem in Machine Learning. There is an extensive range of literature dedicated to studying methods for preventing over-fitting.
Three simple steps to avoid Over-fitting:
Over-fitting is a very common problem in Machine Learning. There is an extensive range of literature dedicated to studying methods for preventing over-fitting.
Three simple steps to avoid Over-fitting:
- Train with more data - Training with more data can help algorithms detect the signal better (But, it won’t work every time).
- Keep the model simple - take fewer variables into account, thereby removing some of the noise in the training data
- Utilize cross-validation methods, such as the k-folds method.
- Use regularization techniques - Regularization describes a wide range of methods for forcibly simplifying your model, such as LASSO, that penalize certain model parameters if they're likely to cause over-fitting.
- Ensembling - Machine learning techniques called ensembles are used to combine predictions from various independent models.