Zettelkasten/Terminology Information
Bagging (Bootstrap Aggregating)
Computer-Nerd
2023. 2. 23. 19:58
Information
- Bagging (Bootstrap Aggregating) is a machine learning technique that combines multiple models trained on different subsets of the training data.
- Bagging is often used to reduce the variance and improve the stability of the predictions.
- Bagging samples the training data with replacement to create multiple bootstrap samples, each of which has the same size as the original dataset.
- Bagging trains a separate model on each bootstrap sample, using the same model type and hyperparameters.
- The final prediction is a combination of the predictions of all the models, typically by averaging the predictions for regression problems or using voting for classification problems.
- Bagging can be used for both classification and regression problems.
- Random Forest is a popular bagging algorithm for decision trees, which randomly selects a subset of features at each split to decorrelate the trees and improve the generalization.
- Random Forest can handle high-dimensional data and non-linear relationships between the features and the target variable.
- Random Forest can also estimate the importance of each feature based on the reduction in impurity, and thus provide insights into the underlying data.
- Bagging algorithms have achieved state-of-the-art performance on many machine learning benchmarks and have been widely used in various applications, such as bioinformatics, finance, and e-commerce.