Here’s a step-by-step explanation of how Gradient Boosting Machines work:
Step 1: Understand the Data
- Features (X): The input variables or predictors.
- Target Variable (Y): The output variable or label you want to predict.
Step 2: Initialize the Model
Start with a base model that makes an initial prediction. For regression, this is often the mean of the target values, and for classification, it’s the log odds of the class probabilities.
-
where N is the number of data points, and yi is the target value for the i-th data point.
-
Initial Prediction (for classification):F0(x)=log1−pp
-
where p is the proportion of positive class in the training data.
Step 3: Compute Residuals
Calculate the residuals, which are the differences between the actual target values and the predictions made by the current model.
-
Residual Calculation:ri=yi−Fm−1(xi)
-
where ri is the residual for the i-th data point, yi is the actual target value, and Fm−1(xi) is the prediction from the previous model.
Step 4: Fit a New Model
Train a new model (often a decision tree) to predict these residuals. This model learns to correct the errors made by the previous model.
- Model Fitting: Fit a model hm(x) to the residuals.
Step 5: Update the Model
Update the current model by adding the predictions from the newly trained model, scaled by a learning rate (also called the shrinkage parameter). This controls how much each new model contributes to the overall prediction.
-
Model Update Formula:Fm(x)=Fm−1(x)+α⋅hm(x)
-
where α is the learning rate, and hm(x) is the prediction from the new model.
Step 6: Repeat
Repeat Steps 3 to 5 for a specified number of iterations or until a stopping criterion is met (e.g., the improvement in residuals becomes minimal).
Step 7: Make Predictions
Once the ensemble of models is trained, use the final model FM(x) to make predictions on new data.
Step 8: Evaluate the Model
Assess the performance of the GBM model using appropriate metrics:
- For Regression: Metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared.
- For Classification: Metrics like Accuracy, Precision, Recall, F1-Score, and AUC-ROC.