Gradient Boosting Machines (GBM):

Gradient Boosting Machines (GBM) is a powerful ensemble learning technique used for both regression and classification problems. It builds models sequentially, each new model correcting the errors of the previous ones.

Here’s a step-by-step explanation of how Gradient Boosting Machines work:

Step 1: Understand the Data

Features (X): The input variables or predictors.
Target Variable (Y): The output variable or label you want to predict.

Step 2: Initialize the Model

Start with a base model that makes an initial prediction. For regression, this is often the mean of the target values, and for classification, it’s the log odds of the class probabilities.

Initial Prediction (for regression): $F_0(x) = \frac{1}{N} \sum_{i=1}^N y_i$ $F_{0} (x) = N 1 i = 1 \sum N y_{i}$

where $N$ $N$ is the number of data points, and $y_i$ $y_{i}$ is the target value for the $i$ $i$ -th data point.
Initial Prediction (for classification): $F_0(x) = \log \frac{p}{1 - p}$ $F_{0} (x) = log 1 - p p$
where $p$ $p$ is the proportion of positive class in the training data.

Step 3: Compute Residuals

Calculate the residuals, which are the differences between the actual target values and the predictions made by the current model.

Residual Calculation: $r_i = y_i - F_{m-1}(x_i)$ $r_{i} = y_{i} - F_{m - 1} (x_{i})$
where $r_i$ $r_{i}$ is the residual for the $i$ $i$ -th data point, $y_i$ $y_{i}$ is the actual target value, and $F_{m-1}(x_i)$ $F_{m - 1} (x_{i})$ is the prediction from the previous model.

Step 4: Fit a New Model

Train a new model (often a decision tree) to predict these residuals. This model learns to correct the errors made by the previous model.

Model Fitting: Fit a model $h_m(x)$ $h_{m} (x)$ to the residuals.

Step 5: Update the Model

Update the current model by adding the predictions from the newly trained model, scaled by a learning rate (also called the shrinkage parameter). This controls how much each new model contributes to the overall prediction.

Model Update Formula: $F_m(x) = F_{m-1}(x) + \alpha \cdot h_m(x)$ $F_{m} (x) = F_{m - 1} (x) + α \cdot h_{m} (x)$
where $\alpha$ $α$ is the learning rate, and $h_m(x)$ $h_{m} (x)$ is the prediction from the new model.

Step 6: Repeat

Repeat Steps 3 to 5 for a specified number of iterations or until a stopping criterion is met (e.g., the improvement in residuals becomes minimal).

Step 7: Make Predictions

Once the ensemble of models is trained, use the final model $F_M(x)$ $F_{M} (x)$ to make predictions on new data.

Prediction Formula: $\hat{y}_i = F_M(x_i)$ $y^_{i} = F_{M} (x_{i})$
where $\hat{y}_i$ $y^_{i}$ is the predicted value for the $i$ $i$ -th data point.

Step 8: Evaluate the Model

Assess the performance of the GBM model using appropriate metrics:

For Regression: Metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared.
For Classification: Metrics like Accuracy, Precision, Recall, F1-Score, and AUC-ROC.