Naive-Bayes

Naive Bayes:

Naive Bayes is a probabilistic classifier based on Bayes' Theorem with an assumption of independence between features. It’s often used for text classification, spam detection, and other classification problems.

Here’s a step-by-step explanation of how Naive Bayes works:​

Step 1: Understand the Data

  • Features (X): The input variables or predictors.
  • Target Variable (Y): The output variable or class label you want to predict.

Step 2: Define Bayes’ Theorem

Naive Bayes is based on Bayes’ Theorem, which describes the probability of a class given the features. The formula is:

P(YX)=P(XY)P(Y)P(X)P(Y \mid X) = \frac{P(X \mid Y) \cdot P(Y)}{P(X)}

Where:

  • P(YX)P(Y \mid X) is the posterior probability of class YY given features XX.
  • P(XY)P(X \mid Y) is the likelihood of features XX given class YY.
  • P(Y)P(Y) is the prior probability of class YY.
  • P(X)P(X) is the marginal likelihood of features XX.

Step 3: Apply the Naive Assumption

The "naive" assumption is that all features are independent given the class. This simplifies the likelihood calculation:

P(XY)=P(X1,X2,,XnY)=i=1nP(XiY)P(X \mid Y) = P(X_1, X_2, \ldots, X_n \mid Y) = \prod_{i=1}^{n} P(X_i \mid Y)

Here, XiX_i are individual features. This assumption makes computations manageable by reducing the complexity of calculating joint probabilities.

Step 4: Calculate Prior Probabilities

Estimate the prior probability of each class P(Y)P(Y). This is the probability of each class occurring in the dataset:

P(Y=y)=Number of instances in class yTotal number of instancesP(Y = y) = \frac{\text{Number of instances in class } y}{\text{Total number of instances}}

Step 5: Calculate Likelihoods

Estimate the likelihood of each feature given each class P(XiY)P(X_i \mid Y). This depends on the type of feature:

  • For categorical features: Use frequency counts. For feature XiX_i in class YY:​P(Xi=xiY=y)=Number of instances where Xi=xi and Y=yNumber of instances in class yP(X_i = x_i \mid Y = y) = \frac{\text{Number of instances where } X_i = x_i \text{ and } Y = y}{\text{Number of instances in class } y}

  • For continuous features: Assume a distribution (often Gaussian). For feature XiX_i in class YY, use:P(Xi=xiY=y)=12πσ2exp((xiμ)22σ2)P(X_i = x_i \mid Y = y) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left( -\frac{(x_i - \mu)^2}{2 \sigma^2} \right)

  • where μ\mu and σ2\sigma^2 are the mean and variance of XiX_i for class YY.

Step 6: Make Predictions

To classify a new instance, calculate the posterior probability for each class using Bayes’ Theorem and the naive assumption:

P(Y=yX)P(Y=y)i=1nP(XiY=y)P(Y = y \mid X) \propto P(Y = y) \cdot \prod_{i=1}^{n} P(X_i \mid Y = y)

Choose the class with the highest posterior probability:

Y^=argmaxy(P(Y=y)i=1nP(XiY=y))\hat{Y} = \arg\max_{y} \left( P(Y = y) \cdot \prod_{i=1}^{n} P(X_i \mid Y = y) \right)

Step 7: Evaluate the Model

Assess the performance of the Naive Bayes classifier using metrics such as:

  • Accuracy: The proportion of correctly classified instances.
  • Confusion Matrix: Provides a detailed breakdown of classification results.
  • Precision, Recall, F1-Score: Evaluate the classifier's performance for different classes, especially in imbalanced datasets.