Neural-Networks

Neural Networks:

Neural Networks (NNs) are a class of machine learning models inspired by the structure and functioning of the human brain. They are used for various tasks including classification, regression, and pattern recognition.​

Here’s a step-by-step explanation of how Neural Networks work:​

Step 1: Understand the Data

  • Features (X): Input variables or predictors.
  • Target Variable (Y): Output variable or label you want to predict.

Step 2: Design the Network Architecture

Decide on the structure of the neural network, including:

  • Input Layer: Contains neurons corresponding to the features of the data.

  • Hidden Layers: Intermediate layers where computations occur. A neural network can have one or more hidden layers.

  • Output Layer: Contains neurons corresponding to the target variable. For classification, the output layer usually has one neuron per class. For regression, it typically has one neuron.

  • Activation Functions: Functions applied to the output of each neuron to introduce non-linearity.

    • ReLU (Rectified Linear Unit): f(x)=max(0,x)f(x) = \max(0, x)
    • Sigmoid: f(x)=11+exf(x) = \frac{1}{1 + e^{-x}}
    • Tanh: f(x)=exexex+exf(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}

Step 3: Initialize Weights and Biases

Set initial weights and biases for all connections between neurons. These are typically initialized randomly or with specific methods (e.g., Xavier initialization).

  • Weights: Determine the strength of connections between neurons.
  • Biases: Allow the activation function to be shifted.

Step 4: Forward Propagation

Compute the output of the network by passing the input data through the layers:

  1. Calculate Weighted Sum: For each neuron in a layer, compute the weighted sum of inputs plus the bias.zj=iwjixi+bjz_j = \sum_{i} w_{ji} \cdot x_i + b_j

  1. where zjz_j is the weighted sum for neuron jj, wjiw_{ji} are weights, xix_i are inputs, and bjb_j is the bias.

  2. Apply Activation Function: Pass the weighted sum through the activation function to get the neuron's output.aj=f(zj)a_j = f(z_j)

  3. where aja_j is the activation output for neuron jj, and ff is the activation function.

  4. Pass Output to Next Layer: The output of one layer becomes the input to the next layer, continuing until the output layer is reached.

Step 5: Compute Loss

Calculate the loss (or error) by comparing the network's output to the actual target values. Common loss functions include:

  • Mean Squared Error (MSE) for regression:MSE=1Ni=1N(yiy^i)2\text{MSE} = \frac{1}{N} \sum_{i=1}^N (y_i - \hat{y}_i)^2

  • where y^i\hat{y}_i is the predicted value, yiy_i is the actual value, and NN is the number of samples.

  • Cross-Entropy Loss for classification:Cross-Entropy=iyilog(y^i)\text{Cross-Entropy} = -\sum_{i} y_i \cdot \log(\hat{y}_i)

  • where y^i\hat{y}_i is the predicted probability of class ii and yiy_i is the actual class label.

Step 6: Backward Propagation

Adjust weights and biases based on the loss using gradient descent:

  1. Compute Gradients: Calculate the gradients of the loss function with respect to weights and biases. This involves:

    • Gradient of Loss with Respect to Output: Compute how the loss changes with respect to changes in the output.
    • Gradient of Output with Respect to Weights and Biases: Use the chain rule to find gradients for each layer.
  2. Update Weights and Biases: Adjust the weights and biases to minimize the loss using an optimization algorithm. Common algorithms include:

    • Gradient Descent:

      w=wηLossww = w - \eta \cdot \frac{\partial \text{Loss}}{\partial w}
      • Adam Optimizer: An adaptive learning rate method that combines the advantages of two other extensions of stochastic gradient descent.

      • where η\eta is the learning rate.

Step 7: Iterate

Repeat Steps 4 to 6 for multiple epochs (iterations) until the loss converges or reaches an acceptable level.

Step 8: Evaluate the Model

Assess the performance of the trained neural network using evaluation metrics appropriate to the task:

  • Regression: Metrics like MSE, MAE, and R-squared.
  • Classification: Metrics like accuracy, precision, recall, F1-score, and ROC-AUC.