Avatto>>DATA SCIENTIST>>SHORT QUESTIONS>>Deep Learning>>Fundamentals of Deep Learning
Some of the most common reasons for the loss not to decrease during training the network includes when we stuck at a local minimum when we set the learning rate to a low value when the regularization parameter is high.
Some of the most common reasons for the loss leading to nan during training the network includes when the learning rate is set to a high value when the gradient blows up and improper or poor loss function.
Some of the hyperparameters of the network include the following:
  • Number of neurons in the hidden layer
  • Number of hidden layers
  • The activation function in each layer
  • Weight initialization
  • Learning rate
  • Number of epochs
  • Batch size
We train the network by performing backpropagation. During backpropagation, we apply any optimization method and find the optimal weights. Gradient descent is the most commonly used optimization method while training the network during backpropagation.
Some of the methods used for preventing overfitting in neural networks include the following:
  • Dropouts
  • Early stopping
  • Regularization
  • Data augmentation


Description