>>>>>>>>Fundamentals of Deep Learning
##### What is a deep neural network?
The artificial neural network consists of one input, N number of hidden, and one output layer. When the artificial neural network consists of a large number of hidden layers then it is often called the deep neural network.
##### What is the need for a transfer function in deep learning?
The transfer function is most commonly known as the activation function. It is mainly used for establishing non-linearity in the neural network. That is, it is mainly used for introducing the non-linear transformation in the neural network for learning the intricate patterns in the data.
##### Explain the difference between the sigmoid and tanh activation function.
The sigmoid activation function scales the value between 0 to 1 and it is centered at 0.5 whereas the tanh activation function scaled the value between -1 to 1 and it is centered at 0.
##### Explain the dying ReLU problem.
Suppose, x is given as an input to the ReLU function. If the value of x is less than 0 then the ReLU function returns 0 as output. If the value of x is greater than or equal to 0 then the ReLU function returns x as output. Thus, the ReLU function always returns 0 when the value of x is less than 0 (that is when x is a negative value) and this is often referred to as the dying ReLU problem.
##### How to combat the dying ReLU problem?
To combat the dying ReLU problem, we use the Leaky ReLU. With the leaky ReLU activation function, we introduce a small slope for a negative value. That is, instead of always returning the 0 every time when x is a negative value, leaky ReLU returns the x multiplied by a small number called alpha. We usually set the value of alpha to 0.01. Thus, leaky ReLU returns x when x is greater than or equal to 0 and it returns x multiplied by alpha when x is less than 0.

