Tuesday, 9 January 2018

What is an Activation Function in Neural Networks

Activation Functions play a very important role in Neural Network so understanding them is key to getting a clearer understanding on how neural networks work.

What is an Activation Function?
An activation function in neural network is a function the that takes the input or the set of inputs to a node and produces an output that is within a particular range say  between 0 and 1(in the case of binary activation function). The input to an activation function could be a formula or even an output of another function.

The Logistic Function
The Logistic function a  type of sigmoid function that takes an input and produces an output between 0 and 1. The formula for the logistic function is given below:

On problem with the logistic function is the vanishing gradient problem. This means that when a neurons activation approaches the limits of either 0 or 1, the gradient at that point gets very close to 0.
Another problem with the logistic function is that it is not zero-centered as well as the problem of slow convergence

The Hyperbolic Tangent Function
The Hyperbolic tangent is another type of sigmoid function that takes an input and produces an output between -1 and +1. Since the output of the Hyperbolic tangent function is 0-centere, it is preferred to the logistic function. But just like the logistic function, it also suffers from the vanishing gradient problem. The formula for the hyperbolic tangent function is given below

The Rectified Linear Unit
The ReLU is an activation function in neural networks defined as the positive part of its argument. The ReLU is defined as

R(x) = max(0,x)
this means that the value is is zero when x is less than zero and linear with a slope of 1 when x is greater than zero. It was noted that it had a 6 x improvement over the the hyperbolic tangent function in a paper by Alex Krizhevsky on ImageNet Classification.
The formula for the rectified linear function is given as:

Other variants of the ReLU function are the Leaky Rectified Linear Unit (Leaky ReLU), the Parametric ReLU and the Exponential Linear unit. I would recommend you do some personal research on these and other activation function.