Sunday, 7 January 2018

Basics of Multilayer Perceptron - A Simple Explanation of Multilayer Perceptron

Today we will understand the concept of Multilayer Perceptron.

Recap of Perceptron
You already know that the basic unit of a neural network is a network that has just a single node, and this is referred to as the perceptron.
The perceptron is made up of inputs x1, x2, ..., xn their corresponding weights w1, w2, ..., wn. A function known as activation function takes these inputs, multiplies them with their corresponding weights and produces an output y.

Figure 1: A Perceptron ('single-layer' perceptron)

What is Multilayer Perceptron?
A multilayer perceptron is a class of neural network that is made up of at least 3 nodes. So now you can see the difference. Also, each of the node of the multilayer perceptron, except the input node is a neuron that uses a non-linear activation function.
The nodes of the multilayer perceptron are arranged in layers.
  • The input layer
  • The output layer
  • Hidden layers: layers between the input and the output
Also note the the learning algorithm for the multilayer perceptron is known as back propagation(explained here).

How the Multilayer Perceptron Works
In MLP, the neurons use non-linear activation functions that is designed to model the behavior of the neurons in the human brain.
An multi-layer perceptron has a linear activation function in all its neuron and uses backpropagation for its training.

About Activation Functions

The activation function combines the input to the neuron with the weights and then adds a bias to produce the output. In other words, the activation function maps the weighted inputs to the output of the neuron.
One of such activation functions is the sigmoid function which is used to determine the output of the neuron. An example of a sigmoid function is the logistic function which is shown below



Another example of a sigmoid function, is the hyperbolic tangent activation function shown below which produces an output ranging between -1 and 1
Yet another type of activation function that can be used is the Rectified Linear Unit or ReLU which is said to have better performance than the logistic function and the hyperbolic tangent function.

Applying Activation Function to MLP
With activation function, we can calculate the output of any neuron in the MLP. Assuming w denotes the vector of weights, x is the vector of inputs, b is the bias and ϕ  is the activation function, the for the ith, neuron, the output y would be given by:

An MLP is made up of a set of nodes which forms the input layer, one or more hidden layers, and an output layer.

Layers of Multilayer Perceptron(Hidden Layers)
Remember that from the definition of multilayer perceptron, there must be one or more hidden layers. This means that in general, the layers of an MLP should be a minimum of three layers, since we have also the input and the output layer. This is illustrated in the figure below.

Also to note is that the function activating these hidden layers has to be non-linear function (activation function) as discussed in previous section/

Training/Learning in Multilayer Perceptrons
The training process of the MLP occurs by continuous adjustment of the weights of the connections after each processing.  This adjustment is based on the error in output(which is the different between the expected result and the output). This continuous adjustment of the weights is a supervised learning process called 'backpropagation'.
The backpropagation algorithm consists of two parts:
  • forward pass
  • backward pass
In the forward pass, the expect output corresponding to the given inputs are evaluated
In the backward pass, partial derivatives of the cost function with respects to the different parameters are propagated back through the network.
The process continues until the error is at the lowest value.
 (A detailed lesson on backpropagation is found here.)

Another learning method for the multilayer perceptron is the Stochastic Gradient Descent) which is explained in details here.
Thanks for reading and if you have any questions, doe drop in the comment box below