In this post, we create a simple convolutional neural network(SimpeConvNet) using only NumPy and…

Riley Learning

26 Jun 2022 • 3 min read

Simple CovNet with NumPy

In this post, we create a simple convolutional neural network(SimpeConvNet) using only NumPy and it will classify MNIST images. The codes are from a book called ‘Deep Learning from Scratch’. Let’s check an architecture of SimpeConvNet and notations first.

Architecture

N: the number of images (or mini batch size)

H: the height of images

W: the width of images

FN: the number of filters

FH: the height of filters

FW: the width of filters

C: the depth of channel

OH: the height of outputs

OW: the width of outputs

The architecture of simple convolution network is from the convolution layer to the softmax with loss layer. You can find source codes here and explanation of them from my posts: Convolution, Pool, Affine, Relu, Softmax with Loss.

The forward pass of convolutional layer produces a four-dimensional tensor with[N, FN, OH, OW] shape, where FNcorresponds to the number of filters applied in a given layer. The output size of convolution is (28 + 2x0–5) / 1 + 1 that is 24.

The pooling layer transforms the tensor form original shape [N, C, H, W]to [N, C, OH, OW]. Here the ratio between H and OH is defined by stride and pool_size hyperparameters. The output size of pooling layer is 20 x (24/2) x (24/2) that is 4320.

Simple ConvNet

You need to visit Github for imports, auxiliary functions and MNIST dataset. There are 7 layers that are Conv1, Relu1, Pool1, Affine 1, Relu2, Affine 2 and Softmax with Loss.

Before we compute the loss, we need to first perform a forward pass. It takes the values of layers(conv — relu — pool — affine — relu — affine) and forward them. As we defined the last layer is the softmax and loss layer, we will be using the cross-entropy loss.

The accuracy is calculated by dividing the number of correct answers by the number of data.

It starts by computing the forward over all the layers, then computes the error between the output of the network and the target value. We can make grads dictionary by using backward function of each layer in reverse order, starting by the last one up to the first.

This can be used when you want use trained parameters.

Train Simple ConvNet

After reading data, we reduce the number data since it might take too much time training and then set parameters. For training, we use a trainer that you can find its code of here and change the optimizer. The options are ‘sgd’, ‘momentum’, ‘nesterov’, ‘adagrad’, ‘rmsprpo’ and ‘adam’. You can read the explanation of some of them here.

The below is the graph of history of accuracy while training.

Apply Filter to MNIST

We can use the ‘params.pk’ file by using load_params function so that we can visualize trained filters. Since we set the number of filters as 30, we can apply 30 filters into each image. Each filter has local features.

I hope that my post has broadened your horizons and increased your understanding of operations of convolution network. Thank you for reading :)