This is a convolutional neural network trained to classify images of the Rock Paper Scissors game calls. It has been trained on a fairly small image data set containing 2188 RGB images. Below you can find information about its creation and evaluation.
To maximize the performance of the network, the train/test images were preprocessed to retain only useful information. This was a three-step process:
The resulting images show only the outline of the gesture which is really what it takes to classify them.
The model is composed of the following layers:
To reduce overfitting, an image generator has been used. It applied small random transformations to train images: rotation, shift, shear and zoom. With early stopping applied, it took 15 epochs to train the model.
Here are the classification results for images from outside of the train/test set:
In some cases, the model is struggling to make a correct classification. This is caused by image preprocessing issues. When the input picture has a not uniform background or the background has the same color tone as the hand, the edges of the gesture are not correctly recognized.
To improve classification results, the following advice should be taken into account:
CNN model has been trained on a dataset available on Kaggle under license: CC BY-SA 4.0