The Semantic "MNIST" - Counting the Number of Dots in an Image

SMNIST

2019. március 10. - nb

Have you seen the movie, Rain Man? Do you remember the scene when Dustin Hoffman can count the exact number of toothpicks on the floor in the blink of an eye. This scene gave the idea to implement it as a machine learning example. We don't count toothpicks but points in an image.

In the classic MNIST task, a typical classifier program takes images of handwritten digits and recognize these digits.

The MNIST classifier says it's 8

The "SMNIST" program doesn't take images of digits but images that contain less than 10 points.

The "SMNIST" classifier should say it's 8

Experiment 1, the generator program

We have write a program (called smnistg.cpp) that generates images that contains less than 10 points. Its output are binary compatible with the format of the original MNIST training and test data so we can immediately start the first experiments using the former MNIST programs.

-rw-r--r-- 1 batfai batfai 146931 márc 10 16:10 t10k-images-idx3-ubyte.gz
-rw-r--r-- 1 batfai batfai 785756 márc 10 16:11 t10k-images-rgb-png.tgz
-rw-r--r-- 1 batfai batfai 5120 márc 10 16:10 t10k-labels-idx1-ubyte.gz
-rw-r--r-- 1 batfai batfai 876709 márc 10 16:07 train-images-idx3-ubyte.gz
-rw-r--r-- 1 batfai batfai 4677440 márc 10 16:10 train-images-rgb-png.tgz
-rw-r--r-- 1 batfai batfai 29480 márc 10 16:07 train-labels-idx1-ubyte.gz

These files can be downloaded at http://smartcity.inf.unideb.hu/~norbi/SMNIST/

The sources can be found at https://gitlab.com/nbatfai/smnist

Introductory video

(TF/softmax MNIST acc~.6, Keras/MNIST CNN~.8, epoch 50, batch 256 ~.9)

There was a mistake in our previous video: we printed the labels instead of classifications of the 10 selected test images... here comes the correction: