Coder Social home page Coder Social logo

alankrantas / mnist-live-detection-tflite Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 2.0 7.47 MB

MNIST Live Detection using OpenCV, Tensorflow Lite and AutoKeras

License: MIT License

Python 100.00%
mnist mnist-classification mnist-handwriting-recognition opencv tensorflowlite autokeras machine-learning deep-learning computer-vision image-classification image-recognition

mnist-live-detection-tflite's Introduction

MNIST Still/Live Detection using OpenCV, Tensorflow Lite and AutoKeras

mnist_live_screenshot

This example use AutoKeras to train a CNN model with the MNIST handwriting dataset, convert it to Tensorflow Lite version, and use OpenCV to do multiple-digits detection/classification, either using a still image or live video. Tested on PC and Raspberry Pi 3B+/4B.

Be noted that the training dataset are consisted of handwritten numbers with certain features, etc. So it's better to use a sharpie on white papers and make the digits as square and clear as possible. Long, thin numbers are likely to get incorrect results. Also the scripts will ignore anything on the border of the image/video and digits that are too big or too small (can be adjusted in the code).

Testing environment

  • Python 3.9.9 (PC) / 3.7.3 (Raspberry Pis)
  • AutoKeras 1.0.16 post1 (will install Numpy, Pandas, scikit-learn and Tensorflow etc. if you don't have them)
  • Tensorflow 2.5.2
  • TF Lite runtime 2.5.0 post1 (both PC and RPis)
  • OpenCV 4.5.5
  • USB webcam

If you have GPU and installed CUDA, AutoKeras will use it for training.

Files

mnist_tflite_trainer.py is the model trainer and mnist.tflite was my result generated from it. It will print out the model summary and test prediction result, then save the model in TF Lite version.

The output of the trainer is as follows (with GPU):

Trial 1 Complete [00h 04m 14s]
val_loss: 0.03911824896931648

Best val_loss So Far: 0.03911824896931648
Total elapsed time: 00h 04m 14s
Epoch 1/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.1584 - accuracy: 0.9513
Epoch 2/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0735 - accuracy: 0.9778
Epoch 3/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0616 - accuracy: 0.9809
Epoch 4/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0503 - accuracy: 0.9837
Epoch 5/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0441 - accuracy: 0.9860
Epoch 6/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0414 - accuracy: 0.9864
Epoch 7/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0383 - accuracy: 0.9872
Epoch 8/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0331 - accuracy: 0.9893
Epoch 9/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0325 - accuracy: 0.9893
Epoch 10/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0307 - accuracy: 0.9901 
Epoch 11/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0305 - accuracy: 0.9899
Epoch 12/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0287 - accuracy: 0.9906
Epoch 13/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0258 - accuracy: 0.9917
Epoch 14/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0243 - accuracy: 0.9920
Epoch 15/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0254 - accuracy: 0.9915
Epoch 16/21
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0243 - accuracy: 0.9920 
Epoch 17/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0231 - accuracy: 0.9922 
Epoch 18/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0218 - accuracy: 0.9924 
Epoch 19/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0213 - accuracy: 0.9932
Epoch 20/21
1875/1875 [==============================] - 9s 5ms/step - loss: 0.0226 - accuracy: 0.9927 
Epoch 21/21
1875/1875 [==============================] - 8s 4ms/step - loss: 0.0197 - accuracy: 0.9938

313/313 [==============================] - 1s 3ms/step - loss: 0.0387 - accuracy: 0.9897

Prediction loss: 0.039, accurcy: 98.970%

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 28, 28)]          0         
_________________________________________________________________
cast_to_float32 (CastToFloat (None, 28, 28)            0         
_________________________________________________________________
expand_last_dim (ExpandLastD (None, 28, 28, 1)         0         
_________________________________________________________________
normalization (Normalization (None, 28, 28, 1)         3         
_________________________________________________________________
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 64)        0         
_________________________________________________________________
dropout (Dropout)            (None, 12, 12, 64)        0         
_________________________________________________________________
flatten (Flatten)            (None, 9216)              0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 9216)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                92170     
_________________________________________________________________
classification_head_1 (Softm (None, 10)                0         
=================================================================
Total params: 110,989
Trainable params: 110,986
Non-trainable params: 3
_________________________________________________________________

My model is 611 KB and its Lite version is 432 KB.

Model test

mnist_tflite_model_test.py can be used to test the TF lite model (also using the original MNIST test dataset):

test image shape: (10000, 28, 28)
test label shape: (10000,)

Loading ./mnist.tflite ...

input details:
[{'dtype': <class 'numpy.uint8'>,
  'index': 0,
  'name': 'input_1',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
                              'scales': array([], dtype=float32),
                              'zero_points': array([], dtype=int32)},
  'shape': array([ 1, 28, 28]),
  'shape_signature': array([-1, 28, 28]),
  'sparsity_parameters': {}}]

output details:
[{'dtype': <class 'numpy.float32'>,
  'index': 16,
  'name': 'Identity',
  'quantization': (0.0, 0),
  'quantization_parameters': {'quantized_dimension': 0,
                              'scales': array([], dtype=float32),
                              'zero_points': array([], dtype=int32)},
  'shape': array([ 1, 10]),
  'shape_signature': array([-1, 10]),
  'sparsity_parameters': {}}]

new input shape: [10000    28    28]
new output shape: [10000    10]

Predicting...
Prediction accuracy: 0.9897
Prediction MSE: 0.1737

              precision    recall  f1-score   support

           0       0.99      1.00      0.99       980
           1       0.99      1.00      0.99      1135
           2       0.99      0.98      0.99      1032
           3       0.99      0.99      0.99      1010
           4       0.98      0.99      0.99       982
           5       0.99      0.99      0.99       892
           6       0.99      0.99      0.99       958
           7       0.98      0.99      0.99      1028
           8       0.99      0.99      0.99       974
           9       1.00      0.98      0.99      1009

    accuracy                           0.99     10000
   macro avg       0.99      0.99      0.99     10000
weighted avg       0.99      0.99      0.99     10000

It also shows first 40 test digits with their predicted labels:

mnist-model-test

Digit detection

mnist_tflite_detection.py is for a single still photo: it will save images to demostrate different preprocessing step, print out digit labels/positions and show the final black-and-white result:

Detected digit: [5] at x=71, y=292, w=43, h=54 (100.000%)
Detected digit: [6] at x=176, y=288, w=32, h=47 (100.000%)
Detected digit: [8] at x=373, y=282, w=42, h=43 (99.861%)
Detected digit: [7] at x=267, y=282, w=36, h=52 (99.974%)
Detected digit: [9] at x=473, y=271, w=32, h=57 (99.852%)
Detected digit: [2] at x=279, y=133, w=38, h=52 (99.997%)
Detected digit: [1] at x=186, y=130, w=29, h=60 (99.874%)
Detected digit: [4] at x=471, y=129, w=52, h=55 (100.000%)
Detected digit: [3] at x=378, y=126, w=29, h=55 (100.000%)
Detected digit: [0] at x=79, y=125, w=56, h=56 (100.000%)

05-mnist-detection

mnist_tflite_live_detection.py is the live video version using a webcam, which draws the result directly on the original images. You can check out the video demo in live-demo.mp4.

Both script can either use Tensorflow Lite from the standard Tensorflow package or pure TF Lite runtime.

Note on image thresholding

In the detection script OpenCV will do automatic image thresholding to convert the video frame to black and white, in order to get clean images of digits. For most of the time it works well, as long as you provide a bright and evenly-lighted surface, but you may want to manually control the threshold when needed:

_, frame_binary = cv2.threshold(frame_gray, 160, 255, cv2.THRESH_BINARY_INV)

mnist-live-detection-tflite's People

Contributors

alankrantas avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

chihhao428 summuk

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.