Coder Social home page Coder Social logo

img-rec's Introduction

基本功能

  • 调用摄像头拍照并识别照片中的物体
  • 加载本地图片或包含图片的文件夹逐张识别

环境依赖

项目基于python3.6开发,推荐使用pycharm打开。没有将venv依赖包目录加入git上传,请自行安装依赖。

  • 安装依赖,在项目根目录下输入pip install -r requirements.txt

主要包含以下模块:

  • opencv或cv2
  • PySide2
  • keras
  • tensorflow

识别模型说明

模型脚本现有两个(Jupyter Notebook编写),Keras_Cifar_CNN_Introduce.ipynb是自行设计搭建的CNN网络,结构及性能如下。

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 32, 32, 32)        896       
_________________________________________________________________
dropout_1 (Dropout)          (None, 32, 32, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 32, 32, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 16, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 16, 16, 64)        18496     
_________________________________________________________________
dropout_2 (Dropout)          (None, 16, 16, 64)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 16, 16, 64)        36928     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 64)          0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 8, 8, 128)         73856     
_________________________________________________________________
dropout_3 (Dropout)          (None, 8, 8, 128)         0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 8, 8, 128)         147584    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 4, 4, 128)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 2048)              0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 2048)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 2500)              5122500   
_________________________________________________________________
dropout_5 (Dropout)          (None, 2500)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 1500)              3751500   
_________________________________________________________________
dropout_6 (Dropout)          (None, 1500)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 200)               300200    
_________________________________________________________________
dropout_7 (Dropout)          (None, 200)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 10)                2010      
=================================================================
Total params: 9,463,218
Trainable params: 9,463,218
Non-trainable params: 0
_________________________________________________________________

Keras_Cifar_ResNet.ipynb是Kares官网上提供的ResNet程序,包含V1和V2两个版本。

自设计的模型由于全连接层较多训练好的模型大约有100MB,就不放在repo里了,训练大约10分钟可以跑完200个epoch(batch_size=500),结果如下:

val_acc: 0.8215

scores: 0.8128

crosstab:

lab\pre 0 1 2 3 4 5 6 7 8 9
0 797 8 62 18 9 5 13 3 72 13
1 7 933 4 5 3 1 4 0 7 36
2 29 0 775 34 58 34 51 9 10 0
3 13 2 69 693 35 112 51 16 5 4
4 9 1 57 57 792 23 39 21 1 0
5 1 45 184 25 697 13 26 2 1
6 3 3 36 51 19 15 869 0 4 0
7 10 2 26 49 57 35 2 813 1 5
8 32 13 13 17 8 2 6 3 898 8
9 18 60 6 23 4 1 10 4 13 861

当前得到的效果较优的模型文件cifar10_ResNet29v2_model.068.h5,是Keras_Cifar_ResNet.ipynb脚本采用Cifar10数据集在Google Colab上进行训练(Python3&GPU加速),迭代70个epoch,择取验证集最佳的一次(第68次,val_acc=85.65%),训练过程约1小时。

使用说明

  • 安装环境依赖
  • 运行根目录下的main.py

项目默认加载相对路径中的Images文件夹来打开图片或图片文件夹,默认加载Models/cifar10_ResNet29v2_model.068.h5作为打开项目时的使用的模型,如有需要请自行修改Core/CameraMainWin.pyCore/ImageRecognition.py

项目结构

IMG-REC
├── Core
│   ├── CameraMainWin.py                            #  主界面逻辑
│   ├── ImageRecognition.py                         #  图片识别类
│   ├── ResultWid.py                                #  多张图片识别结果页面逻辑
│   └── SingleResultWid.py                          #  单张图片识别结果页面逻辑
├── Images                                          #  图片集默认打开此文件夹
│   ├── dataset3                                    #  网上找的图片集
│   └── myset                                       #  自己拍摄的图片集
├── Models                                          #  训练好的模型
│   ├── cifar100_ResNet20v1_model.155.h5            #  数据集_网络结构_训练次数
│   └── cifar10_ResNet29v2_model.068.h5             #  代码默认加载的模型
├── NetJupyterNotes
│   ├── cifar10_model.ipynb                         #  自行设计编写的模型脚本
│   └── Keras_Cifar_ResNet.ipynb                    #  Keras官网的ResNet示例脚本
├── Qt_Ui                                           #  QtDesigner生成的布局文件和py-uic转换后的布局脚本
│   ├── cameraWin.ui
│   ├── resultWid.ui
│   ├── singleResultWid.ui
│   ├── ui_cameraWin.py
│   ├── ui_resultWid.py
│   └── ui_singleResultWid.py
├── LICENSE
├── main.py
├── README.md
└── requirement.txt

作者和版权声明

本项目采用MIT协议。 项目程序主逻辑及打包发布和resnet训练由@Jyunmau完成。 自建模型设计及训练由@cagaha完成。

img-rec's People

Contributors

jyunmau avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.