Coder Social home page Coder Social logo

letter_digit_classification's Introduction

In this section, as the IEEE Firat Computer Society Image Processing Team, We have learned image processing and deep learning. We used them in the project.

Letter and Digit Detection

The repository is basically about detection of letters and digits in an image with deep learning. We Learned how to do, and we applied what is learned.


The Team


Project Structure

The project have two steps.

  1. Step One
    • In this step, main purpose is to classify single letter or digit in the input image.
    • The model is trained in this step.

  1. Step Two
    • In this step, main purpose is detect multiple letter and digit in the input image.
    • The model, that was trained previous step, is used for classifying the detections


1) Step One: Classify Single letter in the image

This step is main structure of the project. Because, The model was trained in this step, and it uses step two.
First of all, The model was needed data for training. Data collection and preparing are time consuming process, and require different skills. For those reasons, we as a team have decided to use a ready dataset. We used balanced emnist data set. There are digits and letters in the dataset.

There are some problems in the dataset. If the dataset is checked, the problems can be seen.
Some of them;

  • Too smiler different labeled data
  • Incorrectly labeled data
  • Out of label images

Those problems affect the model accuracy in a bad way.
The model can be observed with a confusion matrix.

Prediction bar plots are helped us for more clear observation.

2) Step Two: Classify Multiple letter in an image

If we want to classify multiple areas in an image, we should find the areas. Because we want to use the model. We used image processing technique for find the areas. You can use just a model that is trained for this purpose without image proccesing techniques, but you cannot use only image classification model for this. For that reason, we used image processing techniques in this step.
Every detected area passing the model, and the model predict the input. For every area, a rectangle and the prediction letter is drawn.


The Model

We trained two model. One of them is using 1D convolution layers(1dmodel), other is using 2D convolution layers(2dmodel).

Accuracy of 1dmodel is worse then 2dmodel. Probobly, 1dmodel can preform better with better hyperparameter tuning. The project is using 2dmodel because of accuracy issue, and lesser image process steps.

You can find two of them in models folder.


How to use?

First of all, you should clone the repository. After that, you should install requirements. You can use the CLI commend.

pip install -r requirements.txt

Now you can run the project

Single letter or digit

python main.py -s <image_path>

Multiple letter or digit

python main.py -m <image_path>

Developed by the Computer Society Image Processing Team.

letter_digit_classification's People

Contributors

burakblm avatar cankarademir avatar umutguzel avatar vac10 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.