Coder Social home page Coder Social logo

Proper Documentation about bhdd HOT 5 OPEN

baseresearch avatar baseresearch commented on June 26, 2024
Proper Documentation

from bhdd.

Comments (5)

hectorhui avatar hectorhui commented on June 26, 2024

Can you share you result for training? This look amazing, I am trying to do one Burmese digit recognition system (for research purpose only) with your data. I just want to compare and we can share some insight after I've done my model training.

from bhdd.

swanhtet1992 avatar swanhtet1992 commented on June 26, 2024

@hectorhui
We have the plan to add benchmarks page in this repo. All of us are quite busy at the moment to get back to this. You are welcome to contribute the benchmark page.

There are a few people who built an app example and write an article with this data. Probably, you can reference their works.

We also have plan to collect additional raw data for the test set. If you are interested, let me know.


If you use BHDD or parts of BHDD for research, please cite this repo even though we haven't published our own paper for data processing.

from bhdd.

hectorhui avatar hectorhui commented on June 26, 2024

@swanhtet1992
Thank you for your resources. I am building a CNN for BHDD dataset atm. I will share you my results and findings. May I ask how do you split the train set and test set?
From training result and testing result, i could get up to 99.8% accuracy, but anyway this sound too good to be true as well. I have not tested with any other hand written digits yet. I will try and let you know after my attempts.

from bhdd.

swanhtet1992 avatar swanhtet1992 commented on June 26, 2024

May I ask how do you split the train set and test set?

We had 100+ contributors. We took test data from approximately 30 different writers.
Like I said in the above comment, we are planning to collect more data for test set (and probably for training set too).

The weakness in this dataset is that almost all contributors are at the same age. We noticed that hand writings can be pretty much the same for the age group. We are planning to add below 10 years old and above 40 years old age group. If you have sources where we can get contributors from such age groups, do let me know.

From training result and testing result, i could get up to 99.8% accuracy, but anyway this sound too good to be true as well.

That depend on your architecture and how deep you go. Since you are using CNN, it's kinda overkill for these kinds of datasets. Even for MNIST, LeNet-5 could achieve 0.8 test error rate. So, that's normal to have such high accuracy.

May be you could try Linear Classifier, KNN and SVM too.

from bhdd.

ThuraAung1601 avatar ThuraAung1601 commented on June 26, 2024

the pickel file is corrupted i think. i got this error while trying to is to continue my experiments.
UnpicklingError Traceback (most recent call last)
in ()
5
6 with open("/content/BHDD/data.pkl",'rb') as file:
----> 7 dataset = pickle.load(file)
8
9 trainDataset = dataset["trainDataset"]

UnpicklingError: invalid load key, 'v'.

from bhdd.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.