Coder Social home page Coder Social logo

ceciljoseph97 / caltech101histogramclassification Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 2.77 MB

Image Classification performed in HistogramData

License: MIT License

Jupyter Notebook 100.00%
artificial-neural-networks caltech101 classification histogram machine-learning python

caltech101histogramclassification's Introduction

Image Classification of Caltech101 Dataset( Histogram Data )

Caltech-101 contains a total of 9,146 images, split between 101 distinct object categories (faces, watches, ants, pianos, etc.) and a background category.

Dataset Distribution

image

We can observe that the Faces category has the highest number of images as 870. And the lowest number of images as low as 31. Such an imbalanced dataset is one of the major reasons for the bad performance of deep neural networks and other general classifiers.

Run Strategies:

Options

  1. Run Classifiers with K-fold Strategy(Training|Testing).
  2. Run Classifiers with Stratified K-fold Strategy(Training|Testing).
  3. Run SVM with [3, 5, 10, 15, 20, 25, 30] images per class for training.

for the options 1 and 2, below classifiers used and performance is observed:

  1. Multi-layer Perceptron (MLP) Classifier.
  2. SVM Classifier.
  3. Random Forest Classifier.
  4. KNN Classifier.
  5. Logistic Regression Classifier.
  6. LightGBM Classifier.

Dataset consideration

  1. Original Dataset
  2. Dataset after Removing the BACKGROUND_Google Class.(noise class).

Some general Observation

  1. Training|Testing Strategy have high impact in classifier performance.
  2. random_state has considerable impact in classifier performance.
  3. [3, 5, 10, 15, 20, 25, 30..] images per class for training showing increase in accuracy but after the training split 30 classifier seems to get overfitted and accuracy flatlines. This is due to the imbalanced dataset and in the frequency plot we can see minimum number of images in a class is 31 and max is 870, which will impact the classifier performance considerably.
  4. SVM classifier is showing good promise in terms of computational cost and accuracy => Reason for choosing SVM over other for train split scenario.

Requirements:

Images.csv: Contains the images which are represented by an image ID and the corresponding class.
EdgeHistogram.csv: Contains the feature data, Edge Histogram feature data for the images (Dimension of 80).

installation and Deployment

For a local installation, make sure you have pip installed and run:

pip install notebook

Note: Use conda environment to ease up the setup and future environment setups.

conda create --name <envname> --file requirements.txt
conda activate testing

Install the necessary dependencies:

python -m pip install -r requirements.txt

Running in a local installation Launch with:

jupyter notebook 

if using google colab: Just run the cells.

caltech101histogramclassification's People

Contributors

ceciljoseph97 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.