Coder Social home page Coder Social logo

wii-herbarium-plant-classification-using-machine-learning's Introduction

WII Herbaium Plant Classification using Machine Learning

As the name says this project is a machine learning supervised classification project, In this I have followed the following steps:
1-> Data Acquisation
2-> Data cleaning and preprocessing
3-> Feature Exctraction
4-> Model training and selecition
5-> Evaluation/ Acurracy
6-> User-Input based classification and occurence

Dataset and Goal of the project

The Dataset here used is Wildlife Institue of India Herbarium Dataset through GBIF network you can get it from here. The goal for now is to a make classification maodel.

Cleaning Raw Dataset

To get a clean dataset considering the goal. The steps i followed were: Handling null values, Removing double and duplicate data(it was done using excel, so not show in the code). Further to get the better insides of the dataset EDA analysis was done and cleaning was done accordingly using statistical method like mean, mode , meadian and interpolation.

Analysis and Fetaure Extraction

As the most of the variables were string categorical data types. So i used Dython library for it, and to analysis I used heat map and then Extracted the fetures to perform classification

Model and Classifier

As it is classification model so it comes under supervised machine learning . The classifier algorithm used here are Random Forest , Decision Tree, SVM, KNN, Logistic Regression . But before fitting the model into classifer , Encoding was also done to all the extracted categorical features, so that algorithm can work smoothly.

Evaluation and more results

For evaluation of model metrics like accuracy, recall and F1 score are used.Out of all the 5 classifier one with the high accuracy and was not seem to be overfitting was decision tree. So, decision tree classifier based model gives the best result. Other insights have also been drieved from the dataset by visualization.

User-Input base classification and occurence

As for the result Input based classification is done. Also the ocurrence (state provision and localitiy) have been extracted on the basis of classification.

Future work

There is scope of improvement in the model as i think it is overfitting because it is achieving around 100% accuracy in Random Forest based model. Furhter deployment of the model can also be done. Since, Herbarium dataset holds scientific value for research.

wii-herbarium-plant-classification-using-machine-learning's People

Contributors

w-ight avatar

Watchers

 avatar

wii-herbarium-plant-classification-using-machine-learning's Issues

Overfitting of the model

The accuracy of the model for Random forest algorithm is just around 100% which is a overfitting sign. Need to fix this

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.