Coder Social home page Coder Social logo

sampling-assignment-102017017's Introduction

Sampling

by Amrita Bhatia (Roll no. 102017017)

Sampling is a statistical technique used to select a representative subset of a population. An ideal sample has all the characteristics of the population it is derived from. While an ideal sample is not feasible to derive, different sampling techniques can be used to obtain samples close to an ideal sample.

This project applies various sampling techniques to different machine learning models and evaluates their performance using the accuracy metric.

Sampling techniques used:

S. No. Technique
S1 Simple random sampling (w/0 replacement)
S2 Simple random sampling (with replacement)
S3 Systematic sampling
S4 Stratified sampling
S5 Cluster sampling

Models used:

S. No. Model
M1 Logistic regression
M2 Naive Bayes classifier
M3 Support vector classifier
M4 Decision tree
M5 Random forest

Methodology

  • The data was highly imbalanced with a 98.8% majority class. Therefore, SMOTE was used to balance the class distribution.
  • The five sampling techniques were then applied to select samples from the data.
  • These samples were then used to train the machine learning models. The models were tested on the remaining data not included in the sample.

Results

The accuracies of the models for each sampling technique are as follows:

logreg nb svc dt rf
S1 0.818182 0.792208 0.935065 0.948052 1.000000
S2 0.917866 0.897544 0.911092 0.970364 0.990686
S3 0.906557 0.796721 0.906557 0.947541 0.991803
S4 0.922162 0.752432 0.915676 0.966486 0.992432
S5 0.458849 0.541151 0.490168 0.446468 0.448653
  • The highest accuracy (100%) is seen in the random forest model using simple random sampling without replacement.
  • The worst performing model is the decision tree using the cluster sampling technique with an accuracy of 44.65%.

sampling-assignment-102017017's People

Contributors

amr8ta avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.