Coder Social home page Coder Social logo

1136623363 / advanced-statistics-hypothesis-analysis Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kavitha-kothandaraman/advanced-statistics-hypothesis-analysis

0.0 0.0 0.0 1.39 MB

To perform basic EDA, Statistics and perform Hypothesis Analysis on the given datasets.

Jupyter Notebook 100.00%

advanced-statistics-hypothesis-analysis's Introduction

Advanced-Statistics-Hypothesis-Analysis

Objective

To perform basic EDA, Statistics and perform Hypothesis Analysis on the given datasets.

About Data

The insurance.csv dataset contains 1338 observations and 7 attributes.

Context

The data contains medical costs of people characterized by certain attributes. Let’s see if we can dive deep into this data to find some valuable insights.

Attributes

age: age of primary beneficiary
sex: insurance contractor gender, female, male
bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective     index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9
children: Number of children covered by health insurance / Number of dependents
smoker: Smoking
region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.
charges: Individual medical costs billed by health insurance.

Tasks to perform

1.Import the necessary libraries
2.Read the data as a data frame
3.Perform basic EDA which should includethe following and print out your insights at every step.
  a.Shape of the data 
  b.Datatypeofeach attribute
  c.Checking the presence of missing values
  d.5 point summary of numerical attributes
  e.Distribution of ‘bmi’, ‘age’ and ‘charges’ columns.
  f.Measure of skewness of ‘bmi’, ‘age’ and ‘charges’ columns
  g.Checking the presence of outliers in ‘bmi’, ‘age’ and ‘charges columns
  h.Distribution of categorical columns (include children)
  i.Pair plot that includes all the columns of the data frame 
4.Answer the following questions with statistical evidence
  a.Do charges of people who smoke differ significantly fromthe people who don't?
  b.Does bmi of males differ significantly from that of females?
  c.Is the proportion of smokers significantly different in different genders?
  d.Is the distribution of bmi across women with no children, one child and two children, the same ?

advanced-statistics-hypothesis-analysis's People

Contributors

kavitha-kothandaraman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.