Coder Social home page Coder Social logo

cv2-mod3-sec20-hypothesis-testing-lesson's Introduction

Questions

  • How do we decide which test to use?
  • What is the workflow like?
  • What about non normal populations?]

Hypothesis -> Frequentist Approach

Bayesian Approach

Objectives

YWBAT

  • apply hypothesis testing to groups
  • meeting test critera

Scenarios

  • Medical Research
  • In sports, does height have an effect on release point from the plate
  • Put a price on carbon, does this effect emissions
  • Serving a landing page
  • Ad campaigns - which ads drive clicks
  • Insurance - does this population present more/less of a risk

Outline

import pandas as pd
import numpy as np

import scipy.stats as scs

import matplotlib.pyplot as plt
mu0 = 54.0
population1 = np.random.randint(10, 100, 2000)
population2 = np.random.randint(20, 80, 2000)
# how can we compare the means of these populations?
population1.mean(), population2.mean()
(54.8145, 48.396)
# sampling distributions
means1 = []
means2 = []

for i in range(30):
    means1.append(np.random.choice(population1, size=50, replace=False).mean())
    means2.append(np.random.choice(population2, size=50, replace=False).mean())
    
    
# based on the clt - the means of the sampling distributions is normally distributed
# step 1: pick your test
# step 2: do we meet the criteria of the test?
# test for equal variances LEVENE TEST
# h0: var1 = var2
# ha: var2 != var2

scs.levene(means1, means2)

# p = 0.06 -> fail to reject null, variances are equal
LeveneResult(statistic=3.587887243695722, pvalue=0.06319220302449272)
# Which test do we use?
# pick our test: ttest_ind
# what are the assumptions:
# a, b have to be normal
# need to check for equal_variances


# h0: mu1 = mu2
# ha: mu1 != mu2

scs.ttest_ind(means1, means2, equal_var=True)


# pvalue = 0 -> reject the null, so the means are different
Ttest_indResult(statistic=7.51778742021142, pvalue=3.951152287595308e-10)
np.mean(means1), np.mean(means2)
(53.97533333333333, 47.931333333333335)
# Shapiro test
# h0: x is normal
# ha: x is not normal
scs.shapiro(means1), scs.shapiro(means2)

# massive pvalues -> fail to reject null -> normal
((0.9760878086090088, 0.7147558331489563),
 (0.984048068523407, 0.9198938608169556))
# h0: mu1 = mu2
# ha: mu1 != mu2
scs.ttest_rel(np.random.choice(population1, size=30), np.random.choice(population1, size=30))
Ttest_relResult(statistic=-0.20328351760585, pvalue=0.8403331042360073)
# set up your null/alternative hypothesis
# get normal data through sampling distribution(s)
# pick test to run
# meet assumptions/requirements
# run test
# make conclusion
# dig deeper

ttest_1samp

  • When

    • See if a population statistics is the same as a statistic (number)
      • comparing an arr to a number
  • Assumptions

    • pop mean
    • normality -> shapiro test

ttest_ind

  • When

    • Comparing 2 populations (arrays)
  • Assumptions

    • normality -> shapiro test
    • equal variance -> levene test

Testing for multiple groups (>2)

Assessment

cv2-mod3-sec20-hypothesis-testing-lesson's People

Contributors

erdos2n avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.