Coder Social home page Coder Social logo

dsc-0-09-09-distributions-cdf-lab-ds-onboarding's Introduction

The Cumulative Distribution Function (CDF) - Lab

Introduction

In the previous lesson we saw how we can use a discrete random variable used for modeling fair die having a uniform probabilities for all possible values. In this lab, we shall try to calculate a cdf for this variable and visualize it for inspection.

Objectives

You will be able to:

  • Calculate the cdf for a given discrete random variable
  • Visualize the cdf using matplotlib

Calculating CDF in python

Recall the formula for calculate the cumulative probability from previous lesson:

So given a list of all possible values of x, We can easily calculate the cdf for a given possible value (X) by performing following steps:

  • Build a function calculate_cdf(lst,X), where lst is a list of all possible values in a discrete variable x (6 values for a die roll), and X is the value for which we want to calculate the cumulative distribution function.
  • Initialize a count variable
  • for all values in lst, if a value is less than X, add one to count - do nothing otherwise. (this will tell us total number of values less than X)
  • Calculate the cumulative probability of X dividing the count with total possible values
  • Round off by 3 decimals and return the cumulative probability of X.
def calculate_cdf(lst, X):
    
    pass

# test data
test_lst = [1,2,3]
test_X = 2

calculate_cdf(test_lst, test_X)

# 0.667

Let's now use above function to calculate a cdf for each value in a die roll with an intention of plotting it.

Perform following steps in the cell below:

  • Create a list die_lst with all possible values of a fair die
  • Initialize an empty list die_cum for storing cumulative probabilities for these values.
  • For each value in the die_lst calculate its cumulative probability using the function above and store in die_cum list.
die_lst = None
die_cum = None

die_cum

# [0.167, 0.333, 0.5, 0.667, 0.833, 1.0]

cdfs are implemented with two sorted lists: xs, which contains the values, and ps, which contains the cumulative probabilities for xs.

Following this, we now have a list of possible values, and a second list containing cumulative probabilities for each value. Let's go ahead and plot these values in matplotlib using the stem plot.

  • Use die_lst for x-axis and die_cum for y-axis
Text(0,0.5,'Cumulative Probabilities')

png

Level Up (optional)

cdfs (and pmfs) can be calculated using built in numpy and matplotlib methods. So we don't have create custom functions to calculate these. We can draw a histogram styled cdf as shown below using following methods.

You would need to perform these steps

png

Summary

In this lesson we looked at developing a cdf a percentile function of a discrete random variable. We looked at how to calculate and visualize a cdf. This technique can also be applied to continuous random variables which we shall see later in this section.

dsc-0-09-09-distributions-cdf-lab-ds-onboarding's People

Contributors

loredirick avatar shakeelraja avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.