Coder Social home page Coder Social logo

rkucar / dsc-cumulative-distribution-function-lab-online-ds-sp-000 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from learn-co-students/dsc-cumulative-distribution-function-lab-online-ds-sp-000

0.0 0.0 0.0 531 KB

License: Other

Jupyter Notebook 99.00% Python 1.00%

dsc-cumulative-distribution-function-lab-online-ds-sp-000's Introduction

The Cumulative Distribution Function - Lab

Introduction

In the previous lesson, you learned how you can create a cumulative distribution function for discrete and continuous random variables. In this lab, you'll try to calculate a CDF for a dice roll yourself, and visualize it.

Objectives

You will be able to:

  • Calculate CDF in Python for a given discrete variable with a limited set of possible values
  • Visualize and inspect a CDF in order to make assumptions about the underlying data

Calculating CDF in Python

Recall the formula to calculate the cumulative probability from the previous lesson:

$$\Large F(x)= P(X \leq x)$$

So given a list of all possible values of x, We can easily calculate the CDF for a given possible value $X$ by performing the following steps:

  • Build a function calculate_cdf(lst,X), where lst is a list of all possible values in a discrete variable $x$ (6 values for a dice roll), and $X$ is the value for which we want to calculate the cumulative distribution function
  • Initialize a variable called count
  • For all values in lst, if a value is less than or equal to $X$, add one to count - do nothing otherwise. (this will tell us the total number of values less than $X$)
  • Calculate the cumulative probability of $X$ dividing count by the total number of possible values
  • Round by 3 decimals and return the cumulative probability of $X$
def calculate_cdf(lst, X):
    
    pass

# test data
test_lst = [1,2,3]
test_X = 2

calculate_cdf(test_lst, test_X)

# 0.667
0.667

Now, use this function to calculate a CDF for each value in a dice roll so you can plot it later on.

Perform the following steps in the cell below:

  • Create a list dice_lst with all possible values of a fair dice
  • Initialize an empty list dice_cum for storing cumulative probabilities for these values.
  • For each value in the dice_lst calculate its cumulative probability using the function above and store in dice_cum list.
dice_lst = None
dice_cum = None

dice_cum

# [0.167, 0.333, 0.5, 0.667, 0.833, 1.0]
[0.167, 0.333, 0.5, 0.667, 0.833, 1.0]

CDFs are implemented with two sorted lists: one list which contains the potential outcome values of your discrete distribution, and another list which contains cumulative probabilities.

Following this, we now have a list of possible values and a second list containing cumulative probabilities for each value. Let's go ahead and plot these values in matplotlib using a bar plot.

  • Use dice_lst for x-axis and dice_cum for y-axis
# Your code here

png

Level Up (optional)

CDFs (and PMFs) can be calculated using built-in NumPy and matplotlib methods. So we don't have create custom functions to calculate these. We can draw a histogram styled CDF as shown below using the following steps

You would need to perform these steps

# Your code here

png

Summary

In this lesson, we looked at developing a CDF - a percentile function of a discrete random variable. We looked at how to calculate and visualize a CDF. This technique can also be applied to continuous random variables which we shall see later in this section.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.