Now that you understand the basics of sets, you'll learn how this knowledge can be used to calculate your first probabilities! In this section, you'll learn how to use sets to create probabilities and you'll learn about the foundations of probability through the three probability axioms.
You will be able to:
- Compare experiments, outcomes, and the event space
- Calculate probabilities by using relative frequency of outcomes to event space
- Describe the three axioms of probability
- Describe the addition law of probability
For the following examples, we will consider what happens when throwing a single 6-sided die ("die" is the singular of "dice").
When you throw a die once, you can consider this a random experiment. The result of this "experiment" is the outcome. So, for example, the outcome could be a 2.
An event is the outcome of a particular random experiment. So, for example, "rolling the die and getting a 5" is an event.
The sample space represents the universe of all possible outcomes. With our die rolling example, that space includes the values of 1, 2, 3, 4, 5, and 6.
The event space is a subset of the sample space. You can think of it as the collection of events we "care about" out of all possible events. For example, our event space could be "rolling a number higher than a 4", which would include the values of 5 and 6.
As you may have noticed, we used the term "subset" to describe event spaces. Sets are a very useful way to represent the probability concepts described above. Let's expand on that now.
Let's define the sample space for rolling a single die as the set . looks like this:
You can then say that:
Other examples of sample spaces:
- In this case, is equal to some number x, with x being a non-negative integer
- Mathematically, x being an integer looks like . is a "special" set, containing all integers.
- x being non-negative looks like
- To represent the set of x values that meet both requirements, we'll use an additional way of defining a set, called the "set builder" notation. In set builder notation, the vertical bar means "such that", and the conditions are separated by commas
- Our overall definition of is . In other words, contains all instances of x such that x is an integer and x is greater than or equal to zero.
- In this case, let's say that x is a real number between 0 and 24
- Mathematically, x being a real number (roughly equivalent to a floating point value, although there are subtle differences) looks like . is another special set, the set of all real numbers.
- x being between 0 and 24 looks like
- Putting that all together, we get . In other words, contains all instances of x such that x is a real number and x is between 0 and 24.
Let's define the event space as . As noted previously, , i.e. is a subset of .
An example of could be "rolling a number higher than 4". This would be written .
Or if were "rolling an odd number", that would be written .
Once is defined, we can say that event happened if the actual outcome after rolling the die belongs to the predefined event space .
Other examples of event spaces based on previously defined sample spaces:
- We can define this as 20 or fewer text messages
- The event space still includes only non-negative integers, but this time there is an upper bound of 20 also
- . In other words, contains all instances of x such that x is an integer between 0 and 20.
- We can define this as 6 or more hours watched
- The event space still includes only real numbers below 24, but now the lower bound is 6 rather than 0
- . In other words, contains all instances of x such that x is a real number between 6 and 24.
Once you understand sample spaces and event spaces, you understand the foundational concepts of probability.
While conducting an endless stream of experiments, the relative frequency by which an event will happen becomes a fixed number.
Let's denote an event by , and the probability of the event occurring by . Next, let be the number of conducted experiments, and the count of "successful" experiments (i.e. the times that event happened). The formal definition of probability as a relative frequency is given by:
In other words, the probability of the event ( ) equals the limit as the number of experiments goes to infinity of the count of successful experiments divided by the number of experiments .
This is the basis of a frequentist statistical interpretation: an event's probability is the ratio of the positive trials to the total number of trials as we repeat the process infinitely.
In the early 20th century, Kolmogorov and Von Mises came up with three axioms that further expand on the idea of probability. The three axioms are:
A probability is always bigger than or equal to 0, or
If the event of interest is the sample space ( ), we say that the outcome is a certain event, or
The probability of the union of two exclusive events is equal to the sum of the probabilities of the individual events happening.
Remember the inclusion-exclusion principle states that:
If we know that (that there is no intersection between and , so the set formed by their intersection is the empty set ), then we can skip that part of the formula, so the cardinality of is simply:
The same logic works for the probability of events in two event space sets. If the events are exclusive โ they never happen at the same time, so the intersection between them is empty โ you can simply add the two probabilities together.
The additivity axiom is great, but most of the time events are not exclusive. Then we need to bring in the rest of the inclusion-exclusion principle (subtracting the intersection), which is now referred to as the addition law of probability or the sum rule when we are talking about probabilities.
Put in words, the probability that or will happen is the sum of the probabilities that will happen and that will happen, minus the probability that both and will happen.
Let's reconsider the dice example to explain what was explained before:
Let's consider two events: event means throwing a 6, event means that you throw an odd number ( ). These events are exclusive, so you can use the additivity rule if you want to know the answer to the question:
"what is the probability that your outcome will be a 6, or an odd number?"
There is a 2/3 probability that the outcome will be a 6 or an odd number.
Now, let's consider the same event and another event . These events are not mutually exclusive, so if you want to know the probability that or will happen, you need to use the addition law of probability.
Note that is equal to getting an outcome of 5, as that is the "common" element in the respective event spaces of and . This means that
There is a 5/6 probablity that the outcome will be in or .
In the previous examples, you noticed that for our dice example, it is easy to use these fairly straightforward probability formulas to calculate probabilities of certain outcomes.
However, if you think about our text message example, things are less straightforward, e.g.:
"What is the probability of sending less than 20 text messages in a day?"
This is where the probability concepts introduced here fall short. The probability of throwing any number between 1 and 6 with a die is always exactly , but we can't simply count our messages event space. In words, the probability of sending 20 messages is likely different than the probability of sending, say, 5 messages, and will be different for any number of messages sent. You'll learn about tools to solve problems like these later on.
Well done! In this section, you learned how to use sets to get to probabilities. You learned about experiments, event spaces, and outcomes. Next, you learned about the law of relative frequency and how it can be used to calculate probabilities, along with the three probability axioms.