Birthday Paradox by Example – it is not a Paradox

The Birthday Paradox is presented as follows.

…in a random group of 23 people, there is about a 50 percent chance that two people have the same birthday

Birthday Paradox

This is also referred to as the Birthday Problem in probability theory.

First question: What is a paradox?

…is a logically self-contradictory statement or a statement that runs contrary to one’s expectation

Wikipedia

What does that mean? A logically self-contradictory statement‚ means that there should be a contradiction somewhere in the Birthday Paradox. This is not the case.

Then a statement that runs contrary to one’s expectations, could be open for discussion. As we will see, by example, in this post, it is not contrary to one’s expectation for an informed person.

Step 1: Run some examples

The assumption is that we have 23 random people. This assumes further, that the birthday of each one of these people is random.

To validate that this is true, let’s try to implement it in Python.

import random

stat = {'Collision': 0, 'No-collision': 0}

for _ in range(10000):
    days = []
    for _ in range(23):
        day = random.randint(0, 365)
        days.append(day)

    if len(days) == len(set(days)):
        stat['No-collision'] += 1
    else:
        stat['Collision'] += 1

print("Probability for at least 2 with same birthday in a group of 23")
print("P(A) =", stat['Collision']/(stat['Collision'] + stat['No-collision']))

This will output different results from run to run, but something around 0.507.

Probability for at least 2 with same birthday in a group of 23
P(A) = 0.5026

A few comments to the code. It keeps record of how many times of choosing 23 random birthdays, we will end with at least two of them being the same day. We run the experiment 10,000 times to have some idea if it is just pure luck.

The check if len(days) == len(set(days)) tests whether we did not have the same brirthday. If function set(…) takes all the unique days in the list. Hence, if we have two the same days days of the year, then the len (length) will be the same for the list and the set of days.

Step 2: The probability theory behind it

This is where it becomes a bit more technical. The above shows it behaves like it says. That if we take a group of 23 random people, with probability 50%, two of them will have the same birthday.

Is this contrary to one’s expectations? Hence, is it a paradox?

Before we answer that, let’s see if we can nail the probability theory behind this.

Do it step by step.

If we have 1 person, what is the probability that anyone in this group of 1 person has the same birthday? Yes, it sounds strange. The probability is obviously 0.

If we have 2 persons, what is the probability that any of the 2 people have the same birthday? Then they need to have the same birthday. Hence, the probability become 1/365.

How do you write that as an equation?

What we often do in probability theory, is, that we calculate the opposite probability.

Hence, we calculate the probability of now having two the same birthdays in a group. This is easier to calculate. In the first case, we have all possibilities open.

P(1) = 1

Where P(1) is the probability that given a group of one person, what is the probability of that person not having the same birthday as anyone in the group.

P(2) = 1 x (364 / 365)

Because, the first birthday is open for any birthday, then the second, only has 364 left of 365 possible birthdays.

This continues.

P(n) = 1 x (364 / 365) x (363 / 365) x … x ((365 – n + 1) / 365)

Which makes the probability of picking 23 random people without anyone with the same birthday to be.

P(23) = 1 x (364 / 365) x (363 / 365) x … x (343 / 365) = 0.493

Or calculated in Python.

def prop(n):
    if n == 1:
        return 1
    else:
        return (365 - n + 1) / 365 * prop(n - 1)

print("Probability for at no-one with same birthday in a group of 23")
print("P(A') =",  prop(23))

Which results in.

Probability for at no-one with same birthday in a group of 23
P(A') = 0.4927027656760144

This formula can be rewritten (see wikipedia), but for our purpose the above is fine for our purpose.

The probability we look for is given by.

P(A) = 1 – P(A’)

Step 3: Make a graph of how likely a collision is based on a group size

This is great news. We can now calculate the theoretical probability of two people having the same birthday in a group of n random people.

This can be achieved by the following code.

from matplotlib import pyplot as plt

def prop(n):
    if n == 1:
        return 1
    else:
        return (365 - n + 1) / 365 * prop(n - 1)

X = []
Y = []
for i in range(1, 90):
    X.append(i)
    Y.append(1 - prop(i))

plt.scatter(X, Y, color='blue')
plt.xlabel("Number of people")
plt.ylabel("Probability of collision")
plt.axis([0, 90, 0, 1])
plt.show()

Which results in the following plot.

Where you can see that about 23 people, we have 50% chance of having a pair with the same birthday (called collision).

Conclusion

Is it a paradox? Well, there is no magic in it. You can see the above are just simple calculations. But is the following contrary to one’s expectation?

6 weeks are 6*7*24*60*60 seconds = 3,628,800 seconds.

And 10! = 10*9*8*7*6*5*4*3*2*1 = 3,628,800.

Well, the first time you calculate it might be. But does that make it a paradox?

No, it is just a surprising fact the first time you see it. Does it mean that seconds are related to faculty? No, of course not. It is just one strange thing that connects in a random way.

The same with the Birthday Paradox, it is just surprising the first time you see it.

It seems surprising for people that you only need 23 people to have 50% chance of a pair with the same birthday, but it is not a paradox for people that work with numbers.

Leave a Reply