Vitalii Tymchyshyn and Andrii Khlevniuk posted a new paper here entitled “On the average number of birthdays in birthday paradox setup.

” This paper gives a different perspective on the birthday problem and a new proof.

In the usual formation of the birthday problem, one asks the probability that everyone in a group of size n has a different birthday, or one asks how big n needs to be for the probability to be approximately 1/2 that two people share a birthday.

Tymchyshyn and Khlevniuk ask about the expected number of unique birthdays in a group of size n and show that it equals (1 – αn) / (1 – α) where α = 364/365.

Variations on the birthday problem often come up in practice.

For example, you often need to consider how likely it is that a hash function will map set of inputs to unique values.

There’s a rule of thumb that if there are N possible hash values, then there’s a 50-50 chance of a collision if you hash √N values.

Let α = (N-1)/N) and p = 1/N.

If we take the first three terms in the Taylor series for (1 – p)n we see that the expected number of unique hash values after hashing n inputs is approximately n – n(n – 1) p/2.

If we choose n = √N, then the expected number of unique hash values is approximately n – 1/2, or in other words, there’s about a 0.

5 probability of a collision.

Related posts Serious applications of a party trick Random sample overlap.