Vitalii Tymchyshyn and Andrii Khlevniuk posted a new paper here entitled “On the average number of birthdays in birthday paradox setup.
” This paper gives a different perspective on the birthday problem and a new proof.
In the usual formation of the birthday problem, one asks the probability that everyone in a group of size n has a different birthday, or one asks how big n needs to be for the probability to be approximately 1/2 that two people share a birthday.
Tymchyshyn and Khlevniuk ask about the expected number of unique birthdays in a group of size n and show that it equals (1 – αn) / (1 – α) where α = 364/365.
Variations on the birthday problem often come up in practice.
For example, you often need to consider how likely it is that a hash function will map set of inputs to unique values.
There’s a rule of thumb that if there are N possible hash values, then there’s a 50-50 chance of a collision if you hash √N values.
Let α = (N-1)/N) and p = 1/N.
If we take the first three terms in the Taylor series for (1 – p)n we see that the expected number of unique hash values after hashing n inputs is approximately n – n(n – 1) p/2.
If we choose n = √N, then the expected number of unique hash values is approximately n – 1/2, or in other words, there’s about a 0.
5 probability of a collision.
Related posts Serious applications of a party trick Random sample overlap.