The Poisson Distribution

Also, as per condition 4, number of trials should be sufficiently bigger than number of successes, which is also violated in this case because we have 147 trials (i.e. number of years in the data set) and successes to the order of ~1000 or more (i.e. total number of goals per year). Even logically, we can think that if there are more number of matches in a year, then there will be more number of total goals in that year, which violates condition 3.Based on above, we can also assume that our option 2 (i.e. total number of goals in 1 day), although will be closer to being a Poisson distribution as compared to option 1, but it still wont be because more number of matches in a day will mean more number of goals which will violate condition 3 that the rate at which events occur needs to be constant..Lets visualize this for option 2.2..k is total number of goals and interval is 1 daySo, even though number of successes is fairly low compared to number of trials (condition 4 satisfied), rate of event occurring is not constant and is dependent on number of matches played for option 2..Therefore, we reject option 2 as a Poisson distribution as well. Lets finally explore option 3.3..k is total number of goals and interval is 1 match.Eureka!.We have a constant rate of number of goals per match with a peak at around 3 goals and a mean of 2.935642 goals per match..Number of goals scored (‘the event’ being a goal being scored) is an integer where one goal is independent of another and the number of matches (i.e. trials) is way higher than number of goals (i.e. successes) per match..Therefore, we have found our Poisson Distribution!Probability of events for a Poisson DistributionNow that we have our Poisson distribution, we can calculate the probability of k events happening in an interval using the following:P(k events in an interval) = e ^{-λ } * λ^{k}/k!.where, λ = Mean number of events per interval, i.e..mean number of goals per match. k = Number of events for probability estimation, i.e..number of goals, e = is the Euler number and k!.= is the factorial of k.As per our exploration above, we have mean number of goals as λ = 2.935642, we can plug-in this value to the formula above to calculate the probability of any number of goals being scored in a match.For example,P(5 goals scored in a match) = e^-2.935642 * 2.935642^5/5!P(5 goals scored in a match) = 0.09647195841Lets use R to calculate the above.## [1] 0.09647199And we see the same value as calculated above.We can also see how the probability varies as we increase the number of events i.e.. More details

Leave a Reply