Poissonizing the Multinomial
7.3. Poissonizing the Multinomial#
This is just an extension of Poissonizing the binomial.
Suppose you have a sequence of i.i.d. multinomial trials. For example, suppose you are drawing at random with replacement from a population that has
If you draw a fixed number of times
If you replace the fixed number
For each
, the distribution of is Poisson .The counts
in the different categories are mutually independent.
We won’t go through the proof which is a straightforward extension of the proof in the case
When the number of trials is fixed,
For example, suppose that in your population the distribution of classes is as follows:
Class 1: 20%
Class 2: 30%
Class 3: 50%
Now suppose you draw
the number of Class A individuals has the Poisson distribution with parameter
% of , that is Poisson ,the number of Class B individuals has the Poisson
distribution,the number of Class C indidviduals has the Poisson
distribution,and these three counts are independent.
Note that the Poisson parameters of the three Class counts must add up to the original Poisson parameter of the distribution of
For example, the chance that you will get at least 3 individuals in Class A, at least 5 in Class B, and at least 8 in Class C is about 42.5%.
(1 - stats.poisson.cdf(2, 4))*(1-stats.poisson.cdf(4, 6))*(1-stats.poisson.cdf(7, 10))
0.42475602042528027
The number of factors in the answer is equal to the number of classes, unlike the inclusion-exclusion formula in which the amount of work increases much more with each additional class, as you have seen in exercises.
Quick Check
Suppose I roll a Poisson
Answer
Poissonization helps data scientists tackle questions like, “How many times must I sample so that my chance of seeing at least one individual of each class exceeds a given threshold?” The answer depends on the distribution of classes in the population, of course, but allowing the sample size be a Poisson random variable can make calculations much more tractable. For applications, see for example the Abstract and References of this paper.