Poisson approximation in practice

February 3, 2020

We know from basic statistics textbook that the distribution of a Binomial random variate (with probbaility of success $p$) can be approximated using a Poisson distribution (of parameter $\lambda = np$), provided certain conditions are met (usually, small $p$ and large $n$).¹ An easy to remember application is that the sequence of $\text{Bin}(n,\frac{1}{n})$ distributions converges in law to the Poisson distribution with mean 1. We can see $\text{Bin}(n,\frac{1}{n})$ either as the sum of $n$ independent Bernoulli trials with small probability of success, dependent on $n$, or as the count of the total number of occurrences among $n$ independent rare events. It turns out that the later has many useful applications.

Here are two illustrations, taken from DasGupta.² In the matching problem, cards are drawn one at a time from a well-shuffled deck containing $N$ cards, and a match occurs if the card bearing the number $j$ is drawn at precisely the $j$-th draw from the deck. Let $S_N$ be the total number of matches. We will need a little theorem, which happens to be useful when we want to prove that a Poisson limit is still applicable for the sum of dependent Bernoulli trials

Theorem: For $N\ge 1$, let $X_i=1,2,\dots,n$, $n=n(N)$ be a triangular array of Bernoulli random variables, and let $A_i$ denotes the event for which $X_i=1$. For a given $k$, let $M_k$ be the $k$-th binomial moment of $S_n$; i.e., $M_k=\sum_{j=k}^n{j\choose k}P(S_n=j)$. If there exists $0<\lambda<\infty$ such that, for every fixed $k$, $M_k\rightarrow \frac{\lambda^k}{k!}$ as $N\rightarrow\infty$, then $S_n \rightarrow_{\mathcal{L}}\text{Poi}(\lambda)$.

In the matching problem, the binomial moment $M_k$ can be shown to be $M_k = {N\choose k}\frac{1}{N(N-1)\dots (N-k+1)}$. Using Stirling’s approximation, for every fixed $k$, $M_k\rightarrow\frac{1}{k!}$; in other words, the total number of matches converges to a Poisson distribution with mean 1 as the deck size $N\rightarrow\infty$. Convergence is very fast. See also More about the matching problem.³

In the birthday problem, we are interested in the probability that two randomly chosen persons were born the same day. More formally, suppose each person in a group of $n$ people has, mutually independently, a probability $\frac{1}{N}$ of being born on any given day of a year with $N$ calendar days. Let $S_n$ be the total number of pairs of people $(i, j)$ such that they have the same birthday. Then $P(S_n > 0)$ is the probability that there is at least one pair of people in the group who share the same birthday. It turns out that if $n$ and $N$ can be expressed as $n^2=N\lambda+\mathcal{o}(N)$, for some $0<\lambda <\infty$, then $S_n\rightarrow_\mathcal{L}\text{Poi}(\lambda)$. If $N=365$, $n=30$, then $S_n\approx\text{Poi}(1.233)$.

LeCam’s theorem on total variation is also useful. It states, in part, that $d_\text{TV}\left(\text{Bin}(n,\lambda/n),\text{Poi}(\lambda)\right)\le 8\lambda/n$. Further discussion can be found on maths.SE. ↩︎
DasGupta, A. Asymptotic theory of statistics and probability. Springer, 2008. ↩︎
Interestingly, it is possible to derive a recurrence relation for the PDF of $S_n$: (proof here) $$ \begin{array}{rcl} \Pr(S_1 = 1) &=& 1\cr \Pr(S_n = k) &=& (k+1)\Pr(S_{n+1} = k+1)\quad \text{for}\: k \in \{0,1,\dots,n\} \end{array} $$ This allows to obtain the probability density function of $S_n$ recursively for any $n$. ↩︎

aliquote.org

Poisson approximation in practice

See Also