By mathvault | Combinatorics

Carl asked the following *discrete-math-style* probability question:

## $N$ balls are randomly dropped into $k$ boxes $ (k \le N)$. What is the probability that all boxes are filled?

Want to know the answer? Here it is!

Of course, when there is only 1 box, the probability of filling the box with $N$ balls is just $1$. No mystery here.

(Remember, we are assuming that each ball needs to go into *exactly* one of the boxes — nothing more, nothing less)

So what if $k=2$ (i.e., we have 2 boxes) instead? Well, the probability of filling all the boxes seems intractable, so let’s calculate the probability of *not* filling all the boxes instead! Let $E_i$ be the event that the i^{th} box is empty, then we have that:

$P(\text{not all boxes are filled}) = P(E_1 \cup E_2)$

Fortunately, due to the simplicity of the case, we also have that:

$P(E_1 \cup E_2)$ = $P(E_1) + P(E_2)$

Now, how can $E_1$ happen? Exactly when all $N$ balls go into box 2 of course! So $P(E_1)=(\frac{1}{2})^N = \frac{1}{2^N}$.

Similarly, $P(E_2)=\frac{1}{2^N}$ as well, which means that:

$P$ (not all boxes are filled) = $P(E_1) + P(E_2) = \frac{2}{2^N} =\frac{1}{2^{N-1}}$

So to wrap up, we have that:

$P$ (all boxes are filled) = $ 1 – P$(not all boxes are filled) = $1 – \frac{1}{2^{N-1}}$

In practice, what that means is that when $N=2$ (i.e., when we have 2 balls):

$P$ (all boxes are filled) = $1 – \frac{1}{2^{2-1}} = \frac{1}{2}$

With 3 balls, the same probability becomes $\frac{3}{4}$. 4 balls? $\frac{7}{8}$. 5 balls? $\frac{15}{16}$.

So as is to be expected, the chance of filling all the boxes increases exponentially towards $1$ as the number of balls increases.

Let’s say that we have $3$ boxes and that $E_i$ denotes the event that the i^{th} box is empty, then, proceeding as before, we have that:

$P(\text{not all boxes are filled}) = P(E_1 \cup E_2 \cup E_3 )$

Except that there is no cheating now — meaning that we have to appeal to the so-called **inclusion-exclusion principle***:*

\begin{multline*} P(E_1 \cup E_2 \cup E_3) = P(E_1) + P(E_2) + P(E_3)-P(E_1 \cap E_2)-P(E_1 \cap E_3) \\ -P(E_2 \cap E_3) + P( E_1 \cap E_2 \cap E_3) \end{multline*}

Now, $E_1$ happens precisely when all balls go into box 2 or box 3, so $P(E_1)=(\frac{2}{3})^N = \frac{2^N}{3^N}$.

All other probabilities can be calculated fairly easily, for example, $E_1 \cap E_2$ happens when all balls go to box 3, so the probability of it would be $(\frac{1}{3})^N = \frac{1}{3^N}$.

Putting everything together, we now have that:

$P(E_1 \cup E_2 \cup E_3)$= $\frac{2^N}{3^N}+\frac{2^N}{3^N}+\frac{2^N}{3^N}-\frac{1}{3^N}-\frac{1}{3^N}-\frac{1}{3^N}$ = $\frac {2^N – 1}{3^{N-1}}$

And now to wrap up:

$P$ (all boxes are filled) = $1 – P$ (not all boxes are filled) = $1-\frac {2^N – 1}{3^{N-1}}$

All right. Let’s put that to test again! When $N=3$ (e.g., we have 3 balls):

$P$ (all boxes are filled) = $1-\frac {2^3 – 1}{3^{3-1}} = \frac{2}{9}$

5 balls? $\frac{50}{81}$. 7 balls? $\frac{602}{729}$. 10 balls? $\frac{18660}{19683}$.

See the same trend? While one might have a hard time filling the boxes with only 3 balls, increasing the number of balls (by perhaps just a bit more) will greatly facilitate this process.

Surprise! For the general case, the formula *will* look a bit nastier with some binomial coefficient and some $(-1)^{i+1}$. All right. Enough said. Here we go!

$P$ (not all boxes are filled) = $P(E_1 \cup \ldots \cup E_k)$ =

(invoking the *inclusion-exclusion principle*…)

\begin{multline*}P(E_1)+\dots+P(E_k)-P(E_1 \cap E_2) -\ldots-P(E_{k-1} \cap E_k) \\ + P(E_1 \cap E_2 \cap E_3)+ \ldots+P(E_{k-2} \cap E_{k-1} \cap E_k) \\ + \ldots + (-1)^{k+1}P(E_1 \cap \ldots \cap E_k) \end{multline*}

(and after some lengthy number crunching…)

(and some intense parsing…)

$\displaystyle k(\frac{k-1}{k})^N-\binom{k}{2}(\frac{k-2}{k})^N+\binom{k}{3}(\frac{k-3}{k})^N-\ldots+(-1) ^k \binom{k}{k-1}(\frac{1}{k})^N$

(converting into *sigma notatio*n now…)

\[ \sum_{i=1}^{k-1} (-1)^{i+1}\binom{k}{i}(\frac{k-i}{k})^N \]

Lastly, putting everything together, we get that:

\begin{align*} P (\text{all boxes are filled}) & = 1-P(\text{not all boxes are filled}) \\ & = \displaystyle 1-\sum_{i=1}^{k-1} (-1)^{i+1}\binom{k}{i}(\frac{k-i}{k})^N \end{align*}

And that’s about it! Finito! 🙂

**Math Vault and its Redditbots** has the singular goal of advocating for education in higher mathematics through *digital publishing* and the *uncanny* use of technologies. Head to the **Vault** for more math cookies. :)

(Contributor Post) A Primer on Statistical Significance — Grade Comparison, Fertilizer Selection, Dice Rolling and Tire Lifespan Evaluation and More!

Desmos: A Definitive Guide (How to Perform Cool Computations and Creating Great Graphs Using Online Graphing Calculator)

Biostatistics For Health Professionals (McGill EPIB 507) — A Review

Upcoming Last-Minute McGill MATH 323 Q&A Session

Add Your Reply