Stars and Bars Theorem

Problem

Given K variables, $a_1, a_2, a_3 \dots a_K$ and a value $N$, how many ways can we write $a_1 + a_2 + a_3 \dots a_K = N$, where $a_1, a_2, a_3 \dots a_K$ are non-negative integers?

For example, if $K = 3$ and $N=2$, there are 6 solutions.

a b c
-----
2 0 0
0 2 0
0 0 2
1 1 0
1 0 1
0 1 1

Solution

The solution is very simple. In order to understand it, let us visualize the problem from a different angle.

Another way to represent the solutions visually is to use “Stars and Bars”. In the visualization below, we imagine we have $K$ partitions and we need to distribute $N$ stars in them. Like before, if we have $K = 3$ and $N=2$, then:

a  b  c
--------
**|  |
  |**| 
  |  |**
* |* |
* |  |*
  |* |*

This visualization gives us the foundation of the solution. If we have $K$ partitions and $N$ stars to distribute in them, we can imagine that we have $N+K-1$ blanks. For example, when $K=3$ and $N=2$, we have $N+K-1=2+3-1=4$ blanks.

- - - - 4 Blanks

So, if we have $N+K-1$ blanks, how many of them will be stars? $N$ of the blanks need to be stars. And how many of the blanks need to be bars? Remember, if we have $K$ partition, then we just need $K-1$ bars to create $K$ partitions. So there will be $K-1$ bars.

So, the problem boils down to which blanks get to become stars and which blanks become bars. If we have $N+K-1$ blanks, how many ways are there to select $N$ of them to become stars? $\binom{N+K-1}{N}$ or $\binom{N+K-1}{K-1}$

For our example this equals to $\binom{4}{2} = 6$.

Stars and Bars Theorem

Theorem: For any pair of positive integers $N$ and $K$, the number of $K$-tuples of non-negative integers whose sum is $N$ is equal to the binomial coefficient $\binom{N+K-1}{N}$ or $\binom{N+K-1}{K-1}$.

Or we can rephrase “Stars and Bars Theorem” as “Ball and Urn Theorem”:

Theorem: If we have K distinguishable containers and N indistinguishable balls, then we can distribute them in $\binom{N+K-1}{N}$ ways.

The “Stars and Bars” theorem is also known as “Ball and Urn” theorem. Don’t get confused. They are the same thing.

Extensions

Positive Number of Stars in Each Partition

What if every partition needs to have at least one star? Simple. Place a star in each partition and subtract those placed stars from total stars. Then apply the Stars and Bars theorem normally.

For example, find the number of solution to $a + b + c = 10$, where $a,b,c > 0$.

Let $a = 1 + a’$, $b = 1 + b’$ and $c = 1 + c’$, where $a’,b’,c’ \geq 0$
$a+b+c = 10$
$(1+a’)+(1+b’)+(1+c’) = 10$
$a’+b’+c’ = 10 – 3$
$a’+b’+c’ = 7$, where $a’,b’,c’ \geq 0$

This can now be solved using normal “Stars and Barsh” theorem, which equals to $\binom{9}{2}$.

Theorem: For any pair of positive integers $N$ and $K$, the number of $K$-tuples of positive integers whose sum is $N$ is equal to the binomial coefficient $\binom{N-1}{K-1}$.

Lower Bound with Each Partition

The extension above can be further generalized. For each partition, if we are given a lower bound $x$, then we simply place $x$ stars in that partition and subtract with from total stars $K$. Finally, we apply “Stars and Bars” theorem normally.

So, for $a_1 + a_2 + a_3 \dots a_K = N$, where $a_1 \geq L_1, a_2 \geq L_2, a_3 \geq L_3 \dots a_K \geq L_n$ is same as $a’_1 + a’_2 + a’_3 \dots a’_K = N – L_1 – L_2 – L_3 \dots L_K$, where $a’_1, a’_2, a’_3 \dots a’_K \geq 0$.

Equation with Inequality

What if we are asked to find:

$$a_1 + a_2 + a_3 \dots a_K \leq N$$

This is a very interesting extension. There are two ways to think about the solution to this problem.

Garbage Partition

Imagine that you have an extra partition which acts like a garbage bin. Then if you try to apply normal “Stars and Bars Theorem”, what happens?

So the equation now becomes: $a_1 + a_2 + a_3 \dots a_K + a_{K+1} = N$. What is the solution to this problem? Ans: $\binom{N+K}{K}$.

Now, simply imagine, anything that gets assigned to the extra garbage partition gets thrown out. So the sum of remaining partitions will be: $a_1 + a_2 + a_3 \dots a_K \leq N$

Theorem: For any pair of positive integers $N$ and $K$, the number of $K$-tuples of non-negative integers whose sum is less than or equal to $N$ is $\binom{N+K}{K}$.

Vertical Sum of Pascal Triangle

Another way to solve the problem is to simply loop over the solution from 0 to $N$.

$$a_1 + a_2 + a_3 \dots a_K = 0 \\
a_1 + a_2 + a_3 \dots a_K = 1 \\
a_1 + a_2 + a_3 \dots a_K = 2 \\
a_1 + a_2 + a_3 \dots a_K = 3 \\
\dots \\
a_1 + a_2 + a_3 \dots a_K = N \\
$$

If we find the sum of all the results above, then we will get result for $a_1 + a_2 + a_3 \dots a_K \leq N$. So what is the sum?

$$\binom{N+K-1}{K-1} + \binom{N+K-2}{K-1} + \binom{N+K-3}{K-1} \dots + \binom{K-1}{K-1}$$

Do you know what this is? This is the vertical sum of a column in pascal triangle. I will discuss the properties of Pascal Triangle someday else. For today, understanding the “Garbage Partition” logic will be enough.

So vertical sum of pascal triangle, or $\sum_{i=k}^{n}\binom{i}{k}$, is equal to $\binom{n+1}{k+1}$.

Upper Bound with Each Partition

Now, this is tricky. This time we are given the upper limit for each partition. So the number of stars we place must be smaller or equal to the upper limit for that partition.

One possible solution is Dynamic Programming. We iterate over each partition and keep track of remaining stars in our hand. When at each partition, we run a loop from 0 to upper limit and try placing the different amount of stars in each container.

It will be something similar to this ( not exactly this ):

int dp ( int partition, int stars ) {
    int res = 0;
    for ( int i = 0; i < upperLimit[pos]; i++ ) {
        res += dp ( partition + 1, stars - i );
    }
}

Now, the above DP solution will work only if we are lucky to get constraints that fit the complexity of the DP. When we are unlucky, we will need to be more creative.

Related Problems

  1. UVa 10541 – Stripe
  2. SPOJ MARBELS – Marbles
  3. UVa 11480 – Jimmy’s Balls
  4. UVa 13035 – Another Combination Problem
  5. UVa 10532 – Combination! Once Again

Application of Prufer Code: Random Tree Generation, Cayley’s Formula and Building Tree from Degree Count

On the last post (Prufer Code: Linear Representation of a Labeled Tree), we discussed how to convert a labeled tree into Prufer Code and vice versa. On this post, we will look into some of its applications in problem-solving.

Generating Random Tree

This one is the simplest. Though problem-solving may not require generating random trees, problem setting might. In order to generate a random tree with $N$ nodes, all we need to do is generate a Prufer Code of length $N-2$ with each element being in the range of $1$ to $N$. Next, we just convert the Prufer Code into a tree.

The randomness of the tree depends on the randomness of the sequence generated. Using rand() function is not good since it’s not truly random. How we can generate a random sequence requires a blog post of its own. Maybe I will write a post on it in near future.

Cayley’s Formula

Given $N$ labeled nodes, how many spanning trees can we build?

For example, how many different spanning trees can we build with $3$ nodes?

According to Cayley’s Formula, the number of spanning trees we can build with $N$ labeled node is $N^{N-2}$.

Cayley’s Formula can be proved by Prufer Code easily. Since there are $N$ nodes, what will be the length of the Prufer Code of the spanning trees? Ans: $N-2$. Next, at each position of the Prufer Code, how many different values can we place? Ans: $N$. Therefore, we have $N^{N-2}$ combinations possible for Prufer Code.

Building Tree from Degree Count

Given $N$ nodes and for each node its degree count, i.e, number of other nodes that are its neighbor, how many different spanning trees can we build while maintaining the degree count of each node?

For example, let $N=5$ and the degree count be $[3, 2, 1, 1, 1]$. How many spanning trees respect these degree constraints? Ans: $3$

So how do we find the answer using Prufer Code? We use the following two information:

  • A labled tree with $N$ nodes has a Prufer Code of length $N-2$.
  • If a node has degree count $d$, then it appears exactly $d-1$ times in the Prufer Code.

So for each node, we know how many times it appears in Prufer Code from its degree count. If the degree count for $i_{th}$ node is $d_i$, then it appears $d_i-1$ times. We also know the total number of openings we have in our Prufer Code ($N-2$ openings). So the number of possible spanning tree while maintaining the degree constraint is:

$$\binom{N-2}{d_1-1} \binom{N-2-(d_1-1)}{d_2-1}…\binom{N-2-(d_1-1)-…-(d_{N-1}-1)}{d_N-1} \ =\binom{N-2}{d_1-1,d_2-1,…,d_N-1}$$

For example, when $N=5$ and degree count is $[3, 2, 1, 1, 1]$, the length of Prufer Code will be $5-2=3$ and the nodes will appear $[3-1,2-1,1-1,1-1,1-1] = [2,1,0,0,0]$ times. So answer will be $\binom{3}{2,1,0,0,0} = \binom{3}{2}\binom{1}{1}=3$.

Simple right? See if you can solve the following problem now: UVa 12956 – Curious Guardians. It’s going to require DP along with what we have learned so far.

Conclusion

I know about one more application of Prufer Code, but I did not write about it here because its proof is a bit tricky. So I will probably write a separate post for that. If you are feeling curious about it then here is a spoiler: UVa 11719 – Gridland Airports.

If you have enjoyed learning Prufer Code, then don’t forget to share it.

Resources

  1. Wiki – Caley’s Formula
  2. forthright48 – Prufer Code: Linear Representation of a Labeled Tree

Related Problems

  1. UVa 12956 – Curious Guardians

Prufer Code: Linear Representation of a Labeled Tree

I guess this is going to be my first post (apart from the contest analysis’) which is not about Number Theory! It’s not about graph either, even though the title has “Tree” in it. This post is actually about Combinatorics.

Prufer code, in my opinion, is one of the most underrated algorithms I have learned. It has a wide range of applications, yet, very few people seem to know about it. It’s not even a hard algorithm! It’s very easy and you should understand it intuitively.

In this post, we will be discussing what is Prufer Code and its conversion to and from a tree. On next post, we will look into some of its application.

Prufer Code: What is it?

Every labeled tree of $N$ nodes can be uniquely represented as an array of $N-2$ integers and vice versa

The linear array representation of a labeled tree is called Prufer Code. In case you are wondering what’s a labeled tree, it’s just a tree with distinguishable nodes, i.e, each node is marked such that they can be distinguished from one another.

Okay, before we move on to how to covert a labeled tree into Prufer Code and vice versa, let me tell you how I came to know about Prufer Code. If you are a problem setter, the story will be helpful to you.

How I came to know about Prufer Code

So a few years ago, I used to create challenges for HackerRank as a “Challenge Creator”. So one day, I created a challenge involving tree. Since the problem had a tree in it, I had to create some random trees for the data set. I used the following algorithm to create the random trees:

1. Start with node 1. Initially, there is only node 1 by itself. Totally independent.
2. Now, for each node x from 2 to N:
  a. Node 1 to x-1 has already been processed. They are now connected as a tree.
  b. Choose a node randomly from node 1 to x-1 and connect node x with it.
3. Your tree is ready.

I thought this was a pretty reasonable process for getting myself a tree. So I submitted the challenge for testing. Back then, Wanbo used to test all my challenges. He no longer works at HR now. Anyways, as a tester, it was Wanbo’s duty to write bad solutions to test my challenge. He looked into the dataset and realized that my data set wasn’t truly random. He quickly figured out a pattern and wrote a heuristical solution that got AC. So he sent my challenge back to me and asked to improve the dataset.

I started to analyze my tree generator again. Why is it not random? What’s wrong with it? After some thinking, I realized that in the pseudocode above, node number 1 has a higher chance of getting a child. As a result, trees created by that generator had higher degrees for nodes with a lower index. This can’t be called a random tree generator.

Once I realized my mistake, I notified Wanbo about it. He then suggested me to google Prufer Code.

And that’s how I came to know about Prufer code. After learning the Prufer Code, creating random trees became trivial. All I had to do is create a random array of length $N-2$ and then convert it to a labeled tree. It was truly random without any pattern.

Creating random trees is just one of the application of Prufer Code. There is more.

Let us see how to find the Prufer Code of a given tree.

Convert a Labeled Tree into Prufer Code

The process of converting a labeled tree into its Prufer Code is very simple. The pseudocode is given below:

1. Find the smallest leaf node of the given tree. Let it be x. Let the neighbor of node x be y.
2. Write down the value of y on a piece of paper.
3. Remove node x from the tree.
4. Repeat step 1 to 3 until there are only 2 nodes remaining.
5. Once you are done, there will be a sequence of numbers on the piece of paper. That's your Prufer Code.

In order to help you understand, I even drew a picture!

Here, the white nodes are non-leaf nodes, the green nodes are leaf nodes and the blue nodes are the smallest leaf node for that tree.

So for the tree given on picture, the Prufer Code is $1, 7, 6, 6, 1$. It’s simple right?

Before we move on, we will also note down few properties of Prufer Code:

  1. If a node has degree $d$, then that node will appear in prufer code exactly $d-1$ times.
  2. Leaves never appear in Prufer Code.

If you are still having difficulty in understanding, have a look at this video.

Code for Converting Tree into Prufer Code

We are just going to convert the idea we discussed above into code. Nothing fancy.

Complexity: $O(VlogV)$

I have tested the above code with some hand generated test cases. I would have felt much more confident if I could have gotten AC in some problem from an online judge. But unfortunately, I never found any such problem. The other way around is more common.

Convert a Prufer Code into Labeled Tree

So given a Prufer Code, how do we convert it into a tree? The algorithm is just reverse of what we have done before.

1. Find the node numbers who are missing in the Prufer Code. They are the leaves of the tree since leaves never make it to Prufer Code. Let L be the set of leaf nodes.
2. Select the smallest leaf from L and connect it to the first element of Prufer Code. 
3. Remove the first element from Prufer Code. Let the removed element be X.
4. Check if X is still present in the Prufer code or not. If it has disappeared, then X has become a leaf itself. Add it in the set L.
5. Repeat step 2 to 4 until the full sequence disappears.
6. Once the process ends, you will have a connected labeled tree. 

I am not going to draw a picture again. Too much effort. Just watch the following video if you are still confused.

Code for Converting Prufer Code to Tree

We will convert the process we discussed above into code.

Complexity: $O(VlogV)$

Converting Prufer Code to Tree is much more common than the other way around. There is even a problem on Timus on this conversion: Timus 1069. Prufer Code. I believe you now have enough knowledge to solve Timus 1069.

Conclusion

We now know how to convert a labeled tree into Prufer Code and vice versa. Not only we know the process, we even know how to code them.

On next post (Application of Prufer Code: Random Tree Generation, Cayley’s Formula and Building Tree from Degree Count), we will look into some of its common applications and solve some interesting problems related to Prufer Code. Make sure you understand the process clearly before proceeding.

If you enjoyed reading the post, then don’t forget to share it with your friends.

Resources

  1. Wiki – Prüfer sequence
  2. Youtube – Graph Theory: 40. Cayley’s Formula and Prufer Seqences part 1/2
  3. Youtube – Graph Theory: 41. Cayley’s Formula and Prufer Seqences part 2/2

Related Problems

  1. Timus 1069 – Prufer Code

Next Posts

  1. Application of Prufer Code: Random Tree Generation, Cayley’s Formula and Building Tree from Degree Count