Leading Digits of Factorial

Problem

Given an integer $N$, find the first $K$ leading digits of $N!$.

For example, for $N=10$ and $K=3$, then first $3$ leading digits of $10! = 3628800$ is $362$.

Finding leading digits uses concepts similar to [ref1] Number of Trailing Zeroes of Factorial.

Brute Force Solution

Finding the value of $N!$ and then printing the first $K$ digits is a simple but slow solution. Using long long we can calculate value $N!$ up to $N \leq 20$ and using Big Integer we can calculate arbitrary $N!$ but with complexity worse than $O(N^2)$.

Solution Using Logarithm

In [ref1], we say that a logarithm of value $x$ is $y$ such that $x = 10^y$. For now let us find out leading digits of a value $x$ instead of $N!$. We will extend it to cover factorials later.

So, we know that $log_{10}(x) = y$, where $y$ will be some fraction. Let us separate $y$ into its integer and decimal part and call them $p,q$. For example, if $y = 123.456$, then $p=123$ and $q=0.456$.

Therefore, we can say that $log_{10}(x) = p + q$. Which means, $x = 10^y = 10^{p+q}=10^p \times 10^q$.

Now expand the values of $10^p$ and $10^q$. If $A=10^p$, then $A$ will simply be a power of $10$ since $p$ is an integer. To be more exact, $A$ will be $1$ with $p$ trailing zeroes. For example, $A=10^3 = 1000$. What about $B=10^q$?

Since $q$ is a fraction which is $0 \leq q < 1$, value of $B$ will be between $10^0 \leq B < 10^1$, that is, $1 \leq B < 10$.

Okay, we got the value of $A$ and $B$, what now? We know that if we multiply $A$ and $B$ we will get $x$. But don’t multiply them just yet. Think for a bit what will happen when we multiply a decimal number with $10$. If it is integer, it will get a trailing zero, e.g, $3 \times 10 = 30$. But if it is a fraction, its decimal point will shift to right, e.g $23.65 \times 10 = 236.5$. Actually, decimal points shifts for integer numbers too, since integer numbers are real numbers with $0$ as fraction, e.g $3 = 3.00$. So in either case multiplying $10$ shifts decimal point to the right.

So what happens if we multiply, $A$, which is just $10^p$ to $B$? Since $A$ has $10$ in it $p$ times, the decimal point of $B$ will shift to right $p$ times. That is all $A$ does to $B$ is change its decimal point. It does not change the digits of $B$ in any way. Thus, $B$ contains all the leading digits of $x$.

For example,
$log_{10}(5420) = 3.7339993 = 3 + 0.7339993$.
$\therefore B = 10^0.7339993 = 5.4200$.

So, if we need first $K$ leading digits of $x$, we just need to multiply $B$ with $10^{K-1}$ and then throw away the fraction part. That is $res = \lfloor B \times 10^{K-1} \rfloor$. Why $10^{K-1}$ not just $10^K$? That’s because we already have $1$ leading digit present in $10^q$ before shifting it.

Extending to Factorial

It is easy to extend the idea above to $N!$. First we need to find out the value of $y=log_{10}(N!)$.

$y= log_{10}(N!)$
$y= log_{10}(N \times (N-1) \times (N-2) \times… \times 3 \times 2 \times 1 )$
$y= log_{10}(N) + log_{10}(N-1) + log_{10}(N-2) + … + log_{10}(2) + log_{10}(1)$

So we can simply find out the value of $y$ by running a loop from $1$ to $N$ and taking its log value.

After that we decompose $y$ into $p$, integer part and $q$, fraction part. The answer will be $\lfloor 10^q \times 10^{K-1}\rfloor$.

Code

const double eps = 1e-9;

// Find the first K digits of N!
int leadingDigitFact ( int n, int k ) {
    double fact = 0;

    // Find log(N!)
    for ( int i = 1; i <= n; i++ ) {
        fact += log10 ( i );
    }

    // Find the value of q
    double q = fact - floor ( fact+eps );

    double B = pow ( 10, q );

    // Shift decimal point k-1 \times
    for ( int i = 0; i < k - 1; i++ ) {
        B *= 10;
    }

    // Don't forget to floor it
    return floor(B+eps);
}

The code does exactly what we discussed before. But note the $eps$ that we added when flooring value in line $12$ and $22$. This due to precision error when dealing with real numbers in C++. For example, due to precision error sometimes a value which is supposed to be $1$, becomes $0.9999999999999$. The difference between these two values is very small, but if we floor them both, the first one becomes $1$ whereas the second one becomes $0$. So in order to avoid this error, when flooring a positive value we add a small number ( epsilon = $0.000000001$ ) to the number.

Summary

We need to execute the following steps to find the first $K$ leading digits of a number $x$ ( in our problem $x=N!$ ):

  1. Find the log value of the number whose leading digits we are seeking. $y=log_{10}(x)$.
  2. Decompose $y$ into two parts. Integer part $p$ and fraction part $q$.
  3. The answer is $\lfloor 10^q \times 10^{K-1} \rfloor$.

Resource

  1. forthright48 - Number of Trailing Zeroes of Factorial

Number of Trailing Zeroes of Factorial

Problem

Given a positive integer $N$, find the number of trailing zero $N!$ has.

For example, $5! = 120$ has $1$ trailing zero and $10! = 3628800$ has $2$.

Brute Force Solution

Brute Force solution for this problem involves calculating the value of $N!$, which is overkill in this situation. Read the section on Brute Force on this article before you continue – [ref1] Number of Digits of Factorial.

The brute force solution for this problem faces the same limitation as [ref1]. We cannot calculate $N!$ for $N > 20$ using a long long variable and using Big Integer is too slow, $O(N^2)$, if we assume the number of digits in $N!$ increase by $1$ in each step.

Solution Using Factorization

Suppose a there is a number $x$, with $y$ trailing zeroes at its end, then can’t we write that number as $x = z \times 10^y$?

For example, $x = 12300$ and it has $2$ trailing zeroes. Then we can write $x = 123 \times 10^2$.

Therefore, for every factor of $10$, $x$ gets a trailing zero. If we can find out the number of $10$ in $N!$ then we can easily count its number of trailing zeroes.

How do we form a $10$? We form a $10$ by multiplying $2$ and $5$. So we need to find out how many $2$ and $5$ $N!$ has. This can be found using the idea for factorizing $N!$. The idea is discussed in [ref2] Prime Factorization of Factorial. We will be using $\text{factorialPrimePower}()$ function from that post.

So all we need to do is call $x = \text{factorialPrimePower}(2)$ and $y = \text{factorialPrimePower}(5)$ to find out the frequency of those primes. For every pair of $(2,5)$ we get one $10$ factor. So how many pairs can we make if we have $x$ number of $2$’s and $y$ number of $5$’s? We can make $MIN(x,y)$ pairs.

For example, for $10!$ we have $x = \frac{10}{2} + \frac{10}{4} + \frac{10}{8}= 5 + 2 + 1 = 8$ and $y = \frac{10}{5} = 2$. Therefore number of trailing zero is $MIN(x,y) = MIN(8,2) = 2$.

Trailing Zeroes in Different Base

We solved the problem for the decimal base. Now, what if we want to know how many trailing zero does $N!$ has if we convert $N!$ to base $B$?

For example, how many trailing zeroes does $10!$ has in base $16$? In order to solve this, we need to know how the number system works. Read the post [ref3] Introduction to Number System to learn more.

Previously we said that for decimal numbers, multiplying them with $10$ increase their trailing zeroes. Can something similar be said for the base $16$? Yes. Look at the following values:

$(1)_{16} = 1 \times 16^0$
$(10)_{16} = 1 \times 16^1$
$(100)_{16} = 1 \times 16^2$

Every time we multiply a number with its base, all its digits shift a place to left and at the end, a $0$ is added. So for a number in base $B$, if we multiply it by $B$ it gets a trailing zero at its end.

So all we need to do is find out how many $B$ does $N!$ has in it. For base $16$, we need to find out how many \times $16 = 2^4$ occurs in $N!$. For that we find out how many \times $2$ occurs using $x = \text{factorialPrimePower}(2)$ and since we get $16$ for each $2^4$, we conclude that $N!$ has $\frac{x}{4}$ trailing zeroes.

This is extended to any base. Factorize the base $B$ and find out the occurrence of each prime. Then figure out how many combinations we can make.

For example if base is $60$, then $60 = 2^2 \times 3 \times 5$. So we find out $x = \text{factorialPrimePower}(2)$, $y = \text{factorialPrimePower}(3)$ and $z = \text{factorialPrimePower}(5)$. But then, we see that we need $2^2$ in $60$, so we can’t use all the $x$ we calculated, we need to pair them. So it becomes $x = \frac{x}{2}$. Now, using $x,y,z$ we can make $60$ only $MIN(x,y,z)$ times. So this will be number of trailing zero.

Summary

We find number of trailing zero using the following steps:

  1. Factorize the base $B$
  2. If $B = p_1^{a_1} \times p_2^{a_2}… \times p_k^{a_k}$, then find out occurance of $x_i = \text{factorialPrimePower} ( p_i)$.
  3. But we can’t use $x_i$ directly. In order to create $B$ we will need to combine each $p_i$ into $p_i^{a_i}$. So we divide each $x_i$ by $a_i$.
  4. Number of trailing zero is $MIN(x_1,x_2,…,x_k)$.

Resources

  1. forthright48 – Number of Digits of Factorial
  2. forthright48 – Prime Factorization of Factorial
  3. forthright48 – Introduction to Number System
  4. Wikipedia – Trailing Zero

Prime Factorization of Factorial

Problem

Given a positive integer $N$, find the prime factorization of $N!$.

For example, $5! = 5 \times 4 \times 3 \times 2 \times 1 = 120 = 2^3 \times 3 \times 5$.

Brute Force Solution

A possible solution is to calculate the value of $x = N!$ and then prime factorize $x$. But calculating $N!$ is tedious. We cannot fit $N!$ where $N > 20$ in a long long variable. We will need to use the Big Integer class and that would make things slow. I will soon write a blog post on Big Integer until then know that using Big Integer it would take more than $N^2$ steps to calculate $N!$.

Is there a better way?

Limits on Prime

Before we move on to the solution, let us first decide the limit on prime. In order to factorize $x = N!$, we have to generate prime numbers. But up to which value? Should we generate all primes less than $\sqrt{x}$?

Even for a small value of $N$ like $100$, $x$ can be huge with over $100$ digits in it, thus, $\sqrt{x}$ will also be huge. Generating so many primes is not feasible. Using Sieve of Eratosthenes we could generate primes around $10^8$, which is nowhere near $\sqrt{100!}$.

Note that $N! = N \times (N-1) \times (N-2) \times … \times 2 \times 1$. That is, $N!$ is a product of numbers less than $N$ only. Now, can there be any prime greater than $N$ that can divide $N!$?

Suppose there is a number $A$ and we factorized it. It is trivial to realize that all its prime factors will be less than or equal to $A$. So in $N!$, which is the product of numbers less than $N$, if we decompose all those numbers to their prime factors, then they will reduce to primes less than or equal to $N$.

For example, $6! = 6 \times 5 \times 4 \times 3 \times 2 \times 1 = (2 \times 3) \times 5 \times 2^2 \times 3 \times 2 = 2^4 \times 3^2 \times 5$.

So the prime factors of $N!$ will be less than or equal to $N$. Generating primes till $\sqrt{N!}$ is not necessary. We just need to generate all primes less than or equal to $N$.

Prime Factorization

Now that we know the limit for primes, we are ready to begin factorizing the factorial. There is more than one way to achieve this. We will see three of them and discuss which one is best.

First – Linear Loop From $1$ to $N$

We know that $N! = N \times (N-1) \times (N-2) \times … \times 2 \times 1$. So we could simply factorize every number from $1$ to $N$ and add to a global array that tracks the frequency of primes. Using the code for $factorize()$ from here, we could write a solution like below.

vector<int> prime;
int primeFactor[SIZE]; // Size should be as big as N

void factorize( int n ) {
    int sqrtn = sqrt ( n );
    for ( int i = 0; i < prime.size() && prime[i] <= sqrtn; i++ ) {
        if ( n % prime[i] == 0 ) {
            while ( n % prime[i] == 0 ) {
                n /= prime[i];
                primeFactor[ prime[i] ]++; // Increment global primeFactor array
            }
            sqrtn = sqrt ( n );
        }
    }
    if ( n != 1 ) {
        primeFactor[n]++;
    }
}

void factFactorize ( int n ) {
    for ( int i = 2; i <= n; i++ ) {
        factorize( i );
    }

    // Now We can print the factorization
    for ( int i = 0; i < prime.size(); i++ ) {
        printf ( "%d^%d\n", prime[i], primeFactor[ prime[i] ] );
    }
}

We pass the value of $N$ to $factFactorize()$ in line $20$, and it calculates the frequency of each prime in $N!$. It starts a loop from $2$ to $N$ and factorizes each of them. In line $4$ we have the $factorize()$ function modified a bit in line $10$ and $16$ to suit our need. When those lines find a prime factor, they increase the frequency of that prime in the $primeFactor$ array.

It is simple and straightforward but takes $O(N \times factorize() )$ amount of time. We can do better.

Second - Summation of Each Prime Frequency

Instead of factorizing from $1$ to $N$, how about we just find out how many times each prime occurs in $N!$ and list them. If $p_1$ occurs $a_1$ times, $p_2$ occurs $a_2$ times$...$ $p_x$ occurs $a_x$ times, then $N! = p_1^{a_1} \times p_2^{a_2} \times ... \times p_x^{a_x}$.

That sounds nice, but how do we find the frequency of prime factors in $N!$. Let us just focus on one prime factor, for example, $2$, and find out how many times it occurs in $N!$. We will extend this idea to other primes.

Let $N = 12$. How many times does $2$ occur in $12!$? We know that $12! = 12 \times 10 \times 9 \times ... \times 1$. How many numbers from $1$ to $12$ has $2$ as their prime factors? $\frac{12}{2} = 6$ numbers do and they are ${ 2, 4, 6, 8, 10, 12 }$. So we can say that at least $2^6$ is a factor of $12!$. But is there more?

Yes. Notice that $4 = 2^2$, so it has an extra $2$ in it that we did not count. That means for each number that has $2^2$ in them as a factor, we need to add $1$ to our result. How many numbers are there which has $2^2$ as their factor? $\frac{12}{4} = 3$ numbers, which are ${ 4, 8, 12 }$. So we increase our frequency to $6 + 3 = 9$ and say we have at least $2^9$ in $12!$. But is that it?

No. $8 = 2^3$ and for each number with $2^3$ as factor we add $1$ to result. So our result is $9 + \frac{12}{8} = 9 + 1 = 10$.

Do we try with $16 = 2^4$ now? No. $12!$ cannot have any number with factor $2^4$ since $\frac{12}{16} = 0$. So we conclude that $12!$ has $2^{10}$ as its factor and no more.

Now, we extend this idea to other primes. What is the frequency of prime factor $3$ in $12!$? $\frac{12}{3} + \frac{12}{9} + \frac{12}{27} = 4 + 1 + 0 = 5$. We repeat this for all primes less than equal to $12$.

Therefore, we can say that for a given prime $p$, $N!$ will have $p^x$ as its prime factor where $x = \frac{N}{p} + \frac{N}{p^2} + \frac{N}{p^3} + ... \text{ Until it becomes 0 }$.

So, using this idea our code will look as the following.

void factFactorize ( int n ) {
    for ( int i = 0; i < prime.size() && prime[i] <= n; i++ ) {
        int p = prime[i];
        int freq = 0;

        while ( n / p ) {
            freq += n / p;
            p *= prime[i];
        }

        printf ( "%d^%d\n", prime[i], freq ); // Printing prime^freq which is factor of N!
    }
}

This code factorizes $N!$ as long as we can generate all primes less than or equal to $N!$. The loop in line $6$ runs until $\frac{n}{p}$ becomes 0.

This code has 3 advantages over the "First" code.

  • We don't have to write $factorize()$ code.
  • Using this code, we can find how many \times a specific prime $p$ occurs in $N!$ in $O(log_p (N))$ time. In the "First" code, we will need to run $O(N)$ loop and add occurrences of $p$ in each number.
  • It has a better complexity for Factorization. Assuming the loop in line $6$ runs $log_2 (N)$ \times, this code has a complexity of $O(N log_2 (N))$. The code runs faster than this since we only loop over primes less than $N$ and at each prime the loop runs only $O(log_p (N))$ \times. The "First" code ran with $O(N \times factorize() )$ complexity, where $factorize()$ has complexity of $O(\frac{ \sqrt{N} }{ ln ( \sqrt{N} ) } + log_2(N) : )$.

This idea still has a small flaw. So the next one is better than this one.

Third - Better Code than Two

Suppose, we want to find out how many times $1009$ occurs in $9 \times 10^{18}!$. Let us modify the "Second" code to write another function that will count the result for us.

long long factorialPrimePower ( long long n, long long p ) {
    long long freq = 0;
    long long cur = p;

    while ( n / cur ) {
        freq += n / cur;
        cur *= p;
    }
    return freq;
}

If we pass $n = 9 \times 10^{18}$ and $p = 1009$, it will return us $8928571428571439$. But this is wrong. The line $7$ in the code overflows resulting in a wrong answer. Try it yourself. Print out the value of $cur$ in each step and see when it overflows.

We could change the condition in line $5$ into something like this to solve the situation:

while ( n / cur > 0 )

But this remedy won't work if $cur \times p$ overflows into a positive number. If we want we could use techniques that avoid multiplying two numbers if it crosses a limit, but there is a simpler way.

Note that $\frac{N}{p^3}$ is same as $\frac{ \frac{N}{p^2} }{p}$. So instead of saying that $res = \frac{N}{p} + \frac{N}{p^2}...$, we could rewrite it as

$x = N; \ res = res + \frac{x}{p}$.
$x = \frac{N}{p} = \frac{x}{p}; \ res = res + \frac{x}{p}$.
$x = \frac{N}{p^2} = \frac{ \frac{N}{p} } {p} = \frac{x}{p}; \ res = res + \frac{x}{p}$.
$...$
$x = 0$

Instead of raising the power of $p$, we divide the value of $N$ by $p$ at each step. This has the same effect.

So our code for finding the frequency of specific prime should look like the following:

long long factorialPrimePower ( long long n, long long p ) {
    long long freq = 0;
    long long x = n;

    while ( x ) {
        freq += x / p;
        x = x / p;
    }

    return freq;
}

There might still be inputs for which this code will overflow, but chances for that is now lower. Now if we send in $n = 9 \times 10^{18}$ and $p = 1009$, then this time we get $8928571428571425$ as our result.

If we apply this improvement in our $factFactorize()$ function, then it will become:

void factFactorize ( int n ) {
    for ( int i = 0; i < prime.size() && prime[i] <= n; i++ ) {
        int x = n;
        int freq = 0;

        while ( x / prime[i] ) {
            freq += x / prime[i];
            x = x / prime[i];
        }

        printf ( "%d^%d\n", prime[i], freq );
    }
}

This code has less chance to overflow so it is better.

Resource