Prime Factorization of an Integer

Problem

Given an integer N, find its prime factorization.

For example, $12 = 2 \times 2 \times 3 = 2^2 \times 3$.

It is somewhat difficult to factorize an integer. If $N$ is small enough, then it is possible to factorize it using prime generate by Sieve of Eratosthenes.

Trial Division Method

Suppose we are trying to factorize $980$. How would we attempt to factorize it with pen and paper? We will try to extract the smallest possible prime factors from it and make it smaller. That is we will try to divide the number with prime numbers and keep track which is capable of dividing it.

The first prime we have is $2$. Can we divide $980$ with $2$? Yes. $980=2 \times 490$. Now we need to factorize $490$ which is smaller and thus easier.

Again, let’s try to factorize $490$ with $2$. It is possible, so $980=2 \times 2 \times 245= 2^2 \times 245$. It is not possible to extract another $2$ from $245$ so we move on to next prime.

$3$ fails to divide $245$, so we move on to $5$. By repeating this process, we find out that $980=2^2 \times 5 \times 7^2$.

This process is kind of laborious, but we have computers to do our bidding. So for this trial division to work, all we need is a list of primes so that we can try dividing $N$ one by one from that list. How will we get that list? Using Sieve of Eratosthenes.

But before we start generating prime, we need to answer another question. Sieve of Eratosthenes can generate prime up to a certain number. So up to which value should, we generate prime numbers to factorize N?

Limit for Sieve of Eratosthenes

Well, if we are trying to factorize $N$, there is no point in generating primes that are larger than $N$. So we can generate up to $N$ and factorize using the list with primes less than or equal to $N$. But we can do even better by observing a special property of prime factors of $N$.

$N$ can have many prime factors. Some of the prime numbers will be $< \sqrt{N}$ and some $\ge \sqrt{N}$. Exactly how many primes factors can $N$ have that are greater than $\sqrt{N}$?

Notice that if $N$ can never have two prime factors $>\sqrt{N}$ since, $if : A > \sqrt{N}, then : A \times A > N$.

So if we generate primes up to $\sqrt{N}$ and use that list to factorize $N$ then at the end, we will either have 1 or a number greater than $\sqrt{N}$ which will be a prime for sure.

For example, $N=42$ and $\sqrt{N}=6.4 \approx 6$. So let’s try to factorize using primes less than or equal to $6$ only, that is using only $[2,3,5]$. What do we get? $N = 42 = 2 \times 3 \times 7$. Since $7$ can no longer be divided with any prime from our list $[2,3,5]$ we stop. $7$ is our last missing prime factor.

So, in order to factorize $N$, we need to generate primes up to $\sqrt{N}$ only.

Code for Sieve of Eratosthenes

vector <int> factors;
void factorize( int n ) {
    int sqrtn = sqrt ( n );
    for ( int i = 0; i < prime.size() && prime[i] <= sqrtn; i++ ) {
        if ( n % prime[i] == 0 ) {
            while ( n % prime[i] == 0 ) {
                n /= prime[i];
                factors.push_back(prime[i]);
            }
            sqrtn = sqrt ( n );
        }
    }
    if ( n != 1 ) {
        factors.push_back(n);
    }
}

Let me explain a few things about the code. In line 4, inside the for loop I wrote the following condition, i < prime.size() && prime[i] <= sqrtn. The first condition ensures that we don't go array over-bound. In order to understand it, imagine what will happen if we try to factorize a prime number $>\sqrt{N}$. Without that condition, you might get RTE.

The second condition ensures we are always trying to factorize using prime less than $\sqrt{N}$.

In line 5, we check if it is possible to divide the current $N$ with prime[i]. If so we start a loop in line 6 and keep on extracting until we cannot anymore.

In line 10, we are optimizing our code. Since, after extraction of prime factors, $N$ has become smaller, we now need to only factorize with a smaller amount of prime numbers. Due to this single line, the code converges really quickly.

In line 13, we are checking if the residual number after factorization is 1 or not. If not, it is the only prime factor which is greater than $\sqrt{N}$. So we add it to our prime factor list.

Time Complexity

There are two loops in the code for factorize(). The first one is at line $4$. This loop runs once for every prime $\leq \sqrt{N}$. From "Prime Number Theorem", $\pi (N) \approx \frac{N}{ln(N)}$. So there are approximately $\frac{ \sqrt{N} }{ ln ( \sqrt{N} ) }$ primes $\leq \sqrt{N}$.

The second loop at line $6$ runs once for every prime factor $N$ has. So it cannot run more than $log_2(N)$ \times.

So the complexity is $O(\frac{ \sqrt{N} }{ ln ( \sqrt{N} ) } + log_2(N) : )$. In practice, if $N$ is not a prime then due to line $10$, $N$ becomes small quickly. So it is much faster than what we calculated.

Optional Optimization

There is an optional optimization we can perform only when it is possible to generate sieve up to N. Just insert a single statement in line 5.

for ( int i = 0; i < prime.size() && prime[i] <= sqrtn; i++ ) {
        if ( sieve[n] == 0 ) break; /*Checks if n is prime or not*/
        if ( n % prime[i] == 0 ) {

The new line added simply checks if the number we are trying to factorize is a prime or not. If it is prime, then obviously we cannot factorize it anymore. This optimization is highly useful when we already have sieve generated up to $N$.

Reference

  1. forthright48 - Prime Number Theorem - https://forthright48.com/2015/07/prime-number-theorem.html

Sieve of Eratosthenes – Generating Primes

Problem

Given an integer N, generate all primes less than or equal to N.

Sieve of Eratosthenes – Explanation

Sieve of Eratosthenes is an algorithm that generates all prime up to N. Read this article written by Jane Alam Jan on Generating Primes – LightOJ Tutorial. The pdf contains almost all the optimizations of the Sieve of Eratosthenes.

Code

vector <int> prime; // Stores generated primes
char sieve[SIZE]; // 0 means prime
void primeSieve ( int n ) {
    sieve[0] = sieve[1] = 1; // 0 and 1 are not prime

    prime.push_back(2); // Only Even Prime
    for ( int i = 4; i <= n; i += 2 ) sieve[i] = 1; // Remove multiples of 2

    int sqrtn = sqrt ( n );
    for ( int i = 3; i <= sqrtn; i += 2 ) {
        if ( sieve[i] == 0 ) {
            for ( int j = i * i; j <= n; j += 2 * i ) sieve[j] = 1;
        }
    }

    for ( int i = 3; i <= n; i += 2 ) if ( sieve[i] == 0 ) prime.push_back(i);
}

Some lines from the code above can be omitted depending on the situation. But as a whole, the above code gives us two products, a vector<int> prime which contains all generated primes and a char[] sieve that indicates whether a particular value is prime or not. We will need the sieve array for optimizations in other algorithms.

Complexity

The algorithm has a runtime complexity of $O(N log (logN ) )$

Primality Test – Naive Methods

Problem

Given a number N, determine if it is a prime.

What is a Prime Number?

A prime number is a positive number greater than 1 which has no positive divisor except for 1 and itself. For example, 2, 3, 5, 11 are prime numbers. Whereas 4, 6 are non-prime or composite numbers.

O(N) Solution

From the definition, we can easily construct a Primality Test function that can determine if a given number N is prime or not. Let us call it isPrime() function.

bool isPrime ( int n ) {
    if ( n <= 1 ) return false; // n needs to be greater than 1
    for ( int i = 2; i < n; i++ ) {
        // If it is possible to divide n with a number other than 1 and n, then it is not prime
        if ( n % i == 0 ) return false;
    }
    return true; // Otherwise, this is prime
}

The code simply iterates over all values between 2 and N-1 and checks if it can divide N. It has a complexity of O(N).

Read More