Primes

Let's discuss primes!

A natural number is considered a prime if it is:

  • Greater than 1
  • Its divisors are only 1 and itself

Examples:

\(5\) has two divisors: \(\frac{5}{1} = 5\) and \(\frac{5}{5} = 1\)

Which means \(5\) is a prime.

\(4\) has three divisors: \(\frac{4}{1} = 4\) and \(\frac{4}{4} = 1\) and \(\frac{4}{2} = 2\)

Which means \(4\) is not a prime.


Problem

Given an integer \(n\), write a function that returns a list of all prime numbers up to \(n\) (inclusive).

Input

  • \(n\): Integer

Examples

prime_func(5) => [2, 3, 5]  
prime_func(10) => [2, 3, 5, 7]  
prime_func(0) => []  

Strategy

A quick, brute-force way to solve this is to go through all numbers from \(2\) to \(n\) and evaluate each number if it is a prime.

We know that a number is prime if it only has 2 divisors. This means that if we run into more than 2 divisors, we can immediately say that this number is not a prime.

In order for us to evaluate each number (call it \(i\)) from \(2\) to \(n\), we'll need to check from \(2\) to \(i\).

def naive_primes(n):  
    primes = []
    for i in range(2, n+1):
        is_prime = True

        for j in range(2, i):
            if i % j == 0:
                is_prime = False
                break

        if is_prime:
            primes.append(i)

    return primes

By default, we'll set the prime flag to true at the start of each iteration. The modulo operation i % j == 0 tells us that if this evaluation is true, then we have an additional divisor, implying that the number \(i\) is not prime. As soon as we know that this number isn't prime, there is no need to iterate through the rest of the numbers; we'll set the prime flag to false and break from this inner loop to save some performance.

Lastly, we only add the number \(i\) if and only if our prime flag is set to true.

This algorithm costs us \(O(n)\) in space complexity for all prime numbers \(<=n\) and quadratic time complexity.

Improved Solution

We can actually do better by realizing that prime numbers have patterns that we can take advantage of.

Consider the prime number \(2\).

We notice that all even numbers, i.e. \(4, 8, 32, 48, 102, ...\) are not prime because it is divisible by the prime number \(2\).

So, whenever we evaluate a number as prime, what if we use that number to somehow mark the future numbers as not prime?

We can implement this by having a boolean array flag, which will cost us a bit more in space complexity, but save us a good deal of performance in time complexity.

def primes(n):  
    primes = []

    is_prime = [False, False] + [True] * (n-1)

    for i in range(2, n+1):
        if not is_prime[i]:
            continue

        primes.append(i)

        for j in range(i, len(is_prime), i):
            is_prime[j] = False

    return primes

Since we want our is_prime boolean array to be aligned with \(0...n\), we'll have the first two elements be False (since \(0\) and \(1\) are not prime by definition) and set the rest of the elements to be True (temporary default value).

We do a check at the start of each number evaluation by checking against the boolean flag in if not is_prime[i]: continue. If the number isn't a prime, we skip to the next iteration.

When we start at i = 2, the boolean flag is mostly full of True values, so the is_prime check doesn't do much for us initially. However, we will soon come to realize that it becomes very useful in a short amount of time.

Once we add \(2\) to our primes list, we will clear the True flags for all numbers up to \(n\) that are divisible by \(2\). This essentially sets all the even numbers in the example above (\(4, 8, 32, 48, 102, ...\)) to the not prime flag.

Looking back now, since we completely skip numbers that correspond to False in the is_prime boolean array, as soon as we iterate to i = 3, we'll be saving a lot of performance due to skipping a lot of numbers. This effect of skipping number lowers as \(i\) gets higher and higher, but all in all, this algorithm is much more efficient than our brute-force approach.