R is considered the predominant language for data scientist and statisticians. Learn how to use R for your own data science projects.

Published on

4,593 Points

Prime numbers are natural numbers greater than 1 that have only two divisors (the number itself and 1). By “divisible” we mean dividend % divisor = 0 (% indicates MODULAR. It gives the reminder of a division operation). We’ll follow multiple approaches to find out if a number is prime. Let's find out if P is a prime number.

**Approach 1:** Divide P by all the numbers from 2 to P - 1. If any of these numbers is divided by P then P is not a prime number. The algorithm below returns false if P is divisible by any number between 2 and P - 1. Otherwise, P is a prime number and returns true.

**Approach 2:** Whenever there is a divisor greater than square-root(P), there must be another divisor less than or equal to square-root(P) but NOT greater than square-root(P). For example: Divisors of 20 are: 20 = 1, 2, 4, 5, 10 and 20. For each of these pairs there MUST be least one divisor that is less than or equal to square-root(P).

20 =**1** X 20

20 =**2** X 10

20 =**4** X 5

If we stopped finding divisors when current divisor i is 4, we could still get all the divisors. So, instead of looking for all numbers between 2 and (P - 1) we can do 2 to sqrt(P) which gives a faster solution.

**Approach 3:** The only prime number that is even is 2. Any other even number cannot be prime because all even numbers are divisible by 2. So, instead of dividing by all the numbers in range 0 to (P - 1) we’ll divide only by odd numbers up to sqrt(P).

**Approach 4:** Suppose we need to find prime numbers between 1 and 25. The algorithms described above are capable of providing a result but are not optimal. We might have to divide by the same divisor over and over again. For example, we are checking if each of the dividends below are being divided by 2 and 3.

1. Is 15 Prime? Divide 15 by 2 and 3 (2 to sqrt(15))

2. Is 20 Prime? Divide 20 by 2, 3 and 4 (2 to sqrt(20))

3. Is 25 Prime? Divide 20 by 2, 3, 4 and 5 (2 to sqrt(20))

There is another approach that finds prime numbers in a range optimally. Let's check steps for that algorithm while finding all prime numbers between 1 and 25. The first prime number is 2. Discard all multiples of 2 (except 2 itself) as they must not be prime numbers. Multiples of a prime number P or any natural number CANNOT be prime,**ignoring multiplcation by 1**.

2 3~~4~~ 5 ~~6~~ 7 ~~8~~ 9 ~~10~~ 11 ~~12~~ 13 ~~14~~ 15 ~~16~~ 17 ~~18~~ 19 ~~20~~ 21 ~~22~~ 23 ~~24~~ 25

Now discard multiples of 3

2 3~~4~~ 5 ~~6~~ 7 ~~8~~ ~~9~~ ~~10~~ 11 ~~12~~ 13 ~~14~~ ~~15~~ ~~16~~ 17 ~~18~~ 19 ~~20~~ ~~21~~ ~~22~~ 23 ~~24~~ 25

As we know that there will be not be a new prime number once we reach sqrt(n) we will continue this process until we reach sqrt(25) or 5. That is, discard all multiples of 2, 3, 4 and 5. This ancient algorithm is known as**Sieve of Eratosthenes**.

Discard multiples of 4

2 3~~4~~ 5 ~~6~~ 7 ~~8~~ ~~9~~ ~~10~~ 11 ~~12~~ 13 ~~14~~ ~~15~~ ~~16~~ 17 ~~18~~ 19 ~~20~~ ~~21~~ ~~22~~ 23 ~~24~~ 25

Discard multiples of 5

2 3~~4~~ 5 ~~6~~ 7 ~~8~~ ~~9~~ ~~10~~ 11 ~~12~~ 13 ~~14~~ ~~15~~ ~~16~~ 17 ~~18~~ 19 ~~20~~ ~~21~~ ~~22~~ 23 ~~24~~ ~~25~~

The remaining numbers are 2, 3, 7, 11, 13, 17, 19 and 23. Each of these numbers is prime. This method is very fast in cases where we need to search for prime numbers in a gigantic range.

In this algorithm below, there is a boolean array that stores prime values of numbers 0 - 50000. Initially, all of them are false. Take each of the numbers less than sqrt(50000) and mark all of their multiples as false. Then store the values of p[] that are marked as true into prime[] which is the array of all prime numbers in a range. To be clear,

p[] = array that contains boolean values for all the numbers in a range

prime[] = array that contains prime numbers in a range

In the above example, we have 2, 3, 7, 11, 13, 17, 19 and 23 that are marked as true after exclusion. So, simply copy these values to prime[] array. prime[] = {2, 3, 7, 11, 13, 17, 19, 23}

In this way, we have an array that contains ONLY prime numbers (remember, p[] had boolean values for numbers 0 - 25 but we are only interested in prime numbers. So, we copied prime numbers into another array.) Suppose if we are asked to find first eight prime numbers we could easily print the numbers in index 0 to 7 of prime[] array.

```
// This method takes one parameter P
// to check if P is prime
public boolean isPrime(int P) {
for(int i = 2; i < P; ++i) {
if(P % i == 0) return false;
}
return true;
}
```

20 =

20 =

20 =

If we stopped finding divisors when current divisor i is 4, we could still get all the divisors. So, instead of looking for all numbers between 2 and (P - 1) we can do 2 to sqrt(P) which gives a faster solution.

```
public boolean isPrime(int P) {
// i * i <= P is similar to i <= sqrt(p)
for(int i = 2; i * i <= P; ++i) {
if(P % i == 0) return false;
}
return true;
}
```

```
public boolean isPrime(int P) {
if(P == 2) return true;
else {
for(int i = 3; i * i <= P; i = i + 2) {
if(P % i == 0) return false;
}
}
return true;
}
```

1. Is 15 Prime? Divide 15 by 2 and 3 (2 to sqrt(15))

2. Is 20 Prime? Divide 20 by 2, 3 and 4 (2 to sqrt(20))

3. Is 25 Prime? Divide 20 by 2, 3, 4 and 5 (2 to sqrt(20))

There is another approach that finds prime numbers in a range optimally. Let's check steps for that algorithm while finding all prime numbers between 1 and 25. The first prime number is 2. Discard all multiples of 2 (except 2 itself) as they must not be prime numbers. Multiples of a prime number P or any natural number CANNOT be prime,

2 3

Now discard multiples of 3

2 3

As we know that there will be not be a new prime number once we reach sqrt(n) we will continue this process until we reach sqrt(25) or 5. That is, discard all multiples of 2, 3, 4 and 5. This ancient algorithm is known as

Discard multiples of 4

2 3

Discard multiples of 5

2 3

The remaining numbers are 2, 3, 7, 11, 13, 17, 19 and 23. Each of these numbers is prime. This method is very fast in cases where we need to search for prime numbers in a gigantic range.

In this algorithm below, there is a boolean array that stores prime values of numbers 0 - 50000. Initially, all of them are false. Take each of the numbers less than sqrt(50000) and mark all of their multiples as false. Then store the values of p[] that are marked as true into prime[] which is the array of all prime numbers in a range. To be clear,

p[] = array that contains boolean values for all the numbers in a range

prime[] = array that contains prime numbers in a range

In the above example, we have 2, 3, 7, 11, 13, 17, 19 and 23 that are marked as true after exclusion. So, simply copy these values to prime[] array. prime[] = {2, 3, 7, 11, 13, 17, 19, 23}

In this way, we have an array that contains ONLY prime numbers (remember, p[] had boolean values for numbers 0 - 25 but we are only interested in prime numbers. So, we copied prime numbers into another array.) Suppose if we are asked to find first eight prime numbers we could easily print the numbers in index 0 to 7 of prime[] array.

```
boolean[] p = new boolean[50000];
public int[] prime(int n) {
// Initially, mark all numbers as prime
for(int i = 0; i <= n; ++i)
p[i] = true;
// 0 and 1 are not prime numbers
p[0] = p[1] = false;
for(int i = 2; i * i <= n; ++i) {
if(p[i] == true) {
for(int j = i + i; j <= n; j = j + i) {
// mark multiples of i as non-prime
p[j] = false;
}
}
}
int k = 0;
// prime[] harvests prime numbers
int[] prime = new int[n/4];
for(int i = 0; i <= n; ++i) {
if(p[i] == true) {
prime[k++] = i;
}
}
return prime;
}
```

2 Comments

1. Instead of the for-loop condition "i*i <= n", calculate (int)sqrt((double)n), store it in an int called sqrt_n, and then use "i <= sqrt_n" -- that saves O(n) multiplications.

2. The for-loop increment step "j = j + i" is more efficiently written "j += i". In fact, since the initialization and the increment step are the same, the whole loop can be written: "for (j=i; (j+=i) <= n; )"

3. In general, unless there's a good reason, use prefix decrementation ("--i") and post-fix incrementation ("i++").

4. Why bother writing "== true"? If it's a boolean, write just "if (mybool)" rather than "if (mybool == true)"! This saves O(n) subtractions, because comparison is simply subtracting the two values, noting positive/negative/zero, and throwing out the result.

5. I've found it worthwhile, when stepping through an array from end to end, to maintain a pointer and increment it rather than use an integer as a subscript, because that involves O(n) addition operations total. For example the sieving loop in the Eratosthenes method is better written:

```
for (int i=2, boolean *pp=p+2; i <= sqrt_n /* see #1 above */; i++, pp++)
if (*pp)
for (Boolean *ppp=pp+i, int j=i; (j+=i) <= n; )
*(ppp += i) = false;
```

Of course the lower loop has two additions per iteration, so there are no savings there.
Another thing, as you said "Instead of the for-loop condition "i*i <= n", calculate (int)sqrt((double)n), store it in an int called sqrt_n, and then use "i <= sqrt_n" -- that saves O(n) multiplications."

From my understanding, i * i <= n is similar to i <=sqrt(n). Only difference is storing sqrt(n) in a variable which is a code optimization approach and good thing to do (but not the prime focus of this article). I don't understand how does i <= sqrt_n save O(n) multiplication?

Title | Views | Activity |
---|---|---|

Linear Search | 1,059 | |

Introduction to Recursion | 1,076 | |

It's Time to Rethink the Algorithm | 1,280 | |

Trends 2018 – What Will Dominate In SEO This Year? | 499 |

Learn how to collaborate with office 365 Office Online

Next Article:A Guide to Approximate String Matching