Implementing the Sieve of Eratosthenes in Delphi
The Sieve of Eratosthenes is a very fast, but memory intensive algorithm to find the primes under a given value.
I first read about this method when I was about 12 in a book called “Mathematics: a Human Edeavor” by Harold R. Jacobs. I had forgotten about the details of this method until I stumbled on it again the other day. Here is how I developed a solution in Delphi:
The algortihm
Start by creating a field of numbers from 3 up to desired maximum.
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
Mark multiples of 2 starting with 22
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
After this we repeatedly find the lowest unmarked value (which will always be prime) and mark its multiple starting with its square
Mark multiples of 3 starting with 32
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
The next unmarked value greater than 3 is 5. Mark multiples of 5 starting with 52
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
7 is the next unmarked value. So we mark multiples of 7 starting at 72
3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
11 is the next prime, but 112 is outside our max. We are done and all unmarked values are prime. You can see how this algorithm can be very quick esp. since we only have to mark prime multiples and then only multiples greater than the square of each prime in turn.
What is the downside to this method?
We need at least one bit of storage per number that we want to mark. For simplicity if our bitfields start at 0 (and I’ll show why that is a good idea soon) and we want to mark primes up to max(uint32) 4,294,967,295. Then we would need 512 MB just for the bits. We could shrink the bitfield by only storing the odd values and starting from 3.
The algorithm is very fast, but the memory constraints make it problematic to get large primes.
Supposing that we will need to use memory for other purposes, such as storing the prime factorization of numbers we should consider building and running the application as a 64-bit binary to allow easy access to more memory.
Implementation in Delphi
The Bitfield
For the purposes of keeping the implementation simple and still gaining speed I will make the bitfield starting at 0 and a bit for each value. Index 0 represents 0, index n represents n. This keeps the logic very simple. I simply index by each value directly and set the bit. But, what if we want to store values larger than our sample bit-field of 64 bits? We could simply keep an array of 64-bit fields
for example we could have FBitFieldArray : Array[0..2] of UInt64;
to represent 0 to 191
FBitFieldArray[0]
would be 0 to 63, FBitFieldArray[1]
would be 64 to 127, FBitFieldArray[2]
would be 128 to 191. Indexing on field [0] is easy, we just use our number, but what about the other fields. In Delphi integer division is done with Div
, while the Mod
method gets the remainder (modulus) of a number after division. Both can be done at the same time with DivMod
. Since we have blocks of 64 we can divide by 64 to get the block we are in and the remainder would be the position inside that block
We will build our code based on 3 blocks and then we will generalize later
2 is the odd man out
All primes are odd except for 2, which forces us to have a check every iteration of the loop for this exceptional case. In all other cases we can check for our next prime value starting with the previous prime’s position plus 2.
To eliminate this the check for 2 each iteration (and since 2 is a trivial case) we can set all the bits in our bitfield for 2s first before repeating the same process for primes 3 and up.
But, this is not needed. We know that setting every even value will just yield a bit pattern of alternating ones and zeros 101010101. We do have a special case though; 2 is not prime and, of course, 0 and 1 are not prime numbers either. Since binary 10102 is just hexadecimal 516 we can set all blocks to $5555555555555555
except for the very first block. The only difference for the first block is that the first 4 bits would be 00112 or 316. So, we can set the first block to $5555555555555553
Now we can move on to real algorithm.
Eliminate multiples of primes
Starting with prime 3 we can now mark multiples of 3 as not prime. We have to exclude 3 itself of course, but we already covered 2*3 so we don’t need to do 3*2 as well. Therefore we can start at 3*3. This logic applies as we move to each next prime, so we can always start with multiples of our prime starting with its square
Setting the bit corresponding to a composite number
Next we need to find a way to set a specific bit by index. LBlockBitIndex
tells us the bit we need to set, but how do we actually set a bit based on that index? In numeric terms each bit n represents 2n, therefore we want the value 2LBlockBitIndex.
However, we don’t actually need to calculate a power of 2, we can use bit shifting. The shl
method in Delphi shifts bits left. For instance if we start with 100002 and shift 2 left we’d get 10000002. Using 1 shl LBlockBitIndex
we will shift the value by the number of bits. At first glance this looks OK, but what size is the 1
constant used in that expression? Its unlikely to be 64-bit, which means that shifts over the size of the 1
constant will give us incorrect results. To ensure that we shift for a 64-bit unsigned buffer value we can make it explicit by casting UInt64(1) shl LBlockBitIndex
Watch out! Don’t assume the sizes of declared constants either used in expressions or declared with a const statement. If you need a specific size, make it explicit
Now that we have a value with the bit all we need to do is to apply a bitwise or
with the correct block in the field. This way we will set the bit at that index and leave all other bits in the field as they were before
Delphi uses the or
operator for logical and bitwise or
operations. The type of operation depends on the operands, in this case we have a bitwise logical operation.
Finding the next prime
The sieve eliminates primes by marking multiples so if we start with our last prime and move bit by bit until we find an unmarked bit then that bit will represent our next prime. So given we our field how do we do this?
We know that we have checked all values up to p (our last prime). p is 3 or larger so we know we can check for even unmarked bits p+2 or higher. In a loop for n we can shift right by p+n+2 (n > 0) until the right most bit is 0.
Of course as we increment to get the next prime we may loop over to the next block
This code is not fully optimized, but shows the process of finding the next unset bit. Lastly since we convert the Block and Bit index to a number which will be our next prime (we subtract 1 because our bitindex is one out of phase).
Generalizing the sample
First lets remedy the fact that we use “magic numbers” all over the place. While it is fine to write comments to explain the use of numbers that are hard to derive from context. Instead of using 64 for our block size we can declare a constant
We make assumptions that the number of blocks are fixed. That may not be the case, we want to make the sieve more flexible and allow it to expand according to demands
Let start by defining MaxValue
; this will be the cap when searching for primes. To represent that value we would need Ceil(MaxValue/BlockSize)
blocks in our field, so that leads us to MaxBlockIndex := Floor(MaxValue/BlockSize)
.
We know that we start marking multiples of squares of primes, therefore we don’t have to check primes greater than the integer square root of MaxValue
. This gives us MaxValueISqrt := Trunc(Sqrt(MaxValue))
.
Since we can set all of these as dependency on a single value we can write something like this:
With these changes our main loop will now look more readable. Here is our main routine that creates the prime mask
We can also clean up our GetNextPrime
to add in our new variables:
Finding Primes up to a Maximum
The process would be to mark all multiples primes using the method above up to the squareroot of the maximum. And then traverse the field and return the indexes of all the zeros. Assume NumberOfBitsSet
is a method that returns the number of 1s in the field, we will cover that in a moment.
Couting bits
In order to allocate exactle enough memory to return our primes in the method above we needed to know the number of bits set. The difference between the max value and the number of bits set gives us our number of primes. I found this nugget that I converted from C to Delphi. All I do finally is looping through the blocks and accumulating the count
Conclusion
I hope that you found this useful or at least interesting. Please download, clone or fork the source code.
Leave a Comment