Tagalgorithms

Sieve of Eratosthenes Visualized in Real Time with Python

Spoiler Alert! This article reveals the answer to Project Euler problem #10.


Try it!

A few months ago I worked through the first 10 problems of Project Euler. I did OK up to problem #10. The question involved finding all the prime numbers up to 2,000,001. There were earlier questions that involved finding primes, and I cobbled together some javascript with the help of underscore.js that got the job done – albiet very slowly. My homespun solution was just too durn slow for problem #10. I turned to the internet, and it provided me a blazing fast solution. See below.

var ten = function sieve(max) {
    var D = [],
        primes = [],
        sum = 0;
    for (var q=2; q < max; q++) {
    if (D[q]) {
      for (var i=0; i < D[q].length; i++) {
        var p = D[q][i];
               
        if (D[p+q])
           D[p+q].push(p);
        else
           D[p+q]=[p];
      }
      delete D[q];
    } else {
            sum += q;
      primes.push(q);
      if (q*q<max)
         D[q*q]=[q];
    }
  }
  return sum;
 }(2000001);

So yeah, I cheated. At the time, I had no idea how this code worked, I just knew that it did. Since then, I’ve occasionally revisited this code to take another shot at figuring out what this mysterious D array is doing with its Ps and Qs. However, trace after trace and many breakpoints later I was still scratching my head.

But just yesterday I stumbled across Python Tutor. This site is simply amazing. These geniuses have developed an interactive online code editor that spits out pictures based on what your code is doing. You even get little playhead controls to step through each line! This wizardry finally allowed me to scratch the itch that’s been bugging me for a while.

I returned to the site where I ‘borrowed’ the JavaScript algorithm for the Sieve of E., and found the Python equivalent. The syntax is a bit different, but it essentially doing the same thing. The code looks like this:

def eratosthenes(maxnum):
   D = {}  # map composite integers to primes witnessing their compositeness
   q = 2   # first integer to test for primality
   while q <= maxnum:
       if q not in D:
           yield q       # not marked composite, must be prime
           D[q*q] = [q]  # first multiple of q not already marked
       else:
           for p in D[q]:  # move each witness to its next multiple
               D.setdefault(p+q,[]).append(p)
           del D[q]        # no longer need D[q], free memory
       q += 1

for p in eratosthenes(19): print p

The comments were helpful in determining that it was doing the same thing as the JS algorithm, but when I plugged this code into Python Tutor, WHAMMO! It hit me like a ton of bricks. D is a dictionary!!! D is where non-prime numbers are stored and created from prime factors.

So we iterate over all the numbers from 1 to whatever, with q as the iterator index. For each value of q, check the dictionary (D). If that value does not exist in D, then the number is prime. For each prime number, find the square and store it in the dictionary for later. So when q = 2, the first entry in D is 4. Then when q = 3, stick 9 in the dictionary. Also, for each dictionary entry created by squaring a prime, attach that prime as a child object.

When q is not prime, examine the dictionary entry. Take the child, add it to the dictionary value, and create a new entry in the dictionary for the sum. So when q = 4, its child value is 2. Take 2 + 4 and get 6. Create a dictionary entry for 6, and store the original prime factor (2) as its child. Then delete the previous dictionary entry (4). Keep this up until q is larger than your target value (2000001).

This part took me the longest to figure out, and it makes a lot more sense by skipping the deletion part. In the Python Tutor example, comment out the line del D[q]. When running the visualizer, the dictionary keeps growing and it is much easier to see how each entry is created, and why each entry is composite rather than prime. For each square value, its root is added over and over again until q reaches its limit. Deleting dictionary entries is not absolutely necessary with low values of q. It’s a memory saver that keeps D from getting out of hand (and crashing Chrome when trying this in JavaScript).

I know I did a sub-par job of explaining this, but seriously, that’s what Python Tutor is for. Try it, you’ll like it.

Count the occurrences of an item in a List or Array

So I came across a situation where I had a whole bunch of items in a collection, and I needed to know how many of each kind there were.  In other words, I needed to know the number of each unique element in an IList, Array, ArrayCollection, or what-have-you.  I looked for some help online, and fiddled around and got it working.  Then a few weeks later I had to do the same thing, and darn near forgot how I did this.  Here is the trick – SORT!

Sort the items in the array (alphabetically, numerically, chronologically, etc…) so that unique items are grouped together.  Then simply loop over the collection counting identical items, and when a new item occurs, make a note of how many you have of the first item, and start counting again.  Below is a quick example.  Click “Generate” to create a list of 50 items, where each item is going to be a fish, cat, dog or pony.  Then click “Count” to count the number of each one.  Also notice how the original list is now sorted so all the ‘dogs’ and ‘cats’ are together.

Sorry, either Adobe flash is not installed or you do not have it enabled

Here’s the meat of the code:

sortedList = new ();

//Sort Newlist Alphabetically
newList.sort();

/*  Create an object to keep track of each unique item in the array. For example
*   { Name: 'cat'
*     Value: 7  }
*/

var countObject:;

//Create a var to store the previously examined item in the list.
var previousItem:;

//Loop over all the items in the newArray
for each ( var currentItem: in newList )
{
//If the current item in the list is different from the previous item, then create a new countObject and start counting the new item.
if (previousItem == null || currentItem != previousItem)
{
//Create a new object
countObject = new ();

//Set the name to the current item in the loop
countObject.name = currentItem;

//Count this item
countObject.value = 1;

//Put the object in the sorted list so we can see it later
sortedList.push(countObject);

//Done. Set the current item to the previous item.
previousItem = currentItem;
}
else  //Otherwise, we haven't switched to a new item yet, so keep counting the current item.
{
/*  Find the last countObject in the sortedList, then add 1 to the value of that object.
*   For example, if we have looped over 3 cat items in a row, and the current item is also cat, then:
*   { Name: cat, Value: 3 + 1 }
*/

sortedList[sortedList.length - 1].value += 1;
}
}

and Full Source for the above example.

© 2018 Eric Terpstra

Theme by Anders NorénUp ↑