Fetching latest tweet...
  • contact@petersobot.com
  • Projects
  • Résumé
  • About
  • Blog

Fetching latest song...

The Colour of the Web

Nov 22nd, 2010

What colour is the internet?

This is a very difficult question to even ask - what does colour mean? How does one measure the colour of a web page? How do you quantify and categorize the results?

For an upcoming class project (for STAT 206, to be precise,) @amtinits and I are trying to find the colour of the most successful websites on the web. Using the Alexa website rankings and some custom python hackery, we’re finding the average colour of a random sampling of 1,000 websites from the top hundred thousand sites online.

We’re making heavy use of the wonderful webkit2png python library and using the extremely helpful pypng as the heart of our image manipulation scripts.

We’re then comparing the popularity of the site with its hue, as given by the Alexa ranking index and a simple calculation from its average colour value. Then we do some statistical magics, push some buttons, graph some pretty scatter plots, and see if there’s any meaning to the data.

At the moment, we’ve got the scripts running full-time on my server, gathering and processing data slowly. After the data is collected, the report is written (that part won’t be made public) and the marks come back, we want to write a nice front-end with some more options, a database, some crazy amounts of processing, and… do… something with it. (Is this even remotely monetizable? Who cares!)

Until the “app” eventually gets launched, you can view the results of the script in real-time at http://colour.petersobot.com/100k/.

The real-time results of the script can be viewed at http://colour.petersobot.com/100k/output.csv. The script was run for a little while on the top one-million site dataset, but this list had too many malware/spam sites in it. This short list of results can be found at http://colour.petersobot.com/1m/output.csv.