For some months, I have been interested in data visualization. I have been playing with Processing, and Raphaël and read one of Edward Tufte's books.

Besides, as any motivated web developer, I am interested by the new crop of features appearing in our browsers under the banner HTML5, especially SVG and canvas, which allows to draw vectorial graphics in your browser more easily and to manipulate them using Javascript.

Finally, I am fascinated by the idea of open data, that is, the public availability of data in an easily processable format. For an example of projects made possible by open data, you could check www.wheresmyvillo.be which was made possible by the availability (not official in this case ...) of raw data about the bike sharing scheme of Brussels on the web. My interest about this subject was initially sparked by Adrian Holovaty, one of the creator of django, who created www.everyblock.com, a site allowing you to be informed about every public bit of information in you neighborood (from meetups about programming to burglaries).

With all this in mind, in November last year, I went to the first Hacks/Hackers event in Brussels. It is an event aimed at bringing programmers and journalists together to find out interesting ways to exploit open data. It was a very nice meeting, very inspiring, and it crystalized my interests. I had to do something combining it all.

So, in December, I began to look for interesting public data sets. I found out, thanks to a Wikipedia article reference, that the United Nations have made available the content of their World Population Prospects at http://esa.un.org/unpp/. After some thinking and googling, I came to the conclusion that something as basic as the set of population pyramids of all countries of the world were not directly available online. I settled to publish it, using Raphaël.js to obtain a beautiful visualization. Although not very original, I think those graphs are highly thought-provoking.

Here are a few interesting points:

It took me approximatively two months to finish the project (in fact approximatively five long evenings, but it took me two months to find those five evenings....). From the start, I had envisioned a project with no backend, only Javascript goodness. So, yes, no Django this time. This made possible to host the project on GitHub pages (from the homepage : "The GitHub Pages feature allows you to publish content to the web by simply pushing content to one of your GitHub hosted repositories.") which meant free hosting, and furthermore was a good fit, since I wanted to publish all code under an open source license.

About the code

All code is available on GitHub at : https://github.com/madewulf/populationpyramid.net. I do not think this code would be very good for reuse, but it could be used as an inspiration. The resulting html page is unfortunately completely dependent on the availability of JavaScript to render anything useful and will consequently probably not be referenced very efficiently by Google. I hope to fix that later on (if I can find one or two more evenings ;-) )

In the process, I learnt to use a few things:

  • Raphaël.js support for SVG paths. This allows you to draw complex shapes in a quite straightforward way using the SVG path syntax and there are nice animation effects available (the "morphing" effect from one population pyramid to another is only one line of code). As a side note, Raphaël is really a great tool because it fulfills today lots of the promises of HTML5 about vector graphics, even in Internet Explorer (from version 6) by using Mircrosoft VML where SVG is not available. The only downside is that it does not work on Android machines (but works very well on your iThings).
  • the 960 grid system a CSS framework that allowed me to finally fix some of my gripes about CSS (the other ones should be fixed by the use of something like SASS).
  • the use of cURL to scrape the content of web sites. I shoud definitely have known this before.
  • the use of Google Web Fonts. I think a fair part of the reasonable visual appeal of the site is due to the typography;
  • the set up of a custom domain name on top of GitHub pages. Super duper easy. By the way, thanks to the GitHub people for offering such awesome tools!

Publication

Once I got a sufficiently good version of the site, I published it at http://populationpyramid.net. I wanted some people to see it and maybe give me some feedback. After one Facebook wall post and a tweet that did not bring much traffic, I posted the link to Hacker News. I am a daily reader of this site about tech news and the startup world, and I thought that by insisting on the technological side of my endeavour, people there would be interested. I consequently chose the title "Population Pyramids of the World in SVG using Raphaël.js" and it worked quite well. Two hours after publication, PopulationPyramid.net was featured on the front page of Hacker News (here is the comment page). It reached around the 15th spot, which was much more than what I hoped for. Over two days, Hacker News brought around 1200 direct referred visits and around 1000 more through tweets of the different Hacker News feeds.

Next, although I am not a reader of this site, I posted the link to the programming section of Reddit thinking that it could interest people there (I realized later on that I maybe did not pick the right section, which could explain some downvoting there, although it seems to be a common practice on Reddit). There again, PopulationPyramid.net reached the front page, where it remained for around 24 hours, which brought around 1800 direct visits (here is the comment page ). I am quite happy with those numbers, even if honestly, I thought that those sites were able to bring more visits (but I only have a single data point here). That said, even if the traffic spike was nice, I mainly hope that the site will be used as a reference later on. My dream is to have a school using it as material for some homework for teenagers. If you know somebody that could be interested...

Conclusion

I loved to do this side project and I definitely recommend to any programmer wanting to improve to do this kind of one shot project to pick up a bunch of new technologies. Furthermore, I hope that this project can bring some value to people around the world to better understand the social problems we are currently facing. I chose not to add any comments to the figures because I lacked expertise and time, but also because I hope that people will take time to think about those images and express their own interpretations.

New Feature Ideas

As a final note, here is a list of features I would like to add someday:

  • offering the ability to compare two pyramids;
  • allowing to embed a pyramid in a distant site (so that people could blog about some pyramids);
  • adding a curve showing the population size evolution;
  • exploiting the wealth of other data available on the UN website. There is for sure something interesting to do with the available migration rates for example;
  • add coordinate information when hovering over a data point;
  • allow to download the different images as separate files (currently, they are generated in your browser on the fly). This would probably involve using a Raphaël serializer and would allow to create a site allowing progressive enhancements, solving the dependency on Javascript.