What I have been up to

The last 11 months have been quite hectic. Mainly, I bought a house and moved into it with my family, I got a new job as the senior programmer in a small web agency and finally, I sneaked in some side projects that were important to me.

This blog post is the first part, from January to July, of an almost exhaustive recap of what I have been programming this year.

First, in January and already detailed on this site, there has been populationpyramid.net, a website allowing you to browse the population pyramids of all the countries in the world, based on the United States data. This was my first incursion in the open data world, and a first attempt at making data visualization in HTML5.

Then, at the beginning of February, I became the IT manager of a small web agency in Brussels: "my media is rich". My duties include system administration, technical architecture, lots of development and globally, technical lead in all software matters. The first thing they had me doing was an iPad application aimed at the Brussels car show. A 6 days deadline for a commercial project on a platform I had never programmed on before, that is the kind of challenge I like to take on (at a bit of expenses to my sleep and family, I must admit). I finally got that one off the ground and learned quite a lot in the process about how web agencies are working (in a very industrialized way, with well defined position: the designer, the motion designer, the project manager, the programmer, the commercial). It was quite interesting. You can see the app here.

The one point that has been obvious from the start there, was that I enjoyed the change to small, well defined projects that those companies churn out, by opposition to the never ending projects I have been working on before, either at the university for my research or the projects I had in my previous company. I had the feeling of getting things done, without too much meetings or email exchanges.

Next up, I had to realize some websites for big clients of ours like http://www.bfoskodarally.be (my first use of Django in a team, with a customized admin aimed to be used by a client) and http://www.generousdays.be, which is now offline. I learnt quite a lot about javascript with those projects.

Then came a very nice challenge. I had to realize one of the fist viral marketing campaign on LinkedIn for Audi. This gave the Audi A6 Challenge. The concept was quite simple: we had to create a "Hot or Not" on LinkedIn, meaning that any user would have to vote repeatedly on two of his/her LinkedIn connection to decide which one they thought as the best professional.
There is a lot to be said about this project. We got quite a bit of attention, even from the people at LinkedIn (if only to inform us about some rule trespassing that we had unwillingly been doing). I liked this project because it was the first time that I got to really work with an API of a social network (the kind your keep hearing about all the time). I was happy that we managed to create a system where the contest was still winnable by lots of people by the end. There is indeed a risk in such contests that a few people take the lead in the beginning and then get unreachable scores, preventing other people to take part. I used a variation of the ELO score used in chess, where people lose more points when losing against somebody with a lower score. This prevented people to get away too fast. Furthermore, we managed quite efficiently to shut down cheating, which is a tough problem for every contest on the web as I have discovered. There are lots of sites for exchanging cheats in such contests (like this one). Finally, the user interface and design were quite polished, which again was a big change of focus from my previous jobs.

Following that, I had to implement a small iOS application for stands on the Carrefour Running Tour Except for the tight deadline again and the fact that I learned to hate iTunes Connect, not much to say about this.

During those months, from february to july, there were three side projects

  • In April, I helped a bit to enhance the Django development dashboard, by proposing to add some sparklines (the yellow lines on the site), and implementing them using raphael.js. This forced me to delve into Jacob Kaplan-Moss's code (one of the creators of Django), where I could pick a few interesting tricks. In the end, I submitted a pull request that got accepted, which was important to me, as my first participation to a preexisting open source project, even if in the end, Jacob did not keep much of my code anyway (for good reasons). I have to participate in such projects again. I was very enthusiastic about this.

  • At end of May, I was contacted by Jonathan Van Parijs, who organizes the Hack Democracy meetups in Brussels to take part, on the computer front, to a participatory democracy experiment, the g1000. I will not delve into the ideas of the initiative, which are well explained on the site. They asked me to help for the creation of their website using a yet to be defined Photoshop design… In two weeks… In my spare time…
    Let just say that I did not get enough sleep again during those weeks, but I fortunately had help from Jonathan and my awesome colleague, Christophe Gérard (he's freelance now, hire him). This project, not excessively complex, was executed using Django again, but in the process, I have been forced to enhance my Photoshop slicing and css skill a lot.
    From this experience, I keep lots of good memories.
    I will especially remember the long hacking nights with a sense of purpose and in the end the launch of the site at midnight on the 10th of june, from my nephew's room (we were invited to his mother's birthday), with loud basses coming from the party downstair, using my android phone tethering connection because nothing else was available. The opening of the site at midnight had been announced in newspapers and through a press conference and so, we had an expecting audience that brought us some hundreds of visits in the first two hours. A nice, memorable launch.
    Later on, the site has been updated a lot, and in the end, after the main event, it has already got around 150 000 visits. Not too shabby for such a subject, but I must say that the event got a lot of press coverage, even internationally. Technically, the site has been hosted on a 10€/month cloud server at Rackspace, using nginx, mysql and memcached and we never had any performance hiccups at anytime.

  • The third one is not a programming project. At the end of june, I took two evenings to write down what happened to my significant other when her gmail account had been hacked. This gave the previous blog post If I wrote it, it is obviously because I thought this could have some interest, but I was definitely overwhelmed by the reaction. I submitted it to Hacker News before going to sleep and it raised to the top spot on the front page, where it remained for almost a day bringing around 35000 visits in one day. There were lots of interesting comments (notably, the top one, from the well-known Matt Cutts). On the technical side, this blog is hosted on a shared server at djangohosting.ch and it never got unresponsive. This leads me to wonder which kind of traffic you need on a rather static site to have performance issues. I still do not have answer for this…

In part two of this recap, I will talk about the rather big projects that happened in the second part of the year, and about what's next (with a surprise!).

(To Be Continued...)

Hacked Gmail Account

On may 17th, in the evening, I received an email from the Gmail account of Charlotte, my significant other. It was written in french (which is normal for her) and looked like this :

How are you ? Would you have time to spend by email on a peculiar situation about me ? I am in deep problems and couldn’t cope with your support.

Hoping to hear from you really soon.

Best, Charlotte

You’ll find the french original text here under (so that people can find it on Google).

I was quite busy and so immediately dismissed this as spam, and did not bother to check where this email had been sent from. Faking email addresses is way to easy to bother for each suspect email. As many people with a public email address, I often receive fake emails from myself.

But this time, the problem was deeper, as I learnt when Charlotte, the real one, called me to warn me that she could not access her Gmail account anymore and that her phone was constantly ringing because of people worried about her. She also told me about a popup that she had seen in the morning about suspect access to her account from the Ivory Coast. At the time, she was quite busy, clicked on some option that looked reassuring and went on with her day. Damn, that was bad.

I immediately tried to access the account (I know her password), to discover that her password had been changed and much worse, that the recovery email address had been changed too (it read something like xxxx1@live.fr where it should have read something like xxxx@ucl.ac.be). I also soon realized that the security question had been changed too. Really, really bad.

Without much hope, I began to fill the “last chance form” of Google. Let it be clear that if you use a free Google account (and even a paying one, in my opinion), you have very few chances to get real people at Google looking at your problem. It is quite logical if you think of the very high ratio of users versus google members : to keep things manageable, they automate administration tasks as much as possible and so, you are in fact only interacting with their programs, never with them directly.

So, when you fill the “last chance form” of Google, you know that you have to fill it as precisely as possible, so that you request pass their automated test. And to examine your request, Google is asking as much information as possible, notably : - the date of your google account registration and the verification code you received at this moment (happily, both information were available in my Gmail account) - the name of the people to whom you write emails the most - the name of the tags you employ and so on and so forth.

With one phone call to Charlotte, I was able to get almost all these information and finally submitted the form, hoping to get an answer for the next day, if I ever got one.

In fact, I got an answer under 15 minutes. I was really happy about this. I immediately logged into Charlotte’s account and went at the bottom of Gmail home page to view the little snippet of text showing if the account was open anywhere else (more info here). By clicking on details on the right of this snippet, you can also sign out all other sessions. There was indeed a session opened in the Ivory Coast (while we are living in Belgium….) so I did close the sessions and changed the password, security question and backup email address. I finally examined all settings of the Gmail account to find out that a forwarding of all emails had been put in place to the following address : tatianalabelle1@live.fr (I don’t see any reason to keep this secret….). Now I did feel almost safe, but still a bit nervous.

Time now for some damage evaluation. I immediately saw that all contacts had been deleted (annoying but not too bad) and that apparently all emails had been put in the trash (but I immediately saw that some were missing too, from the last few days) except for the responses to the fake “emergency” email. Already around 10 responses. I immediately responded to people not to take this into account. This was a bit embarassing for Charlotte, since lots of his colleagues and her whole familly and friends had been contacted. This took me around 10 minutes.

That’s when I lost the connection to the account again.

I immediately tried to login again, to find out that passwords, security questions and backup email addresses had been changed again. It was time for heavy swearing on my part, with some punching of the table.

I then did everything again : filling the form, waiting 15 minutes to get an answer, signing out sessions, changing passwords, security questions and checking again if no other session had been opened in the mean time. According to Google, it was not the case. As a side not, I was very glad that the “last chance form” did work twice. I was really thinking that it could be blocked after a first attempt, to allow further investigation. But I guess that since a human investigation is not really an option, Google chose to let the system run as much as needed. I will probably never know… So I began responding to emails again.

That’s when I lost the connection again…. Password, security question and backup email changed again.

At this stage, I was getting a bit desperate. I called Charlotte, asking her if any of our computers were open with a session (this may not sound very rational, but you never know). It turns out that the windows XP machine that I keep for gaming was on. I told her to turn it off (I did not had tim to inspect it since). Charlotte also told me to delete all her emails if I could get into the account again, to decrease privacy intrusions (even if at this stage, it could have been too late). In despair, I wrote an email to tatianalabelle1@live.fr asking to stop pirating my girlfriend email account.

I tried the “last chance form” again and it did work a third time. I did everything to try to secure the account and began to respond to emails again. It turns out that this time, I did not lose the connection again. I have been checking this regularly for than a month now, and there was no more suspect activity.

This whole story did left a sour taste though. Charlotte lost all her contacts and past emails. This is not too bad for her, she is not an heavy Gmail user (she has a professional email address too) but I definitely am. I use it for all my emails, my calendar, and lots of other google tools (Analytics for websites for example). If I ever lose my Google account, I could lose a great deal of time and value. This is accentuated by the fact that I have used my Gmail address for registering in lots of other systems too (Ebay, PayPal, Facebook, Twitter, Apple Dev Center, Itunes, some server hosting and so on). For almost all those services, if you have access to my email, you can get access to the service (by filling the “lost password” form). They will send a link to allow you to change the password and login again.

In the precise case of Charlotte, using this kind of accounts did not seem to be the plan. Some people who responded to the fake email got responses where they were asked to send money using Western Union to help Charlotte in Africa, where she had allegedly almost been raped and had not access to a phone. Not very subtle but I still felt the bullet passing much too close…

To mitigate the risk, Google recently launched two-factor authentication, a mechanism that requires you to input, on top of your password, a code generated by an application installed on your phone (iPhone, Android and maybe some others). I have activated this today. You can find more information about this here : advanced sign in security for your google account This indeed increases security, but tends to be a bit cumbersome (I often have a depleted battery, for example, which could prevent access to my emails from a computer) and does not solve other case (like somebody stealing my laptop and using an already opened session).

In the end, I am still feeling a tad insecure about using Gmail as may main account. I think it’s way too difficult to get somebody on the line to help you in case of problem. I am nevertheless quite addicted to their interface, and consequently does not want to give it up, but I will probably switch my main account to a domain I own, so that at least I could shut down the email address in case of need, but still use the Gmail tools through Google apps.

I decided to write this all down so that it could serve as a cautionary tale. You should keep in mind the limitation of your email provider and if you decide to use Gmail, you should keep as much information about your Gmail registration as possible (try to find back your first registration confirmation), and if the option is offered to you, which seems to be the case only for heavy users (probably the people using some paying services from Google), you should activate two-factor authentication. As a side note, I would be definitely interested in buying from Google a gadget similar to the Blizzard Battle net authenticators to use two-factor authentication. I would then be able to keep my phone as a backup option only.

Finally, I am still wondering how those guys in the Ivory Coast got access to Charlotte’s email account. My main hypothesis is that she must have accessed her account on a pc infected by a key logger (To excuse her, she’s working in a big structure, she not in IT and could not possibly control every machine). That said, I found the action quite thoroughly organized (notably, forwarding emails to an external email address is a step that could easily be missed during the recovery of an account), and the most distressing to me is that I am still unable to explain how those guys were able to get access to the account twice after I changed the password, security questions and backup email address from my Mac that does not seem to be compromised.

If you have plausible explanations, I am definitely interested in hearing them.

Original French Text of the fake email :

Comment vas tu ? Aurais tu du temps à consacrer à une situation particulière me concernant discrètement et par mail ? Je suis dans des difficultés telles que je ne saurai que faire sans ton soutien et apport.

Je reste dans l’attente urgente de te lire.

Cordialement,

Charlotte

About PopulationPyramid.net

For some months, I have been interested in data visualization. I have been playing with Processing, and Raphaël and read one of Edward Tufte's books.

Besides, as any motivated web developer, I am interested by the new crop of features appearing in our browsers under the banner HTML5, especially SVG and canvas, which allows to draw vectorial graphics in your browser more easily and to manipulate them using Javascript.

Finally, I am fascinated by the idea of open data, that is, the public availability of data in an easily processable format. For an example of projects made possible by open data, you could check www.wheresmyvillo.be which was made possible by the availability (not official in this case ...) of raw data about the bike sharing scheme of Brussels on the web. My interest about this subject was initially sparked by Adrian Holovaty, one of the creator of django, who created www.everyblock.com, a site allowing you to be informed about every public bit of information in you neighborood (from meetups about programming to burglaries).

With all this in mind, in November last year, I went to the first Hacks/Hackers event in Brussels. It is an event aimed at bringing programmers and journalists together to find out interesting ways to exploit open data. It was a very nice meeting, very inspiring, and it crystalized my interests. I had to do something combining it all.

So, in December, I began to look for interesting public data sets. I found out, thanks to a Wikipedia article reference, that the United Nations have made available the content of their World Population Prospects at http://esa.un.org/unpp/. After some thinking and googling, I came to the conclusion that something as basic as the set of population pyramids of all countries of the world were not directly available online. I settled to publish it, using Raphaël.js to obtain a beautiful visualization. Although not very original, I think those graphs are highly thought-provoking.

Here are a few interesting points:

It took me approximatively two months to finish the project (in fact approximatively five long evenings, but it took me two months to find those five evenings....). From the start, I had envisioned a project with no backend, only Javascript goodness. So, yes, no Django this time. This made possible to host the project on GitHub pages (from the homepage : "The GitHub Pages feature allows you to publish content to the web by simply pushing content to one of your GitHub hosted repositories.") which meant free hosting, and furthermore was a good fit, since I wanted to publish all code under an open source license.

About the code

All code is available on GitHub at : https://github.com/madewulf/populationpyramid.net. I do not think this code would be very good for reuse, but it could be used as an inspiration. The resulting html page is unfortunately completely dependent on the availability of JavaScript to render anything useful and will consequently probably not be referenced very efficiently by Google. I hope to fix that later on (if I can find one or two more evenings ;-) )

In the process, I learnt to use a few things:

  • Raphaël.js support for SVG paths. This allows you to draw complex shapes in a quite straightforward way using the SVG path syntax and there are nice animation effects available (the "morphing" effect from one population pyramid to another is only one line of code). As a side note, Raphaël is really a great tool because it fulfills today lots of the promises of HTML5 about vector graphics, even in Internet Explorer (from version 6) by using Mircrosoft VML where SVG is not available. The only downside is that it does not work on Android machines (but works very well on your iThings).
  • the 960 grid system a CSS framework that allowed me to finally fix some of my gripes about CSS (the other ones should be fixed by the use of something like SASS).
  • the use of cURL to scrape the content of web sites. I shoud definitely have known this before.
  • the use of Google Web Fonts. I think a fair part of the reasonable visual appeal of the site is due to the typography;
  • the set up of a custom domain name on top of GitHub pages. Super duper easy. By the way, thanks to the GitHub people for offering such awesome tools!

Publication

Once I got a sufficiently good version of the site, I published it at http://populationpyramid.net. I wanted some people to see it and maybe give me some feedback. After one Facebook wall post and a tweet that did not bring much traffic, I posted the link to Hacker News. I am a daily reader of this site about tech news and the startup world, and I thought that by insisting on the technological side of my endeavour, people there would be interested. I consequently chose the title "Population Pyramids of the World in SVG using Raphaël.js" and it worked quite well. Two hours after publication, PopulationPyramid.net was featured on the front page of Hacker News (here is the comment page). It reached around the 15th spot, which was much more than what I hoped for. Over two days, Hacker News brought around 1200 direct referred visits and around 1000 more through tweets of the different Hacker News feeds.

Next, although I am not a reader of this site, I posted the link to the programming section of Reddit thinking that it could interest people there (I realized later on that I maybe did not pick the right section, which could explain some downvoting there, although it seems to be a common practice on Reddit). There again, PopulationPyramid.net reached the front page, where it remained for around 24 hours, which brought around 1800 direct visits (here is the comment page ). I am quite happy with those numbers, even if honestly, I thought that those sites were able to bring more visits (but I only have a single data point here). That said, even if the traffic spike was nice, I mainly hope that the site will be used as a reference later on. My dream is to have a school using it as material for some homework for teenagers. If you know somebody that could be interested...

Conclusion

I loved to do this side project and I definitely recommend to any programmer wanting to improve to do this kind of one shot project to pick up a bunch of new technologies. Furthermore, I hope that this project can bring some value to people around the world to better understand the social problems we are currently facing. I chose not to add any comments to the figures because I lacked expertise and time, but also because I hope that people will take time to think about those images and express their own interpretations.

New Feature Ideas

As a final note, here is a list of features I would like to add someday:

  • offering the ability to compare two pyramids;
  • allowing to embed a pyramid in a distant site (so that people could blog about some pyramids);
  • adding a curve showing the population size evolution;
  • exploiting the wealth of other data available on the UN website. There is for sure something interesting to do with the available migration rates for example;
  • add coordinate information when hovering over a data point;
  • allow to download the different images as separate files (currently, they are generated in your browser on the fly). This would probably involve using a Raphaël serializer and would allow to create a site allowing progressive enhancements, solving the dependency on Javascript.
Japan 2010 on PopulationPyramid.net

Population Pyramid of Japan in 2010