Personal

A small family

Sunday, September 28th, 2008 | Personal | 1 Comment

A few weeks ago our little daughter was born. She is healthy and doing very well and we are all very happy. In the image she is two days old.

Hannah

This is also, if there should be any doubt, why I have been a tad quiet as of late.

A new time…

Saturday, February 23rd, 2008 | Personal | 1 Comment

Scan

Tags: ,

The children of Húrin

Tuesday, January 8th, 2008 | Personal | No Comments

As a christmas gift, I got Tolkien’s The children of Húrin, since my parents have by now realised that I am partial to fantasy novels. Mostly what I enjoy is to read about unlikely heroes who seek the good in themselves and try to persevere in the face of overwhelming foes. This is what true heroes would do, the ones of epic proportions. However, The children of Húrin is not like this at all.

The most positive surprise about the book is its physical quality. The paper is not thin and cheap like in most works of fiction, but dense with good texture. The story, however, is a whole other matter. It is a story of the two children of Húrin, Niënor and Túrin, whom the evil Valar, Morgoth, have set his evil will upon, and perhaps their fates can be explained by this, but there is no striving for a greater goodness in them. They are haughty, vain, and egotistical, and hardly once do they do what one would think to be right.

Perhaps these negative tales need to be spun from time to time so that the tales of goodness may shine the brighter, but I must admit that I am somewhat dismayed by the bleakness of the story, at least it is not what I like to seek in a novel.

Tags: ,

Ihre Papiere, bitte

Sunday, December 23rd, 2007 | Personal | No Comments

The surveillance society is upon us, whether we would like it or not. All in the name of catching serious criminal offenses like ‘terrorism’. In particular the digital realm is being monitored with phone calls, phone text messages and communications on the internet. Based on the EU data retention law, which I have written about here and here, my dear country, Denmark, enacted their surveillance laws a few months ago: Bekendtgørelse om udbydere af elektroniske
kommunikationsnets og elektroniske kommunikationstjenesters registrering og opbevaring af oplysninger om teletrafik (logningsbekendtgørelsen)
; in short, the logging proclamation. According to it, the following items must be logged in an internet session:

  1. Transmitter’s IP address
  2. Receiver’s IP address
  3. Transport protocol
  4. Transmitter’s port number
  5. Receiver’s port number
  6. Time for the start and end of the communication

So what does this leave us with? Sure, we can see what machine you connect to and how long your connection lasts, so for the fun of it, and because this is about as ridiculous as it gets, I decided to take a try at logging all my TCP connects/disconnects an entire afternoon and evening and see what that would lead us to discover about me. Since the originating IP in this instance is a bit irrelevant, let us focus on the receiver’s IP address and port number.

A day’s worth of log information takes up a good bunch of lines, so instead of going through all of it, I will go through enough of it to illustrate the pointlessness of the entire thing. This took less than eleven minutes to do.

11:43:11 - 11:43:13: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:11 - 11:43:16: 81.19.246.12:www (RDNS N/A)
11:43:11 - 11:43:20: 81.19.246.12:www (RDNS N/A)
11:43:12 - 11:46:41: 193.88.32.86:www (RDNS N/A)
11:43:13 - 11:43:14: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:13 - 11:43:20: 81.19.246.12:www (RDNS N/A)
11:43:15 - 11:43:16: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:15 - 11:43:16: 64.158.223.144:www (RDNS img.snv.mediaplex.com)
11:43:20 - 11:43:27: 81.19.246.12:www (RDNS N/A)
11:43:29 - 11:43:31: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:31 - 11:43:55: 81.19.246.12:www (RDNS N/A)
11:43:32 - 11:43:33: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:34 - 11:43:35: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:35 - 11:43:36: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:35 - 11:43:36: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:37 - 11:43:40: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:37 - 11:43:43: 81.19.246.12:www (RDNS N/A)
11:43:38 - 11:48:51: 80.167.236.88:www (RDNS a80-167-236-88.deploy.akamaitechnologies.com)
11:43:38 - 11:49:19: 80.167.236.88:www (RDNS a80-167-236-88.deploy.akamaitechnologies.com)
11:43:39 - 11:43:45: 81.19.246.96:www (RDNS N/A)
11:43:49 - 11:44:14: 128.242.125.13:www (RDNS N/A)
11:43:51 - 11:43:53: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:51 - 11:43:52: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:51 - 11:43:55: 81.19.246.12:www (RDNS N/A)
11:43:54 - 11:43:55: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:54 - 14:20:33: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:55 - 11:43:56: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:55 - 14:20:35: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:43:55 - 11:44:04: 81.19.246.12:www (RDNS N/A)
11:44:00 - 11:44:01: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:00 - 11:44:01: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:02 - 11:44:03: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:02 - 11:47:13: 64.158.223.128:www (RDNS ad.snv.mediaplex.com)
11:44:02 - 11:44:16: 83.133.64.252:www (RDNS N/A)
11:44:03 - 11:44:05: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:03 - 11:44:05: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:03 - 11:46:36: 193.88.32.86:www (RDNS N/A)
11:44:04 - 11:44:09: 81.19.246.12:www (RDNS N/A)
11:44:06 - 11:44:07: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:07 - 11:44:08: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:09 - 11:44:14: 81.19.246.12:www (RDNS N/A)
11:44:10 - 11:44:12: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:14 - 11:44:17: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:15 - 11:44:20: 81.19.246.12:www (RDNS N/A)
11:44:19 - 11:44:20: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:20 - 11:44:21: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:20 - 11:44:23: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:20 - 11:44:47: 128.242.125.13:www (RDNS N/A)
11:44:20 - 11:44:32: 83.133.64.252:www (RDNS N/A)
11:44:22 - 11:44:23: 193.88.71.163:www (RDNS N/A)
11:44:24 - 11:44:26: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:27 - 11:44:28: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:28 - 11:44:29: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:29 - 11:44:32: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:30 - 11:44:32: 194.126.131.130:www (RDNS adserver2.adtech.de)
11:44:37 - 11:44:38: 77.79.194.194:www (RDNS 77.79.194.194.adocean.pl)

To people who have spent some time looking into DNS, it should come as no surprise that reverse DNS is shaky at best, since most companies either don’t have the correct PTR records or they do not have them at all. So what did trigger all these calls to adtech? Well, that’s fairly easy: I visited pol.dk, which is the 81.19.246.12 entry above without an available reverse DNS. Pol.dk is the online version of the Danish newspaper Politiken, which is slightly on the left of the political spectrum, so if I consistently visit this news source as my primary source for news, people watching the logs could probably peg me to be on the left of the political spectrum as well.

11:52:16 - 11:52:17: 66.35.250.150:www (RDNS slashdot.org)
11:52:17 - 11:52:18: 216.73.86.153:www (RDNS annymegaadvip3.doubleclick.net)
11:52:18 - 11:52:22: 69.28.241.125:www (RDNS static-vip.srv.jobthread.com)
11:52:19 - 11:52:29: 66.35.250.55:www (RDNS images.slashdot.org)

Next is a trip around Slashdot to check for the latest geekish news. A huge portion of their readers are strong privacy advocates and for the most part they think copyright is too far-reaching in its current form and refer to MPAA and RIAA as the MAFIAA. At least the vocal part of their readers seem to hold these opinions. If I follow a lot of the yro.slashdot.org stories (your rights online) then odds are that I am also interested in these things and hold these views; however, from this log entry we can only tell that I’ve visited the main slashdot site.

11:52:23 - 11:52:24: 212.187.213.175:www (RDNS uk-pix05.quantserve.com)
11:52:56 - 11:53:00: 66.96.26.214:www (RDNS uf.ServerNorth.net)
11:52:56 - 11:53:17: 82.165.177.183:www (RDNS u15185240.onlinehome-server.com)
11:52:57 - 11:53:05: 209.172.63.166:www (RDNS iw-fb-apache-2.zeservers.com)
11:52:58 - 11:53:00: 66.96.26.214:www (RDNS uf.ServerNorth.net)
11:52:58 - 11:53:01: 66.207.163.2:www (RDNS N/A)
11:52:58 - 11:53:01: 64.131.83.210:www (RDNS princess.questionablecontent.net)
11:52:59 - 11:53:00: 64.4.241.33:https (RDNS www.paypal.com)
11:52:59 - 11:53:00: 64.4.241.33:https (RDNS www.paypal.com)
11:52:59 - 11:53:04: 209.172.63.166:www (RDNS iw-fb-apache-2.zeservers.com)
11:52:59 - 11:53:10: 66.96.26.211:www (RDNS uf2.ServerNorth.net)
11:52:59 - 11:53:09: 66.96.26.211:www (RDNS uf2.ServerNorth.net)
11:52:59 - 11:53:10: 66.220.2.5:www (RDNS ['ns1.keenspot.com', 'ns1.keenspace.com', 'binky.keenspace.com'])
11:53:00 - 11:53:10: 208.122.4.178:www (RDNS N/A)
11:53:00 - 11:53:01: 207.7.147.85:www (RDNS optimize.indieclick.com)
11:53:00 - 11:53:01: 64.4.241.33:https (RDNS www.paypal.com)
11:53:00 - 11:53:01: 204.11.109.21:www (RDNS a.tribalfusion.com)
11:53:01 - 11:53:08: 208.122.4.178:www (RDNS N/A)
11:53:01 - 11:53:05: 74.208.78.7:www (RDNS s214871675.onlinehome.us)
11:53:01 - 11:53:27: 66.220.2.5:www (RDNS ['ns1.keenspot.com', 'ns1.keenspace.com', 'binky.keenspace.com'])
11:53:02 - 11:53:05: 66.220.2.19:www (RDNS nineteen.keenspot.com)
11:53:02 - 11:53:09: 69.17.116.124:www (RDNS webhosting.speakeasy.net)
11:53:02 - 11:53:05: 66.220.2.25:www (RDNS twentyfive.keenspot.com)
11:53:03 - 11:53:13: 69.17.116.124:www (RDNS webhosting.speakeasy.net)
11:53:04 - 11:53:05: 66.220.2.25:www (RDNS twentyfive.keenspot.com)
11:53:04 - 11:53:14: 66.96.26.211:www (RDNS uf2.ServerNorth.net)
11:53:04 - 11:53:14: 66.96.26.211:www (RDNS uf2.ServerNorth.net)
11:53:05 - 11:53:06: 67.15.50.37:www (RDNS ev1s-67-15-50-37.ev1servers.net)
11:53:05 - 11:53:13: 66.249.93.166:www (RDNS ug-in-f166.google.com)
11:53:05 - 11:53:09: 69.17.116.124:www (RDNS webhosting.speakeasy.net)
11:53:05 - 11:53:11: 66.220.2.25:www (RDNS twentyfive.keenspot.com)
11:53:06 - 11:53:13: 66.249.93.166:www (RDNS ug-in-f166.google.com)
11:53:06 - 11:53:11: 66.207.163.2:www (RDNS N/A)
11:53:07 - 11:53:25: 12.18.170.211:www (RDNS frost.mtaonline.net)
11:53:08 - 11:53:13: 216.197.119.157:www (RDNS N/A)
11:53:08 - 11:53:11: 66.220.2.25:www (RDNS twentyfive.keenspot.com)
11:53:09 - 11:53:10: 207.7.147.85:www (RDNS optimize.indieclick.com)
11:53:09 - 11:53:11: 66.207.163.2:www (RDNS N/A)
11:53:09 - 11:53:10: 195.78.94.245:www (RDNS N/A)
11:53:10 - 11:53:25: 66.220.2.19:www (RDNS nineteen.keenspot.com)
11:53:10 - 11:53:11: 8.7.217.43:www (RDNS N/A)
11:53:10 - 11:53:11: 204.11.109.24:www (RDNS a.tribalfusion.com)
11:53:11 - 11:55:28: 209.101.90.33:www (RDNS dndorks.com)
11:53:11 - 11:53:13: 66.33.217.213:www (RDNS basic-kant.dawber.dreamhost.com)
11:53:11 - 11:53:12: 80.252.93.102:www (RDNS N/A)
11:53:11 - 11:53:13: 195.78.94.245:www (RDNS N/A)
11:53:12 - 11:53:19: 66.207.163.2:www (RDNS N/A)
11:53:12 - 11:53:13: 66.220.2.25:www (RDNS twentyfive.keenspot.com)
11:53:12 - 11:53:15: 72.29.92.15:www (RDNS server.whiteninjacomics.com)
11:53:13 - 11:54:22: 192.217.199.107:www (RDNS N/A)
11:53:13 - 11:53:19: 66.207.163.2:www (RDNS N/A)
11:53:13 - 11:53:19: 66.33.217.213:www (RDNS basic-kant.dawber.dreamhost.com)
11:53:14 - 11:53:17: 64.131.83.210:www (RDNS princess.questionablecontent.net)
11:53:15 - 11:53:16: 216.197.119.157:www (RDNS N/A)
11:53:15 - 11:53:19: 209.101.90.33:www (RDNS dndorks.com)
11:53:16 - 11:53:17: 8.7.217.43:www (RDNS N/A)
11:53:16 - 11:53:20: 64.233.171.104:www (RDNS rn-in-f104.google.com)
11:53:16 - 11:53:20: 64.233.171.104:www (RDNS rn-in-f104.google.com)
11:53:17 - 11:53:18: 8.7.217.43:www (RDNS N/A)
11:53:18 - 11:53:24: 208.122.4.178:www (RDNS N/A)
11:53:18 - 11:53:24: 208.122.4.178:www (RDNS N/A)
11:53:18 - 11:53:29: 66.249.93.166:www (RDNS ug-in-f166.google.com)
11:53:20 - 11:53:22: 207.44.216.40:www (RDNS 1002-3.lowesthosting.com)
11:53:20 - 11:53:22: 66.228.125.212:www (RDNS server3.blibs.com)
11:53:23 - 11:53:24: 217.163.21.31:www (RDNS ad1.vip.rm.ch1.yahoo.net)
11:53:23 - 11:53:24: 217.163.21.31:www (RDNS ad1.vip.rm.ch1.yahoo.net)
11:53:24 - 11:53:42: 69.89.31.88:www (RDNS box288.bluehost.com)

This bunch of sites are the webcomics I read. There are a few of them, as you can see. Now, we don’t actually need to go any further than this in dissecting my personal browsing habits to see where this falls apart. A few of them are hosted on a hosted solution for a bunch of webcomics on keenspot. So how do we discern between what we actually visited on that specific address given the logs? Well, you can’t! This has all to do with the fact of how webservers host non-SSL webpages.

At the core level a webserver runs on a machine, typically listening on port 80 (the www port). This webserver may provide any number of pages using what in the Apache world is known as virtual hosts, so if you request a page from foo.com it will serve you one set of pages, and if you request a page from bar.com it will serve you another set of pages, but all this will happen just by you connecting to port 80 on some machine. If we couple this with the fact that a terrorist could be running a webserver that serves two sites: a reputable site that logs calls and a shady terroristy site (advocating privacy, or what have you) that does not log visits then it does not require huge amounts of training in Computer Science or in systems administration in general to quickly see zillions of ways through this.

Fortunately we have expert politicians dealing with these things. In fact, in Danish law we have something called §20 questions where a minister can be forced to answer some question from a member of parliament (folketinget). Here we have a question asking the justice minister’s opinion on the fact that a survey indicated that 54% of educated Engineers and Computer Scientists thought they could circumvent the legislated logging. For the non-Danish readers I will translate the minister’s answer:

I have no further knowledge of the survey that is referred in the question, including how and on what accounts Computer Scientists and Engineers think they can circumvent the requirements in the logging proclamation.

The purpose of the rules on logging is to prevent and solve very serious crime and it is difficult for me to imagine that Computer Scientists and Engineers in general would have a wish to try to circumvent the rules in this area.

It should be noted that it, in itself, will cause an increased attention on a person if the police, in the course of an investigation of a person, discover that he has tried to circumvent the logging proclamation.

In other words, it is suspicious to circumvent the logging, even though over half the higher educated IT workforce believe they can circumvent it without issues. I guess the criminals are extra fearful on account of this, it’s not as if the criminals are breaking a bunch of other laws already. Since I prefer to not be a suspect, I will not regale you with the ways this can be circumvented, but suffice it to say, the law is a joke, and the justice minister’s understanding of the implications are a joke. If it wasn’t so very sad, I’d probably be laughing my ass off.

If you wish to redo this experiment, or if you just want to see exactly how much information is logged about what you are doing online, grab a copy of tcpspy and leave it running for a while. If you are in Denmark, then all this is logged and is related to you personally (another requirement of the proclamation), or rather it is related to the account holder of the internet connection you are using, because there is no way to discern between the individuals using a connection, and it is saved for a year and made available for all investigations into ‘serious crime’. Welcome to the surveillance society, your privacy is gone.

Tags: , ,

Il Cucchiaio d’Argento

Friday, December 14th, 2007 | Personal | No Comments

My wife and I both appreciate good food, be it Danish, French, Chinese, Thai, Italian, or what have you. Our bookcase will most likely bear witness to this as we have an entire shelf dedicated to cookbooks. A few days ago when we were passing the time until my wife’s boots were fixed, we invariably found our way past a bookshop, and, of course, to the cooking section (I had to drag my wife away from the gardening section, but that’s a whole other matter).

In Denmark we have some solid classics like God Mad: Let at lave (Good food, easy to cook), and Frk. Jensens kogebog (Mrs. Jensen’s cookbook), and both are fairly complete with lots of information about cooking. However, the book I discovered in the bookshop put both to shame, Il Cucchiaio d’Argento (Sølvskeen in Danish, The Silverspoon in English) is the traditional Italian cookbook, with more than 2000 recipes from all the Italian regions. As a testament to its completeness, the book weighs in at over a thousand pages, more than 20 recipes with rabbit, themed recipes from antipasto over pesci (fish) to carni (meat) and dolci (desserts).

The cookbook is entirely unwieldy and extremely lovely, and while there are altogether too few pictures of the food, the book promises to turn you into a real connoisseur on traditional Italian food, and with the thorough descriptions of what kitchen equipment serves what purposes, illustrations of how animals are cut and what each piece of meat is good for, it will, in my opinion, be a priceless reference when cooking. Lastly, as the book also mentions, nothing goes to waste in the Italian kitchen, so the book also features recipes on brain, ox jaws and calf head. Intriguing!

Buono appetito.

Tags: ,

Clothes

Thursday, November 22nd, 2007 | Personal | No Comments

Is it just me, or are a lot of brand clothes websites absolutely abysmally designed, usability-wise? Flash players required, mostly it is impossible to link to images and some annoying ‘music’ is blaring out your speakers if you haphazardly find your way onto one of these sites. Hugo Boss and Falbe are two of the more egregious examples of this annoying behaviour. Bertoni, Eton and many others, follow closely by only featuring a bunch of ‘fancy’ photos in a flash application. Honestly, what were they thinking?! Slightly better is the Burberry website, but that was the only half-decent brand site I found in over an hour of searching!

Another problem is the Danish outlets (I hope for your sake that things are better in your country!). They have been designed mostly with the same mindset, except they are even worse, several of them do not even show anything but a page of brand names they sell! Kaufmann and Din Tøjmand are among the worst, Tøjeksperten barely climbs above them by actually having a PDF version of their catalogue available online.

Please, design shops and clothes outlets, the web is not another TV station showing commercials 24/7 where consumers passively gawk at your magnificent creations. We want instant feedback, we want to be able to easily link to apparel so we can show it to friends, get opinions, easily find stores that carry the clothes, so we can go to those stores and try the clothes on to see whether they fit us. It would be absolutely perfect if you could see whether the apparel is in stock in a specific store so we don’t have to wade all the way across town to find another outlet that might or might not happen to have it in stock. It is time you join the digital decade, it’s the thing in fashion.

Tags: ,

Life is good

Monday, November 19th, 2007 | Personal | No Comments

White tea

Japanese tea cup. White tea. Life is good.

Tags:

Version Control Systems

Sunday, November 18th, 2007 | Personal | No Comments

Much like the topic of editors will be able to spawn a religious war among developers, so can the topic of version control software (or VCS for short). The old paragon of the development community, CVS, has felt less loved as of late and CVS does, indeed, suffer from many issues: commits are not atomic, causing the build to break if there is an inadvertent conflict somewhere through the commit process; branching, or rather merging, is a lot more difficult than it need be, and the system is generally slow. Its main advantage is that it is old and understood by many. But, seeing as this is more of a bad excuse, like continuing to use Microsoft products instead of trying other things, I would like to go on a small literary journey through a few of the interesting (and not quite so interesting) alternatives to CVS there is.

As one of the older contenders to CVS, we have Subversion or svn for short. Their cutesy motto was ‘CVS done right’. There are still issues with it, though. Renaming isn’t implemented adequately, merging can still be troublesome, and like CVS it is centralised, requiring roundtrips for many commands having to do with the history of the repository. Many projects have moved on to use svn, but in the words of Linus Torvalds, ‘There is no way to do CVS right’.

Also in the centrally controlled group, although proprietary and costing you a fortune, lies Microsoft’s newest wonder technology, Team Foundation Version Control (TFVC). Although it is nicer than the ridiculous piece of software, Visual SourceSafe, that they pushed at you before, it is still incredibly annoying to use. Granted, I prefer to run my source control at the command line rather than having all sorts of strange things happen inside my IDE, but what happens when you run ‘tf get /recursive’ to get everything recursively from the VCS? It pops up a GUI when there are conflicts! Running ‘tf help’ starts a graphical help browser… after a minute or two! Other interesting issues with it has been that it thinks some files are the most recent locally, even if there are no files locally at all. And, probably due to Visual Studio, when you get the latest version of a solution, it occasionally has a tendency to check out some project files to make changes to their configuration. All in all, creating one very annoying user experience. Now, on the bright side, merges between branches are fairly easy to do, and branching is an easy operation as well, so kudos to Microsoft for at least getting one thing right. I cannot, though, recommend anyone to use it given that so much better solutions exist, at least for source control management. (The case for Team Foundation Server really lies in the nice integration for project managers as well, but that is fairly irrelevant when it comes to the merits of the quality of the VCS).

But, enough of the centralised version control systems. They are a thing of yesteryear. Come to replace them are the distributed version control systems, or DVCS’s for short. These systems largely work in the same fashion, but with some differences in implementation. Their main benefit, though, is that they allow much more diverse workflows and can be used in just the way you like to work. So, if you prefer to work with a central repository, you can do that without any issues, just like with CVS, svn or TFVC, but if you prefer to work in a more disconnected model, or utilise some of DVCS’s staging workflows, they do not keep you from doing just that.

Some of the more popular contenders in the DVCS world are git, mercurial (or hg for short), and Bazaar (bzr). Before we look at each of these, though, perhaps it would be nice to look at some of the possible workflows that can be useful with a distributed model. Throughout this, it is perhaps important to note that in a distributed model, when you do a ‘checkout’ from a source tree, you are also a full repository. We’ll get back to this more in a bit.

Centralised with staging

In a centralised model, everything works (more or less) like you know with CVS. However, the main great difference is that since each branch is a repository in its own right, we can do some interesting release staging. If we designate the main development repository as ‘mainline’, this will be where the developers share their finished features. Once a development team signs off on a feature being completely implemented and ready for testing, they tell the test team to pull it into their ‘test’ repository. Once the test team thinks a feature is ready, they will tell the QA team to pull it into their ‘QA’ repository, and finally when QA signs off on everything being perfectly swell, you can pull the feature into the ‘production’ repository that is used to create the official build of whatever it is that you’re developing. This is all illustrated in the drawing below.

Central staging workflow

What is interesting here to remember is that history is complete at each step. You know exactly who did what, when, and where for each step in the software release flow. The test team can easily look in their own history to see what developer introduced a fault, as the tester will have the entire history right there at his own machine. There is no easy way to do this kind of staging with centralised VCS’s. Also note that this staged release can be accomplished without centralised repositories, but a central model more accurately reflects how most businesses work, and how most managers prefer things work.

Small, completely distributed project

Let us imagine that Alice and Bob are working on system for visualising branches’ changes in a version control system. Mostly this system has a unified functionality, but since each of the developers use this system in a different context as well, they each require some custom features to ease their own workflows. Since they both really like DVCS’s, they decide to just pull the changes they each find interesting from each other, thus creating a fully custom visualiser that helps them in their workflow. As it will probably be an aid to understand why people may find it interesting to have different setups for this, let us consider each of these people as their own persona:

Alice is a maintainer on a large open source project and she has the role of merging submitted patches by everyone. Most of the patches are sent to her in mail so what she needs to do is for each mail, review the patch, and if it seems sensible, merge it into the official repository. It therefore makes sense that her main view of the visualisation can take a selected mail, perform the merge listed in it and let Alice review all the code quickly and easily, speeding up the general process of improving the software.

Bob is a code reviewer at a middle-sized software company where he spends about half of his workday reviewing other people’s code against the product mainline. To cut down on the time that he has to spend figuring out what is new and old since the last time he has reviewed some code, he wants to merge the recent changes into his own branch and then quickly be able to look at the code diff for each change since his last merge. Thus it makes sense for Bob to get a good view of the different branches in the history and to quickly be able to get a diff view for each of these changes, so he can quickly address his concerns to the correct developers on the project.

With this in mind, it is now possible for each of them to develop any number of features, while they can cherry-pick (selectively choose) the changes they are interested in, from each other. So rather than over-engineer the application to do all sorts of things, they just adapt the source code to their own need, without any one of them having the ‘wrong’ version. This workflow, while simple, is illustrated below for the sake of completeness.

Alice and Bob's workflow

Tracking an upstream CVS

As part of my thesis work, I am perusing a CVS repository with a large framework for doing code transformations that my advisors and some more people have written. Since they aren’t necessarily interested in using all the code I need for my thesis, it is easier for me to keep the history of their repository while adding my changes locally. This might’ve been solved by using a CVS branch, but CVS branches are notoriously sad to work with. Instead, I have used a DVCS to import the entire CVS history to a local branch and then I branch from this (to keep a pristine copy of the CVS repository) for my different features. As an added benefit, I can just rerun the import whenever there are changes in the CVS repository, and these changes are reflected in my DVCS branch as if they were made natively in the DVCS. This would never have worked with a centralised system. In essense, my advisors can keep on working on their system without my code causing them any worries, and I can keep introducing their changes to my code as they commit it. A win-win situation, really. The, also fairly simple flow, is illustrated below.

Thesis workflow

DVCS – a redux

Looking at the workflows above (and there are countless more), we see that DVCS is able to easily support not only large corporations, but also small groups of developers who want to share work.

We still need to look a bit at the three contenders: git, hg and bzr. They pretty much all have the same features, so it is perhaps most interesting to look at the differences.

git was created by Linus Torvalds for use with the Linux kernel after a larger controversy with the original DVCS provider, BitKeeper. Being an operating systems guy, Linus has ensured that git is lightningly fast, but being an operating systems guy, Linus has also caused the general usability of the system to be abysmal, unless you are ready to devote a non-trivial amount of time to learn how it all works (it is improving with each new version, though, so not all is abysmal any longer). Another unique feature of git is that it tracks your content, not your files, but I will not go into what implications this have. Since git is used with the Linux kernel, it should be proven without doubt that it works on large projects. git owes a lot of its speed to using some very specific features of the Linux file system handling, so git works rather abysmally on Windows, making it a non-contender if you do a lot of cross-platform work.

hg was also created as a replacement for BitKeeper for the Linux kernel, but once Linus got underway with his own project he did, of course, not see much incentive for using hg. However, development on hg has continued and it has proliferated. It is implemented in Python, with a few modules in C, and is generally slower than git, but not abysmally so. One of the larger projects using mercurial is the Mozilla corporation with Firefox and friends. Thanks to Bryan O’Sullivan, there is also a very thorough book for hg on using it, and on using DVCS’s in general. It is very good reading if you are interested in DVCS’s in general.

bzr is backed by Canonical, the people behind Ubuntu, and it is written in Python after the devise ‘Correct first. Fast later’. This is one of the reasons why there has never been a working tree corruption in bzr, unlike in most other systems. There are few differences to bzr and hg, in reality, but there are in general more plugins available for bzr doing different things, and in my very subjective opinion, bzr feels more polished and has very nice usability. It is slower, due to the general development philosophy, but all the speed issues are being addressed and it doesn’t feel terribly slow to work with any longer.

I have, in case anyone was wondering, chosen to use bzr for my personal projects, but no matter your taste, I would encourage you to take a look at distributed version control systems and see how they might help with your workflow. And stay away from CVS and TFVC.

Tags: , , , , , ,

Programming

Tuesday, November 13th, 2007 | Personal | No Comments

When did the fun go –
and all the joy?
Whereto, I have forgotten.

Tags:

The human(e) operating system

Tuesday, October 16th, 2007 | Personal | No Comments

A lot has happened in the OS world over the past year or so. Microsoft has released their newest and greatest OS, Vista, which is riddled with ‘difficulties’ and ‘issues’. Like throttling your network performance to 10% of the full capacity when you are listening to media files, due to DRM checking. In fact, DRM is a cornerstone of Vista’s construction and coupled with Microsoft’s ever-increasing arrogance and refusal to play nice with others (yes, I have been rather slow to fully realise the scope), I decided that it was about time to try something else when my computer died last December (blown PSU and motherboard and damages to the CPU, in case anyone was wondering).

Now, I have written about trying Ubuntu before, here, but as a blemish on my own name, it was somewhat scathing remarks that flowed from my keyboard about it. The truth of everything is that I probably expected it to work like Windows and with the same mannerisms, but, naturally, GNU/Linux does not work like Windows. So, bearing with me a decent portion of experience in running GNU/Linux now and then since the fairly early days, a high amount of being fed up with Microsoft, a dislike of Apple’s tricks for vendor lock-in, and a general idea that it was time to start acting according to my own ideals, I once again went looking at the GNU/Linux landscape.

One of the principal things to remember about GNU/Linux is that it is founded on openness, collaboration, and… choice. This means that there is a lot of different distributions: Ubuntu, Fedora, Debian, Slackware, and many, many others. Each of these distributions serve a particular segment of users and have each their way of organising things (mostly), so where do you start? One option is to go on and investigate them all at Distrowatch or for a slightly more limited amount of distributions you can check out Zegenie Studio’s Linux Chooser. Since I have run GNU/Linux several times before I already knew what distribution I would like to use: Ubuntu.

So with a lot of new shiny computer parts assembled, I booted the Ubuntu 7.04 (Feisty) LiveCD, clicked install, waited some time and then I had a fully functioning system, except for the integrated network chipset, which was unknown to Linux. The vendor had thoughtfully supplied a network driver that only worked on an older kernel version and generally wasn’t behaving very well after I tried to convert it to a newer kernel version (once a developer, always a developer, eh?). Fortunately, I had an old PCI network card lying around that I installed in the computer and then everything ran wonderfully.

The system isn’t too uncommon for a Windows user, but it has its small quirks here and there. One of the things I have been doing a lot has been .NET development, in particular in C#, so after having played around a bit with configuring themes, setting up power management and things like that I have the first panic attack, ‘there is no Visual Studio here, what do I do?!’ Carefully repeating the breathing exercises I had prepared in advance I quickly relaxed. There’s bound to be something around for working with .NET programs given Mono’s track record of keeping up with the .NET framework. Quite right so, there is MonoDevelop, which is based off of the freeware Windows editor, SharpDevelop. At the time I used it, there was still no debugger support and the auto-completion sort of worked, I hear it has gotten better since, but unfortunately C# started to annoy me around the time of the switch anyway (I shall try to remember to write more about my issues with it in another blog post), so I haven’t really done much personal development for .NET since then.

All in all, I have been glad that I chose to exert the effort of getting used to GNU/Linux and the way of doing things in this world as it frees me from the vendor lock-in that Microsoft and Apple dish out to their customers and once you get used to how things work in this different world, they actually just work. As an added bonus, several of the distributions come with a centrally managed repository of software that can be checked against for updates for all your applications at once. As an added added bonus, they never install updates without telling you, unlike some other companies.

Colour me convinced for a brighter future with GNU/Linux and open standards.

Tags: ,