Gmail: Associate Addresses to Interlocutors

I have a number of different email addresses that all wind up in my gmail account. Some of them are personal, some business. Gmail is smart enough to use a particular email address in your reply when someone writes to you with it, but it would be nice if it went a step further and looked at what address you normally use with a given person, and defaulted to that, so that whenever I write person A, it uses the business address, and switches to the personal address when I write new emails to person B.

I guess that’s one of the problems with relying on software you don’t have the source code to.

On Debian, but not on Time

It was kind of interesting to see this the other day, on the train from Monselice to Padova:

Debian Trenitalia

That’s a Debian system, kernel 2.6.8.1, which appears to have some problems booting. That’s not the only problem with the train system in Italy, unfortunately:

Debian Trenitalia

My Padova->Monselice train, roughly a 20 minute train ride, was 65 minutes late, and would end up being nearly an hour and a half late. Ouch! To be fair, the train system in the US is actually even worse, if that can be believed. And in Austria, while they run on time, they’re so expensive and don’t run often, so that it’s often much more convenient to drive.

Github Part III

Since this seems to be fairly constructive, I’ll keep going. Those not interested in Github, or the dynamics of open source code creation can safely tune out. Chris, the Github dude, has some more to say:

http://ozmm.org/posts/forking_continued.html

I mentioned the Network Graph and Fork Queue but David mentioned neither. I think he doesn’t know what they are, probably because I didn’t explain what they are 🙂

I had had a look at them, and they’re handy tools (I would never accuse Github of not doing good work, that’s not the issue at all), but I would respectfully submit that perhaps Chris is looking at things from a very Github-centric perspective, in which it’s second nature to go look at those. They aren’t obvious, and certainly don’t jump out at the user to say “hey, maybe this code your looking out isn’t the latest and greatest!”. For someone just cruising around, perhaps it should be more evident who’s the ‘top dog’?

Chris then goes on with a very helpful and illustrative demo of how things work at Github, which is good stuff.

However, at the end, he says something that I don’t agree with:

It may seem strange, and perhaps even like a lot of work. “Why should I have to check to see which is the most current? In the old model, there’s always a canonical repository.”

That’s precisely the problem. It does seem like a lot of work, especially when your search space is not limited to Github, but may include other places like Sourceforge, RubyForge, Google Code, project specific sites, and so on.

In the old model, actionwebservice wouldn’t have made it past 1.2.6. Welcome to distributed version control.

Plenty of ‘old style’ projects have survived beyond their founders’ interest in the project. What happens is that you ask for permission to work on the project, and either :

  1. It’s given to you, in which case you can keep working on the canonical code. For instance, someone could have asked DHH to work on the RubyForge version of actionwebservice. Did they? Did he say no? At the very least, he could have been asked to point the RubyForge actionwebservice page at some other site with a more current version of the code.

  2. You have to fork the project, and in that case, sure, you might as well have been using the Github model.

I’ve often found that people are happy to let you contribute to their projects, though, and part of my original point with all of this is that if people just go spitting out forks willy-nilly, it creates a “paradox of choice” type problem, and perhaps takes something away from the community aspect of open source projects. As people are fond of saying at the Apache Software Foundation, it’s about the people, not necessarily about the code. I’m not saying that forks are always bad and that everything should be centrally done, but there’s a balance to be struck between people just working on their own, and some sort of onerous, bureaucratic Central Project Authority. It’s nice that people who want to improve the code make themselves heard on a mailing list/forum/whatever, and thus cover the “people” part of merging in new code and new ideas. More often than not, help and contributions are more than welcome. Design decisions and conversations recorded on mailing lists are available for people to peruse in the future when they have questions.

Just to repeat something that bears repeating: I am not claiming that Github will lead to social disintegration of open source projects or anything drastic like that. However, I’m a bit wary of certain patterns I’ve seen. It’s certainly possible that I’m wrong – as I mentioned, one error I might be making is that people are dumping code they simply wouldn’t have shared on Github, because Github makes it so easy. I do hope my worries are unfounded, and in any case, it’s a good thing that the Github guys are interested in the problem too, and will hopefully do things to alleviate it where possible.

More Github

Before I get started, I want to make it quite clear that I have no problems with Github – au contraire, they provide a very nice service. What my speculation centers around is the social dynamics of Github.

One of the Github guys writes a nice response to my original post, here:

http://ozmm.org/posts/linux_vs_classic_dev_style.html

First of all, apologies for the comment bug. In theory, it has been fixed by the Typo guys, I just need to find the time to upgrade.

Chris mentions that it’s possible to create a “SourceForge style” project on Github. I see that it’s possible to add collaborators, so I guess that’s what he means. I still find it a bit ugly to have things at …/davidw/… but that’s not very important if the URL is at least stable, rather than hopping around to whoever has the latest fork and updates of the code in question.

In my other post, I talked about people being less interested in doing what it takes to contribute to existing projects, rather than forking them, unless it’s really called for. Here’s a concrete example of how things might go wrong. At my current consulting gig, we were looking at utilizing actionwebservice to do some SOAP stuff. That code is not part of Rails anymore, but that’s not the problem – surely someone has decided to take up maintenance of the gem, right? Well, yes… but let’s look:

root@fortrock:~# gem1.8 search -r actionwebservice

*** REMOTE GEMS ***

actionwebservice (1.2.6)
datanoise-actionwebservice (2.2.2)
dougbarth-actionwebservice (2.1.1)
nmeans-actionwebservice (2.1.1)

So we have the original, clearly out of date, and three others. Yuck… I guess we’ll use the 2.2.2 one, but who knows if perhaps the others contain some good stuff too? Let’s Google it.

http://www.google.com/search?hl=en&q=actionwebservice

The first link is to a RubyForge project, which is basically dead. A few links down, there is a link talking about a more recent version, at the Datanoise site. However, it’s one of many. Confusing. If you go to the github page for the datanoise version of actionwebservice, the README has this:

The latest Action Web Service version can be downloaded from
http://rubyforge.org/projects/actionservice

Hrm. We were just there and that doesn’t seem quite right… Sure, it’s an easy enough oversight to ignore, but it adds to the confusion.

Now, this is not really a serious problem, but do you see how it could get worse over time, as various people fork the code, write blog entries, and so on? It would get harder for someone interested in utilizing the code in question to ascertain which is “the” version. It’s not hard to imagine situations where two forks add different bits of useful code. Typically, in situations like this, new users, those who don’t know the code at all, are also the people least able to look through all the diffs and changes and log entries and what not in an attempt to merge any changes that might be useful. This leads to a “paradox of choice” type of problem.

Another consideration, this one more positive, is that perhaps people are putting code in Github that, in the past, might have simply remained on their own computers, with no effort whatsoever to share it, so in that sense, Github might be doing people a service by at least making the code public.

However, even in that case, if the service fills up with code that people have simply “thrown over the wall”, rather than creating a genuine open source project, with a community, communication channels, and so on, it could get a reputation for having code that you have to handle with care, as a lot of is just stuff slapped up there without much thought to quality or continuity.

Anyway, the Github guys have done a nice job with the service, and I’m sure they’ll continue to improve it. Who knows, perhaps they have some good ideas about mitigating some of these potential problems that they’ll be rolling out soon.

Rivet in Action

I’m terrified of flying (actually heights more than flying, but planes tend to go pretty high), and seeing all the images of Flight 1549 floating down the river didn’t help things. I’m glad everyone got out ok.

An interesting detail about the whole story though, from my point of view at least, is all the press hits that Karl Lehenbauer‘s http://flightaware.com/ got – it was mentioned on CNN and a number of other places, and seemed to handle the traffic ok. A little known fact is that, as can be seen from the HTTP headers, FlightAware is built on Apache Rivet, which was one of my first serious open source projects. It’s good to see them doing well, although it’d be nice to see them help out a bit more with Rivet. It’s still good code, and does what it does quite well, but needs some love.

Developer > Project, or Project > Developer(s)?

I’ve been fooling around with “git” lately – it’s useful for local projects where I don’t want to have to set up a remote server, and it also seems to be “the way the wind is blowing” – lots of open source projects are starting to use it, so I might as well familiarize myself with how it works.

The github service seems to be very popular at the moment, and it’s not hard to see why. They’ve created a very useful service, and give it away completely free for open source use. That’s a pretty good deal! It’s also an easy to use site that’s well integrated with git, so you can see lots of stuff graphically instead of fooling around with command line tools.

However, I have a nagging doubt. Github apparently bases everything around the developer. In other words, projects hosted there are mostly attached to a developer who ‘owns’ the project, rather than existing on their own. There are a few exceptions – there seems to be a ‘rails’ role account to manage Rails, rather than having it belong to, say, DHH, so it’s not impossible to do something in a project-centric fashion, but it doesn’t seem to be very common either.

What I’m wondering is what sort of effects this might have in the long term. Github, and git itself, make it very easy to ‘fork’ projects. Could, or will this lead to a situation where people don’t try so hard to get their code in the ‘official’ branch? What happens when someone loses interest in their project? Sure, someone else can fork their repository, but if all the links were pointing to the first guy, because he was the main hacker on a project for X years, they’re not going to switch overnight even if guy B takes over maintanance. And in the meantime, casual searchers might turn up the ‘dead’ branch, or simply be confused as to who’s doing what. This could make things confusing for someone who just wants to check out some code and start using it…. trying to sort out various revisions and ancestry of a project could be quite discouraging.

At the Apache Software Foundation, on the other hand, people are obviously very important, but projects are the organizational unit that people get involved with, and a lot of emphasis is placed on not having one big benevolent dictator, but on having a group of people who make decisions and act as arbiters of what goes into the project. People can come and go as they please, but the project, and, importantly, the links to it, remain the same over time (barring any big, unforeseen blowups).

As far as I can tell, this is not a git problem, but something related to how github organizes their site. And, of course, it’s possible that I have simply missed something due to my lack of knowledge of git, and not being more than superficially familiar with github. Even if I’m right, it shouldn’t take that much effort for github to create a system where it’s possible for multiple people to ‘own’ a project.

Git is a Pain in the Ass

Everyone’s been talking about how great and wonderful git is, and I’ve tried using it for a few projects of my own, on a local basis, sort of like an advanced form of ‘rcs’. So I thought I’d try it out in a little bit more complicated setup, for a site I host on my web server. I want to have a main repository, that I “check out” (or ‘clone’, in git terms) on my laptop. My laptop can’t be the main repository, because I also want to be able to commit from the web server (once in a while a quick live update is called for), so I need a stable address. Ok, I’ll put it in my home directory on the web server, just to see how things work.

mkdir foobar
cd foobar
git init

Ok, so far so good. Now I try and clone it on my laptop, where I will then add files, commit, then push them.

Oops, that doesn’t work. You can’t clone an empty repository.

Ok, I add something on the server where I init’ed it, an empty file, just to give it some content. Now cloning it works on the laptop. Great, we’re in business! I add a bunch of files from the project to the laptop repository, commit, then push them. Seems to be ok so far… I do a checkout on the remote/server machine, and I see my files. Good. Ok, let’s try making a change on the laptop. I remove a file, commit it, push it (which is already an extra step compared to subversion… hrmph). I do a checkout on the server, but it won’t erase the file I removed on the laptop. Weird. I google around a bit, and find that this is supposed to be for my own good, so I won’t wreck things on the server. I need to do git reset --hard (which isn’t a very reassuringly named command) there, and then a checkout, and now things work.

That, however, is a lot of work just to commit and update stuff! I ask for some help on #git, where they mention the --bare option to init. Since I’m just messing around, I go back and redo the init step, wiping the old repository. Now I try and clone that to start over again. Oops, I forgot, I can’t clone it because it’s empty. Grrrrrr…. this is getting annoying.

So, the kind folks on #git tell me I should push to the new server repository from one that’s already populated. Ok, let’s try setting up the laptop:

git init
... add/commit some files ...
git remote add origin ...myurl...

Now let’s try pushing. Nope, still doesn’t do it, I’m missing something. Frustrating. A bit more fiddling and googling, and i find:

get push origin master

Aha! It worked! Weird. And now a simple git push works too. The whole thing seems kind of shaky in the sense that it’s not very confidence inspiring: I feel like it wouldn’t take much to make a wrong turn and find all my files gone forever. I can see some of the advantages, and will likely stick with it – git is quite convenient for local files that I might not have bothered putting under version control in the past, but it’s also a bit more “bureaucratic” in that you have more steps to do, and you have to fill in the forms just so… or else!

Twitter? Hrm. Ok…

Yoav’s post about twitter was interesting:

http://yoavs.blogspot.com/2009/01/one-curious-thing-about-twitter.html

Personally, I don’t really see the appeal. So far there are only two uses I have for it:

  • Lance Armstrong writes a fair bit, and even posts some pictures, and since I am a huge cycling fan, that’s kind of fun to read.

  • I use the search API to keep track of people talking about my stuff… Hecl, LangPop.com, and so on. That’s kind of handy, but it feels more like a big wiretap than a “conversation”.

Other than that, it’s another firehose of data that I really don’t need – I probably read too much junk on the internet as it is.

Rant: Ubuntu, Google, J2ME

Long day, lots of broken stuff. Rant time:

  • Ubuntu’s Intrepid Ibex has way too many regressions. I’ve mentioned this before talking about wireless, but also on the sidelines are bluetooth and my laptop’s multimedia keys. Other things are probably slipping my mind, but that’s what’s bugging me today. I’ve use Linux for a while, and am used to not always having everything working just right, but “it’s never worked” is less annoying than “it used to work but this release broke it”. Also, I bought this computer from Dell because it shipped with Ubuntu. I would have expected the Ubuntu guys to have a few around themselves to test on prior to release.

  • J2ME marketing: http://blogs.sun.com/hinkmond/entry/getjar_3rd_annual_mobile_awards GetJar is a nice service, and I put my own ShopList app there, but you can’t seriously compare it to the Apple or Android stores without insulting my intelligence. What percentage of people with J2ME capable phones actually use GetJar? Is it on their phone when they turn it on? Can developers actually sell stuff there?

  • Google has started dumping lots of my mail in the spam folder since I switched the domain over to a new server. I can’t believe they’re dumping so much non spam mail, and yet can’t figure out that I have never, not once, received an email written in Chinese characters that I actually wanted to read.

Add in some bugs and broken stuff of my own, and it’s made for a frustrating day.

Woops – email outage

I shouldn’t have been working on Sunday, but taking care of the flu that I managed to catch from my daughter, but instead I was finishing up my move from Slicehost to Linode, so it’s no surprise that I made a mistake and didn’t get postfix configured quite right, and so I lost a few days of email to @dedasys.com. Argh! Hopefully nothing too important.