Since this seems to be fairly constructive, I’ll keep going. Those not interested in Github, or the dynamics of open source code creation can safely tune out. Chris, the Github dude, has some more to say:
I mentioned the Network Graph and Fork Queue but David mentioned neither. I think he doesn’t know what they are, probably because I didn’t explain what they are 🙂
I had had a look at them, and they’re handy tools (I would never accuse Github of not doing good work, that’s not the issue at all), but I would respectfully submit that perhaps Chris is looking at things from a very Github-centric perspective, in which it’s second nature to go look at those. They aren’t obvious, and certainly don’t jump out at the user to say “hey, maybe this code your looking out isn’t the latest and greatest!”. For someone just cruising around, perhaps it should be more evident who’s the ‘top dog’?
Chris then goes on with a very helpful and illustrative demo of how things work at Github, which is good stuff.
However, at the end, he says something that I don’t agree with:
It may seem strange, and perhaps even like a lot of work. “Why should I have to check to see which is the most current? In the old model, there’s always a canonical repository.”
That’s precisely the problem. It does seem like a lot of work, especially when your search space is not limited to Github, but may include other places like Sourceforge, RubyForge, Google Code, project specific sites, and so on.
In the old model, actionwebservice wouldn’t have made it past 1.2.6. Welcome to distributed version control.
Plenty of ‘old style’ projects have survived beyond their founders’ interest in the project. What happens is that you ask for permission to work on the project, and either :
It’s given to you, in which case you can keep working on the canonical code. For instance, someone could have asked DHH to work on the RubyForge version of actionwebservice. Did they? Did he say no? At the very least, he could have been asked to point the RubyForge actionwebservice page at some other site with a more current version of the code.
You have to fork the project, and in that case, sure, you might as well have been using the Github model.
I’ve often found that people are happy to let you contribute to their projects, though, and part of my original point with all of this is that if people just go spitting out forks willy-nilly, it creates a “paradox of choice” type problem, and perhaps takes something away from the community aspect of open source projects. As people are fond of saying at the Apache Software Foundation, it’s about the people, not necessarily about the code. I’m not saying that forks are always bad and that everything should be centrally done, but there’s a balance to be struck between people just working on their own, and some sort of onerous, bureaucratic Central Project Authority. It’s nice that people who want to improve the code make themselves heard on a mailing list/forum/whatever, and thus cover the “people” part of merging in new code and new ideas. More often than not, help and contributions are more than welcome. Design decisions and conversations recorded on mailing lists are available for people to peruse in the future when they have questions.
Just to repeat something that bears repeating: I am not claiming that Github will lead to social disintegration of open source projects or anything drastic like that. However, I’m a bit wary of certain patterns I’ve seen. It’s certainly possible that I’m wrong – as I mentioned, one error I might be making is that people are dumping code they simply wouldn’t have shared on Github, because Github makes it so easy. I do hope my worries are unfounded, and in any case, it’s a good thing that the Github guys are interested in the problem too, and will hopefully do things to alleviate it where possible.