Github Part III

Posted by David N. Welton Thu, 22 Jan 2009 15:55:00 GMT

Since this seems to be fairly constructive, I'll keep going. Those not interested in Github, or the dynamics of open source code creation can safely tune out. Chris, the Github dude, has some more to say:

http://ozmm.org/posts/forking_continued.html

I mentioned the Network Graph and Fork Queue but David mentioned neither. I think he doesn’t know what they are, probably because I didn’t explain what they are :)

I had had a look at them, and they're handy tools (I would never accuse Github of not doing good work, that's not the issue at all), but I would respectfully submit that perhaps Chris is looking at things from a very Github-centric perspective, in which it's second nature to go look at those. They aren't obvious, and certainly don't jump out at the user to say "hey, maybe this code your looking out isn't the latest and greatest!". For someone just cruising around, perhaps it should be more evident who's the 'top dog'?

Chris then goes on with a very helpful and illustrative demo of how things work at Github, which is good stuff.

However, at the end, he says something that I don't agree with:

It may seem strange, and perhaps even like a lot of work. “Why should I have to check to see which is the most current? In the old model, there’s always a canonical repository.”

That's precisely the problem. It does seem like a lot of work, especially when your search space is not limited to Github, but may include other places like Sourceforge, RubyForge, Google Code, project specific sites, and so on.

In the old model, actionwebservice wouldn’t have made it past 1.2.6. Welcome to distributed version control.

Plenty of 'old style' projects have survived beyond their founders' interest in the project. What happens is that you ask for permission to work on the project, and either :

  1. It's given to you, in which case you can keep working on the canonical code. For instance, someone could have asked DHH to work on the RubyForge version of actionwebservice. Did they? Did he say no? At the very least, he could have been asked to point the RubyForge actionwebservice page at some other site with a more current version of the code.

  2. You have to fork the project, and in that case, sure, you might as well have been using the Github model.

I've often found that people are happy to let you contribute to their projects, though, and part of my original point with all of this is that if people just go spitting out forks willy-nilly, it creates a "paradox of choice" type problem, and perhaps takes something away from the community aspect of open source projects. As people are fond of saying at the Apache Software Foundation, it's about the people, not necessarily about the code. I'm not saying that forks are always bad and that everything should be centrally done, but there's a balance to be struck between people just working on their own, and some sort of onerous, bureaucratic Central Project Authority. It's nice that people who want to improve the code make themselves heard on a mailing list/forum/whatever, and thus cover the "people" part of merging in new code and new ideas. More often than not, help and contributions are more than welcome. Design decisions and conversations recorded on mailing lists are available for people to peruse in the future when they have questions.

Just to repeat something that bears repeating: I am not claiming that Github will lead to social disintegration of open source projects or anything drastic like that. However, I'm a bit wary of certain patterns I've seen. It's certainly possible that I'm wrong - as I mentioned, one error I might be making is that people are dumping code they simply wouldn't have shared on Github, because Github makes it so easy. I do hope my worries are unfounded, and in any case, it's a good thing that the Github guys are interested in the problem too, and will hopefully do things to alleviate it where possible.

3 comments |

Trackbacks

Use the following link to trackback from your own site:
http://journal.dedasys.com/trackbacks?article_id=2134

  1. aleco
    about 4 hours later:

    What if you could do the following?

    root@fortrock:~# gem1.8 search -r actionwebservice

    REMOTE GEMS

    actionwebservice (v1.2.6, 0 commits in the past 90 days) datanoise-actionwebservice (v2.2.2, 73 commits in the past 90 days) dougbarth-actionwebservice (v2.1.1, 24 commits in the past 90 days) nmeans-actionwebservice (v2.1.1, 7 commits in the past 90 days)

  2. David Welton
    about 5 hours later:

    Aleco, that would help some, I suppose, but doesn't fix the Google issue. I think it's more of a people issue than one that can be solved by technology. People need to agree to have a canonical site for a project.

  3. gwoo
    7 days later:

    David, this thread is very interesting to me because I had a similar impression of the social dynamics of putting "me" before the "project". For this reason, I built http://thechaw.com to try to balance the dvcs with a traditional project centric model. I have been very happy with the result. so far. In Chaw, the original project is generally started by an individual who quickly adds contributors. A new developer may fork the original project to show they have something of value that can be merged back. The result seems to be less disintegration and more collaboration between forks and the original project. My suspicion as we continue this little experiment is that the original project founders will ask quality "fork" developers to become part of the main project. At the very least the original project can maintain the code by a merge in from the forks every once in a while. Ideally, this results in exactly what you are looking for with less reason to abandon a project and a reduction in search costs for the one true source.

    thanks @doener in #git on irc.freenode.net for pointing me to this thread.