Javascript Charts

I’ve been doing some work on LangPop.com, and one of the things I’d like to do is update the chart software. What I have now, Plotr seems to work ok, but being the tinkering type, I want to fix things, even if they’re not broke. Truth be told, my worry is that Plotr isn’t maintained any more, so it would be a good idea to find something that’s receiving a bit of attention.

The candidates:

These seem to be fairly up to date in the sense that someone worked on them recently. There are some older ones like PlotKit, that do not appear to be maintained any more. It is entirely possible that I missed a good one. Another possibility would be to use Google’s chart API, but I’d rather be a little bit more in control of things than to farm that out, and I also am planning on doing some interactive features in the near(ish) future.

So let’s have a look:

Flot

Based on JQuery, this one looks fairly complete, and has a lot of different, nice looking charts. Since I don’t really care what library it’s based on JQuery seems as good as any other, being quite popular these days. Installation is pretty simple, and, having good defaults, it’s easy to get nice looking data up on the screen in short order.

Flotr

Modeled after Flot, Flotr uses Prototype (the Rails default) instead of JQuery, and is the follow up to Plotr. I’m not quite sure what the motivation was behind the name/project change, but this seems to be where the guy is spending his time. Since he did a good job with Plotr, this one seemed worth a look too.

ProtoChart

This one is based on Prototype too. It immediately annoyed me by uncompressing from the .zip file in the current directory, scattering files around. I’ve always found this behavior a bit antisocial. This code claims to be “motivated by Flot, Flotr and PlotKit libraries”, which indicate that it’s fairly recent. However, my feeling is that the problem is not necessarily how old code is in these projects, but how quickly they spring up, bloom, and then stagnate. I’d like to use something that’s got some staying power… But anyway, ProtoChart looks like pretty good code, even if the distribution is a bit minimal, and doesn’t come with examples.

If you look at LangPop.com, you’ll notice something that’s very important for my choice: I need a barchart that has horizontal bars, rather than the more traditional vertical bars. This is because it would be quite difficult to squeeze so many languages across the screen horizontally. This ended up being the deciding factor: like Plotr, Flotr supports horizontally oriented bar charts, making it the obvious choice. The other two libraries looked pretty good too – Flot, in particular, looks like quite solid code.

So, there you have it – very brief reviews of an incomplete selection of libraries! In my defense, the goal is to attempt to dominate my maximizing nature, pick one, and get on with doing some cool stuff with LangPop. I do, however, welcome comments and suggestions.

Langpop.com – now with IRC

Someone suggested the clever idea of counting IRC channel users as a metric to use for langpop.com, so I went ahead and gathered that data. Since this is the first month, there are a few glitches (I missed #fortran – sorry guys), and a thing or two to iron out, but I thought I’d go ahead and publish the new results. IRC counts towards the “discussion” portion of the stats, rather than the main part. Like everything else, it has its biases – I used the Freenode network, which is definitely more about free software than other places might be. One number that really jumps out is just how popular the Haskell channel is – it’s right up there with all the most popular languages. It will be interesting to see if, over several years, all the “talk” about Haskell starts to translate into people using it for fun and for their jobs.

I welcome comments on this post, but there’s also a Google group I’ve set up for discussion of the results, data sources, languages, and so on:

http://groups.google.com/group/langpop/

Another idea I’m kicking around is to create a separate page for “up and coming” languages, stuff like Scala, Clojure, F#… that kind of thing. It might have to use different metrics, because the newer languages often don’t show up on many of the ones I use for the languages we currently have.

On Debian, but not on Time

It was kind of interesting to see this the other day, on the train from Monselice to Padova:

Debian Trenitalia

That’s a Debian system, kernel 2.6.8.1, which appears to have some problems booting. That’s not the only problem with the train system in Italy, unfortunately:

Debian Trenitalia

My Padova->Monselice train, roughly a 20 minute train ride, was 65 minutes late, and would end up being nearly an hour and a half late. Ouch! To be fair, the train system in the US is actually even worse, if that can be believed. And in Austria, while they run on time, they’re so expensive and don’t run often, so that it’s often much more convenient to drive.

Github Part III

Since this seems to be fairly constructive, I’ll keep going. Those not interested in Github, or the dynamics of open source code creation can safely tune out. Chris, the Github dude, has some more to say:

http://ozmm.org/posts/forking_continued.html

I mentioned the Network Graph and Fork Queue but David mentioned neither. I think he doesn’t know what they are, probably because I didn’t explain what they are 🙂

I had had a look at them, and they’re handy tools (I would never accuse Github of not doing good work, that’s not the issue at all), but I would respectfully submit that perhaps Chris is looking at things from a very Github-centric perspective, in which it’s second nature to go look at those. They aren’t obvious, and certainly don’t jump out at the user to say “hey, maybe this code your looking out isn’t the latest and greatest!”. For someone just cruising around, perhaps it should be more evident who’s the ‘top dog’?

Chris then goes on with a very helpful and illustrative demo of how things work at Github, which is good stuff.

However, at the end, he says something that I don’t agree with:

It may seem strange, and perhaps even like a lot of work. “Why should I have to check to see which is the most current? In the old model, there’s always a canonical repository.”

That’s precisely the problem. It does seem like a lot of work, especially when your search space is not limited to Github, but may include other places like Sourceforge, RubyForge, Google Code, project specific sites, and so on.

In the old model, actionwebservice wouldn’t have made it past 1.2.6. Welcome to distributed version control.

Plenty of ‘old style’ projects have survived beyond their founders’ interest in the project. What happens is that you ask for permission to work on the project, and either :

  1. It’s given to you, in which case you can keep working on the canonical code. For instance, someone could have asked DHH to work on the RubyForge version of actionwebservice. Did they? Did he say no? At the very least, he could have been asked to point the RubyForge actionwebservice page at some other site with a more current version of the code.

  2. You have to fork the project, and in that case, sure, you might as well have been using the Github model.

I’ve often found that people are happy to let you contribute to their projects, though, and part of my original point with all of this is that if people just go spitting out forks willy-nilly, it creates a “paradox of choice” type problem, and perhaps takes something away from the community aspect of open source projects. As people are fond of saying at the Apache Software Foundation, it’s about the people, not necessarily about the code. I’m not saying that forks are always bad and that everything should be centrally done, but there’s a balance to be struck between people just working on their own, and some sort of onerous, bureaucratic Central Project Authority. It’s nice that people who want to improve the code make themselves heard on a mailing list/forum/whatever, and thus cover the “people” part of merging in new code and new ideas. More often than not, help and contributions are more than welcome. Design decisions and conversations recorded on mailing lists are available for people to peruse in the future when they have questions.

Just to repeat something that bears repeating: I am not claiming that Github will lead to social disintegration of open source projects or anything drastic like that. However, I’m a bit wary of certain patterns I’ve seen. It’s certainly possible that I’m wrong – as I mentioned, one error I might be making is that people are dumping code they simply wouldn’t have shared on Github, because Github makes it so easy. I do hope my worries are unfounded, and in any case, it’s a good thing that the Github guys are interested in the problem too, and will hopefully do things to alleviate it where possible.

More Github

Before I get started, I want to make it quite clear that I have no problems with Github – au contraire, they provide a very nice service. What my speculation centers around is the social dynamics of Github.

One of the Github guys writes a nice response to my original post, here:

http://ozmm.org/posts/linux_vs_classic_dev_style.html

First of all, apologies for the comment bug. In theory, it has been fixed by the Typo guys, I just need to find the time to upgrade.

Chris mentions that it’s possible to create a “SourceForge style” project on Github. I see that it’s possible to add collaborators, so I guess that’s what he means. I still find it a bit ugly to have things at …/davidw/… but that’s not very important if the URL is at least stable, rather than hopping around to whoever has the latest fork and updates of the code in question.

In my other post, I talked about people being less interested in doing what it takes to contribute to existing projects, rather than forking them, unless it’s really called for. Here’s a concrete example of how things might go wrong. At my current consulting gig, we were looking at utilizing actionwebservice to do some SOAP stuff. That code is not part of Rails anymore, but that’s not the problem – surely someone has decided to take up maintenance of the gem, right? Well, yes… but let’s look:

root@fortrock:~# gem1.8 search -r actionwebservice

*** REMOTE GEMS ***

actionwebservice (1.2.6)
datanoise-actionwebservice (2.2.2)
dougbarth-actionwebservice (2.1.1)
nmeans-actionwebservice (2.1.1)

So we have the original, clearly out of date, and three others. Yuck… I guess we’ll use the 2.2.2 one, but who knows if perhaps the others contain some good stuff too? Let’s Google it.

http://www.google.com/search?hl=en&q=actionwebservice

The first link is to a RubyForge project, which is basically dead. A few links down, there is a link talking about a more recent version, at the Datanoise site. However, it’s one of many. Confusing. If you go to the github page for the datanoise version of actionwebservice, the README has this:

The latest Action Web Service version can be downloaded from
http://rubyforge.org/projects/actionservice

Hrm. We were just there and that doesn’t seem quite right… Sure, it’s an easy enough oversight to ignore, but it adds to the confusion.

Now, this is not really a serious problem, but do you see how it could get worse over time, as various people fork the code, write blog entries, and so on? It would get harder for someone interested in utilizing the code in question to ascertain which is “the” version. It’s not hard to imagine situations where two forks add different bits of useful code. Typically, in situations like this, new users, those who don’t know the code at all, are also the people least able to look through all the diffs and changes and log entries and what not in an attempt to merge any changes that might be useful. This leads to a “paradox of choice” type of problem.

Another consideration, this one more positive, is that perhaps people are putting code in Github that, in the past, might have simply remained on their own computers, with no effort whatsoever to share it, so in that sense, Github might be doing people a service by at least making the code public.

However, even in that case, if the service fills up with code that people have simply “thrown over the wall”, rather than creating a genuine open source project, with a community, communication channels, and so on, it could get a reputation for having code that you have to handle with care, as a lot of is just stuff slapped up there without much thought to quality or continuity.

Anyway, the Github guys have done a nice job with the service, and I’m sure they’ll continue to improve it. Who knows, perhaps they have some good ideas about mitigating some of these potential problems that they’ll be rolling out soon.

Rivet in Action

I’m terrified of flying (actually heights more than flying, but planes tend to go pretty high), and seeing all the images of Flight 1549 floating down the river didn’t help things. I’m glad everyone got out ok.

An interesting detail about the whole story though, from my point of view at least, is all the press hits that Karl Lehenbauer‘s http://flightaware.com/ got – it was mentioned on CNN and a number of other places, and seemed to handle the traffic ok. A little known fact is that, as can be seen from the HTTP headers, FlightAware is built on Apache Rivet, which was one of my first serious open source projects. It’s good to see them doing well, although it’d be nice to see them help out a bit more with Rivet. It’s still good code, and does what it does quite well, but needs some love.

The “Gig” Economy?

This article talks about the rise of people working part-time, doing various ‘gigs’:

http://www.thedailybeast.com/blogs-and-stories/2009-01-12/the-gig-economy/full/

Perhaps her intended audience is not interested in “dull” things like economics, but even someone like me with just a smattering of reading in economics can’t help but think of Coase and his Theory of the Firm when reading it.

The question is: why do firms exist? Why isn’t the economy composed of a huge network of independent contractors? If markets and prices and such work so well, why do these big, monolithic companies, which internally, are not ‘free markets’ exist?

His answer is, greatly simplified, “transaction costs”. From the article:

Every time the boss turns around asking for a key member of staff to join today’s frantically convened cost-cutting strategy meeting the reply comes back, “It’s not Sam’s day to come in and he’s the one working on it. Julia can come, though.” “Julia? What she got to do with it?” “Yeah, well, we’ll have to bring her up to speed.”

In other words, while there is a savings from not having Sam present every day, there is also a cost associated with it, which includes having had to look for and interview Sam to do some work, and bargaining over a price for Sam’s services, and then “bring him up to speed”. You have to do those things with regular employees too, but you have to do them less often with people who stick around for years.

So if one wanted to look, in a less anecdotal way and slightly more scientific way, about whether the US, or other countries, are turning into “gig economies”, one could do worse than look at the transaction costs associated with utilizing contractors as opposed to permanent employees. If those costs have fallen (it may be easier to find people, thanks to the internet, for example), perhaps it is more efficient to have more contractors and fewer full time employees and so the equilibrium will tilt in that direction. Of course, another explanation might simply be that the economy is bad, and companies don’t have the budget to take on more people, so they get by with what they can in the short term.

In terms of the social costs and benefits to a “gig economy”… well, that discussion is best left for other sites, as it’s a big, long, complex one with lots of politics and economics and eventually boils down to everyone’s own view of how the society they live in ought to look, which is very much outside the scope of this journal.

Hosting, Commodities, and “The Cloud”

Anyone connected to the world of IT has been bludgeoned over the head with “cloud” news lately, to the point where the term has become vague and mostly a buzzword. There is, however, something behind the phenomenon, described in Nick Carr’s book The Big Switch, with more computing power being centralized at large data centers where economies of scale come into play.

I had an interesting chat with Marco D’Itri a while back, about hosting and commodities. It’s clear that there is some commoditization going on, but he maintains that hosting is not a commodity. I’ve been thinking about it and the conclusion I’ve come to is that some parts of the business are certainly commodities: disk space, memory, bandwidth, and processor cycles. Those things, are, ultimately, what we want to buy when we buy ‘hosting’. However, the bits and pieces between that, and people who build on top of those services (i.e. someone who runs a web site) are not really a commodity, yet. Customer service, for instance, might vary a great deal between providers. What will happen in that space, will we see a baseline price for the commodities, and hosting resellers built on top of that that offer different levels of service? I do think that the “mom and pop” hosting type of situations will gradually disappear in favor of larger data centers that can take advantage of economics of scale, though.

Also, as I’ve written about in the past, in Web Hosting – A Market for Lemons, there are some serious information asymmetry issues – how do you know the people providing your service are serious? How do you know they’re using good components that wont’ break often, rather than cheap junk that will lead to frequent outages? If you have the resources, you can build a system like Google’s, where it doesn’t matter what fails, it just works around it, but the basic tools that most of us are working with right now aren’t that high level.

I was reminded of the information asymmetry issue by this article, written by the Dreamhost folks, Web Hosting’s Dirty Laundry, which describes how they caught a ‘review’ site trying to get money from Dreamhost for positive reviews, which is interesting in light of the lemon problem. Wikipedia has some criteria for a lemon market here: http://en.wikipedia.org/wiki/The_Market_for_Lemons#Criteria, which include the following:

Deficiency of effective public quality assurances (by reputation or regulation and/or of effective guarantees / warranties

If it’s difficult to get real, honest, impartial reviews of hosting services, that is a push in the direction of ‘lemons’. Of course, it’s not impossible to get this information, but it seems that a lot of us still go by “hearsay” – what others we know use and report to be ok. To compare it with another product, I’d probably ask around to friends prior to purchasing a new car, but whatever I get is likely to work ok, even if it’s not the absolute best. On the other hand, the wrong hosting provider might be very much a “fly by night” operation that leads to a lot of downtime, so I’m far more likely to listen to what other people have to say, and be far more cautious about buying “any old thing”.

Opinions? Comments? Thoughts? Where do you see this industry going?

Developer > Project, or Project > Developer(s)?

I’ve been fooling around with “git” lately – it’s useful for local projects where I don’t want to have to set up a remote server, and it also seems to be “the way the wind is blowing” – lots of open source projects are starting to use it, so I might as well familiarize myself with how it works.

The github service seems to be very popular at the moment, and it’s not hard to see why. They’ve created a very useful service, and give it away completely free for open source use. That’s a pretty good deal! It’s also an easy to use site that’s well integrated with git, so you can see lots of stuff graphically instead of fooling around with command line tools.

However, I have a nagging doubt. Github apparently bases everything around the developer. In other words, projects hosted there are mostly attached to a developer who ‘owns’ the project, rather than existing on their own. There are a few exceptions – there seems to be a ‘rails’ role account to manage Rails, rather than having it belong to, say, DHH, so it’s not impossible to do something in a project-centric fashion, but it doesn’t seem to be very common either.

What I’m wondering is what sort of effects this might have in the long term. Github, and git itself, make it very easy to ‘fork’ projects. Could, or will this lead to a situation where people don’t try so hard to get their code in the ‘official’ branch? What happens when someone loses interest in their project? Sure, someone else can fork their repository, but if all the links were pointing to the first guy, because he was the main hacker on a project for X years, they’re not going to switch overnight even if guy B takes over maintanance. And in the meantime, casual searchers might turn up the ‘dead’ branch, or simply be confused as to who’s doing what. This could make things confusing for someone who just wants to check out some code and start using it…. trying to sort out various revisions and ancestry of a project could be quite discouraging.

At the Apache Software Foundation, on the other hand, people are obviously very important, but projects are the organizational unit that people get involved with, and a lot of emphasis is placed on not having one big benevolent dictator, but on having a group of people who make decisions and act as arbiters of what goes into the project. People can come and go as they please, but the project, and, importantly, the links to it, remain the same over time (barring any big, unforeseen blowups).

As far as I can tell, this is not a git problem, but something related to how github organizes their site. And, of course, it’s possible that I have simply missed something due to my lack of knowledge of git, and not being more than superficially familiar with github. Even if I’m right, it shouldn’t take that much effort for github to create a system where it’s possible for multiple people to ‘own’ a project.

Angry Perl Users

I work on LangPop.com for fun. I make a tiny bit of money from the adsense there, but nothing much to speak of. Unlike the TIOBE folks, I don’t sell my data to anyone. So it can be a bit frustrating when people get all bent out of shape about the results. A case in point was my recent article here, entitled Python “Surpasses” Perl?.

In case the title wasn’t a dead giveaway, I’ll spell things out:

  • I was poking a bit of fun at the idea that any statistics of this nature could pinpoint, to a specific month, the moment when one language “passes” another. They’re way too fuzzy for that!

  • I point out that Perl is still quite popular. According to the Freshmeat metric I was reporting, Perl actually still has more code out there than Python.

And just to be clear: I don’t use Perl or Python or really care which one is more popular. I mostly use Ruby these days, with Tcl, Java, and bits of C thrown in for fun. In other words: langpop.com is not an evil conspiracy to discredit Perl.

However, I also had the temerity to suggest that Python might be a little bit more popular in terms of new things happening. Shock! Horror! Well, that certainly irritated some Perl users. They called my statistics lies, they misspelled my name (as well as the word “you”, which is a lot simpler than my name by any standard), and some of them went off on rants that were pretty badly off target (SourceForge is never mentioned in Langpop or this online journal!). However, not once did they really get around to answering my points, or proposing any sort of valid alternative metrics.

One of my favorite Linux/Free Software news sources, lwn.net published their 2009 predictions, and had this to say about Perl:

It will be a make-or-break year for Perl. If the Perl developers cannot either bring new life to Perl 5 or turn Perl 6 into something real, this language will, by the end of the year, have moved well down the road to “legacy” status.

As can be imagined, that set them off all over again. The mental image I get is a big angry bull, frothing at the mouth, and ready to stomp that damn red cape.

Here’s the problem, though, guys: the cape isn’t what’s holding you back. Trust me, I know.

Once upon a time, when people spoke of scripting languages, they talked about “Perl, Python and Tcl”. No Ruby, no PHP, although both existed. I happen to like Tcl a lot, and it measures much worse than Perl on all the popularity charts… worse than plenty of other languages, these days. Lately, I don’t use it so much, but once upon a time I dedicated a lot of mental energy to wondering why things weren’t going better. And in that community too, there were people who consistently held their fingers in their ears and yelled “LALALALA there is no problem!”. In the long term, it’s not an effective strategy: it makes everyone but the other True Believers think you’re going off the rails, because the evidence is plain to see that things weren’t quite like they used to be, that there are problems. And it doesn’t really address the problem in a constructive way, either. Long term, what works is to produce good code, and get it in people’s hands.

The most effective way of dealing with problems of this nature includes these points:

  1. Not froth at the mouth any time you perceive a slight to your technology of choice.
  2. Talk up the cool things you’re working on. Be positive, explain what’s new and exciting.
  3. Acknowledge reality. For instance with Perl (or Tcl), it’s plain to anyone that they aren’t as popular as they once were. Don’t deny it, but focus on the fact that they’re still widely used, and lots of new code is still written in them, and that they’re still the focus of a lot of work to improve them.
  4. Actually follow through. Jon Corbet’s (Linux Weekly News) point is not a bad one – you can have all the internal releases and milestones and general fun hackery that you want, but at a certain point, you need to get working code out in front of people. This is something Ruby seems to be suffering from these days too (as observed from afar) – lots of next generation fiddling, and tons of good ideas and energy, but to me there is a lack of technology actually flowing my way in a timely manner. With the possible exception of JRuby, which is making lots of progress and putting good code out there for people to use.

To conclude things, since the Perl guys were so unhappy with Freshmeat, here are some statistics from Amazon and Craigslist, comparing about a year ago with the most recent numbers:

As can be seen, both Perl and Python are dropping in Craigslist postings, but Perl is falling farther.

As a final footnote, I am sorry about this journal not accepting comments past a certain point (10 days is the cutoff) – otherwise I get 100’s of spam postings a day. Unfortunately, the software I’m using didn’t deal well with alerting people to the fact that comments were closed. I filed a bug, and they’re working on it.