Python “Surpasses” Perl?

There has been some buzz going around about Python “surpassing” Perl in terms of “popularity”:

http://mail.python.org/pipermail/python-list/2008-November/517191.html

However, they’re basing their results on the TIOBE survey, which is, in my opinion, even dodgier than my own, at http://www.langpop.com . It’s really difficult to pin this sort of thing down, but I think that by utilizing more data sources, my own numbers bear some relationship to what my “ear to the ground” tells me about what’s going on, and certainly more than TIOBE’s, which place D, Logo and Lua above things like Tcl/Tk. I know Tcl is not the hottest thing out there these days, but…Logo? D is gaining in popularity, but how many companies are hiring D programmers, how much code is out there written in D? I’m not saying that D is “bad” or “not worthy” because of that – quite the contrary, I think it’s on its way up, but there are a lot of things you have to look at in order to take a stab at creating some meaningful numbers, and even then it’s important to remember that popularity isn’t everything! In any case, picking a precise moment in time when one language “passes” another is also a bit more theater than science, in my opinion.

Getting back to Perl and Python, I thought I’d look at some numbers of my own. Unfortunately they only go back a year, but they already give us an idea of what’s going on. In order to concentrate on something fairly concrete, let’s look at Freshmeat projects:

Perl is still clearly more widely used. However, I think there’s an important distinction to be made in terms of popularity. Since many systems that function ok stay around for years, there’s a big difference between what’s being used for new systems, and what’s out there already. Perl has a lot of code out there already, but how fast is it growing in comparison to Python?

Python’s definitely got the edge – it’s growing, whereas Perl was very nearly static. To me that does indicate that Python has got momentum right now, where Perl is sort of coasting.

Job search sites, Java and Ruby

I occasionally like to fool around with statistics gathered from the Internet. Sometimes, to produce something like langpop.com, which, even if it’s unscientific, I feel is useful, and more or less reflects my gut feeling about which way the wind is blowing. Other times, it’s fun just to grab some numbers, add them up, and not worry too much if they really have any validity or meaning. In this case, I suspect there’s something to them. What do you think?

The technique: take different job search sites, like monster.com, craigslist, and so on, search them for various languages (with Yahoo’s search API), count the hits, and then look at the ratios. For instance, Java jobs to Ruby jobs, with the idea that, painting with very broad brush strokes, the Java jobs are going to be more “enterprisy”, and the Ruby jobs hipper, cooler, and maybe gone six months from now as the economy tanks and funding dries up. Rough techniques notwithstanding, there do seem to be two distinct groups of sites, one with lower rations, the other with higher ratios (more Ruby jobs compared to Java jobs). For fun, I also included Python and Erlang, although there are very few Erlang jobs out there.

Ruby/Java ratios

Of course, it’s also true that the bigger sites, like dice.com, had more jobs total. Indeed, dice.com has more hits for Erlang than CrunchBoard does for Java!

Totals

langpop.com in Tim Bray’s OSCON keynote

It was neat to see “langpop.com” on the screeen during Tim Bray’s talk at OSCON (contains a link to the video):

http://www.tbray.org/ongoing/When/200x/2008/08/05/Annotated-OSCON-Keynote

The talk itself was an overview of the state of programming languages. However, 15 minutes is not enough time to do the topic justice, but if you’re not a language geek, it’s not a bad survey, and I really like his style: he’s fair when he points out the good and the bad. Like him, I am sick of PHP and do not care to use it any time in the near future, but that doesn’t mean there aren’t a lot of good things to be said about it. In any case, I’m honored that Tim used langpop.com as a source for his talk.

LangPop.com – programming language popularity – update

These few days when Ilenia and Helen are still in the hospital are the eye of the storm for me. It’s quiet at home and I actually have a few free hours when I’m not allowed to be in the hospital, or when they need to get some rest.

One of the things I managed to do recently was some Javascript hacking in order to create a timeline for LangPop.com: http://www.langpop.com/timeline.html. It was fun, because most of the “heavy lifting” is done by Timeplot, and I just had to push the data into place. Of course, there isn’t much interesting there because the site is relatively new, but it should be interesting to see how languages fare over time.

I did some hacking on Timeplot to make it easier to host it on my own server, and to load a bit faster by stuffing it into one big ugly blob of Javascript. When I get a bit of time, I’ll make my changes public, as I think they’re fairly useful for anyone who wants to fiddle around with Timeplot some, and thus host it themselves.

The other thing I did with the site was switch the X and Y axis of the charts, because that works out better in terms of screen space for the labels, with so many languages to keep track of.

Why “dying” is an inappropriate term for programming languages

If you follow online discussions of programming languages, you’ll occasionally see articles about language X “dying”. It is, however, a bad metaphor, because the “life cycle” of a programming language is different from single biological units, of which you can definitely say at some point “He’s not pining … this is an ex-parrot!”.

Once a programming language is widely deployed, it simply won’t up and disappear from one day to the next. Too many people and companies depend on it, and replacing it is difficult enough, that the “switching costs” will keep it in place for many years. The obvious example is Cobol. This effect works in favor of languages that are not “the new thing”, and means they will probably not truly cease to be used for many, many years.

On the other hand, it’s also pretty easy to tell which way the wind is blowing and get an idea of what’s hot and what’s not. Java got hyped a lot in the past, now it’s Ruby. Scala is beginning to get hyped some, so is Erlang. Careful observers can spot the rise of languages as they hit the early stages of the technology adoption lifecycle.

By paying attention to what’s going on in the wider world, it’s also possible to see when languages are used less and less for new projects, and solely kept on for the legacy systems. I think this is what most people mean by “dying”, but of course it creates confusion, because it’s clear that the language is still utilized. In some cases, I think it may be possible that a language continues to be more widely adopted, yet still seems to lag behind, simply because other languages are being adopted at a significantly higher rate.

So what it’s really about is two factors: how widely deployed is the language in any given moment, and, perhaps more importantly, the rate of adoption – is it accelerating or slowing down? Are more people jumping on the bandwagon, or is the language being used less by innovators, early adopters, and new projects?

You need both those coordinates to understand where a language is at – it may be huge but declining, small but on its way up, or a niche language that never achieved widespread usage that isn’t catching on any more. The last one is the one you’d probably want to be wary of. The first two will probably be ok as long as you understand what you’re getting, and the type of user or organization you are matches where the language is at in its lifecycle.

I think that perhaps those interested in discussing these things seriously ought to find some good terms to use. “Forth’s rate of adoption is declining” doesn’t sound as exciting as “Java is dying!” though. “Fortran’s diffusion is decelerating” doesn’t have much of a ring to it.

Any other ideas?

No next big language?

Ola Bini, one of the JRuby hackers, and a very bright guy, posits that there “won’t be a next big language”:

http://ola-bini.blogspot.com/2008/01/language-explorations.html

There might be some that are more popular than others, but the way development will happen will be much more divided into using different languages in the same project, where the different languages are suited for different things. This is the whole Polyglot idea.

I’m dubious, and wonder what he would consider to be the underlying sociological and economic factors driving this change. Programming languages are, in the end, about people in all their weirdness, so to understand where languages are going to go, you have to consider those human factors, as I’ve attempted to do here.

One trend that points to a slow proliferation of languages is of course the lock-in cited in my article. Today’s big languages (Java, and on the web, PHP) won’t just go away from one day to the next, just as C, Cobol and Fortran have not disappeared with the advent of Java. That process will continue, making it likely that new languages will carve out new territory for themselves rather than exclusively cannibalizing existing installations from older languages. This naturally leads (slowly) to more languages, even if the next generation has a Next Big Language.

And why shouldn’t there be one? Ola talks about a situation where various languages run and interact on top of a runtime (JVM). Isn’t that similar to what we’ve had with C, though? Perl, Tcl, Python, etc… all run on top of C. Sure, the JVM is a step up from that in some ways, most notably GC, being a bit more consistently cross-platform, and having a wider array of libraries, but in the end, it still comes down to the network effects of being able to read and write a common language, whatever it runs on. Obviously, the network externalities of programming languages are not so strong that they hit a tipping point after which one language crushes all the others, but they are strong enough to consolidate leadership in one or a few languages. Programmatically, Jython, JRuby, (and Hecl?:-) may even find it easier to interact on the same platform, but the humans writing the code will still push for consistency and the minimum set of common tools in order to aid the sharing and review of code.

Another way to look at it might be from an organizational point of view. Today’s biggest fish in the pond, Google, only allows four languages for their production systems. What would a “no big language future” organization look like? I can’t believe they’d welcome a big hodge podge of things.

In conclusion, as computers are ever more widely used, it’s certain that more languages will be utilized. However, it’s also likely that from time to time a few languages, with one in the lead, will emerge as the leaders.

Programming languages are not like hand tools

I have seen the “right tool for the job” comment one too many times, and felt like writing something about it.

Real tools are pretty simple, really. Things like saws and hammers have been around for thousands of years, and don’t require a great deal of thinking to understand. Some people are masters with hand tools and create wonderful works of art with them, but the tools themselves are not complex.

My programming language, Hecl, is a pretty simple one in a lot of ways (it has to be, to fit on cell phones), and at last count, the core has 7670 lines in a rough count that includes comments. That entails quite a few “moving parts” compared to even something like an electric drill. Programming languages are complex. Certainly to create, but even to use well; to understand the nooks and crannies and ramifications of the language’s design is not something that is accomplished in a few days for most people.

Your average programmer really doesn’t know all that many languages, and even those who know many probably don’t use most of them on a day to day basis. I like learning new languages, yet still wouldn’t relish the idea of going through the mental task-switch required to move between too many in one day.

What this means, is that when there is a job to be done, most programmers will reach for what they already know, rather than the best possible system, and that makes sense. If you know Python, why not write web software in Python, if you get asked to create something for the web? Or if you know Tcl because you create gui’s with Tk, you’re still likely to pick it when you need to create a little server. There are reasons why something like Erlang might make for a better server platform than Tcl, but is it that much better to justify dropping everything, learning a new language, committing to a new set of infrastructure requirements, and adding another thing to ask of people joining your organization? Most likely, it is not, unless the server you’re writing is for a telephony switch that has to be on all the time, in which case, yes, maybe you do need Erlang. If, as is often the case, it wasn’t something that had to be the best in the world, all that extra time to learn a new language would have been mostly “wasted” (ignoring, of course, the fact that for individuals, learning new things is almost always good, but that’s not what we’re talking about). Even the smart folks who hack away in languages like Haskell are fond of using what it has to offer, rather than jumping ship to something else at the first hint of the “something else” being better at their language of choice for a given task.

So both Tcl and Erlang were up to the task of creating a little server. You could have used Ruby or Python as well. Java might prove a bit more verbose, but it’s certainly up to the task. Any number of languages would be, of course. So it’s not like there is a “right tool” and a “wrong tool”.

Maybe a better analogy is languages are like toolboxes, or even whole workshops, full of tools. Most of them have most of the common tools like hammers and saws, some sort of web libraries or the means to create tcp/ip servers. Where they differ is in things like the amount of tools available, the specialized tools that you get for free, the craftsmanship and ease of use of the tools in the hands of both experts and beginners… or in some cases (lisp, say), some fancy tools and supplies that make it possible to build yourself some new tools if you need to (that only you know how to use, but that’s another topic). Some sets of tools really aren’t very good for some jobs – you don’t want to build blazing fast 3-d shooter games in PHP, just as you wouldn’t want to use a regular set of screwdrivers on a delicate watch. However, even though you’d want to use a toolset that also has some specialized tools for working on the watch, you’d rather own a toolbox that came with the watch tools as well as the regular Phillips screwdrivers, because those are pretty handy around the house, and it wouldn’t do to have only the watch tools for fixing common household items.

The point being, languages are big investments, so those that do more, even if they’re not the absolute greatest or first choice for a given task, are generally going to be more useful to more people, especially those just starting out who want a “safe” choice, a language they know will let them do a variety of things, rather than only one or two things (pound nails, say) really well.