Fast, Light and Asynchronous

I am a big fan of Ruby on Rails: it does a lot of things, and it does most of them pretty well.  When starting a new web project, it's the first thing I would reach for: most of the time, your problem is going to be figuring out a good product/market fit, or whipping up some internal tool without wasting a lot of programmer time.  Once you've got a firm grasp of the problem, then maybe you can consider optimizing. Who cares if you do something that no one wants really really fast?

However, Ruby on Rails is not beautiful in terms of being particularly fast or lightweight.  No complaints from me: most of the time, I'm happy to have something that does so much for me, leaving me to work on the actual problem at hand.  Once in a while, though, you do need fast and relatively lightweight, and that space has been getting more interesting over the past few years, at least in terms of the web.

First of all, technologies like "Comet", utilizing web sockets or some other always-on connection are becoming more common, where a socket with the server remains open in order to quickly exchange data from the server to the client – and back.  That seems to be a poor fit for something like Rails, where it can tie up a lot of resources if one isn't careful.  And while computing costs continue to decline, no one minds getting more for less in terms of what their server can do.  Furthermore, with frameworks like Backbone.js, pushing more and more code to the client, the server can afford to be a bit simpler and do less, so it may as well be snappy to boot.

Java has long been fairly popular for "heavy lifting" types of applications, partially because it does end up being reasonably fast.  But it's not something I've ever had much fun using and is usually kind of wordy, and makes you feel like you need a crew of people in Palo Alto, one in Bangalore, and one in Stockholm just to churn out all the code.  And it certainly is no lightweight in terms of memory either.  So… while it can certainly do pretty much anything you need, I don't see it as being the strongest player when someone needs "real time web" code, and needs it to be reasonably light weight.

Ruby, outside of standard Rails stuff like Passenger, seems to offer some interesting possibilities for this kind of work, like Reel but they don't seem to have the traction other solutions do.  Python is in a similar situation: the Twisted framework has been around for a while, and while it has some success stories, never seems to have really 'caught fire'.  Neither of these languages was built for "concurrency" from the ground up, and that seems to have, to date, inhibited people from using them extensively for this kind of job.

My gut feeling is that the need for speed and "concurrency" (or at least handling a lot of concurrent users) will drive adoption of languages heretofore not so popular on the server.  Let's have a look at them.

Erlang: This is by no means a new language, having been developed in Ericsson in the late 1980ies; and it implements several interesting concepts.  First and foremost, concurrency is handled in the form of many small "processes", which are not actually Unix processes at all, but processes internal to the Erlang VM.  The Erlang system contains a scheduler that allocates resources to all of these processes, so even if one of them dies or behaves badly, it's not a problem: the system as a whole can continue to function well.  The way Erlang is built, the scheduler is preemptive: the internal "processes" don't need to yield to let other processes run.  Beyond the scheduler and simple processes, Erlang gives you the tools to create elaborate trees of supervisors and workers that are quite robust to failures in any one portion of the system, as well as giving you the tools to set up a system to run on multiple, distributed computers.  This kind of thinking is necessary when you write applications, such as phone switches, where downtime is really, really not ok.  Erlang processes comunicate almost exclusively via message passing, meaning that state is not shared.  Altogether, this makes for a fast, rock solid system that can easily handle thousands of concurrent connections without breaking a sweat.

The computing world being what it is though, Erlang is likely to be more of a Lisp or Smalltalk: it's a trailblazer that did many things years before their time in other languages, but I don't see it as ever quite catching on amongst 'the masses'.  It has a wonky syntax, it's a functional programming language, and because it is used in environments where too much experimentation is not good, it does not have a lot of room to break with its own past and innovate in terms of the language itself: it's slow to change and improve, even where the need to is clearly perceived.

Node.js:  is the opposite, in some ways: being based on Javascript, it draws on a huge number of potential programmers – orders of magnitude more than Erlang.  And thanks to design decisions enforcing the use of asynchronous code and callbacks for anything that could block, it deals quite well with concurrent connections, even if the language and libraries don't really give you much in terms of "true" concurrency.  This is a simple model that works pretty well, even if, theoretically, all it would take is one "while (1)" loop in a callback to block the entire system.   In practice, this doesn't seem to be a big problem, though.  More of a problem is writing maintainable code, when everything is a callback.  Keeping the network of callbacks straight can be a bit of a chore, and is probably not an optimal model in terms of programmer productivity.  That said, people seem to do make due with it, although Node is young enough that we haven't seem projects that are 5 or 6 years old and maintained by people who didn't write them.  One of the advantages of being such a popular language is that lots of people have a strong interest in seeing Javascript being very, very performant.  That need begat the V8 Javascript engine, from Google, which they were kind enough to release as free software.  So one of the advantages that Node has is that the underlying implementation is extremely fast for a dynamic language.  For many people who know Javascript, picking up Node.js is also an easy choice, even though people used to browser side programming with Javascript will have to adjust their way of thinking to succesfully tackle server projects.

Go language: this is an interesting one, written by some luminaries who  work at Google.  It can best be described as something akin to C with some updated features, such as garbage collection, that make it more suitable for working on large projects, where things like memory leaks are going to make life very frustrating.  It has the feel of a "real language, for real programmers" in that it doesn't stray far too far from what people are used to – it's not going to cause the "what the hell is this?!" reactions that Erlang might in more close-minded circles.  Between the big company backing it, and the approachable syntax and concepts, Go looks like it has a good shot at the mainstream.  Where it gets interesting is their concurrency model, which is apparently based on something called "communicating sequential processes"  which, superficially at least, looks like it has some things in common with Erlang's "actor model" of concurrency.

Under the hood, Go apparently hives off its "goroutines" to different OS level threads, but does not have a preemptive scheduler like Erlang: http://code.google.com/p/go/issues/detail?id=543 – although according to this, they may change that in the future.

I'm not enough of a computer science guy to comment much on the details of CSP vs Actors, but both seem like valid models with strengths compared to trying to keep threads straight, which always seems to be a source of problems for programmers.

Conclusions

So, what's actually going to happen?  I see Node.js as the clear front runner.  It takes a worse-is-better approach that seems to work well enough as people get started.  If they encounter difficulties later, they can always rewrite in something else, if needs be, but by "luring people in", Node.js has gathered a large group of users who continue, in turn, to churn out more code for use with the system, making it more attractive to new users.

Programming languages are not winner-take-all markets though, so perhaps there is room for a few more languages to have decent followings in this space.  Hopefully the competition will lead to ever better tools for those of us utilizing them!

What do you think?

Small Town Google Pollution

There are so many sites trying to vie for Google results for any and every town in existance in the United States that they are crowding out useful information.  They get their list of towns from census data or similar sources, and generate pages for every single entry they find, no matter how small.

During the weekend, I was poking around, looking for information on a not-quite-ghost town called Lonerock, in Oregon:


http://en.wikipedia.org/wiki/Lonerock,_Oregon


I love to visit out of the way places like that when I'm back in Oregon, which we hope to do next summer.  So I was curious about it – it appears quite isolated, but seems like it might be worth a look.  According to the Wikipedia article it is a very small town, with a population of 24 people.  And yet, if you look at the Google results for it, you find a few good links at first, and then:

  • Any number of results with "city information", which is just census data with a bit of fiddling.
  • Current local time.  Gee, that's great to know.
  • Various 'homes for sale' sites: none listed.
  • Horses for sale "in Lonerock", but it turns out they're all in Heppner (30 miles away) or farther.
  • Climate statistics.
  • Bicycles for sale in Lonerock.  Surprisingly, none of those, either.
  • Attorneys at law in Lonerock. I think they ran them all out of town: there are none.
  • Truck scales and weigh stations in Lonerock.  You have to drive at least 50 miles to get to the nearest one.
  • Hotels in Lonerock: nope, none of those either.
  • Various Cable and Internet offerings (I think in reality they use this: http://en.wikipedia.org/wiki/IP_over_Avian_Carriers )
  • Doctors in Lonerock: Neurologists, Gastroenterologists and even Nephrologists.  Turns out there aren't any in town.
  • Car Rentals in Lonerock.   Just in case you fly in to any of the Airports in Lonerock, and want to leave rather than staying in the Hotels in Lonerock.
  • Apartments for rent in Lonerock.  A town of 24 people in the middle of the wide open west is not the kind of place you're likely to live in an apartment…
  • Relocation guides for Lonerock.

And so on and so forth.  Scattered in the middle, you can even find a few articles and photo pages by people who actually had something to say or show about the place, but finding them amidst all the crud is not an easy task.  To me this seems surprising – it doesn't seem like it'd be that hard to make sites that just happen to have entries for every town in the entire nation rank a bit lower than things written by human beings.

Google – have a look at Lonerock, and see if you can use it as a way to seperate the wheat from the chaff!

Simple phone alerts with Gmail and Android

I have some long running server processes that I am going to launch soonish, and I want to be alerted when they're done.  Using Gmail and Android, it's pretty easy:

  1. In Gmail, set up a new label, "Alerts" or something like that.
  2. Set up a filter in Gmail that matches messages along the lines of codered@example.com, and adds the Alerts label, does not skip the inbox and are always marked as important.
  3. Now, on your Android phone, in Gmail -> Settings -> Account settings (select the account) -> Sync inboxes and labels, and select your Alerts label.
  4. Then, go back and select Labels to Notify in settings, select "Alerts", and add it to "Notify in Status bar", select an appropriate ringtone, set vibrate to 'always', and deselect 'Notify once' – because we want to be notified any time an alert comes in.

That should do it – on your server, set up an alias in something like /etc/aliases that redirects email from the alert alias – codered in this case – to your own email address.

Now, you can script alerts that will get their own ringtone on your phone like so: echo "Dave, I can't let you do that" | mail -s "Warning, computer malfunction" codered@example.com

Pretty simple and effective if you need a simple way for your computer to let you know that it needs attention right now.

Up for Auction: LinuxSi.com

A number of years back, I read yet another complaint about someone having trouble finding a computer with Linux preinstalled.

So I did something about it: I created LinuxSi.com, where it is possible to register computer stores in Italy (this was an Italian Linux mailing list) that are helpful towards people wishing to buy a Linux machine.

Fast forward past getting married, having kids and buying a house, and LinuxSi.com is not something I have much time to run any more.  I still think it's a useful service, even if the site itself is a bit creaky.

In any event, I've put it up for auction with Flippa.com, and there's one week left on the auction.  Right now, it's going for just $10, which even with the low amounts of adsense income it brings in, you'd make back pretty quickly.

I hope that it goes to someone who cares about promoting Linux in Italy – if nothing else, the domain name is a good one that could be employed for many things.

Mr. Blank, we’re outside the building, and we want eBooks!

Steve Blank is known for his teachings on the Silicon Valley type of entrepreneurship, with his ideas forming the basis for the "lean startup movement" amongst other things.  He writes frequently on entrepreneurship, and with a great deal of credibility, having been involved in various startups in a number of roles.  He has, without a doubt, walked the walk in terms of startups, and now seems to be spending his time helping other people learn how to walk the same path.  That's a noble thing to be doing when, with the money he's made, he could probably be off doing pretty much whatever he wants.

If you've heard of Steve Blank, you've probably also heard his famous phrase: "get out of the building", an admonition to startup founders to get out and talk with their customers to validate their ideas, rather than huddling in their offices building something that may or may not have a market.

With that in mind, when I saw he had a new book out, The Startup Owner's Manual, I thought "great, that's one I'll get without hesitating!".  Unfortunately, though, an eBook wont' be out until "2nd half of 2012"!  Ouch.

To me, his ignoring eBooks is indicative of a need to get a bit further outside the building, though.  "I want an eBook" was probably the biggest request on his blog post announcing the new book, along side messages of thanks for writing the book.

After reading, on Blank's blog about the availability of the book from BookDepository Ltd, who offer free worldwide shipping, I went ahead and ordered it even if I would have prefered the eBook.  Since they're in the UK, and I'm in Italy, I figure it can't take that long, right?

Wrong.  I ordered on March 15th, and as of April 13th, it still isn't here.

Compare and contrast with the other books I'm currently reading which I was able to order and start looking at in just a few minutes on my Kindle.

Granted, Steve Blank surely isn't doing this for the money, and from that point of view has little real need to listen to his customers – it's not wrong to say he's doing the world a favor by writing the book in the first place.  If he thinks a paper version is far superior, that's his perogative.  However, I think he's doing a lot of his readers a disservice by not making the eBook available sooner.  I know I would have liked to start reading what he had to say last month, rather than waiting for a paper book to make its way (by mule train?) down here to Italy.  The crux of the matter is that while he may well be right in thinking a paper book is "better", for some people, an eBook is the only option, and for them, an "inferior" eBook is a heck of a lot better than no book at all.

Also, on a more constructive note, with eBooks, you can get pretty creative.  For instance, if you have a tabular worksheet, you can simply hyperlink to it in, say, Google Docs, so that those with more advanced devices like iPads can open up the link and start working with a real, live spreadsheet immediately, rather than a chart in a printed book.  Granted, that means 'giving away' the worksheet, but presumably it's not that valuable on its own, and makes for great advertising if it gets a lot of attention.

Finally, since I actually run a business that does eBook conversions , on the blog post announcing the book, I offered to donate our services, so he'd get his book done for free, so you can't accuse me of just complaining!

Mr. Blank, get out of that building and make an eBook available, please!

I’m not good enough to work on open source software

Actually, that's not true – I've produced plenty of open source software over the years.  However, in a sense, it is true: only the very best actually get paid to work on open source software full time, and I'm not one of them.  People like Linus Torvalds.  People like Guido van Rossum, although even he supposedly divides his time, and does not work on Python full-time.

Think about that.  Python is a hugely popular programming language used by many companies and individuals who get a lot of value out of it. But the creator doesn't even work on it full-time. Now, that's just one example – perhaps Guido enjoys the things that Google sends his way to work on outside of Python in any event – but I think it's representative of open source in general.

Back to me: I've produced some small bits of open source code that other people find useful.  Several people have even built products on Hecl that make money. But I'm not good enough to work on open source full time – I'm not one of those famous, brilliant coders who is so good that someone will find a way to pay them to work on stuff that gets given away for free.  I am, however, a good enough programmer to work on people's proprietary code, and have never had too much trouble finding someone with a project they're happy to pay me for.  Why is that?  Because it's so much easier to funnel money back into a proprietary project.  If people like the product and buy it, the company gets money, which can be used to pay the developers.  With open source, millions of people might use it and get a lot of value from it, but the developer has no right to receive any of that back as cash, which he or she can use to pay for things like food and rent.

So, I can  code tolerably well, and I could conceivably contribute more open source code, but instead I spend my time working on proprietary code because it pays the bills.  Clearly, when I can, I use open source software in these endeavors, and contribute back whenever I can, but the "secret sauce" remains closed.  That's a pair of hands lost to the creation of more open source.

I know I'm not alone in this, either – tons of people work on mostly proprietary projects the world over, but relatively few people get paid to work on open source code full time.

So when I read about people debating the utility of copyright bring up the existence of open source as some sort of counterexample, it irritates me a bit.  The right level of protection and enforcement of copyright is a complex debate that I'm not going to get into here.  What I want to point out is "that which is not seen".  Sure, open source exists.  But how much more of it would exist it there were more money to fund work on it?  How much open source software remains an idea in the developers head that does not get realized for lack of time?  People often criticize the "Linux desktop" despite its extraordinary strides in recent years.  Well, how much farther along would it be if there were more people paid to work on the 'boring stuff', like usability testing?  Ubuntu and Redhat pay a few people to do that kind of stuff, but how many more people do Microsoft and Apple have for that kind of work?

That's not to say that open source "doesn't work" or some such nonsense.  It obviously works quite well, but it really shines where the currency is code, not money.   Developers can and do give back lots, in terms of code, bug reports, suggestions, documentation, and so on to open source projects, which make them better for all involved.  Where open source doesn't seem to work quite as well are in small, fast-moving, consumer-oriented products.  My guess is that 99% of iPhone users could care less about the source code for their apps, but on the other hand, a large portion of the Emacs user base more than likely has written at least a few lines of Elisp.

In any event, the point isn't to beat up on open source software, but to counter this idea that "intellectual property" is in no way shape or form necessary because the existence of open source software somehow "proves" that "things will get made just the same".  Yes, maybe they will, but in lower quantities than consumers might find desirable.  After all, most of us aren't good enough to work on open source software.

BikeChatter.com for sale

What with two kids, a new house, and LiberWriter getting some good traction, I've been looking around for things to give to a good home so as to have less stuff to deal with.

So, on the auction block goes BikeChatter.com : https://flippa.com/2696023-professional-cyclists-on-twitter-plus-2-years-of-history

BikeChatter.com is the place to go on the web to follow professional cyclists on twitter.  With 500+ racers, and nearly half a million status updates from racers like Lance Armstrong, Alberto Contador, Mark Cavendish, and many, many more, this site is the best place to find out what's going on in the world of professional cycling, directly from the participants.

Since I like following the site myself, I really want to see it go to people who will take it and make it even better.