Make Life Easier for European Startups: Simpler/Cheaper Limited Liability Companies

As I've mentioned here before, one of the differences between "Europe" and the US is just how cheap it is to start a company in the US.  Before we go any further, I'll take a moment to add the standard "yes, I know that Europe is not one country" disclaimer, and specify that I'm mostly talking about continental Europe.  Starting a company in the UK or Ireland isn't nearly as bad.

In Oregon, I spent $55 to create DedaSys LLC.  If I'd created it with one or more partners, I would have spent something on a lawyer in order to create a solid agreement, but that is of course a voluntary expenditure that we would pay for because it provided us with some value.  In Italy, it costs thousands of Euros just to get started with a company; before you've made any money at all.  And, while there are gradual improvements, it's still a bureaucratic process that pretty much must involve at least an accountant and a notary public.  And you have to involve the very arbitrary number of 10,000 euros of capital in the company, supposedly there as a guarantee for the people you're doing business with.  But 10,000 is not nearly enough to cover some kinds of problems you might cause, and way more than, say, a web startup with a couple of guys and their laptops need.  My friend Salvatore says it's possible to sort of "get around" sinking the full 10K into your company, but in any case, the principal of "caveat emptor" is a more sensible one.  At most, make a transparency requirement so that people dealing with companies can tell exactly how much reserves they have.

During a recent bout of cold/flu, compliments of our daughter's nursery school, when I had some extra time on my hands, I decided to do something about this, however much it may be pissing into the wind.  I set up a site:

http://www.srlfacile.org (warning: Italian only)

As well as a Google Group, and petition for people to sign, in an attempt to make a little bit of noise about the problem, here in Italy.

While it's likely that the actual bureaucratic mechanisms are more smoothly oiled in other European countries, I have my doubts as to whether the actual amount of paperwork can compete with the very minimal page or two required by many US states.  And in any case, the costs are still quite high, and while we all have different levels on the idea of the role of government, and ideal levels of taxation, I think we can agree that it's sensible to levy taxes only after a company has begun to make money!

So – how about it?  Anyone want to create sister initiatives in other countries in Europe where things aren't as simple and easy as they should be?  Anyone care to tell of how this problem has been fixed elsewhere?  I've heard tell that in Germany, there is now a simpler/cheaper way to create limited liability companies.

US Exports: “The Cloud”?

An Economist special report in last week's print edition talks about how the US will need focus more on savings and exports:

A special report on America's economy: Time to rebalance

I've been thinking about that for a while too, especially after the dollar's recent weakness, although it has been strengthening some, lately, apparently due to the state of Greece's finances…

I think that the computing industry is, in general, well poised to take advantage of that.  For instance, what could be easier to export than computing power or "Software as a Service"?  All it takes is a few minutes for someone in Europe to sign up to a US-based service with a credit card.

For instance, compare Linode's prices and good service with most of their European competitors (gandi.net for instance, who are good people, and you have to love that they put "no bullshit" right on their front page).  Not that they don't have good service in Europe, but it's very difficult to compete on price with the dollar being significantly cheaper.  With the dollar where it is right now, gandi is almost, but not quite, competitive with Linode.  If you don't include taxes.  If the dollar weakens again, though, things could easily tilt far in Linode's favor.

Besides a weak dollar, I think it will be important for companies in a position to do so in the US to focus on "the rest of the world".  The US is a big, populous country where it's very easy to forget about far-off lands.  Compare my home town of Eugene, Oregon to where I live in Padova.  Google Maps says that it takes 7+ hours to drive to Vancouver, Canada (which, to tell the truth, isn't all that foreign in that they speak English with an accent much closer to mine than say, Alabama or Maine).  Going south, Google says it's 15+ hours just to San Diego, although I think that's optimistic myself, given traffic in California.  From Padova, I can be in France in 5 hours, according to Google, 3 hours to Switzerland, 4 to Innsbruck, in Austria, less than 3 hours to the capital of Slovenia, Ljubljana, and around 3 hours to Croatia, too.  And if you wanted to throw in another country, the Republic of San Marino is also less than 3 hours away, according to Google's driving time estimates.   You could probably live your entire life in a place like Eugene and never really deal much with foreigners, whereas here, nearby borders are both a historic and an ever-present fact.

The outcome of this is that, to some degree, people in the US have traditionally focused their businesses "inwards" until they got to a certain size.  Which is, of course, a natural thing to do when you have such a big, homogenous market to deal with before you even start thinking about foreign languages, different laws, exchange rates and all the hassle those things entail.

However, if exchange rates hold steady or favor the US further, and internal spending remains weaker, it appears as if it may be sensible for companies to invest some time and energy to attract clients in "the rest of the world".

"Cloud" (anyone got a better term? this one's awfully vague, but I want to encompass both "computing power" like Linode or Amazon's EC2, as well as "software as a service") companies likely will have a much easier time of things: for many services, it's easy to just keep running things in the US for a while, and worry about having physical or legal infrastructure abroad later.  Your service might not be quite as snappy as it may be with a local server, but it'll do, if it performs a useful function.  Compare that with a more traditional business where you might have to do something like open a factory abroad, or at the very least figure out the details of how to ship physical products abroad and sell them, and do so in a way that you're somewhat insured against the large array of things that could go wrong between sending your products on their merry way, and someone buying them in Oslo, Lisbon or Prague.

Since this barrier to entry is lower, it makes more sense to climb over it earlier on.  As an example, Linode recently did a deal to provide VPS services from a London data center, to make their service more attractive to European customers. 

However, they still don't appear have marketing materials translated into various languages, and presumably they don't have support staff capable of speaking languages like Chinese, German or Russian either (well, at least not in an official capacity).  This isn't to pick on them; they may have considered those things and found them too much of an expense/distraction/hassle for the time being – they certainly know their business better than I do – and that they simply are content to make do with English.  Other businesses, however, may decide that a local touch is important to attracting clients.

What do you need to look at to make your service more attractive to people in other countries?  In no particular order:

  • Internationalization and localization.  Most computer people can "get by" in English, but perhaps their boss doing the purchasing doesn't.  If research shows that you are in a market where people value being able to obtain information in their own language, or interact with a site or service in it, make an effort to equip your code with the ability to handle languages other than English, and then pay to have your content translated.  Good, professional translations are not easy: for instance, when I translate to English from Italian (you always translate from the foreign language to your native language – anyone who doesn't isn't doing quality work) I read the Italian text, digest it, and then spit out an English version.  This doesn't mean just filling in English words for Italian, but looking at sentence length and structure, as well as translating idioms and cultural references into something that makes sense.  Basically, you read and understand the concepts and then rewrite the text, rather than simply taking each sentence and translating it.  Also, knowledge of the domain in question is important, so that you don't translate "mouse" to "topo", but leave it as "mouse" as is proper in Italian.
  • Part of internationalization is considering problems like time zones, currency, and names, which can vary a great deal from culture to culture.
  • Going a step further, you might consider hiring, or outsourcing, staff that is fluent in other languages to provide first-level support.  Reading English is one thing for many people; they can take the time to work out what one or two unfamiliar words mean.  However, if you have a problem with your server over the weekend, and you don't feel comfortable writing or calling someone to deal with a problem in English, you might consider purchasing a local service even if it's more expensive, because you can deal with people in your own language if the need should arise.  These people might either be local or remote, depending on what their role is.  For instance, Silicon Valley is 9 hours behind Central European Time, so when it's 9 AM here, and the business day is just getting started, everyone but the late-night coders in California is headed for bed, which means that it would be difficult to provide timely support unless you have someone working the late shift.  It may be easier to hire someone in Poland to support your Polish users than finding a Polish speaker in Silicon Valley who is willing to work from midnight to 9 AM.
  • Legal issues are not something I can give much advice on, but things like the privacy of people's data certainly bear considering.   If you don't actually have offices abroad, though, it's less likely that anything untoward will likely happen to you if users understand that they're using the service in question according to the laws and regulations of the jurisdiction your business resides in.  Once again though: I'm not a lawyer.
  • Even something as basic as your name needs to be carefully thought through.  "Sega" for instance, has a potentially rude meaning in Italian.  These guys are visible from a major road near Treviso: http://www.fartneongroup.com/ – doubtless the company was founded locally and subsequently grew to the point where they then learned of their unfortunate name in English (admittedly though, it does make them potentially more memorable than their competitors).

There's certainly no lack of work there, but on the other hand, it's possible to do almost all of it from wherever you happen to be located, rather than spending lots of money and time flying around to remote corners of the globe, as is still common practice in many industries.

Where Tcl and Tk Went Wrong

I’ve been pondering this subject for a while, and I think I’m finally ready to write about it.
Tcl was, for many years, my go-to language of choice. It’s still near and dear to me in many ways, and I even worked on a portion of a book about it ( https://journal.dedasys.com/2009/09/15/tcl-and-the-tk-toolkit-2nd-edition ).

However, examining what “went wrong” is quite interesting, if one attempts, as much as possible, a dispassionate, analytical approach that aims to gain knowledge, rather than assign blame or paper over real defects with a rose-colored vision of things. It has made me consider, and learn, about a variety of aspects of the software industry, such as economics and marketing, that I had not previously been interested in. Indeed, my thesis is that Tcl and Tk’s problems primarily stem from economic and marketing (human) factors, rather than any serious defects with the technology itself.

Before we go further, I want to say that Tcl is not “dying”. It is still a very widely used language, with a lot of code in production, and, importantly, a healthy, diverse, and highly talented core team that is dedicated to maintaining and improving the code. That said, since its “heyday” in the late 90ies, it has not … “thrived”, I guess we can say. I would also like to state that “hindsight is 20-20” – it’s easy to criticize after the fact, and not nearly so easy to do the right thing in the right moment. This was one reason why I was reticent to write this article. Let me repeat that I am writing it not out of malice or frustration (I went through a “frustrated” phase, but that’s in the past), but because at this point I think it’s a genuinely interesting “case history” of a the rise and gentle decline of a widely used software system, and that there is a lot to be learned.

At the height of its popularity, Tcl was up there with Perl, which was the scripting language in those days. Perl, Tcl, and Python were often mentioned together. Ruby existed, but was virtually unknown outside of Japan. PHP was on the rise, but still hadn’t really come into its own. Lua hadn’t really carved out a niche for itself yet, either. Tcl is no longer one of the popular languages, these days, so to say it hasn’t had problems is to bury one’s head in the sand: it has fallen in terms of popularity.

To examine what went wrong, we should probably start off with what went right:

  • Tk. This was probably the biggest draw. Easy, cross platform GUI’s were, and are, a huge reason for the interest in Tcl. Tk is actually a separate bit of code, but since many of the widgets are scripted in Tcl, the two are joined at the hip. Still, though, Tk is compelling enough that it’s utilized as the default GUI library for half a dozen other languages.
  • A simple, powerful language. Tcl is easy to understand and get started with. It borrows from many languages, but is not an esoteric creation from the CS department that is inaccessible to average programmers.
  • Easily embeddable/extendable. Remember that Tcl was created in the late 80ies, when computers were orders of magnitude less powerful than today. This meant that fewer tasks could be accomplished via scripting languages, but a scripting language that let you write routines in C, or, conversely, let the main C program execute bits of script code from time to time was a very sensible idea. Tcl still has one of the best, and most extensive C API’s in the game.
  • An event loop. Lately, systems like Python’s “twisted”, and node.js have made event-driven programming popular again, but Tcl has had it for years.
  • BSD license. This meant that you could integrate Tcl in your proprietary code without worrying about the GPL or any other legal issues.

These features led to Tcl being widely used, from Cisco routers to advanced television graphics generation programs to the AOLserver web server, which was busy serving out large quantities of dynamic web pages when many of us were still fiddling around with comparatively slow and clunky CGI programs in Perl. Note also that there have been a lot of cool things to have gone into Tcl in the mean time. It has a lot of impressive features; many more than most people realize, and has had a lot of them for a while. Check out http://www.tcl.tk to learn more about the “good stuff”. But that’s not the point of this article…

There was a healthy, active community of developers producing lots of interesting add-ons for the language, and working on the language itself. This culminated in its adoption by Sun Microsystems, which hired the language’s creator, Dr. John Ousterhout, and a team of people, who added a lot of great features to the language. Quoting from Ousterhout’s history of Tcl page:

The additional resources provided by Sun allowed us to make major improvements to Tcl and Tk. Scott Stanton and Ray Johnson ported Tcl and Tk to Windows and the Macintosh, so that Tcl became an outstanding cross-platform development environment; Windows quickly came to be the most common platform. Jacob Levy and Scott Stanton overhauled the I/O system and added socket support, so that Tcl could easily be used for a variety of network applications. Brian Lewis built a bytecode compiler for Tcl scripts, which provided speedups of as much as a factor of 10x. Jacob Levy implemented Safe-Tcl, a powerful security model that allows untrusted scripts to be evaluated safely. Jacob Levy and Laurent Demailly built a Tcl plugin, so that Tcl scripts can be evaluated in a Web browser, and we created Jacl and TclBlend, which allow Tcl and Java to work closely together. We added many other smaller improvements, such as dynamic loading, namespaces, time and date support, binary I/O, additional file manipulation commands, and an improved font mechanism.

Unfortunately, after several years, Sun decided that they wanted to promote one and only one language. And that language was Java. So Ousterhout and many people from his team decamped to a startup that Ousterhout founded, called Scriptics, where the Tcl and Tk innovations continued:

In 1998, Scriptics made several patch releases for Tcl 8.0 to fix bugs and add small new features, such as a better support for the [incr Tcl] extension. In April 1999, Scriptics made its first major open source release, Tcl/Tk 8.1. This release added Unicode support (for internationalization), thread safety (for multi-threaded server applications), and an all-new regular expression package by Henry Spencer that included many new features as well as Unicode support.
However, as many a company based around open source was to find later, it’s a tough space to be in. Scriptics changed its name to Ajuba, and was eventually sold (at a healthy profit, apparently, making it a relative dot com success story, all in all) to Interwoven, for the “B2B” technology that Ajuba had developed. Interwoven was not interested in Tcl, particularly, so, to create a system for the ongoing development and governance of the language, the “Tcl Core Team” was created.

This was something of a blow to Tcl, but certainly not fatal: Perl, Python, Ruby, PHP, Lua have all had some paid corporate support, but it has by no means been constant, or included large teams.

At the same time in the late 90ies, open source was really starting to take off in general. Programmers were making all kinds of progress, and had begun to make Linux into what is today the world’s most widely used server platform, and laying the groundwork for the KDE and Gnome desktops. While these may still not be widely used, they are for the most part very polished systems, and leaps and bounds better than what passed for the ‘Unix desktop’ experience in the 90ies.

One of the key bits of work that was added to Tk was to make it look pretty good on Microsoft Windows systems. This was in an time when the “enterprisey” folks were turning away from Unix in the form of AIX, Solaris, HPUX, et al. and taking up NT as the platform of choice, so it was in some ways sensible to make Tk work well there, and in any case as a cross platform GUI toolkit, it ought to work well there in any case.

And, on the Unix side, Tk emulated the expensive, professional Motif look and feel that serious Unix programmers used. What could go wrong?

As Gnome and KDE continued to mature, though, what would become one of Tk’s major (in my opinion) marketing blunders took root. I have it on good authority, from someone who was there in the office, that the Scriptics guys working on Tcl and Tk viewed Gnome and KDE (and the Gtk and Qt toolkits) as not really worth their while. To be fair, since Tk has always been under a liberal BSD style license, the Qt toolkit has always been “off limits”. Still, though, the attitude was that Tk was a standalone system, and since it ran on pretty much any Unix system, it didn’t need to bother with Gnome or KDE. Gradually, though, as more and more people used Gnome and KDE exclusively on Linux, the Tk look and feel began to look more and more antiquated, a relic from the 1990ies when Motif (which has since pretty much disappeared) was king. Tk applications started to really stand out by not looking at all like the rest of the operating system. And, while Linux may not be responsible for a vast portion of the world’s desktops, it is widely used by developers, who were turned off by the increasingly creaky look of the Tk applications they saw.

Tk is and was actually a fairly flexible system, and it would have been possible to tweak the look and feel to make it look a bit better on Linux, without even doing any major work. Maybe not perfect, but certainly better looking. Nothing happened, though.

Another problem was that Tk and Tcl made it so easy to create GUIs that anyone could, and did, despite, in many cases, a complete lack of design skills. You can’t particularly blame the tools for how they’re used, but there was certainly a cultural problem: if you read most of the early Tcl and Tk books, and even many of the modern ones, there are hundreds of pages dedicated to exactly how to use Tk, but few to none explaining even basic user interface concepts, or even admonitions to the user to seek out that knowledge prior to attempting a serious GUI program.

The end result is that a lot of Tk programs, besides just looking “old fashioned” had fairly poor user interfaces because they were made by programmers who did not have a UI/design culture.

Contrast that with Gnome and KDE, which have made a point of focusing on how to make a good GUI for their systems, complete with guidelines about how applications should behave. It may have taken them some time to get things right, but they have done a lot to try and instill a culture of creating high quality, well-integrated GUI’s that are consistent with the system where they run.

Lately, there has been a lot of work to update the Tk look and feel, and it has finally started to bear fruit. However, in terms of marketing, the damage has already been done: the image of “old, crufty Tk” has been firmly planted in countless developers’ minds, and no amount of facts are going to displace it in the near future.

Another problem Tcl faced, as it grew, was the tug-of-war between those wishing to see it small, light, and easy to distribute embedded within some other program, and those wishing it to become a “full-fledged” programming language, with lots of tools for solving every day programs. Unfortunately, that tug of war seems to have left it somewhere in the middle. Lua is probably more popular these days as an embedded language, because it is very small, and very fast, and doesn’t have as much “baggage” as Tcl. Meaning, of course, that it doesn’t do as much as Tcl either, but for a system where one merely wishes to embed a scripting language, without much ‘extra stuff’, Tcl’s extra functionality is perhaps a burden rather than a bonus. On the other hand, while Perl was happily chugging along with their CPAN system for distributing code, giving users easy access to a huge array of add-on functionality, and Python was building up a “batteries included” distribution, that included a lot of very useful software straight out of the box. Tcl, on the other hand, chose to keep the core distribution smallish, and only lately has got some semblance of a packaging and distribution system, that is, however, run by ActiveState, and is (at least according to a cursory glance at the Tcl’ers wiki), not even fully open source. The lack of a good distribution mechanism, and, in the meantime, eschewing a larger, batteries-included main distribution left Tcl users with a language that, out of the box, did significantly less than the competition. Technically, a Python style “big” distribution would not have been all that difficult, so once again, I think this is a marketing problem: a failure of the Tcl Core Team to observe the “market”, assess what users needed, and act on it in a timely manner.

Somewhat related to the large Tcl vs small Tcl issue was one particular extension, or extensions to the language that was noticeably absent: a system for writing “object oriented” code. Tcl, at heart, will never be an OO language through and through, like Ruby or Smalltalk, but that doesn’t mean that an OO system for it is not a useful way of organizing larger Tcl systems. Indeed, Tcl’s syntax is flexible enough that it’s possible to write an OO system in Tcl itself, or, optimizing for speed, utilizing the extensive C API in order to create new commands. Over the years, a number of such systems have arisen, the most well-known being “Incr Tcl” (a play on the incr command, which is akin to += 1 in languages like C). However, none of these extensions was ever included with the standard Tcl distribution or somehow “blessed” as the official OO system for Tcl. This meant that a newcomer to Tcl wishing to organize their code according to OO principles had to pick a system to use from several competing options. And of course, newcomers are the least able to judge a complex feature like that in a language, making it a doubly stressful choice. Furthermore, even experienced Tcl programmers who wanted to share their code could not utilize an OO system if they wanted their code to work with just standard Tcl. Also, if their code had a dependancy on some OO system, it would require the user to download not only the extension in question, but the OO system it was built on, which, naturally, might conflict with whatever OO system the user had already selected! As of Tcl 8.6, thanks to the work of Donal Fellows, Tcl is finally integrating the building blocks of an OO system in the core itself, but this effort is probably something like 10 years too late.

Some other more or less minor things that have taken too long to get integrated in the Tcl or Tk distributions include the PNG format (still not there in a production release of Tcl), a readline-based command line (Tkcon is nice, but not a replacement for simply being able to type “tclsh” and get a nice, functional shell like Python, Ruby and most other languages have. This could easily lead to a bad first experience for someone trying out Tcl). Tcl also took too long to integrate a first-class hash type (accessed with the ‘dict’ command), which only appeared in 8.5. Its “arrays” aren’t bad, but don’t quite have the full power of a hash table as dict implements them. Once again, the code to do these things was/is out there, it has just been a matter of integrating it into Tcl and Tk, which has been a slow process.

One actual technical problem that Tcl faces is the concept that all values must be representable as a string. This is more or less ok for things like lists, hash tables or numbers, but is problematic when the user wishes to represent a value that simply isn’t a string. A basic example is a file handle, which at the C API level is a FILE*. How does Tcl get around this? It keeps an internal hash table with a Tcl-script accessible string, such as “file5” that points to a FILE * value that is used internally by file commands. This works pretty well, but there is a big “but”: since you can compose a string like “file5” at any time, that must be able to access the actual file pointer, you can’t do any sort of “garbage collection” to determine when to clean things up automatically. Other languages have explicit references to resources, so the program “knows” when a resource is no longer referenced by the rest of the program, and can clean it up. Therefore, the programmer must explicity free any resources referenced this way. This explanation is simplifying things somewhat, but it is something I view as a technical problem with Tcl.

If you’ve been following along, you’ve noticed a lot of “these days” and “recently”. That’s because Tcl is still very actively developed, with a lot of new ideas going into it. However, if you look at the release dates, it seems that after Ajuba was sold off, and Ousterhout pretty much abandoned an active role in the language for good, placing it in the hands of the Tcl Core Team, there was a lull in the momentum, with Tcl 8.5 taking 5 years to be released.

This is actually, in my opinion, an interesting phenomenon in languages: you risk hitting some kind of local maximum when your language is popular enough to have a lot of users who will be angry if things are changed or accidentally broken in the course of big upheavals. So you have to slow down, go carefully, and not rock the boat too much. On the other hand, there is an opportunity cost in that newer languages with less to lose can race ahead of you, adding all kinds of cool and handy new things, or simply fix and remove “broken” features. Erlang is another system that has, in my opinion, suffered from this problem to some degree, but this article is long enough already! Once again, though, not really a technical issue, but a problem with how the code was managed (and not an easy one to solve, at that).

A Tcl failure that I was personally involved with was the web infrastructure. What went on to become Apache Rivet was one of the first open source Apache/Tcl projects, and was actually quite a nice system: it was significantly faster than PHP, and of course made use of Tcl, which at the time had a large library of existing code, and could be easily repurposed for projects outside the web (or the other way around: non-web code could be brought in to a web-based project). One thing I ought to have done differently with the Apache Rivet project was listen to the wise advice of Damon Courtney, my “partner in crime” on the project, who wanted to see Apache Rivet have a fairly “fat” distribution with lots of useful goodies. Rails and Django, these days, have shown that that’s a sensible approach, rather than relying on lots of little extensions that the user has to go around and collect. The code was out there, I should have helped make Rivet do more “out of the box”.

A problem that is and it isn’t: the syntax. Tcl’s syntax is amazingly flexible. Since everything is a command, you can write new commands in Tcl itself – and that goes for control structures, too! For instance, Tcl has a “while” command, but no “do … while”. It’s very easy to implement that in Tcl itself. You simply can’t do that in most “everyday” languages. However, this comes at something of a “cost”. The syntax, for your average programmer who doesn’t want to go too far out of their comfort zone, is perhaps a little bit further afield from the C family of languages than they would prefer. Still though, a “human” problem, rather than a technical one. Perhaps, sadly, the message is that you’d better not “scare” people when introducing a new language, by showing people something that doesn’t look at least a little bit familiar.

Conclusions? First and foremost, that, gradually, Tcl and Tk continue to be improved. However, if one visits developer forums, there are also a lot of negatives associated with the Tcl and Tk “brands”, and I am not sure if it will be possible to rectify that. So what can we learn from the rise and subsequent “stagnation” of Tcl?

  • Success in the first place came from doing one particular thing very well, and making it a lot easier than other existing systems at the time. That’s a good strategy.
  • Not staying up to date can be deadly. Of course it can be tricky to know what’s a genuine trend, and what’s just a fad, in this industry, and picking the right things to stay up to date with can be tricky. That said, there are a number of areas where Tcl and Tk failed to follow what became very clear directions until way too late.
  • Do your best not to get trapped between existing users who don’t want to rock the boat, and don’t lose your agility and ability to iterate with the system you’re developing. Long delays between releases can be deadly.
  • Don’t lose touch with your “roots”. In this case, the open source community that is a “breeding ground” for new developers and projects. Tcl and Tk became passe` in that environment, which has led to its lack of adoption for new projects not only in the world of open source, but in businesses as well.
  • Don’t isolate yourself: Tcl and Tk stopped appearing at a lot of the open source conferences and events and in magazines/books/articles online, either because with no figurehead/leader to invite there was less interest in speakers and authors, or because the rest of the Tcl Core Team wasn’t particularly engaged, or for whatever other reason. This created something of a negative feedback loop where Tcl and Tk were things associated with the past, rather than something currently talked about and discussed.

Books vs “e-books” ?

I've been thinking about something for a while, and to be honest, still haven't reached any firm conclusions: what to think about self-published "e-books"?  I'm curious to hear your opinions.

For instance:

These are all: electronic, in that they aren't distributed as real, paper books, have no ISBN number, and are generally only available via the author's web site (you won't find them on Amazon.com).  They aren't simply on-line, PDF versions of real books.

They're certainly a departure from the traditional model of having a publisher, editor, and real, physical books that could be purchased from book stores.  They don't appear to have been through any kind of formal editing or quality control process.  The prices seem to differ quite a bit; the first one is $19, the second one is $12, and the last one is $30.77.

For the authors, the appeal is obvious: they get to keep all of the money, and don't have to fool around with a lot of "process".

Consumers, on the other hand, have to consider different aspects: with a "real book", the bureacracy and process exist to guarantee some minimum standards of quality.  If you buy an O'Reilly book, you know that it's probably like many of the other books they sell: perhaps it won't stand the test of time like something written by Knuth, but it'll be a pretty good guide to the technology it describes, most likely be someone who is indeed an expert in that area.  If I buy some random PDF on the internet, it may come from someone who really knows their stuff, or it may be junk.  On the other hand, were this market to grow, theoretically prices could come down.  Since the people who are authoring the book don't have to fool around with editing, printing, and so on, and get to keep all the money themselves, they could in theory keep their prices significantly lower than someone creating a more 'traditional' book with a lot of overhead.  That is, of course, if the book is one where there is competition in its niche.  Right now a lot of these books that pop up on my radar are written by domain experts.  However, what's to prevent a lot of people from jumping in and attempting to make a quick buck with a flashy looking web site?  Buying books based only on reputation?  That might lead to people who are really good authors, but perhaps not well known as "doers" (they didn't invent the technology in question) being left out in the cold.  Also, there is something of an unknown quantity about "pdf books".  For instance, after raking in a bunch of cash with theirs, 37signals put it on their web site, completely for free.  That had to leave the guy who bought it the day before it went free feeling like a bit of a chump.  At least with a 'real book', even if the contents are posted on the internet, you have a physical object that belongs to you.  I wonder how bad piracy is, and how bad it might be were these to become more popular?  Another thing worth noting is that, via services like Lulu.com, it *is* possible to print these out.

In any case, I think things are likely to change with time, as we aren't dealing with a static situation, but rather one where a changing landscape maylead to different outcomes, as the key variables… vary.

I am honestly unsure of what to make of this development.  How do you see the market for "home brewed" pdf ebooks evolving?

Nickles, dimes, pennies, and Italian regulations

I have recently gone about opening a "Partita IVA" ( http://it.wikipedia.org/wiki/Partita_IVA ) so I can act as an independant consultant here.  Like everything here, it's a pain in the neck, but opening it wasn't all that bad, compared to other close encounters of the bureaucratic kind that I've had.

When it came time to send out my first bill, of course I had to get the accountant to help me put it together (simply sending a bill with the amount to be paid would be way too easy).  The crowning touch, though, was that I had to go to the "tabaccheria" and purchase a 1.81 (one Euro, eight-one cents) "marca da bollo" ( http://en.wikipedia.org/wiki/Italian_revenue_stamp ) to affix to the aforementioned bill.  This is only necessary, however, in cases where the bill exceeds 77.47 (seventy-seven Euro, fourty-seven cents).  The end result was that between asking the accountant for help, going over to the store to get the stamp, and so on, I probably wasted in excess of a half an hour of my life for something that really isn't that complicated.

Who dreams this bullshit up, anyway?

Google execs convicted

In an update to an earlier article I posted, it appears that the Google executives in question have been convicted:

http://www.corriere.it/salute/disabilita/10_febbraio_24/dirigenti-google-condannati_29ebaefe-2122-11df-940a-00144f02aabe.shtml (in Italian)

They were convicted for having failed to block the publication of a video showing some teenagers picking on and hitting another minor with Down's syndrome.

It will be interesting to see how Google reacts.  Apparently, the court believes that Google is criminally responsible for videos its users happen to post, which means that they would, in theory, have to personally review every video submitted to determine whether they are going to be infringing on someone's rights because of its content?

Update:

Here's a New York Times link:

http://www.nytimes.com/aponline/2010/02/24/business/AP-EU-Italy-GoogleTrial.html


Update 2:

"cate" posted a link to Google's official response: http://googleblog.blogspot.com/2010/02/serious-threat-to-web-in-italy.html

Also, it's really incredible to read the comments here (in Italian): http://vitadigitale.corriere.it/2010/02/processo_vivi_down_google_cond.html

Most of them are against this ruling, but a significant number think it's a good thing, which just goes to show that you can't put all the blame on politicians for Italy's woes: someone is voting for them, after all.

Italy vs Google

I'm starting to notice a pattern here:

  • Google executives are on trial because some sorry excuses for human beings picked on a retarded person and posted the video to youtube: http://news.bbc.co.uk/2/hi/technology/8115572.stm – this one is simply preposterous.  Going after the execs of a company who did nothing to aid, abet, condone or in any way facilitate the abuse in question is absurd, and if extended to other industries would mean that you could pretty much attack any company whose products happened to figure in a crime somehow.  Kitchen knives, hunting rifles, golf clubs, even automobiles would seem fair game.
  • Italy is going after "user generated content" sites like Youtube and wants to force them to register with the government if they wish to operate: http://arstechnica.com/tech-policy/news/2010/02/italy-preparing-to-hold-youtube-others-liable-for-uploads.ars
  • And last but not least, this hit piece in the normally respectable Corriere della Sera: http://www.corriere.it/economia/10_gennaio_28/mucchetti_4de4be8a-0be8-11df-bc70-00144f02aabe.shtml
     – it's in Italian, but the gist of it is that Mr Mucchetti really has it in for Google because they operate out of Ireland in the EU, whereas he believes they should be registered in Italy as a publisher, and subject to Italy's myriad rules, regulations, and, of course, taxes regarding publishing.  Despite, well, not really publishing much of anything themselves. He mentions "tax evasion" charges that had been considered, because the Italian division of Google is not where the adsense revenue in Europe goes.  I suppose he figures that since the ads are bought by residents of Italy, the money should somehow stay in Italy?  He also huffs and puffs about Italy's antitrust laws, which, in the same piece, he admits were created with the express purpose of not touching existing companies (the market share limit was set higher than the share of the largest existing company).  Perhaps he would do well to reflect on political schemes and carve-ups like that and think about why companies like Google go to Ireland, rather than Italy.  He also makes some quick mentions of network neutrality, and rambles on a bit about how it's a battle between the "Obamanian, Californian, search engines" versus the telecommunications industry, in "the rest of the world and above all in Europe".  And of course he uses a liberal sprinking of keywords like "globalization", "multinational corporations", and "deregulated" to attempt to paint Google in terms of being a big, evil company throwing its weight around.  One wonders if there aren't more pressing problems with the Italian media industry, such as the prime minister owning a large chunk of it?

One way of seeing things is that politicians and businessmen in Italy noticed Google was actually making quite a bit of money, and even if they don't quite understand this internet thing, they want some of the loot.

And while Google certainly is becoming big enough to be cause for worry and discussion, the moves against them in Italy do not seem anything like a rational response calculated to offset severe failures in the market.

In any case, it will be interesting to see what happens.  Maybe, after China, we'll see Google quit Italy as well?

Flippa experiment

I decided to try a little experiment with Flippa.com, a site where you can auction off domains or web sites.

I put http://www.innsbruck-apartments.com up for auction:


http://flippa.com/auctions/83341/Innsbruck-Austria-rental-listing-site—Ski-Season

We'll see how it goes and whether the site is worth using for other sites that I'd like to sell on.

It's a good test case, because it's a site I threw together years ago simply to aid our search for a new apartment in Innsbruck, and then requested by friends.

Rough Estimates of the Dollar Cost of Scaling Web Platforms – Part I

I have been pondering the idea behind this article for a while, and finally had a bit of time to implement it.

The basic idea is this: certain platforms have higher costs in terms of memory per concurrent connection. Those translate into increased costs in dollar terms.

Nota Bene: Having run LangPop.com for some time, I'm used to people getting hot and bothered about this or that aspect of statistics that are rough in nature, so I'm going to try and address those issues from the start, with more detail below.

  • Constructive criticism is welcome. I expect to utilize it to revisit these results and improve them. Frothing at the mouth is not welcome.
  • There is something of a "comparing apples and oranges" problem inherent in doing these sorts of comparisons. As an example, Rails gives you a huge amount of functionality "out of the box", whereas Mochiweb does much less. More on that below.
  • I am not familiar with all of these systems: meaning that I may not have configured them as I should have. Helpful suggestions are, of course, welcome. Links to source code are provided below.
  • You can likely handle many more 'users' than concurrent connections, which means multiple browsers connecting to the site at the same time.
  • Programmer costs are probably higher than anything else, so more productive platforms can save a great deal of money, which more than makes up for the cost of extra memory.  There's a reason that most people, outside of Google and Yahoo and sites like that, don't use much C for their web applications.  Indeed, I use Rails myself, even though it uses a lot of memory and isn't terribly fast: I'd rather get sites out there, see how they do, and then worry about optimizing them (which is of course quite possible in Rails).

Methodology

All tests were run like so: my new laptop with two cores and four gigs of memory was used as a server, and my older laptop was used to run the ab (Apache Benchmark) program – they're connected via ethernet. I built up to successive levels of concurrency, running first 1 concurrent connection, 2, 10, and so on and so forth. The "server" computer is running Ubuntu 9.10, "karmic".

Platforms

The platforms I tested:

  • Apache 2.2, running the worker MPM, serving static files.
  • Nginx 0.7.62, serving static files.
  • Mochiweb from svn (revision 125), serving static files.
  • Jetty 6.1.20-2, serving static files.
  • Rails 2.3.5, serving up a simple template with the current date and time.
  • PHP 5.2.10.dfsg.1-2ubuntu6.3, serving up a single php file that prints the current date and time.
  • Django 1.1.1-1ubuntu1, serving up a template with the date and time.
  • Mochiweb, serving a simple template (erltl) with the date and time.
  • Jetty, serving a simple .war file containing a JSP file, with, as clever observers will have surmised, the date and time.

As stated above, it's pretty obvious that using Rails or Django for something so simple is overkill:

Better Tests for the Future

I would like to run similar tests with a more realistic application, but I simply don't have the time or expertise to sit down and write a blog, say, for all of the above platforms. If I can find a few volunteers, I'd be happy to discuss some rough ideas about what those tests ought to look like. Some ideas:

  • They should test the application framework with a realistic, real world type of example.
  • The data store should figure as little as possible – I want to concentrate on testing the application platform for the time being, rather than Postgres vs Sqlite vs Redis. Sqlite would probably be a good choice to utilize for the data store.
  • Since this first test is so minimalistic, I think a second one ought to be fairly inclusive, making use of a fair amount of what the larger systems like Rails, Django and PHP offer.
  • I'd also be interested in seeing other languages/platforms.
  • The Holy Grail would be to script all these tests so that they're very easy to run repeatably.

Results

With that out of the way, I do think the results are meaningful, and reflect something of what I've seen on various platforms in the real world.

First of all, here we look at the total "VSZ" (as ps puts it) or Virtual Size of the process(es) in memory. Much of this might be shared, between libraries, and "copy on write" where applicable.

The results are impressive: Rails, followed by Django and PHP eats up a lot of memory for each new concurrent connection. Rails, which I know fairly well, most likely suffers from several problems: 1) it includes a lot of code. That's actually a good thing if you're building a reasonably sized app that makes use of all it has to offer. 2) Its garbage collector doesn't play well with "copy on write". Which is what "Enterprise Ruby" aims to fix. Django and PHP are also fairly large, capable platforms when compared to something small and light like mochiweb.

That said, excuses aside, Erlang and Mochiweb are very impressive in how little additional memory they utilize when additional concurrent connections are thrown at them. I was also impressed with Jetty. I don't have a lot of experience with Java on the web (I work more with J2ME for mobile phones), so I expected something a bit more "bloated", which is the reputation Java has. As we'll see below, Jetty does take up a lot of initial memory, but subsequent concurrent connections appear to not take up much.  Of course, this is also likely another 'apples and oranges' comparison and it would be good to utilize a complete Java framework, rather than just a tiny web app with one JSP file.

So what's this mean in real world terms of dollars and cents? As your Rails application gets more popular, you're going to have to invest relatively more money to make it scale, in terms of memory.

For this comparison, I utilized the bytes/dollar that I'm getting for my Linode, which works out to 18,889,040.85 ($79.95 for 1440 MB a month).

As we can see, to have a similar amount of concurrent users is essentially free for Mochiweb, whereas with Rails, it has a significant cost.  This information is particularly relevant when deciding how to monetize a site: with Erlang and Jetty it would appear that scaling up to lots of users is relatively cheap, so even a small amount of revenue per user per month is going to be a profit, whereas with Rails, scaling up to huge numbers of users is going to be more expensive, so revenue streams such as advertising may not be as viable.  It's worth noting that 37 signals, the company that created Rails, is a vocal supporter of charging money for products.

There's another interesting statistic that I wanted to include as well.  The previous graph shows the average cost per additional concurrent user, but this one shows how much the platform costs (using  when there is just one user, so it acts as a sort of baseline:

As we can see, Jetty is particularly expensive from this point of view.  The default settings (on Ubuntu) seem to indicate that, for instance, the basic $20 a month Linode package would not be sufficient to run Jetty, plus a database, plus other software.  I think that the Apache Worker number is off a bit, and may reflect settings made to handle a large number of connections, or perhaps a different MPM would make sense.

Source Code / Spreadsheet

The spreadsheet I put together is here: http://spreadsheets.google.com/ccc?key=0An76R90VwaRodElEYjVYQXpFRmtreGV3MEtsaWYzbXc&hl=en

And the source code (admittedly not terribly well organized) is here: http://github.com/davidw/marginalmemory/