Hosting, Commodities, and “The Cloud”

Anyone connected to the world of IT has been bludgeoned over the head with “cloud” news lately, to the point where the term has become vague and mostly a buzzword. There is, however, something behind the phenomenon, described in Nick Carr’s book The Big Switch, with more computing power being centralized at large data centers where economies of scale come into play.

I had an interesting chat with Marco D’Itri a while back, about hosting and commodities. It’s clear that there is some commoditization going on, but he maintains that hosting is not a commodity. I’ve been thinking about it and the conclusion I’ve come to is that some parts of the business are certainly commodities: disk space, memory, bandwidth, and processor cycles. Those things, are, ultimately, what we want to buy when we buy ‘hosting’. However, the bits and pieces between that, and people who build on top of those services (i.e. someone who runs a web site) are not really a commodity, yet. Customer service, for instance, might vary a great deal between providers. What will happen in that space, will we see a baseline price for the commodities, and hosting resellers built on top of that that offer different levels of service? I do think that the “mom and pop” hosting type of situations will gradually disappear in favor of larger data centers that can take advantage of economics of scale, though.

Also, as I’ve written about in the past, in Web Hosting – A Market for Lemons, there are some serious information asymmetry issues – how do you know the people providing your service are serious? How do you know they’re using good components that wont’ break often, rather than cheap junk that will lead to frequent outages? If you have the resources, you can build a system like Google’s, where it doesn’t matter what fails, it just works around it, but the basic tools that most of us are working with right now aren’t that high level.

I was reminded of the information asymmetry issue by this article, written by the Dreamhost folks, Web Hosting’s Dirty Laundry, which describes how they caught a ‘review’ site trying to get money from Dreamhost for positive reviews, which is interesting in light of the lemon problem. Wikipedia has some criteria for a lemon market here: http://en.wikipedia.org/wiki/The_Market_for_Lemons#Criteria, which include the following:

Deficiency of effective public quality assurances (by reputation or regulation and/or of effective guarantees / warranties

If it’s difficult to get real, honest, impartial reviews of hosting services, that is a push in the direction of ‘lemons’. Of course, it’s not impossible to get this information, but it seems that a lot of us still go by “hearsay” – what others we know use and report to be ok. To compare it with another product, I’d probably ask around to friends prior to purchasing a new car, but whatever I get is likely to work ok, even if it’s not the absolute best. On the other hand, the wrong hosting provider might be very much a “fly by night” operation that leads to a lot of downtime, so I’m far more likely to listen to what other people have to say, and be far more cautious about buying “any old thing”.

Opinions? Comments? Thoughts? Where do you see this industry going?

Slicehost vs Linode

Update: since I did these statistics, Linode has upgraded their memory offering even more, which means that their offer is even better than what I show here.  As of summer 2010, Slicehost's offering really isn't very competitive, unfortunately.  I continue to hear that they do provide great service, but so does Linode.

Since hosting is *the* major expense for http://www.dedasys.com, and obviously a critical part of much of what I do, getting the right one is very important. Naturally, "the right one" for me may not be the right one for everyone. I am a fair sysadmin for small numbers of machines, so I don't mind doing that myself – I don't need, nor want to pay for hand holding.

Since this post has become fairly popular, I am going to link to my Linode affiliate URL if you'd like to sign up with Linode – thanks! Other than running my sites there, I have no other affiliation with Linode. And, if you're more a fan of Slicehost, here is my referral page on Slicehost.

I recently moved everything over to [Slicehost](http://www.slicehost.com) on the recommendations of friends, and am so far fairly happy with the experience. It's cheaper, simpler, and more flexible than Layered Tech was. To boot, I also like the fact that Slicehost is run out of St. Louis, Missouri, rather than some really expensive tech center like San Francisco or Boston. Hosting is basically a commodity, and much like you wouldn't want to put a factory in San Francisco, you don't want your hosting their either. However, I discovered something annoying about Slicehost: they use x86_64 servers, which, per se, doesn't really matter to me – I use open source code that can run on any number of architectures. The problem is that this particular architecture uses more memory than plain old x86. Significantly more. My Layered Tech server with 1 gig of memory was hitting the swap space a bit, but the same code on a slice was swapping quite heavily, despite the fact that I'd moved PostgreSQL off to its own slice. Since I pretty much exclusively run Rails, I decided to look into Phusion's "Ruby Enterprise Edition", which is basically just some nice hacking on Ruby's garbage collection mechanism. What they've done is nice, and I may end up using it, but ultimately it's buying me space that I would also gain from simply moving back to x86. With that in mind, I decided to take another look at what appears to be emerging as one of Slicehost's main competitors, [Linode](http://www.linode.net), who *do* use x86 servers. Here are my results, which are admittedly not all that scientific, but what the heck – you're getting them for free, and they were pretty quick to whip up. ### Service, Support, Setup But before we begin with the numbers, I want to add a few words of caution. One of the big things about hosting, to me at least, is how they deal with unexpected random negative events. Connections going down, disk breakage, DoS attacks, and so on. It's really hard to get an idea of just what sort of people you're dealing with until you've "ridden the river together". I don't really know what sort of response times, uptime, and all that either Slicehost or Linode have, so there are potentially some big intangibles there that are not quite as easy to draw pretty charts for. In terms of the console/setup/management tools, I liked Slicehost's simplicity more – Linode gives you more options, but they're a bit fiddly and cluttered feeling at times. For instance, Linode lets you pick how much swap space to give your disk image, but doesn't include a bit of javascript to balance out the swap and regular partitions, so that if you type in a larger swap number, you have to do the math and subtract that from the other number. Annoying, but not a big deal. Linode lets you pick various data centers in the US (no Europe, yet). Slicehost gives you the option to do backups of your disk images, which is nice, and something that Linode lacks. ### Comparing plans Since I'm interested in comparing hardware and machines I actually have access to, here are the plans I am currently signed up with, and the salient numbers:

  Slicehost 1024 Slicehost 512 Linode 720
Memory 1024 512 720
Bandwidth 400 200 400
Disk 40 20 24
Cost 70 38 39.95

Memory / Dollar: Bandwidth / Dollar: Disk / Dollar: Linode mostly comes out ahead, but not by that much, except for bandwidth. However, let's take a crude look at x86_64 vs x86 memory usage. First, `./script/console` (a Rails console) for the same code base on two different machines, showing the virtual memory size (VSZ) and resident set size (RSS): And just to look at something else that's a bit smaller, I ran the following script: puts File.open("/proc/self/status", "r").read And got the following data: Being charitable with those numbers, we get x86 taking up 77% the memory that x86_64 does. So let's add that to our comparison of how much memory you're getting per dollar on Slicehost and Linode: Wow! That's a fairly significant difference. ### Performance That's most of what we can glean from published numbers, and running a few simple experiments. However, there's another factor that's important: performance. "Virtual Private Server" systems, or simply VPS's, got a bad name in the past because they were "overbooked" – too many virtual servers competing for the same CPU resources on one machine. Slicehost and Linode both look like they want to avoid that kind of bad reputation, and so far all the systems I have used have felt snappy and responsive. Now here's where we get really unscientific… I decided to try and do *something*, even if it wasn't much, to get some kind of objective measurement of what kind of CPU I was getting on the machines. This is really unscientific, because the machines do have other duties (I don't have the cash, time or inclination to set up random servers just for testing), and of course, who knows what else is sharing the computer with my systems: if you get a relatively unused computer, you can apparently pick up extra cycles for yourself, over and above your guaranteed minimum. But c'est la vie, and so I decided to do some numbers anyway. I picked a C implementation of mandelbrot from the [language shootout](http://shootout.alioth.debian.org/u32q/benchmark.php?test=mandelbrot) more or less "for the hell of it" (I'm going to keep repeating "unscientific" and invite someone to do better tests than I have). I ran this code every 20 minutes for a couple days, and then averaged out the run times: As expected, the higher power Slicehost 1024 machine wins out in terms of raw speed (less time is better), but not by that much over the Linode system. Indeed, when we factor in the relative prices, Linode comes out ahead, again: So it looks like we're not paying a CPU penalty for having more memory, bandwidth, disk, etc… ### Notes * Nothing in these statistics is indicative of future directions that both Linode and Slicehost might take. Slicehost was recently acquired by Rackspace, and that could have effects on the service they provide. Maybe one or both will come under financial pressure and start "overbooking" their servers. * The spreadsheet I used is available via Google docs, for the curious. I did most of the work in OpenOffice, and then uploaded it though, because working with the web spreadsheet was kind of painful. Unfortunately, you can't select non-contiguous data ranges in the Google spreadsheet, so the labels are interspersed with the data. Yuck. Also, Google docs doesn't handle 'time' values nicely, which is what the performance data was originally. The spreadsheet is here: * The standard deviation on the processing times is higher for both slices than the linode system. Not quite sure what that means in terms of what's going on under the hood, but I thought it was interesting.

Slicehost Migration

Phew… that was a lot of work. I have a bunch of small sites that I migrated to slicehost, and at the same time upgraded to Rails 2.1. Here are my notes:

  • x86_64 sucks up memory in a bad way. On a 32-bit x86 system

    davidw 20089 0.0 0.1 3320 1428 pts/1 S+ 11:41 0:00 ruby

    and on x86-64:

    davidw 24197 0.0 0.1 16756 1968 pts/0 S+ 18:33 0:00 ruby

    Ouch! That means that you’ll need more memory, and thus to spend more money, to run code that might have fit decently in a similar amount of memory on an x86 machine. For instance, my old server had a gig, and was running low on memory, so on slicehost, I set up one 512 meg slice to handle PostgreSQL, and a gig slice to handle Apache and Rails. The database slice is performing ok, but the ‘web’ one is kind of iffy in terms of memory. Part of the problem here, I will admit, is that Rails is a memory hog, but still, I had hoped to come out ahead by moving from a machine with 1 gig to two machines with 1.5 gigs between them, and instead I’m more or less where I was before.

  • I got caught up in a bit of a mess with Rails and the Ubuntu Intrepid update. I prefer to manage the Ruby stuff with gems, rather than traditional .deb package management, as I’ve had better luck with that in the past. I do use some of the system packages though, such as libpgsql-ruby1.8. However, the maintainers did something kind of ugly with the package: they swapped out the actual code from one code base to another – “Switched upstream source from ruby-postgres to ruby-pg”. Which changes the way you load it, and perhaps a few other things as well. I think I would have been happier seeing the package renamed or something, as all of a sudden, programs stopped working.

  • I really like the ability to resize slices. I started both my slices at the 256 size, and then made them bigger when I was ready to put them into production. The process is not instantaneous, but it’s pretty handy if you need more capacity in a hurry. Adding new slices is also fast.

So, all in all, I’m happy with the move, although the memory issue is irritating.

Slicehost vs Layered Tech?

I’ve been a happy Layered Tech customer for a number of years. After several terrible experiences with hosting companies that didn’t charge much, which were the inspiration for Web Hosting – A Market for Lemons, I found that LT offered good, basic service at reasonable prices. My first server there cost $70 a month, and handled what I needed it for with aplomb.

Fast forward to now: LT no longer has servers under $150 a month, and while they’re nice machines, I miss being able to get something a little bit cheaper, and am considering Slicehost.

The real distinction between the two is: real, physical machines vs VPS (Virtual Private Servers). The latter earned itself a bad reputation in the past, because many providers ‘overbooked’ the machines that their clients’ VPS ran on. I had some negative experiences with that myself, prior to seeking out a ‘real’ machine to run my web sites on. However, I’ve heard that people are reasonably content with Slicehost, so perhaps they’re running a tight ship. For those who have tried them out, how is the speed/latency of their offerings, and compared to a more or less ‘equivalent’ real machine? The positive side of running a well-planned VPS is that you can quickly switch between configurations, allowing you a bit of room to grow, if you plan things right, which might allow me to save some money.

Incidentally, something that I like about both LT and SH is that they’re not in the California Bay Area, which is a really expensive place to run what isn’t exactly a “rocket science” business. Sure, you want good, solid, smart people, but there’s no reason to be in such an expensive part of the country.

Thoughts? Opinions?