Alarming number of spam false positives in Gmail

This morning, I was checking into some unfinished work on my side project, LiberWriter and  noticed that I hadn’t seen several emails.  I checked in Gmail’s spam folder and found around 15 emails that had been dumped there, without anything obviously wrong with them. Several were from customers wondering why they hadn’t heard anything from us.

Worried, I started digging further.  Here’s what I have found so far:

  • An email from the people at work regarding payments to me.  This is important stuff!
  • An important personal email.
  • Emails from JIRA at work.  I get a bunch of these every day; and while they are not high priority, not seeing them is potentially damaging for me and the company.
  • Several emails to the erlang-questions mailing list that were fairly typical and topical.
  • Update: I just found a large (10’s) of emails from the Apache Software Foundation.
  • Update 2: Another personal email, not that important, but not at all spammy.
  • Update 3: Numerous emails from the tcl-core and Postgres ‘general’ mailing lists.

Several of these things were pretty easy to search on, but I am now very worried that there may be many others which I am missing.  I do receive a lot of spam, and in the past, the Gmail team has done an excellent job of filtering things, which is why I have happily recommended it.  Google employs a lot of extremely intelligent and capable people, so the thought that they’re hard at work on the spam problem has always been a good one.

I don’t know what has changed though, but seeing all those emails discarded in spam gave me a bad scare and has got me really worried.  I live in email… and if I can’t trust it, I’m in big trouble.

Ironically, I am a paying customer of Google as of a few days ago, in order to have extra storage space.

Update 4: I got in touch with someone from Google, who said that there was indeed a problem and they are working to push a fix soon.  I’m really glad to hear that!

I Default to Postgres

With a startup or new project of any kind, you’ll face many uncertainties.  One thing I always count on is the Postgres database.

I like to experiment with new things.  I have a particular weakness for programming languages, but over the years, I’ve always come back to Postgres for data storage.  It’s solid, reliable, flexible, and I know I can take it in whichever direction I need.

I think you have to pick your battles – no one can do it all. You need to strike a careful balance between trying out new stuff, so that you don’t miss out on new and improved ways of doing things, and avoiding churn generated by blindly jumping to new technologies.

I’ve always been happy to try out new programming languages or frameworks, but I like to be able to count on my data, so I’ve always come back to Postgres.

You can do a lot with Postgres, and its capabilities slowly but surely expand, year after year.  One of the nice things is that new capabilities are pretty solid when they come out, rather than feeling like rough concepts waiting to be polished.

Postgres also has a great community, of thoughtful, helpful, talented people, which makes dealing with it that much more pleasant.

Erlang and unreliable resources

Here’s something you should be aware of if you’re using Erlang in a large system.  It’s a pity the core team hasn’t included something like it in the Erlang distribution itself, or event documented; because it’s something I think you’ll eventually need if you build a big enough system.

In a large system, you’ll have components that should be up and running – but at some point might not be.  For instance, a web site with a database.  If you write your Erlang code to just “let it crash” because the web server can’t connect to the database – just like you read in the books – then your whole site will fall over, web server and all.  At this point, you can’t even show your visitors a message explaining that “we’re experiencing problems, please try again later”, because the problem has propagated throughout the system, bringing it all down.  Another scenario might be a machine with a user interface and some fragile hardware.  The machine can’t do its job without the hardware, but if the hardware is broken, the software should not crash completely!  The user interface needs to stay active, letting users know that the machine is broken, and perhaps offering some diagnostics or offering some steps to try and correct the problem.  What these have in common is that there are some components of the system that should always try to be available even if they are not 100% functional, and there are other bits and pieces that may be necessary for the correct functionality of the system, but should not cause it to become entirely unavailable when something goes wrong.

The term for this is “circuit breaker”, because it breaks the chain that is part of a normal OTP system, where enough failures of a worker lead to a supervisor crashing, and its supervisors crashing in turn, after enough failures, and so on up the chain.  One high quality implementation is located here: – and politely points to some other, similar implementations.  I’m a bit frustrated that I found this only recently, because it’s a nice bit of code that strikes me as quite likely to be used somewhere withing any sufficiently large Erlang system.

As a footnote, I also created something similar, although it’s not battle tested (I just released it, actually!) and operates at a different level: – but it might be useful for some people.

When to use AWS – and when not to

Amazon’s “web services” – AWS for short – have made quite a splash in the developer community.  It’s pretty impressive how much horsepower you can harness in a very short amount of time.

However, it’s not the most economical solution, nor the simplest.  I think many startups are blinded to these facts by dreams of scaling up (“to infinity and beyond!”), when they should be concentrating on growing their business a bit at a time, and would be better off with something else, like Linode (yes folks, that’s a referral link!), which I use, or Digital Ocean, Hetzner, or one of the many others.  If you’re a Silicon Valley company fueled by venture capital and need to get really big really fast, this advice might not apply to you.  Otherwise, read on.

Here’s where I would consider using AWS from the beginning of a new project:

  • You absolutely positively know you’re going to have to scale horizontally from the start.
  • You need massive amounts of storage and want to stay close to Amazon’s S3.  Companies that deal in a lot of images or videos or something might need this.
  • You know you are going to need to start and stop services as demand rises and falls.  This sounds cool, but it’s not easy, so if you aren’t positive you need it, it’s premature optimization and you want to avoid it.  An example of this might be a company that gets requests to process large amounts of data: the request comes in, they start up an EC2 instance, farm out the job to it, and then shut it down when the results are ready.  This isn’t really something you can do in the time frame of an HTTP request though: the time to start a new instance is most likely measured in minutes.

There are probably other cases where some of Amazon’s services make sense, but where I come from, in the land of smaller, bootstrapped companies, worrying about “web scale” is very likely the wrong thing to be doing.  You want to get your infrastructure set up and running, get paying customers, figure out your market, understand which features they need, and so on.  If you’re making money by charging customers for your product, you can probably scale vertically by moving to larger instances, or do simple things like splitting the DB server from the web server, before you have to worry too much about scaling problems.

The actual dollar difference in what you spend is something to keep in mind if you’re watching your expenses carefully, but there’s also a complexity factor: AWS instances can be kind of complex to set up and run in terms of all the bits and pieces and understanding how the fit together.  Something more basic is just a regular old Linux server that you ssh into and run.  It’s pretty straightforward, meaning it’s less of a hassle.  As a small or bootstrapped company, or individual, you want to avoid spending too much time on complex stuff that is not likely to pay off for a long time.

Amazon is a Big Company and is not going to pay much attention to a small fish like you or I – you’ll likely get better support from someone who makes their money helping smaller companies and individuals.

You don’t want the absolute cheapest hosting service, because you do get what you pay for, and what you pay for is not for when things are running smoothly, but when things go bad.  I’ve always had good service and support from Linode, but I’m sure others are good too.  Ask around to see what people are saying.

So: if you’re a small company, don’t know how much you need to scale, or how you might need to scale, do not worry about it or try to “think ahead” too much: save the money and get something better suited to your needs.  When you do need to scale – if you reach that point – you can do so with a much better idea of exactly how to scale your own business in a way that makes sense.

Testing is a Pareto thing

If you work on rocket engine software or some kind of life-critical software, you can ignore this post.

Still here? I wanted to sum up my thoughts on testing, as someone who works on fairly ordinary software from day to day.

I think testing is very beneficial, and software without tests is not much fun to work on, because you’re afraid of breaking it; and maybe not even knowing that you’ve broken it!

However, I also agree with DHH’s post on test driven development: that the rhetoric has gone too far in some circles.

I think it’s best viewed as a graph:

Initially, adding a little bit of testing gives you big benefits: first and foremost because now you have a test framework to hang new tests on, which is a great way to grow it.  As you discover bugs, add tests to ensure they never happen again!  This keeps the test suite growing organically, and gives the tests a sense of purpose, rather than just slogging through the whole codebase, adding tests for everything.

As you add more and more tests though, the returns on time/coverage/effort probably begin to flatten.

This is – very roughly – in line with the Pareto principle, also known as the 80/20 rule.  In the case of testing though, my feeling is that it’s often better to skip that difficult 20% part – unless you have a lot of extra time on your hands.

Like most everything else about software development, it’s driven by the economic necessities of the people and businesses involved, and therefore there are compromises involved.  Aiming for the maximum amount of benefits with the least amount of coverage is the sweet spot for testing.

People, Places and Jobs

Paul Graham has caused a stir with a recent essay that favors immigration of high-skilled programmers:

I don’t think the essay itself was one of Graham’s better efforts, but mostly because it is not bold enough in making the case for immigration. However, the reactions to it on places like Hacker News are dismaying on a variety of levels, both intellectual and moral.


Many of those railing against the idea of letting in more programmers (or workers in general) have a simplified view of the economy, in which there are a fixed number of jobs to go around.  10,272, say.  And if you let in more people, there will be more competition for those jobs, and therefore lower wages.  The people arguing against immigration are afraid that they won’t make as much money.

This is known as the lump of labor fallacy.  It is zero-sum thinking in which for one to gain, another must lose.  It is wrong, with the simplest example being that there are clearly more jobs and more money in the US economy now than 100 years ago.  Currently, there are approximately 320 million people in the United States.  100 years ago, there were roughly 100 million.  Clearly, something happened for the country to have gained both people, jobs, and money: the economy grew.  Graham points out that the economy and business are generally a positive-sum game in one of his earlier essays:

Suppose you own a beat-up old car. Instead of sitting on your butt next summer, you could spend the time restoring your car to pristine condition. In doing so you create wealth. The world is– and you specifically are– one pristine old car the richer. And not just in some metaphorical way. If you sell your car, you’ll get more for it.

In restoring your old car you have made yourself richer. You haven’t made anyone else poorer. So there is obviously not a fixed pie. And in fact, when you look at it this way, you wonder why anyone would think there was.

Immigration is also not a zero-sum game – everyone who comes to a country and works also spends money that then goes back into the local economy, in addition to paying taxes, and most likely contributing in other ways.  Many immigrants go on to create companies themselves, which in turn hire more people.  Companies founded by first or second generation immigrants include Google, Apple, ATT, eBay, IBM, and countless others.  People with the drive to get up and go someplace new tend to be, on average, a bit more entrepreneurial than those who stay put.

Most economic research points to the fact that immigration does not depress wages indicating that the fear of adverse economic effects of immigration is not well-founded.  The article is worth reading carefully, because it goes into the details and links to the actual research.

If you’re a programmer, you’ve probably heard of PHP, Ruby on Rails, Java, C#, C++ and Python, just to grab a few projects at random.  These were developed by people who are or were, at some point, immigrants (they did not necessarily create these things while in the US).  We would all clearly be worse off without these people and their contributions.  Or maybe they would have moved somewhere else.  The sun does not revolve around the United States, and given enough bad policy decisions, people can and will go elsewhere to found their companies.  Looking at open source software shows that Graham is on to something with regards to the distribution of talented programmers: many, if not most open source contributions and projects originated outside the US.  Making it difficult and unpleasant for those people to move to the US does no one any good.

Even more average programmers are a positive thing, though: not all of us are brilliant inventors, and there is plenty of room in the economy for those of us who work on all kinds of projects utilizing the tools created by the most exceptional among us.  Having more people available lets entrepreneurs try more new, inventive and creative businesses.  It’s beneficial for everyone.

If you gave the government the job of trying to determine who the “great” programmers are, and who are merely normal ones, how would that work out in practice?  Keep in mind the immigration folks don’t just deal with programmers, so theoretically they’d have to be responsible for judging “great” people in every field out there, which sounds like a very difficult task under the best of conditions, let alone for harried, overworked bureaucrats.

There are something like 11,000 babies born each and every day in the United States.  With 135,991 H1B visas granted in 2012, that’s only 12 days of babies. Granted, those children won’t compete for jobs right away, but eventually they will.  And yet very, very few people loudly complain about other people having children, because children, just like immigration, do not harm the economy.

Another idea I see often in these discussion is: “we should auction H1B visas to the companies willing to pay the highest wages”.  Auctions can be a sensible way of allocating resources, but with a purely arbitrary number of visa slots to auction off (is there any statistical or research-based reason why the numbers are what they are?), you’re not really improving things a great deal, due to the artificial limit (in programming, we often call numbers like these “magic numbers”, and not in a good way). Because some professions simply pay more than others, you also risk starving some professions of talented immigrants entirely.

A few people make the argument that “brain drain” harms the countries people immigrate from by depriving them of their best and brightest.  In this day and age, however, it’s easy for people to make some money, gain some experience, and then return to their countries of origin, not to mention the huge remittance economy that is a lifeline for many families “back home”.  Everyone is better off.  Some workers are simply not valuable at home, so moving allows them to be more productive and make more money.  Consider a fashion designer from a rural town in Oklahoma: forcing that person to stay where they were born is not going to create a fashion industry in that town, so letting that person go to Milan, Paris or New York is going to make everyone better off.

One sensible point people make in criticizing the H1B program is that it does not offer immigrants a lot of bargaining power.  If they were a bit less beholden to their employers, they could more confidently bargain for higher wages.  It is possible for someone with an H1B to change jobs, but you really need to “grasp the next vine before you let go of the previous one”, or you might get in trouble with your visa status.


People living in a country founded by immigrants trying to keep out other people hoping for a better life for themselves and their families is questionable, at best, and very ugly at its worst.  In reading the reactions to Graham’s essay, I would not ascribe blatant, outright racism to most people commenting, but there’s certainly an ugly undercurrent of “us” and “them”.

My own background certainly colors my views, and makes “us” and “them” far more nebulous concepts than they appear to be for some: I was born in the US, and live in Italy.  My wife is Italian, as are many of my friends.  I work at an Italian company.  I do not see these people as inherently different than people who happen to have been born in the same country I was.  They’re people, just like people anywhere.  If they wanted to go to the US, work, and obey the laws, I don’t see any reason for them not to.  I don’t see these people as a “them”: they’re my friends, family and colleagues.  I can’t support policies that would keep out my Italian friends in order to “protect” people who happen to be born in the same country as me, with whom I may have absolutely nothing in common other than a passport.

If someone can, just by going to a different place, make more money and live a better life, there’s a strong moral argument in favor of letting them do so.  To consign someone to a life less fulfilling than it could be because they happen to be born on the wrong side of a line seems unjust to my way of thinking.

This is especially true because none of us has any say in the matter of where we are born: it’s a genetic lottery that those of us born in wealthy countries were lucky enough to win.  It’s the least we can do to let people from elsewhere have a shot at a better life.  And those from other “1st world” countries?  What’s the harm in letting them in if they follow the rules?  If they don’t like it, they’ll probably go home where there’s more of a social safety net than in the US.

There’s also the matter of freedom: If people agree to abide by the laws of a country, and do so, why should we tell them ‘no’?  Believers in liberty should support others’ liberty to live and work where they wish.

Because of the positive-sum economics of immigration, it doesn’t cost us anything to let people in, and it might make the immigrants’ lives a lot better.  Best of all, this helping hand for our fellow human beings is not an act of charity: it’s through their own work that these people will be better off in addition to contributing to the local economy.  Everyone wins.

I recently read a book first published in the 1960ies, called “Business Adventures” (my review is here), with one quote from the boss of Xerox about unions and their relationship to minorities that struck me as relevant to immigration as well:

For example, we’ve tried, without much fanfare, to equip some Negro youths to take jobs beyond sweeping the floor and so on. The program required complete coöperation from our union, and we got it. But I’ve learned that, in subtle ways, the honeymoon is over. There’s an undercurrent of opposition. Here’s something started, then, that if it grows could confront us with a real business problem. If it becomes a few hundred objectors instead of a few dozen, things might even come to a strike, and in such a case I hope we and the union leadership would stand up and fight. But I don’t really know. You can’t honestly predict what you’d do in a case like that. I think I know what we’d do.”

At one time, it was considered acceptable to keep black people out of ‘good jobs’.  Attempting to keep foreigners out of jobs seems broadly similar to me: it’s immoral and based on bad economics.

If someone is willing to obey the law and work, they ought to be welcome in a country of immigrants.

Further Reading

The optimal number of immigrants

Homelands: The Case for Open Immigration

Perfect software versus economic reality

“Why, oh why, is the state of software so bad, so buggy?” reads a comment lament on forums frequented by programmers.  As programmers, we’re a perfectionist bunch, and it bothers those of us who care about quality to see things work poorly.

Why indeed?  The answer is almost entirely dictated by economics.

Imagine that you’re a customer looking to have a custom web site / application built to help manage your small local bookstore.  It’d be bad if the system lost all your data, but other than that, minor bugs are tolerable, and your business just doesn’t make a ton of money in the first place.

You talk to two local shops that might be able to code up what you need:

  • One offers to write code they guarantee to be free of bugs.  It doesn’t have many features, but would probably get the job done.
  • The other company can get it done for a  bit less, with many more features than the first one.

You ask the first vendor about the extra features, and they say they might be able to get to them over time.

Since those extra features would be handy, you opt for the second vendor.  Once in a while, it does have an annoying bug or two, but the programmers do try hard, and manage to fix most of them in a few weeks, and none of them cause you any major problems.

The above anecdote is how much of the market for software works: everyone says they want high quality, defect free code, but how many people are willing to pay for it with either much higher prices, or a reduction in features?  It depends on what the software is used for, but by and large: not many.  They’d rather have the features, or pay less, and tolerate the odd bug here and there.

Everyone’s heard the old adage, “good, fast or cheap, pick any two”, which is very closely related to another triad: “lots of features, few bugs, cheap”.  You simply can’t develop something quickly or on the cheap, and have both lots of features and very few bugs.  You trade off features for time spent on ensuring there are no bugs, and since eliminating bugs is something that seems to asymptotically head for, but never quite achieves perfection, the more time you spend on that, the fewer features you have.

This suggests that if you want to work where people pay a lot of attention to quality, you should work where even small bugs cost a lot of money, or where they put people’s lives in danger.  Those are the sorts of environments where quality will count.