Protecting your backend from jargon-laced error messages
Code that goes wrong generates errors. It’s typical for developers to spend their time on the code, not on the error messages. This is a plea to put a little more effort into those messages, even if you think no one will see them.
Unknown attackers ambushed the Boston Globe site with a DDOS (distributed denial of service) attack yesterday. I was checking a link to an article there when my browser displayed this instead:
At the time, I had no idea about the attack, I just wondered what was going on. I smiled and posted it on Facebook, where it received any number of unhelpful and smartass remarks. I also checked bostonglobe.com on isitup.com and confirmed that the problem was with the Globe site, not with my computer, my browser, or my connection.
Deconstructing an error I never should have seen
How did I end up seeing this dog’s breakfast of jargon? And what does it mean?
First off, this error is generated by the site’s server. We see site errors all the time, mostly 404 errors that indicate that a Web address references a page that doesn’t exist, typically the result of a broken link or a typing error. Sites know that you’ll see these pages from time to time, so they generate error pages that range from informative to humorous. Here’s a good one:
Anyone browsing the Web has seen 404 pages before. But what’s a 503? Here’s the official description from the W3C Web Standards body:
The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay.
OK, so my error message from the Globe means the Web server is not responding. But what about the rest of that incomprehensible text in the message?
Some amateur sleuthing pointed me at the last line of type on the page, “Varnish cache server.” What is Varnish? It’s a tool that sites use that sits between the content and the the browser and rapidly serves pages. It’s supposed to be invisible. And it is, except when it fails.
I’m guessing here, but “Backend is unhealthy” probably indicates that this cache server was confused by what it found on the server as it was trying to cache and serve pages. From the caching tool’s perspective, the rest of the site is the “back end” (and presumably the browser is the “front end”).
What about Guru Mediation? That’s an in-joke. Early software on the Commodore Amiga displayed an odd error message including the words “Guru Meditation” (as opposed to “mediation”). This error sometimes appeared visibly to mystified users when, for example, cable companies served up television content from Amiga machines.
I think the folks at Varnish hijacked this message as an homage to Amiga, because caching servers mediate between browsers and Web sites. It’s an obscure pun.
People will see the errors you write. Here’s what to do about that.
The job of the Varnish caching tool is to deliver Web pages. That means that whoever wrote that error message knew that it might appear in place of a requested Web page on an end user’s screen. As a result, they have a responsibility to create a message that’s comprehensible.
On the other hand, they also have a responsibility to write a message that carries technical information that is helpful to people debugging the site (for example, the letters and numbers after the “Details” in this error message).
Here’s some good advice from the MacOS interface guidelines on error messages:
Write an alert message that describes the alert situation clearly and succinctly. An alert message such as “An error occurred” is mystifying to all users and is likely to annoy experienced users. (…) Write informative text that elaborates on the consequences and suggests a solution or alternative. Give as much information as necessary to explain why the user should care about the situation. (…) Informative text is best when it includes a suggestion for fixing the problem. (…) Express everything in the user’s vocabulary. An alert is an especially bad place to be cryptic or to use esoteric language, because the arrival of an alert can be very unsettling. (…) It’s a good idea to avoid using OK for the default button. The meaning of OK can be unclear even in alerts that ask if users are sure they want to do something. For example, does OK mean “OK, I want to complete the action” or “OK, I now understand the negative results my action would have caused”?
“Express everything in the user’s vocabulary” is basically the opposite of jargon. It’s hard to write error messages because they are typically the result of something failing in an unknown place, increasingly between elements of systems that are trying to communicate and failing (like a caching server in the midst of a denial of service attack). But remember that the poor user is the one who has landed in this unknown place. Throw them some sort of information and a suggestion on what to do next.
In the cases of messages like the one I saw, all this would take is a little bit of extra effort to put the error in reader-comprehensible terms. So you’d end up with something like this:
Error 503. Server unable to respond.
We are unable to display this page due to errors in our site.
Technical details
Varnish cache server error: Difficulty in mediation with site back end.
Details: cache-ewr18133-EWR 1510248182 1203912970
Less amusing, yes. But a message like this shows the reader that the problem is not their fault, and shows the site owner the details of the problem with Varnish.
Developers should ask for writing help
Back in the early days of software, I worked with a developer who was writing a version of VisiCalc, at the time the dominant spreadsheet software, for the first model of the Macintosh. Pop-op boxes were a new interface feature of this first-even graphical user interface system. They required developers to write a slew of text messages in situations where they had never before been necessary.
In the first internal test version of the software, when you selected “Quit,” you saw this message:
Do you really want to start a global thermonuclear war?
[Yes] [No]
Until you selected Yes, you could not exit the program.
The developer knew that every single person testing the program would see this message. He intentionally made it silly and outrageous because he knew then someone would rewrite it before the product shipped.
While I don’t recommend this method, I applaud the sentiment. Developers should know how to write comprehensible and helpful error messages. And if they don’t, they should do whatever they must to get competent writers to help them.