February 28, 2021
DevRelCon founder and CEO of Hoopy, the content agency for the developer economy.
How often do you write code that works first time? How often does it happen when trying a new framework or SDK? And how often will it happen to developers trying out your product?
The chances are that most people attempting to code with your SDK will hit at least one error message. Probably several. When they hit those errors, you’re not going to be there to help them. All you can give them is the message they see. When they see that message, will they keep trying, or give up?
Here, LaunchDarkly’s Yoz Grahame shares how you can make your errors even more helpful to new developers than having you sitting next to them.
He argues that there is such a thing as a beautiful error message, and when you see one, he thinks you’ll agree with him. And then you’ll want to make your errors look like that. Watch this talk to find out how.
Hello. My name is Yoz Grahame. I’m a developer, evangelist, advocate… developer advocate for LaunchDarkly and I’m here to talk about error messages because yes, your error messages can be beautiful, and developers can love them, or at least not entirely hate them.
So opening question, this seems fairly obvious, would you want to show your product’s error messages during a demo? The answer is usually no, and here, we see an example of why is the famous video of Windows 98 being demoed before it was ready, and it demonstrates that it’s not ready with that blue screen right there, to Bill Gates’ and everybody else’s embarrassment.
You don’t want that kind of scenario, and that’s what we think about when we think about showing errors in a demo. But what if you wanted to show your product’s error messages on purpose? Keep that thought. Now, you may be wondering if I’m talking about error messages, are you seriously asking me to reprioritize my work in favour of error messages on top of all the other DevRel work we’re doing like blogging, tweeting, podcasting, making demos, giving talks, documenting, somehow meant to find the time for error messages?
Well, I think you should possibly care more about error messages than you currently do. Unfortunately, it’s pretty easy to demonstrate whether you should. Have you ever heard of user testing? Of course, you’ve heard of user testing, right? It’s simple. You just get a user, ideal customer, in front of your product and you give them a task to perform and then you stand back and you’re not allowed to interact with them while you watch them try to do it.
Now, imagine you’re stuck on this side of the glass, and the user is trying to use your product and trying and failing and trying and failing and they’re getting error messages and you can’t help them. You’re stuck watching them, you know what…but you can’t help them and they’re failing. And this is a nightmare I wake up from repeatedly. It’s the kind of nightmare, frankly, we should have if that’s the kind of nightmare we’re giving to our users with bad developer experience.
So before I go further, let’s just get some confusion out of the way. When I say product and your product, I’m talking about any kind of product used to create or deliver software, whether it’s programming languages, frameworks, web service APIs, which can give error messages over HTTP or in JSON and HTML, and SaaSs for software teams, like the company I work for, LaunchDarkly.
We are a software as a service but we also provide libraries and modules that developers will use and obviously generate errors if they need to. And by you, I’m saying you as the creator of the software or part of the organization that is responsible for delivering this software but, you know, you have more experience whether you’re a DevRel or an interested developer or just somebody who wants to know more about the topic.
I’m betting you’ve been on the other side of that. You’ve had situations where you’re eager to use some code or write some new code, you’ve got a project in mind, you’ve got momentum. And unfortunately, we’ve all been in the situation of error messages destroying that momentum, right? An error can block you and then stop you getting past it and that’s your work for the evening done.
You’ve just got a pile of bugs. This is… I have a separate talk I gave all about debugging. I’m fascinated by debugging in particular because I hate seeing people’s momentum destroyed by a simple error but that’s what happens so often. And in developer relations, you know, this is an emotional response that we need to identify with and respond to.
We connect with how people relate. As my colleague, Heidi Waterhouse said to me, literally, the other day, “We’re paid to care about people’s feelings about code.” That’s what we do. And people’s feelings about error messages are usually that they’re bad, right? But no, errors are bad. Bugs are bad. Error messages should be the way past that.
They should provide a solution. So error messages are a vital part of your customer success experience. If you want your customers to be successful, then error messages are the prime placement for how and when you can give them support right in the moment when they’re experiencing pain. And it’s important to think about this also in terms of how people are going to judge your product as potential customers, right, people trying it out.
Error messages probably show your product at its worst. I’m assuming that. I’m assuming that your error messages are generally as bad as the ones that we’ve come to accept from most products. And this matters to developers at all experience levels. If you’re a beginner, then an error message is a slap in the face.
It’s yet more evidence of your imposter syndrome, to feed your imposter syndrome, to feed that voice that says, “I’m not good enough for this. I can’t do this and I should just give up and become a farmer.” But if you’re an experienced developer, then you get much more wary of new tools, because you realize that every shiny new product is actually a liability, okay?
You care about maintainability, debuggability. You’ve been promised the earth, but you’ve reached that situation of having to dig something out of a hole, usually while your boss is watching, usually at the worst possible time. So experienced developers judge a tool by its errors, and unfortunately, most of the time, there’s not much to judge there.
Bad error messages, they cost you customers. They cost customer support time, right? You’ve got tech support people who are spending a lot of their time explaining a message that someone hardcoded into your product. And finally, they cost you champions. It’s… Champions, the people who should evangelize your product, get their experience, their feelings from it, by what they experience from it.
So if you can give them positive feelings, then with good error messages, then you can earn these things. You can earn back tech support time. You can earn champions who have been delighted by your error messages. Now, the idea of delightful error messages, still, bear in mind, I know it sounds implausible if you don’t know what I’m talking about but let’s go through some examples of bad errors, and I’m not talking about this kind of thing, right?
This is a bug. This is obviously the kind of software gore thing that we see in the Daily, you know, Daily WTF. I’m not talking about that, right? I’m going to talk mainly about programming languages, but as I said, these lessons apply to all kinds of programming products. It’s just that programming language is the prime examples, because they give the most raw kind of output and they have a huge amount of context that they really should be working better with.
They fixed it in the past year or two, finally. So this habit of concealing information is a very common one. At least that one, you could go back and look at the code. Worse is when you’re dropping transient information in an error. Here, we see an HTTP request has failed.
We don’t know why. It could be a 404, it could be a 500, it could be a request never left the machine in the first place. You don’t know. You can’t debug. So how about bad remedial advice because sometimes programming, you know, compilers give you advice on what to fix. So here, I’m actually running this live, taking a risk.
Here, it’s saying that the error, a semicolon is expected between n1 and n2. We’ve got two variables that we’re going to multiply and print out but this is the wrong syntax. So it says it’s expecting a semicolon, right? So I’m putting a semicolon in there, following its advice and let’s see what happens when it goes, “Well, no, that’s not a statement either.” And if you go back and look at what the original two errors were because it gave us two, it actually knew that wasn’t going to work, in advance.
Python, right, that’s what we hold up as a really great beginner language. And yet its errors look like this. Now, this is obviously another situation of concealing useful information. It’s not telling me what the tuple is or what the index is but I actually want to focus on something different, which is the tone of this, right, because it doesn’t sound like something that was written by a human to be read by another human, and yet it absolutely was.
It sounds like the kind of thing you’d hear from a bad ’60s sci-fi robot, you know, like a Dalek. “Tuple index out of range, Doctor,” is what it sounds like. It sounds like someone was trying to use as few characters as possible to write out the error rather than communicate effectively, which is tragic. It’s a systemic problem.
It’s a perfect example of a systemic problem because one of them, I think, the primary reason why error messages look like this is because they’re written by engineers who have grown up with error messages that look like this. They, you know, when it comes time to write an error message, they just think back to the error messages they’ve seen in the past and they perpetuate the problem. There are other reasons why this gets perpetuated.
Usually, the fact that it’s been left up to an engineer to write this error message in the first place and they have to do it while fulfilling another task is why they don’t tend to put much thought into it. And that’s a problem of the engineering culture and the engineering organization as a whole, the whole product delivery organization. So this whole time, I’ve been saying error messages can be beautiful.
This Perl error has been in the Perl codebase for over 25 years. It’s in Perl 5. And it’s an error where you can see that it’s telling us where the problem is but it’s also giving us advice in a human tone about what the problem might be. It’s not even sure but it knows enough to say, “You know what, it’s worth a look at this.”
This is already more helpful than any error I’ve ever seen out of either Python or Java but there are actually much better work, more advanced work happening in the error messages field. That’s a field, as you can imagine. And let’s look at a more modern language. So, everybody who bet on Elm showing up in this talk, congratulations, you win. Elm has become a poster child for good error messages.
Elm is a lesser-known functional language, but it’s hugely loved, and this is the reason why. The errors are gorgeous, right? It’s got so much goodness. It has got, you know, you’ve got the type of error highlighted at the top in a different colour so it’s easy to see when scrolling through your logs. It’s telling you very clearly what the problem is in your code.
It’s actually pointing out the segment of your code causing the problem, the part of the line, which very few error messages do. It’s telling you what it should be. It’s giving you some advice. And the thing is that the tone is human, right? It totally reads like it’s being carefully explained to another human being. It’s also quite long, and actually, I think that’s a problem, more of a problem than a solution.
You didn’t become an engineer just to read error messages, although, you know, you end up spending a huge amount of time doing that anyway. But in this case, you don’t even mind, because you know that it’s going to be valuable. So this is not coming out of a huge engineering organization. This is coming out of one person, Evan Czaplicki.
He wrote an amazing blog post about how he literally just spent two weeks focusing entirely on error messages and came up with something so good that it’s one of the main things you hear about Elm when you meet an Elm evangelist. There are lots of Elm evangelists, because they love the language because it treats them well.
And they’ll mention the error messages within the first few seconds of talking to you about why Elm is so good. Fortunately, his approach, Evan Czaplicki’s approach is now being copied by other languages such as by Rust, the relatively new and increasingly popular language that’s come out of Mozilla. If you read this blog post, they have directly gone to Elm for their model of what error messages should look like.
So I’ve mentioned a couple of times context, taking context in error messages and being able to use that effectively. There’s a couple of types of context that you should be aware of when designing error messages. The first is invocation context, right? What caused the problem? What context was the error invoked? So it could be, you know, your customer’s code, It could be running while, you know, being watched, the developer is running it in front of them, or more likely, it’s running on a server.
It could be explored through an API explorer or possibly even it’s a GUI, right, because a lot of these tools are being used through GUIs and the output context, how it’s displayed. Well, so you’ve got the shell, you’ve got the GUIs, as mentioned, you’ve got the HTTP response if you’ve got a web API, so this could be part of a JSON or an HTML error, and log files, which are really the most common places for most errors to end up in, certainly, by quantity but they’re also an important place that might not get checked properly.
So as with all communication, you need to think about the audience. Who’s going to see the error and how? That’d be a really good point. I don’t have a slide for this, unfortunately, but a lot of people, when talking about log files, make sure that the log level for your error message is set correctly, okay? If it’s an error that you really want somebody to see, make sure it’s got the error log level and not warning because so many engineers turn off, go past the warning log level because of all the noise it generates.
So make sure your levels are set accordingly. Well, let’s think about how developers interact with errors when they see them. So most errors have these things: line number, error type, a failing operation, some description of that. I’ve paired with developers over the years and it’s so common that engineers will ignore at least one of these things that is printed out in an error.
I’m sure anybody who’s spent a long time programming knows this feeling, of being in that very fast code-run-debug loop and trying to get through quickly and missing a vital part of the error that you only spot about five minutes later when you actually slow down, breathe, and look at it again.
But the problem is not just with the engineers. The problem is that it’s how you scan the error message, right? And if the error message is just blatted out as this uniform piece of text, it’s very easy to miss the relevant bit, which is why to think about things like layout and colouring, you want to call attention to the right problems in the error message because engineers have very little time.
They’re busy, they’re stressed. Errors are documentation. In fact, they’re more important than documentation. Documentation is for educating your customers and error messages should be doing that. But they’re more important because they are the thing that is most likely to get attention and it’s most likely to get attention at exactly the time when you really want to educate your customers about what they should be doing.
So treat them like documentation. Have them reviewed. Ideally, give them their own home in the codebase that your technical writers can get to to actually improve them. This is a UX problem. This is a technical writing problem, okay, and it needs to be treated like that. So when engineers see errors what’s one of the most common methods of debugging them? Well, they’ll jump to the world’s favourite debugging tool, right?
A search engine, Google. So in my debugging talk, I talk about how to debug, how to search for errors but it’s really hard and it’s a sense that you evolve over the years of which parts of an error message are the parts that are actually going to show up useful results when you Google for them.
Well, you could help people out by, for example, giving your error some unique ID. The one good thing about this Python error is that, at least, it’s fairly obvious what to Google for here, right? And when you Google that, first result, as you’d expect, is Stack Overflow, right? You have no control over the Stack Overflow [ianudible] You have no control over, when people Google your error message, what the first result is going to be and how accurate it’s going to be, how well it’s going to paint your company or your product.
So here’s an idea for getting control of that. Why not cut Google and Stack Overflow out of the flow entirely, right? If you add a URL as part of the error message, then you can provide something longer, readable, link to all kinds of things, even provide tools in the page that can help people out.
They’re not always going to click on it. They’ll probably click on it the second time they see it or the third time when stuck in a loop. So coming to the end here you’re going, “Okay, right, we need to fix error messages but we’ve got loads of them. Which ones should we focus on?” So there’s a few ways to work it out. I would say, firstly, that the errors at the beginning and the end of the user experience flow are vital because the ones at the beginning, it’s when people are trying to get set up, right?
That’s when they can be most thrown off. You want to give them success before they fail. Secondly, towards the end, the errors are most likely to get trapped in logs. So I’m really running out of time. So metrics and feedback. If you can get metrics and feedback from your errors, if you can get them through a portal automatically that’ll be great but there’s privacy problems with doing that and also, you need the distribution across customers.
If somebody has an error showing up in a script and that script is run thousands of times a day, that’s going to imbalance your metrics. You could look at support tickets but the trouble is regularly searching that is a chore and most customers have the problems most of the time they won’t even bother reaching for support but you can turn this around. Firstly, if you can do automatic reporting, it’s still useful but even better, how about if you have those error URLs do analytics on them, right?
If customers are logged in, then you can try and grab the login ID and work out which customers are hitting the problems. With support tickets, help support, right? Help them report what a problem is and actually talk to them about it. They don’t want to be fixing the same errors over and over again. So actually talk about what’s important. Have regular meetings with them. So in summary, errors are more important than documentation because they’ll be there at exactly the moment when people need help.
You don’t have to fix them all at once. You can prioritize and even small changes can make a huge difference. And if you make that difference, then developers will love you. They’ll remember how well you treated them because of how used to being treated badly they are by almost all the other errors they get. Now, please go and design some error messages to make me say thank you very much.