Solving internal technical documentation at Spotify

Author

Gary Niemen

Gary Niemen

In this session from DevRelCon London 2019, Gary Niemen share they story of how their approach to documentation has changed at Spotify, drawing parallels with Joseph Campbell’s Hero’s Journey framework.

Transcript

Gary: I’m Gary Nieman and I’m a technical writer of 25 years and now a product manager at Spotify and I work in a team. We’re working on technical documentation and solving the problems around that. I work in a group, a tribe, who work with developer experience.

The talk is called The Hero’s Journey and how we are solving technical documentation at Spotify. Spoiler alert, by treating docs like code. And in fact, it’s treating docs like code plus more things that will become clear later. The alternative name for this talk could have been “We told you so” signed Engineers of the World. Okay, for a technical writer, that makes sense.

The Hero’s Journey, what’s that? In 1940s, this guy called Joseph Campbell, he analyzed all the myths and stories of the world, not all of them, probably, but most of them, and he came out with a template of all stories based on this one story. It’s a departure, an initiation, and a return. That’s how it’s broken up. And Joseph Campbell’s 17, it was 17 steps originally, I’m not gonna do all 17, I’m gonna go for the 12 version. And you’ll find these… underlying themes in all of your favorite films, things like The Matrix, and Lord of the Rings, and Star Wars. And you find it even in the Technical Writer’s Saga, that very famous film.

I’m gonna be the hero in this story, because I’m giving the presentation, so I get to be the hero. But it’s about the team, and this is the team at Spotify where I work, continuing the film theme, Pulp Fiction we are called, and two engineers, and two technical writers, and an engineering manager, who is here, and myself.

Step one is ordinary world, and that’s what the situation was like before the adventure. At Spotify, we have about 1,500 engineers and a lot of the engineers are making tours or platforms for other engineers at Spotify. There’s a great need for technical documentation. This is about that type of documentation, it’s about internal technical documentation. And because of Spotify’s culture of the teams can work in the way the want, it happened over time that when teams realized they wanted to do some technical documentation, they would do it sometimes in Confluence, they would do it sometimes in Google Docs, the would sometimes knock up a website, they sometimes do a readme file in GitHub, et cetera, et cetera, et cetera. There was tons of technical documentation spread over everywhere. And of course then other engineers looking for this documentation, they couldn’t find it when they needed it. We had a search engine, it didn’t work that well. And it broke after a while, so it was a sort of chaos situation.

Is this a problem worth solving? And it became apparent that the answer to that question is was yes, because in a productivity survey that we ran at Spotify about a year and a half ago, not finding the information that you need turned out to be number three blocker. So yes, it’s a problem worth solving. And coupled with the situation, we had technical writers, we had about three technical writers that were spread around the infrastructure teams and they were solving smaller problems within the individual groups. They weren’t solving the big problem for Spotify at all, they were solving individual problems. This was the situation about a year ago.

The call to adventure, that’s the next step. And I would say interestingly, the call to adventure started when one of our writers spotted a talk that Google had done about three years ago, at the docs conference. And I was quite astonished when I saw it because they had exactly the same problems that we had spotted in our organization and they’d started to solve it and even solved it using like a Docs Like Code solution. So we watched this video and we thought, “My God, that is it. “We should be doing something like that.” And then coincidentally, we were about a week away from Hack Week at Spotify. At Hack Week, you can do what you want for a week. And a couple of colleagues, we decided to try and just do something like what we saw Google had done. And we did that and it worked out quite well, and we showed it off to people and we said, “Oh, look what we’ve done, this is gonna solve documentation,” so we really kinda flirted whenever we saw a manager, we were like, “Oh look at this!” And within a couple of weeks of that one of somebody in sort of a higher position than us, who could make some changes, started to think about this and said like what happens if we put writers in a team and get some engineers and you know, what would it take to do this really well here at Spotify? And the next step is the refusal of the call. Yes, I’ve been a technical writer as I said for 25 years and there’s that like, “Oh my God, this is completely different.” In the past, I’ve worked in like markdown type, sort of tagging type, a documentation solution, so that was okay. And in the past I had implemented engineer workflow using Microsoft Word and those didn’t really work. But I knew this was completely different, a Docs Like Code approach. Could I give up my craft, in a way, that I’ve been working with for years? And of course the answer is yes, otherwise I wouldn’t be here today.

The next step is meeting the mentor. What’s that about? At this stage, we were just three technical writers in a team. And we had a hypothesis, we’d seen this video, and that’s about all. So we needed to get some more meat on this, and so one thing we did was some user research to validate the hypothesis that we had, not only for the solution, but for mainly the problem. And we were fortunate that we got a very experienced product manager to help us get started, so that’s a bit like meeting a mentor and we got an engineer, who, bonus, was really enthusiastic about this topic. And so that was a big moment.

Crossing the threshold. We got started, we had our team, and we were gonna build something like what we had done in Hack Week and something like the Google thing. And so we spent weeks and weeks with complex designs and processes, complicated architectural design and this was it. That’s the Spotify equivalent of a napkin. Within a couple of weeks, we got out an alpha version of what we called TechDocs. And we’re very fortunate, I mean, so how did we do it in two weeks? We’re very fortunate in that we have a lot of internal staff that we could rely on. First of all we work in GitHub, that’s a big plus, and we have a CID system there, so we could use that. And we quickly picked MkDocs as our static site generator. Somebody had worked with that before and said they thought it was good, so we tested a few and then we went for MkDocs. And storage in GCS and we have this thing in Spotify that’s an internal developer platform. And so we had that whole ID where we could expose the websites. We managed to knock this up fairly quickly. And that’s what it looks like.

Steps 6, 7, and 8 are all about challenges in the Hero’s Journey. As a technical writer for many years I’ve noticed and experienced many many problems of technical documentation that need to be solved. And these are some of them, maybe it’s all of them. I don’t know. I’m gonna flip through them quite quickly, maybe not in this order.

Technical documentation at scale, that’s what I mentioned. How do you you know I got 1,500 engineers, we can’t have 100 technical writers, how do we solve this?

Discoverability and findability. Findability, you know what you’re looking for, that’s search. Discoverability, you don’t know what you’re looking for, but you want to discover it, that’s a problem of technical documentations. Trust, how can you trust… How can you trust what you find?

Maintenance, it’s very easy to create documents, but or documentation, but how do you maintain it afterwards. That often gets forgotten. Feedback loop, that is about that somebody making comments on a piece of documentation, that those comments would get to the one who, or the team who, owns that documentation and they will correct it and then you get a feedback loop.

Hidden knowledge, that’s many engineers having lots of knowledge that’s in their heads and it’s not shared. Ownership is a bit connected to no handover process. Documentation has to be owned. If it’s not owned, it will absolutely decay over time. And then a handover process needs to be in place otherwise you won’t have ownership eventually. Used and useful, I think teams who create documentation really wanna know that it’s used and that it’s useful for their own, you know, they put a lot of work into putting some documentation up and they need to know.

Adoption and engagement, that’s about there are often different use cases, and different needs, you have to think about that you can’t just ignore that. And then I left the one top left to last, because at the bottom of all this is the key problem that when engineers get stuck and they want to use documentation to get unstuck, then you wanna help them fast. And they want to get unstuck fast, so that really is the key problem underneath all this. Some or all of the problems we document with the technical documentation and what I think is that with Docs Like Code and Docs Like Code Plus, that I’ll show in a second, we’re solving all of these, almost. I don’t wanna say all, but we’re basically solving pretty much all of this. So, I’m just going to show four or five examples, I’m not gonna go through everything.

This is the basic how to do tech documentation at scale. I mean this is really enabling the engineers so just make it super easy, the documentation is as close to the code as possible. It’s in markdown, so it’s super easy and we provide a navigation. You don’t have to have the navigation if you want you can have navigation that’s on the left. And then it produces a doc site and the doc site looks pretty good. Super easy for engineers, in line with their work flow completely in the tools that they’re using every day. We made this information card as it’s called at the moment, but it will be a trust card eventually. For each documentation site, you’ve got the owner, we get that from GitHub, the doc was last updated, we get that from GitHub, the open issues again, I’ll talk on that in a second on the next slide. Top contributors and we have some support help as well. And in the future we’re gonna make this into more of a trust card, maybe you get a percentage showing how reliable this piece of documentation is. Of course, we need to work out what variables go into that, we think it’s doable.

This is feedback loop, so you’re in a piece of documentation so a user is in a piece of documentation, and they have the option of either… They click that and they go directly into GitHub and they can submit a PR, or they can highlight a piece of text and select that for opening a GitHub issue. And that’s the number I showed before. That’s one doc site per repo and it can be a repository with code or it can be a repository that’s just the documentation… repository. That’s the feedback loop. And this is an example of Docs Like Code Plus, I mean this is not classic Docs Like Code, but we’ve added on to it to give extra functionality. This is our doc home page.

This is addressing discoverability and findability. We’ve implemented a pretty decent search… At the top there, and then we provide like links for special Spotify useful documents, for example the engineering handbook and such. And then in the future we’re develop some more lists, for example top ten doc sites and stuff. There’s a lot we can do with generating automatic information. Again, this is Docs Like Code Plus. We’re building on the Docs Like Code and then we’re sort of plussing it. And then finally, in this small piece, we’ve implemented code based graphics as well. A lot of engineers were saying, “Yeah we don’t want to sort of do our graphics in this, “and that or I can’t implementing in code.” And at first I thought “Wow, does anyone really want to make a graphic using code?” but yes, engineers want that. So we implemented a support PlantUML and a Graphviz. An interesting story with this is that we didn’t even implement this inside our team. We’ve had such a good adoption for TechDocs and people really like it so this functionality actually came from contributors. They asked if we supported it, we said no, and they did it. Perfect.

In The Hero’s Journey we are up to nine. I wonder what time out to? This is the reward. And the reward for me is the impact that we’re having in the organization and the love that we get for TechDocs. As a tech writer for 25 years I can really say that I really often felt that I didn’t have the impact that I wanted to have. I would create maybe pieces of documentation I thought that were really good, and they were much appreciated and “Yeah this really helps, “this is really good!” And then after six months, I was working on something else and that piece of documentation was not maintained and it sort of died. I really feel what we’re doing now, with Docs Like Code we’re really having an impact in the organization. Lots of positive feedback and really cool. And I get to do talks like this, so that’s also a reward.

The Road Back. On The Hero’s Journey, this is the road back. I don’t know if you can visualize that but I think Luke coming back in the… I don’t know. Anyway, so Hero’s Journey and all that, but actually it has been super easy. It’s gone really quickly and it has been easy. So why is that? What’s our success factors? These are them, there’s more actually, but I just stuck to these. A cross-functional team; that was absolutely essential. We were solving a real problem in the organization. The organization was hungry for this, it was almost desperate, so we really filled a hole. We fully embraced Docs Like Code. We didn’t say, “Oh yeah, we’re gonna do Docs Like Code, “but if you wanna Confluence, that’s okay.” No, we’ve been really opinionated and we’ve standardized and centralized and said, “No, it’s Docs Like Code. That’s the way we’re going.” We had a quick development of a minimum, lovable product. Good adoption, that’s given us confidence going forward.

Collaboration, I mentioned, for example the graphics functionality. At Spotify, we get to own the problem and the solution. That’s meant that we can be opinionated and say that this is the way to technical documentations at Spotify, there is no other way. We were able to build on existing technology. And the one I missed out on purpose, we had a clear and inspiring vision, which is this: Spotifiers using technical documentation go from stuck to unstuck in less than a minute. That’s quite ambitious, but it’s a vision and it was shifting the needle.

The Resurrection, this is the final battle in the Hero’s Journey. And if you imagine in The Matrix, I think it’s Reloaded, where Neo, he’s up against a thousand Agent Smiths. I remember just seeing that thinking how can he cope with all of those? This is convincing other technical writers that Docs Like Code is the way to go! Okay, no one got that. So, yes, so Return with the Elixir. So this is return, the elixir is the Holy Grail, and this is actually the concept that made me think of this theme for this talk, because for me, Docs Like Code has been the Holy Grail. For a technical writer, I have experienced this whole thing with Docs Like Code as being the Holy Grail.

What have we learned? I would say keep focused on the key problem, keep the solution simple so it just works, fiercely optimize for the engineer, enable others to build on the platform, and standardize and centralize. What’s next? We need to work more on trust and maintenance, as I mentioned a bit, it’s been a year. We got a thousand doc sites. All the information is pretty new, people trust the information. I suspect that within six months, people are going to start going, “Oh I can’t find anything, and “oh I don’t know if I can trust this.” We really need to make sure that we have more things in place with all the information that we have now to make sure that it’s trusted and maintained as well.

And then we’ve talked about open sourcing TechDocs, so we’re gonna discuss that further, and we might do so. And then, we haven’t solved yet in-code documentation for example, for ABIs we haven’t solved that yet and there’s a big ask in the organization. So we need to get that into TechDocs as well somehow, we’re not sure yet. So that’s gonna be a challenge. We want all of the technical documentation to be centralized to be in one place and then people can find it and we can do stuff if it’s all in one place. Yeah, I think that’s what I have.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.