Docs at Weaveworks DX: from open source to SaaS and beyond

Author

Luke Marsden

Luke Marsden

Weaveworks’ transition from open source to open source + SaaS has had an impact on how it structures and presents its documentation. In this talk from DevXCon San Francisco 2017, Luke Marsden, then head of developer experience, discusses what has gone well and what challenges still remain.

Transcript

Lovely. Thank you. Awesome. And I just wanted to also say a huge thank you to Phil, Matthew, and Tamao. I think they deserve a round of applause for putting on an excellent conference. So cool. So I’m gonna be talking in almost exactly 17 minutes, hopefully about docs at Weaveworks and how we use docs to help improve the developer experience from open source to our SaaS product and beyond. So just the sort of obligatory 30-second slide on what we do at Weaveworks, what we’re working on is trying to make it possible for software teams to go around this loop faster and the loop that I’m pointing at is about taking an application from development into production. And when you do that, you need to go through deploying your application into production, and then sometimes you have problems with your application and that means that you need to be able to observe and monitor in order to make more changes.

And so it’s kind of a complicated product. It depends on lots of different things. And so the developer experience and the documentation for that is a little bit interesting. Just introduce sort of the team I work with, most of whom are here actually, apart Elia. Anita is my colleague. Where’s Anita? At the back there who heads up our docs. Tamao, we’re very fortunate to work with who works on community with us. Elia, my colleague in London, mostly does code but also does talking and I work on sort of pulling everything together where possible.

Our starting points

So just sort of to give a little background. So once upon a time, there was Docker and Docker was new and shiny and everyone was very excited about it. And the first thing that Weaveworks did was to develop a product called Weave Net. And Weave Net was the first way of connecting Docker containers to each other across multiple hosts. And so this was solving a really fundamental problem and it is super useful and we’ve got hundreds of thousands of downloads and lots and lots of users and that’s all wonderful. And Weave Net is all open source.

Then the next thing that the company did was actually to acquire a small company that I was working on a product called Scope, and Scope is a tool for visualising the way that containers are connected to each other and so on. And so this is just sort of setting the scene. So when I joined a year ago, this was pretty much the state of play. There were two repositories on GitHub, one for Net and one for Scope. Each repository had some markdown files in it and that was the documentation. And there was also a WordPress site that was the marketing site, and had a staging environment.

And so we’d previously written a WordPress plugin called WordyPress which I’m glad to say we don’t use anymore. WordyPress itself was quite a reasonably well-written piece of go code but integrating with WordPress is never fun, and half the time, the documentation would just disappear and then you’d have to turn off the plug-in and turn it on again literally and then the documentation would reappear. And it was something to do with mod_rewrite rules and we never figured it out.

Next steps

So that was fine. I mean, it was just about good enough and we used that to ship changes to these markdown files to WordPress. And the developers on the Net and Scope teams could preview their docs changes in the staging environment. So that’s sort of where we started out, and it was okay. We have a number of different types of documentation. There’s the open source project docs that I mentioned just now. There’s also what we have, what we call Guides which we recently renamed to Tutorials, and there are two types of tutorials. There’s step-by-step tutorials where you literally just have a list of instructions on a web page and the user follows along and they run it in their own environment. And then we also have another type called interactive labs and these are Katacoda environments which present an online learning environment that users can connect directly to some instances that are running on some server somewhere that they can use to get this sort of ephemeral environment that they can use to kick the tires.

Then, of course, there’s the marketing site which, in my opinion, is also documentation. It’s all part of the developer experience of coming to our website and trying to use our stuff. And of course, blog posts. I’ll talk a little bit more about Katacoda. That’s the project that allows us to do those interactive labs. I really like Katacoda because it minimises the mean time to value. And what that means is when someone comes to our site and they want to kick the tires with our stuff, they don’t have to do a bunch of different things. They don’t have to set up a lot of complicated technology in order to actually experience the value of what we do. And this is super important especially because as I mentioned at the beginning, when you saw the diagram, there are lots of different moving parts in the product that we work on. And so just one example, probably the most complicated example is our continuous delivery product which requires a Kubernetes cluster, a version control system, and a container registry, and these all need to be wired up together.

And so being able to just drop an ephemeral environment in front of a user that has all of those things already wired up, and then they just get to click the buttons and see like, “Oh, I can deploy red buttons, and now I can deploy green buttons. And okay, so I can see how this works.” They get straight to the value. So that’s super useful. Shout out to Katacoda and my friend Ben Hall in London whose startup is Katacoda.

The problems we faced

So we had these problems with the website. WordPress was slow. I think when we finally moved off WordPress, we jumped from like five-second home page load times to like a hundred milliseconds or less, and I was like, “Yes.” And I’m really, really sorry, Anita, for the horrific manual process that I put in place at 4:00 a.m. one morning to update the guides just before KubeCon, I think it was, when we had to roll out to a whole load of new documentation and the process for updating those guides was really painful, involved manually copying and pasting markdown that had been converted into HTML into WordPress, and it was just gross. The irony of manually updating our continuous delivery guide was not lost on me.

So anyway, in the second half of last year, another thing happened which is that, like I say, Weaveworks started selling a SaaS subscription to this product, Weave Cloud. So this was sort of part of the company’s pivot from being a pure open-source company to a company that was trying to make money out of selling access to software delivered as a service. And so Weave Cloud is this combination of Flux which is our continuous delivery tool, with Scope which is the visualisation tool, Cortex which is Prometheus monitoring, that’s scalable, Weave Net which is the networking thing, plus some user management stuff and a graphical user interface like a web interface, that was nice. And it was obvious that the documentation that we had at the time needed to catch up.

And so Sonia who’s here…where’s Sonia? Oh, she’s gone to have coffee with a friend. Oh, well. So Sonia started this effort earlier this year to completely revamp the website and sort of do it from scratch. And as I said, Anita identified that we really needed documentation for our product and started working on a more comprehensive set of docs for Weave Cloud. And I’ll talk a little bit about how Sonia and I worked on the architecture for the website. Hopefully, this is useful to people. Maybe it’s obvious, but if you’re thinking about how to put together a website that has requirements for different parts of the business, then maybe some of this is useful.

Our architecture

So the marketing team really wanted a web interface for editing copy and blog posts and we have people doing SEO things and they want to be able to put keywords into our website rather than messing around with Git. Whereas us, the developer experience team, we really did want to manage our tutorials using markdown and GitHub with pull requests just because that was how we felt most comfortable managing our content. And then the engineering team were really adamant that they wanted to keep the open-source product documentation alongside the product in the open-source repos, but we didn’t want to put our website repo with…our website repo can’t be open source because there might be an announcement that’s not public yet.

So we ended up having to keep those open source product docs in separate repos but pull in those markdown files as necessary. And so we ended up with an architecture that looks a little bit like this. On the left-hand side, we use a headless CMS called built.io and that satisfies the requirement that marketing really want this nice web interface for editing copy. And you can also upload blogs and images for blog posts and things like that. I didn’t know about headless CMSes when we started this project, but apparently, they’re all the rage now. And they’re quite good today. It’s basically a database that’s accessible via an API that has a nice web interface on top of it that allows you to add like authors and collaborators and has version control on the content and so on.

Then we also, of course, have the website itself which fortunately is now a Jekyll site which is a lot nicer than having a WordPress site. And then we just had the tutorials as part of the website because the tutorials need to be updated and the website needs to be updated when the tutorials change. And so that just makes sense to put them in the same repo just as part of that Jekyll site. Then, of course, we need to pull in those open source product docs. And then in the middle, we ended up deciding on Netlify as a solution which is really sort of custom-built for solving this problem of, “I want to take a Jekyll site and basically do a CI pipeline on that site and then also publish it to a CDN.” And they just bundle everything as one package. And then Netlify can publish out to a staging site or a live site. And another really nice feature it has is that it can do…it can give you a preview URL for each pull request and that’s super useful.

If you’re adding a new tutorial or adding a new sort of template to the Jekyll site, then you can see these preview URLs for each pull request and you can go navigate to those and see what it’ll look like before you merge it. Of course, having the open-source scope, and Net repos in GitHub, the teams working behind those have their own CI pipelines for those things, and they want to be able to preview changes to the product docs that make up part of that website. And so there’s a bit of CI integration that allows those to trigger builds in Netlify and also generate specific preview URLs for different branches in those repos.

Massive improvements

So we built this over the course of a few months in conjunction with a team in Brighton in the UK called The Unit who were very good to work with. And overall, it’s a massive improvement to what we had before. It seems like a pretty sort of sane architecture. There are a couple of problems with it that I think are still being ironed out. One of the problems is that the builds are really slow. Every time you do any build, it pulls down all the assets from built.io again and that’s…it’s not really the amount of data so much as the latency required to do an API request to pull down each of those assets.

Another problem is that because it’s pulling down…sorry, because the build is pulling down the content and the assets from built.io in a non-atomic fashion, if the content in the CMS gets modified during a build, then you can have some weird behaviour where I think, for example, we changed one of the images on our home page exactly at the same time as the build was happening and I think the content that refers to the image became out of sync with the images that were actually available in the build, and that was kind of annoying. I’m sure there are ways of solving these problems and we will work through them, but overall, it was a really big improvement.

Structuring our docs

So let’s talk about the docs themselves for a minute. This was primarily Anita’s effort. I think she did an excellent job on this. We were inspired by the structure of the Kubernetes docs. And the structure is really quite sensible. The idea is that you have a concept section, a task section, a tutorial section, and a reference section. The concept section…and when I was reviewing the docs, I was really struck by, actually, how sensible this is from the perspective of a new user because in the concept section, you don’t try and get people to do anything. You just talk about the ideas and the sort of conceptual model and the different components and the architecture, and that really allows you to just focus on those things and do a good job of articulating exactly those pieces.

Then in the task section, you can really…sort of on the assumption that you can link people back to the concepts if you need to. Just talk about, okay, well, if you now want to achieve this specific task, then bang-bang-bang, here are the things that you need to do. And these are tasks that are really core to the product. So things like hooking up a container image registry to the deployment tool, for example. Then tutorials are more like, well, if you have this specific X, Y, and Z combination of technologies and you want them to work together or we’ve got this cool new open source project is being announced and we have some demo that shows it working together, then we can put the tutorials in there. And we also have our interactive tutorials, the interactive labs in the tutorial section.

And then the reference section contains the open source docs and also links off to things like the Prometheus query language documentation which developers who are using our product will need to understand. And I thought Anita did a really great job of introducing concepts gradually throughout these sections. If I was writing them, I would probably try and pack everything you need to know into the first paragraph, but I think that people reading the docs have a much better time if they’re introduced to concepts gradually so there’s not too much cognitive load.

Positive outcomes

Okay. I’ve only got a couple of minutes left. I just want to mention one other thing that we do in developer experience at Weaveworks which is something that…a project that Tamao has been spearheading and that is the Weave online user group. So we kicked off having this online user group and it’s been really fun. Quite a lot of hard work, but also has had some really positive outcomes where we just run online talks, trainings, and meetups every week. And we’ve occasionally started doing these in person as well and trying to stream, meet up in person, and on the internet at the same time is fun and a little bit challenging. But this has allowed us to really sort of form a community around people who are already interested in what we were doing, who through whatever channel, noticed that we started running these events, and we’ve had lots of repeat customers. I mean, it’s free. People keep coming back for more and that’s great.

And we’ve also been running talks and trainings around topics as was mentioned earlier, topics that aren’t necessarily core to our brand but which are of interest to people and relevant to us. So we’ve been running Kubernetes trainings which has been something that I think we’ve had the most sort of positive response out of all the topics that we’ve had because Kubernetes is a hot topic, but it’s kind of hard to understand. So people really appreciate having us show them how to use it, and it gives us an opportunity to use our technology as an example.

The challenges that remain

So finally, just going back to the documentation, Tamao encouraged me to include challenges in this talk, so what isn’t working so well. And so very openly, we do have challenges with coordinating docs’ changes with the engineering teams. So we have this SaaS product and we have multiple teams who work in different ways within those teams releasing continuously to this service, and our docs need to keep up. And at the moment, we haven’t yet figured out a really, really effective, really reliable way of having notifications to the people working on the docs of changes that are coming into the product. And so we’re starting to experiment with just really simple things like having a mailing list where we try and get the entire engineering team to just give us some heads-up that some new feature is landing or some breaking change to the UI.

And interestingly enough, the marketing team have exactly the same challenge. So if we solve this problem, for example, with a mailing list, then we’ll be able to solve it for the entire business. So watch the space.

Okay. And the last thing I want to say is that we are hiring for DX people in the Bay Area. So if you’re interested in working on a product which makes developers more productive with containers, continuous delivery, monitoring and visualisation, please come and talk to me or send me an email, [email protected]. We offer a great work-life balance. As a European company, we believe in that sort of thing and we participate a lot in open source projects like Kubernetes, for example. Roughly, 20% of time is spent coding and also working on talks and content. So give me a shout. Thank you very much.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.