Quality documentation is often one of the weakest links in successful open source project. In this talk from DevRelCon London 2019, Jo Cook talks about The Good Docs Project and Google’s Season of Docs are working to make it easier to create excellent open source documentation.
Jo: Hi, I’m Jo. I’m not a developer. I’m entirely new to dev rel. The first time I even heard the term was about two months ago, possibly. This is all very new to me, but I’m here to really talk to you about a problem with open source projects.
First of all, let’s acknowledge what the problem is. Projects are dying, open source projects are dying. In a recent survey of the most active projects on GitHub, 64% of those projects rely on just one or two developers to survive, and that’s a vast number. We then take into account that these people are really often doing day jobs and running these projects in their spare time, or in a very under-resourced way. We’ve got another problem, that they’re not very diverse! And this is obviously simply one metric for diversity, but the ratio of women in IT is between 17 and 25%, but in open source, it’s between 0.1 and 5%. At the very least, without going into diversity and how important it is, what that means is that there’s a big pooling of talent, which could be being mined for participation in projects.
Another issue is that documentation is often severely lacking, and it’s seen as being a big barrier to participation, to adoption, and contribution. Then you get people saying things like this, “My code is open on GitHub, but I don’t spend “ny effort making it easy for people to use. If you want to figure it out, figure it out yourself, have fun.” And that really isn’t a badge of honor. This is not big, this is not cool. I haven’t included the person that wrote that quote.
Then we find that getting help is tricky, and again, an open source survey from a few years ago said that “Incomplete or outdated documentation is a big problem. “93% of respondents spotted problems with documentation, and yet, 60% of those people say they rarely or never contribute.” I think that allows us to restate our problem. Open source projects, they’re not dying, but they are under-resourced, not very well-documented, and they don’t attract diverse contributions.
How do we go about fixing this problem? First of all, we got to look at finding more resources somehow, and what we find, really, is that long-term support for projects isn’t so much about creating more time for developers, sorry, it is about creating more time for developers. It’s not about money. How can we give developers more time to do what they want to do, whilst also improving project documentation and, basically, stopping this problem where there’s only one or two people responsible for projects.
The GitHub Open Source Survey from 2017 identified that documentation is a really, really, really good way of getting people involved in projects, but not only that, it’s a really good way of establishing more inclusive and accessible communities. Getting people in documentation brings in people from different communities, and encouraging contributions to more than just the code makes the project a lot more resilient in the long run. Encouraging user contributions to documentation is massively important to projects. And obviously, I don’t know, I mean, I’m not a developer, but I know that there are a number of rude and impolite, and users that think they’re entitled. We’re talking about the generally nice users here, you know? Some users are really not, it’s not possible to save them. Some of the reasons why users are the best people to write documentation. It’s really, really easy for developers to miss out steps because they’re very, very close to their software.
You’ll all remember this famous instructions on how to build an owl. Step one, draw some circles, step two, draw the rest of the, beep, owl. And I’ve come across software where those are the instructions, basically, you know? Do this one thing, run the software. What about the 20 steps in the middle that I have to google? I was at a workshop for a mapping library recently, and it’s a fairly basic mapping library, to be honest. It’s great, don’t get me wrong, but the workshop asked people in the first instance to install Node, install NPM, work with Webpack. The people at this mapping workshop wanted to know how to draw a map on a webpage. They really didn’t give a damn about Node and NPM and Web Map, and they didn’t need it to put a map on a webpage. And I think literally 20% of the participants would’ve given up, had they been trying to run that workshop at home. That makes me cry, it makes me very, very sad.
Another thing is that developers are really, really familiar with the technical terminologies that they use. This is a classic one, instructions from GitHub. This is a kind of random, made-up example, but you see the similar sorts of things all the time, clone the dev branch, add the grunt tasks, recompile. I’m sorry, but if you’re an average user, that makes no sense whatsoever. I barely understand what the grunt bit means, for a start, and I’m also not that interested. I just want to do the thing! The other thing to remember is that developers and power users as well, and this is not, I’m not just coming down on developers here, we’re not standard computer users. There was a Nielsen Group survey quite recently that said that only 5% of the population of the 33 richest countries in the world rated their own computer abilities as high, and only a third of them could actually complete medium-complexity tasks. When they’re faced with statements like this, there’s a very, very small number of people that are really going to understand what you mean and actually do something with it.
What else is there? It’s users that really know what their own expectations and needs are, and quite often, those don’t actually match what you as the developer or the owner of the project think. As projects become more successful, they gain a life of their own, and you find people using the software in odd ways. And really, it’s users that, not only do they become more demanding of your time, but they establish what is actually important. You’ve let that software out into the world, but it’s the users that are going to actually find things to do with it.
They’re going to find the novel use cases, for a start, and what needs love and documentation. It’s users that are actually going to be using the documentation, as well, and this quote here is actually not about a little open source project. This is about one of the four big, enormous projects, technical, not really projects, but one of the four big tech teams. What people are saying is that the documentation is so terribly bad that they’re forced to search through video transcripts to find something relevant to the thing that they’re trying to work with. That’s just not accessible, but also, have you got users that have different accessibility needs? Is searching through videos really the most appropriate way for people to find out about your software? You got to figure out what things put people off, really, when you’re writing documentation, or when you’re trying to get them to help you, when you’re trying to get them to contribute.
The first is I think one of the important ones, is high barriers to entry. What we’re talking about here is, is it easy for people to contribute to your documentation? This is a genuine documentary toolchain for a project that I work with quite a lot, that I shall protect its name. First of all, I have to install the correct versions of Java, Python, Maven, Sphinx, Sphinx Bootstrap Theme, and I have to make sure that that’s actually working. I’ve got to learn not only how to write simple text in restructured text format, but I’ve got to understand about refs and things like that, and all that dot, dot, colon nonsense. I’ve got to understand how to fork things on GitHub. I’ve got to understand about pull requests, reviews, and commits. And it’s only then that I can actually submit a documentation change! And the only reason I even bother going through this process is because I love this project and I want the documentation to improve, but I know that this puts people off.
To be fair, this actually applies to software in general these days. This is another quote from an article about open source system-wide activity tracing. “If you want to start building a full-stack “web application in 2018, you need to install Node.js “and the npm package manager, “run a slew of npm commands to configure “a custom toolchain with a CSS preprocessor “and a JavaScript code bundler, “and you’ve got to adjust your OS environment variables, “find all the system dependencies,” and you know, really? Another thing to remember is that your software, your baby software, your darling, might actually only be a step towards somebody achieving their aim. They might not actually be all that bothered in using your software. It might just be something they need to do to build their web map or something like that.
Users are more likely to help you ensure that your documentation covers these early, painful setup stages, because they’re actually not all that focused on the endgame. They just want to get things done. Another thing that happens a lot is that new contributors can be made to feel a little bit unwelcome, which is terrible, really. Contributors are not an annoyance, they’re not an increased burden, they’re not only suitable for correcting spelling mistakes. And yet, again, going back to the GitHub survey, “Negative experiences have real “consequences for project health. “21% of people who experienced or witnessed “negative behavior said they stopped “contributing to a project because of it.” Now, I do admit that we’re talking about all sorts of negative behavior here, and not just being made to feel slightly unwelcome, but you know, all of these things matter. Can we figure out a better solution to all of this?
The first of all is, how do we reduce the barrier to entry? And what are the ways we can make it so much easier? I’m a big fan of documentation that has edit this page on GitHub, where literally, you’re taken to the correct page in the GitHub source, and you can simply edit it in the GitHub Editor. Yes, you need a GitHub account, but you don’t need to learn restructured text quite often, you don’t need to do a proper pull request, but you can fix your problem as you see it. Now I notice that some of the really big projects send somebody to a form for submitting an issue. You spot an issue, you’ve then got to go and find this form, and then you’ve got to spend some time filling in that you’ve spotted a spelling mistake or an errant Oxford comma, and you know, life is so much easier if people can just fix it themselves. Preferably do it so that they don’t have to install any additional software, and not go through that awful Mavens, Sphinx, Python, Java nonsense.
The other important thing is to allow people to get their contributions live as quickly as possible. It’s a massive buzz if you’re a new user, to see your change that you made, even if it is just a spelling mistake, up there, live, in the documentation, so you know you’ve actually done something, you’ve contributed. These contributions, small contributions, lead to bigger contributions. I should know, I started off by contributing spelling changes to some documentation, and eventually, I got added to project steering group, so these things do happen. And I’m glad I got added to the project steering group, by the way, that’s a good thing. Acknowledge contributions, no matter how small they are, because they are important to the people that made them, and they’re important to your project.
The next thing that you need to do is you need to work with existing technical writers. There are lots of technical writers out there with huge amounts of subject matter expertise. They’re not secretaries. It’s not trivial to make complex information in the technical domain simple to understand and follow, so as developers, your best bet is to really make friends with those technical writing people. And then you got to provide workflows that make it easier to write good documentation, and find all of those existing good practices. There are lots of them.
To summarize this part of my talk, this is important because encouraging help with documentation for your project encourages empathy and improves diversity, both of which I think are very good things. And I know what you’re going to say, “I don’t have the time! “I don’t have the time for all of this! “I am literally juggling fire.” That’s where Google Season of Docs comes in. Google Season of Docs matches technical writers with open source projects for over three or six-month, really quite intense writing periods.
I’ve been involved with it for OSGeo, the Open Source Geospatial Foundation. And we’re working on two aspects of the big, massive open source geospatial stack. We’re looking at OSGeoLive, which is a live DVD that basically gives you all of the software for trying out, and the metadata package GeoNetwork. While we started working on this, we thought, why don’t we abstract this expertise and work to make something a little bit more generic? Good point, so then we come to building an actual solution. And I’m very pleased to introduce the Good Docs Project, and you can find stickers downstairs, if you can find them amongst all the other many stickers. I’m just going to briefly now talk about the various aims of the Good Docs Project and how this can help developers basically to go through all the stuff I’ve just talked about, but in as little of their own personal time as possible.
The first aim is to identify all of the elements of good documentation that a project needs. We’re talking really about tutorials, how-tos, reference, technical documentation. And we’re looking at providing best practice resources and templates for each of them. Doing that hard work of finding all of that best practice stuff and putting it all in one handy location for you.
The second aim is to establish a minimum viable docset, and we’re making it open source, so that anybody can really use it. And this is designed to help you create a baseline set of docs, not only for projects as you set them out, as you start them, but right through to maturity, as you go through changes in requirements and users deciding that actually, your project is really something else entirely.
And the third aim is to create a community of writers, users, and techies, somewhere that you can get practical tips and advice, and help for all parts of this project, of your process. With the general aim of increasing quality and consistency, you can see, I just learned to do pink sparkles. I love pink sparkles. They’re great. Saving time for developers, democratizing knowledge, and generally making the world a better place, which I think are all very good things to do.
If you’re interested, we’ve got a website at thegooddocsproject.dev. You can find us on Twitter @thegooddocs. We’ve got a Slack channel, and we’ve got a groups mailing list, so lots of ways to get in touch with us. And that’s me! You can find this talk on GitHub, at archaeogeek, that’s me, .github.io/devrelcon2019, and you can find me on Twitter, @archaeogeek, as I said, and thank you very much!