Overcoming spam during Hacktoberfest

After a potentially damaging jump in spam, the Hacktoberfest team acted fast to resolve the problem.

Case study participant
Phoebe Quincy
Senior Community Relations Manager
Hacktoberfest began with a few hundred people and now attracts nearly 200,000 participants each year. Organised annually by DigitalOcean, the event is dedicated to encouraging people of all abilities to contribute to open source. A decade since the initial edition, Hacktoberfest is now a global phenomenon.

The Challenge

DigitalOcean has a history of making outsized contributions to developers’ lives. Their content marketing strategy saw them become the go-to resource for many Linux and other open source topics. Taking that a step further, the company’s developer relations team set out in 2013 to help give back to the open source community whilst further raising the brand’s profile with developers. The result was Hacktoberfest.

Hacktoberfest is held, as the name suggests, in October and is a month-long celebration of open source. The initial idea was to provide an easy on-ramp for people looking to get into open source, while also helping to provide project maintainers with a larger number of potential contributors. And launching during what was probably the peak of developer T-shirts, a Hacktoberfest T-shirt was a natural way to recognise participants. Anyone, no matter where they were in the world, would receive a shirt once they’d made four accepted pull requests.

That T-shirt took on a life of its own.

“An interesting thing that we observed was that people really loved the T-shirt and each year, the design changed. The excitement of getting one of these T-shirts started to gain momentum.”

By Hacktoberfest 2020, it was the T-shirts themselves, rather than the opportunity to learn how to contribute to open source, that seemed to have become the focus for some people. That year, a YouTube video showing how to make low effort, meaningless pull requests gained traction. Thousands of people learned how to make low quality PRs just to get the free T-shirt. The result was an influx of spam pull requests that angered open source communities and posed a threat to Hackotoberfest’s reputation.

“In 2020 we began to see people were creating low effort PRs on legitimate repos, not just their own, causing open source maintainers a lot of work. Some were also creating fake repositories and then having their friends create pull requests, which they were then approving. There were a huge influx of minor changes to projects in order to earn a reward.”

The “quantity is fun, quality is key” values of the event seemed to be at risk. Open source project maintainers complained vocally that the event was harming open source, rather than helping, due to the spike in low-quality submissions. The deluge of spam pull requests put some maintainers  under huge pressure.

“Many of these spam PRs added a useless comment to a line code. Others proposed incorrect punctuation changes in documentation. Even so, each pull request generated work for the project maintainer who had to evaluate it, close it, tag it as spam, often lock the thread to prevent spam comments, and then report the spammer to GitHub or GitLab.”

Up until that point, Hacktoberfest had spread largely through open source people sharing it with newcomers to whom they could teach the community’s expectations and culture. As such, the Hacktoberfest team realised they needed to act quickly in order to prevent harm.

“It was creating a lot of negative press for us. The very people that we were trying to help were really upset with us. We needed a quick solution. It was an emergency crisis situation.”

The Solution

The Hacktoberfest organising team set out a plan to deal with the immediate problem and to make it harder for similar situations to arise in future.

An Advisory Council

A crisis management team was created that came together over the first few days of October 2020 (Hacktoberfest 2020) We worked all night to solve the vulnerabilities, and these measures were put in place immediately. Later we held roundtables with many community members to retroactively understand the impacts and to continue to work on improving the rules.

Having received this advice and feedback, the Hacktoberfest team decided to tighten the rules:

  • Maintainers must opt-in to Hacktoberfest: The ‘hacktoberfest’ topic must be added by maintainers to their participating repository in GitHub or GitLab. Beforehand, people could attempt to contribute to any open source repository, even where maintainers were unwilling to participate in or were unaware of Hacktoberfest. 
  • PR/MRs only count when accepted by a maintainer: For a contribution to count towards Hacktoberfest, the repo’s maintainer must tag it as ‘hacktoberfest-accepted’.

We continue to work on improving things and in 2021 we invited six members of the open source community to sit on a Hacktoberfest Advisory Council. Through their feedback we developed these changes to the rules:

  • Review period reduced: The review period of a PR/MR was decreased from 14 days to 7 days. When a user submits a new PR (ready-to-review, not a draft), the submission is given a ‘grace period’ before it becomes a valid Hacktoberfest contribution. This allows time for maintainers to label any spam PRs as invalid.
  • Spam PR/MRs labelled: If a pull request has any label that contains the word ‘spam’, it is treated as such and is not counted for Hacktoberfest. If an individual has multiple spammy pull requests, they are banned from participating. This was already a rule prior to 2020, however in 2021 we started banning users who submitted multiple spammy PR/MRs.

The team also introduced other changes to help keep Hacktoberfest true to its founding spirit, including:

  • A new reporting tool: Now anyone can report potential abuse and cheating.
  • Reduced emphasis on the T-shirt: The event now puts community contribution and personal development at the centre of its messaging, rather than the T-shirt.
  • Improved guidance for newcomers: Many people come to Hacktoberfest with good intentions but without knowing what constitutes a good contribution. The team has provided new guidance even during registration on how to create a high-quality pull request.

As always with spam, the Hacktoberfest team recognise that this is something of an arms race. With that in mind, they are in a constant process of evaluation to ensure that they are one step ahead, continually refining the rules based on feedback from the community. The organising team also conducts evaluation sessions and discusses any issues or potential areas of improvement that arise and then feed that back into planning for the following year.

The topic of preventing spam has also become a theme of Hacktoberfest’s communications with participants, making it easier for newcomers to see the problems it causes and providing a dedicated Discord channel for discussion and reporting of issues.

The Outcome

The Hacktoberfest team is confident that the changes have restored the event’s relationship with the broader open source community by minimising spam pull requests. In particular, they’ve identified three main benefits.

Contribution quality has improved overall

Following the introduction of anti-spam rules, project maintainers taking part in Hacktoberfest 2021 reported higher quality contributions, even if overall participation was lower than the previous year. Thousands of participants submitted 294,451 accepted pull requests to open source projects, in a variety of programming languages, as well as many that were documentation and other non-code contributions. During Hacktoberfest 2022, 146,891 people from 194 different countries made 335,000 contributions throughout the month of October. At the time of writing, the event has shepherded 2.35 million pull requests to open source projects in total.

Reputational damage was averted

Up until the spam problems of 2020, Hacktoberfest had enjoyed a positive reputation and warm welcome from across the open source community. However, the influx of lower quality pull requests had the potential to inflict lasting damage both to the event and to its lead sponsor DigitalOcean. However, the team avoided the worst of it by responding rapidly. Maintainer satisfaction increased as they did not have to deal with a flood of spam and they expressed their contentment on social media, as seen in the Hacktoberfest 2021 recap

Contributor confidence increased

Following the changes, first time Hacktoberfest contributors were more confident due to the extra guidance on how to make high quality contributions.

“What’s been really great is that people are now self policing or group policing. We also have some amazing volunteers who help in running the Discord community and keeping things positive.”

Looking ahead

For a decade, Hacktoberfest has placed DigitalOcean at the heart of open source, even if there have been some bumps in the road. The focus for the immediate future is both to grow the number of participants and to continue to introduce other sponsors who can help support the event.

In her role leading the Hacktoberfest program, Phoebe Quincy looks forward to how the event can help yet more people to both personally and professionally advance through open source. 

“The very first Hacktoberfest started with less than a few hundred people taking part. In 2020, that increased to 169,886 participants. We’re getting close to 200,000 and that’s actually the goal of Hacktoberfest: to get more people involved in open source. Hacktoberfest’s mission is to inspire more people to get involved in open source and work together to improve the software powering our world today.”