Managing a multilingual open source community

Queeny Jin
Queeny Jin
DevRelCon Earth 2020
30th to 10th June 2020
Online

TiDB is a NewSQL database that has contributors (430+ and growing) from all over the world. How do they bridge the gaps between different cultures and languages?

Queeny discusses how they manage a multi-lingual, multi-cultural, international developer community in this talk from DevRelCon Earth 2020.

Watch the video

Key takeaways

Takeaways coming soon!

Transcript

Queeny Jin: Hello, everyone from the other side of the screen. Thanks for joining my session today. I'll talk about how we manage a bilingual open source community, and I call it dancing in shackles. And now we're moving towards multilingual open source community and we are planning to add Japanese as the next coming language. So the agenda for today includes three series sections. Number one is some background information about PingCAP, the company where I'm from and about me and about the TiDB community. And also the next topic would be the three pillars of the TiDB ecosystem. And the last one, but not least one, is how we're dancing in shackles.

So about PingCAP.

PingCAP is basically founded in Beijing in April, 2015 and it provides enterprise level services and technology for TiDB. And as you can see from the stats on the screen, that TiDB is a pretty popular database open source project and it has 24. 2 key stars on GitHub and it's only about five years old. So on top of that, we have more than 440 contributors in the TiDB Ripple on GitHub only, and also for Thai tv, which is now a cloud native computing foundation project. It's open source on April, actually, April, April 1st in 2016. It now has 7. 6 stars.

T TV was originally created also by PingCAP and it's donated to in 2018 and is now going through the voting process for graduation. And the last one I want to highlight that's also from PCA is the Kios project. Kios SMASH project is an kiosk. Engineering is a Teos engineering platform that's are to be the defective standard of kios engineering the cloud.

It's open source on the last day of 2019 and it has 1. 9 key stars. Sorry, I might have made typo on the number of contributors, but we can check that later on the GitHub pages. So that's about PingCAP.

And from myself, about myself, I am the head of internationalisation, which is short for I 18 and pink. And I'm also employee number 17 at PingCAP, So I'm also a citizen investor and previously I worked at IBM and Symantec as technical writer. So moving on to the introduction of the TiDB community and ecosystem. So for this diagram here, you can see three pillars. And we believe the three pillars, which are people, communication and governance, are essentially what makes of the TiDB community. And we count on these three pillars to build and rank this ecosystem and community. Okay? Again, for the people it contains not only the developers who contribute code to the projects, but also the end users who are constantly give us adopting the project, giving us feedback by filing issues and bugs, et cetera, to tell us whether the project is moving on towards the right direction.

And that's the most important asset, the ecosystem community, which are people. And the next one is about communication. Communication has a rather broad scope in my definition because we consider code is also part of the communication because I'll elaborate more on that. But essentially this, our code continent channel are how we communicate ourselves with the community and the ecosystem to interact and engage with each other in the community and on top of the people and the communication to actually run the community. We need some governance in place that includes the code of conduct and this are some ground rules and also the community structure which will put everyone in a place where they can learn and grow. And also there are all kinds of interesting events going on in the community to constantly attract people and evolve in the larger community. And we have designed an owner system in place so that they can actually see the growth, see their growth path and enjoy their growth.

So speaking of the people, as I mentioned, we are, it includes not only developers but also users.

And we believe the ti B ecosystem is not only made by developers and for developers, but also by end users and for the end users. So these are some numbers that coming from the entire IDB ecosystem, we can see that we have more than 1000 contributors and three maintainers and committers and veers, et cetera. Also, we have a lot of active contributors. These are all defined in the community structure, which are go, which I will cover in the next slide. And you can also see on the bottom left, you can see the growth of the numbers. Where we coming from early 2016 to now, it has two more than 1000 contributors. And where does these contributors come from? So they come from the entire community.

Some of them are individual contributors, but many, many of them come from the organisations and companies that are currently adopting TiDB in their production system.

So when I see TiDB, it is essentially about the TiDB and t TV and KI Smash and this entire ecosystem. And besides TiDB Tiny B and Ks smash PingCAP has a lot of other open source projects as well. So moving on to the governance for the governance, as I mentioned earlier that we have different roles, we have contributors, we have committers, active contributors. These are the different roles we have defined in this community structure. And for a bigger picture of the community organisation, you can see that we have A PMC, we have a developer role and the user group and organiser committee. So the roles I mentioned earlier are all to the developer group. For PMC, it means that it is the core management team that oversees the entire community.

And for developer group, which has a lot of different roles and different roles and responsibilities, they are essentially the Title D developers. And also we have user group, it's users and from big companies or the adopters of, it's a nonprofit third party that will organised on a regular base so that users can talk to users, exchange their ideas, and sometimes complaints about the entire project. They're all welcome and all kind of feedback are welcome. And also we have organiser committee, which is the committee where the organisers in charge of these different events and activity operations.

I really want to dive deep into a little further into the developer group and the user group. So basically now you can see that we have a user group with different roles and to help the developer in the community role. We also have different seats by seat, I mean it's a special interest group. So for this special interest groups is organised.

So basically the TiDB community is organised primarily by SIGs. Each SIG is comprised of members from multiple companies and organisations. They have a common purpose to advance the project with respect to specific topics and subject. So the goal is to enable a distributed decision structure and code ownership as well as providing the focus forums to get the work done and making decisions and onboarding new contributors. And we also now have 21 working groups. Working groups provide a formal avenue for disparate special groups to collaborate around the common problem, a working room reference. The interest of the group had to build a consensus. This is a little more about developer group and also we have B community organisation, the user group in this organisation.

These are groups as I mentioned earlier, where the users get together to elect leader and co-leader and ambassador and have regular meetings and conference sometimes conferences and to collect feedback and exchange experiences using the project.

So these are the community structure where as we can run the community and give oversight over this entire community. So when we talk about governance, as you can see that we also have different events, hand and honour systems. So we have hackathon annually in October. And also we have the Usability Challenge programmes and also performance challenge programmes, et cetera, which are also on, it runs for several months and have a different board for those who have participated and all the credit system to challenge those participants. It's very interesting. And also we have provided some database learning programmes so that new contributors to the community can get some hands-on learning experience and practising experience. Also, we have devcon, which is a conference only for the developers and we believe in open source and we want more people to join these events as long as they are interested in it.

And one of the events that I really want to highlight is the talent plan. So these talent plan courses are categorised into different subjects. You can learn programming language. You can see we also design designed a different learning path for learning path one, we'll walk through the contributors or those who are part of the talent plan programme tour of goal. And then when they complete the learning path one to three, they can get some hands-on experience by trying the practical network applications in rust and et cetera. So this is basically some programming language courses. And also there are some infrastructure systems such as Tiny sql, tiny Key Value, et cetera. And also we have one more course is a deep dive into tiny B ecosystem in place so that they can know better about the ecosystem and community that we are on.

And when they complete these courses, they are invited to attend the Talent Challenge programme and to get a badge if they can participate.

And also different programmes and bounTiDBug bounty programmes are in place for them to challenge themselves and practise what they have learned in the talent plan. The talent plan is really popular in the developer community, especially for those who are interested in distributor systems. And we have covered people, which includes developers and end users. We also covered the governance, which includes different events and programmes and all kinds of honour system that we have designed in place. But to bring them all together, the most important thing is communication. And communication is where the shareholders are in. So I mentioned earlier that we categorise communication with a broader scope where we believe code is also part of the communication because we believe code is a way of conveying yourself in certain type and get yourself understand and get yourself understood by others.

So you can code in different programming languages such as Go or rus, which are the two languages we primarily adopt.

And you can get involved in the Golang community and the Rust community, et cetera. But you have to follow the coding style and the contribution guide we have in place so that you can get better understanding of what's going on. And so other people can also get some understanding of what's going on as well. And in code, many, many of the developers have some comments that people don't understand and to bring that together to make your code much easier to understand. We have some code common style, which is in English in place. And this is where you can code in a way that other people can better understand you. And on top of the code, the next very important factor that will determine whether the communication is effective is how your content is structured.

So we have RFC, which is essentially the design documents where for the major features and functions of the project.

And also we have different technical blogs that come from the developers. And so they are essential contributors. And the other roles that we have defined in the developer group, and these are the technical blogs and they are really describe how a feature came into life. So for end users, the end user documentation is also created, but essentially created by the people, by my team and other engineers from pcap because we would like to create the documentation that's easy to use, easy to find and easy to understand so that we can easily onboard them to try out the product and the project to help them with their work. And also we would like to see content from the contributors and end users as well. They can share their story with other contributors and end users so that they can know better of what's going on and how to avoid the pitfalls that could damage their data, et cetera.

So these are essentially the user generated content.

But as I mentioned, the check posts are also in communication because for those contributor generated content, there are sometimes just new one language and we need to move forward to move forward to make it more user-friendly and reader-friendly to other language as well. So I will cover that even next month. And the third factor that is also very important is the channel where these people can talk with each other, where this 1000 contributor can have a place to exchange their ideas and provide feedback to think up and to other people as well just to engage in the community. So we have Slack channel, which is the titled Slack channel, and we have different private channels and public channels and for those things and for those special interest groups and working groups where they can discuss all those projects together and working to advance the project.

And we also have social media, we have for TiDB, we currently don't have specific Twitter handle for it, but it's under the pink tag handle. And we're thinking about creating a separate handle Forb as well. But for Tech TV and KI Smash, they also have their own Twitter handle. So every time when a contributor make a contribution to just match or tv, he or she will get a call out in the Twitter handle.

Okay, so for hack news hand, Reddit, now these are the platform where we syndicate the content. So when we have some exciting technical blogs or announcement or some very technical contents that we think we believe that will be interesting to the hacker community, we posted to Hacker News and more often, I think in the past three months, we got, in the past six months, we got to about four stories that are highlighted on hyper news front page.

So it has given us a lot of traffic to our website and also for local community and forum for local community and forum. This is forum, as I mentioned, we're also the shackle is sometimes people from different regions. They prefer different social media platform such as for the contributors and users from China, they use WeChat a lot and they also use Quora in China, which is called Jehu. But for the other side, for other regions in the world, they would prefer to use Twitter. And I know in Japan you also have qda to share knowledge and blog. So these are the channels we need to focus on.

If you want to build a global open source project. The local experience matters a lot.

So this is where when I talk about the shackles, so these are the three content, these are the three questions I ask myself a lot. Number one, where does the content come from? So we have a lot of events. Our project is moving very fast. So the even will we'll have a lot of contributors and users and also P up employees to talk on the S. And this is where we can get a lot of content.

And also the project itself evolves very fast. We can have different topics on the new features and this technical decision making processes, et cetera. And also we have different programme languages such as Rust. We are very active in the rust community and we will write blogs about what's our thoughts regarding the programming languages and the upstream and downstream ecosystem where B is a database and it has to be part in the bigger community and ecosystem to work with the upstream and downstream.

For example, the local engine of T tv, isb. So we need to engage with the R DB programme, R DB community, and CC. The next one is about the language and cultural differences. As I mentioned earlier, some of the user generated content UGC are in Chinese and it's original and authentic, but it is not English or dance friendly.

And the next slide will cover how we manage to make the content English audience friendly. And there's a tricky word there. So there are also some cultural differences which we need to update the terminology regularly, such as the resident updates on the whitelist and blacklist and master slave, et cetera. So all those are the cultural differences. We need to keep sharp eyes and stay updated. So not to mention the technical and non-technical differences, sometimes we need to analyse the audience pretty carefully because if we are not careful, we would draw into the subject matter that too deep that the ordinary audience would not understand. But that's also very subjective because even if we want to give the mass media nontechnical subject, it is very difficult to just grasp the essence and how we can make it easier to understand by and accept it. So the dancing part is we make where we can attract more and more people.

I have mentioned a lot of events that are going on in the TiDB community, but to make some of those events applicable to the global audience, we need to do some local experiments first. So we put on some programmes such as the First challenge programme in a local community first and see how the local community reacts to the organisation, the effects of the events. And then we'll expand it to the global scope. For example, the Usability Challenge programme. And for another part of the dancing, I would like to the lifecycle of a user story. So basically for the UGC in Chinese we would say take a user story in the source language we will select to promote abroad and see that these content are really solid. It has all the details that the global audience would like to absorb and see how we can help them.

So we would get permission from the user who have shared that story and then we'll put some efforts into it and translate the piece into English.

And we'll go through different review periods such as technical review and editorial review and also some PR department review and from the users organisation. And then we'll public it, we'll make it public on the target media and other channels, and then syndicate to other media as well. So basically the creation process is all about taking one concept presented in one language and completely adapting it and recreating it in accordance with the nuances and cultural ideas that need to be funded in the target language. I put a link here so that you can know more about what transcreation means. And just to end my topic with one more example. So we launched a programme last year, which is completely community driven. So it is to write a book about TiDB in $48. And it has attracted 102 office and 421 commits and almost 200 pool requests in the red.

So we're trying to make a plan to make it to launch the programme into the global community as well. And I'm open for ideas which can help us do that. And so that's all for my topic and thank you.

Taiji Hagino: Thank you very much. Thank you for your great session. And we would like to move to Q1 for your session. I think that we can have a couple questions for you to dive in some. Could you please pick up to some questions from the slide two?

Second MC: Okay. Should I share my screen or just okay to read?

Taiji Hagino: Yeah. Okay. To just read the question. Okay.

Second MC: So we have a couple of question. First one is, which pillar do you think the most important to be architect in order to succeed on open source community?

Oh, I think you're muted.

Queeny Jin: Sorry. I think the people is the most important pillar that will hold up the entire communiTiDBecause all the communication between and among those people and all the governance is how we and the organiser of the community would just to help people in this community work with each other. And essentially the people matters the most in the three pillar. So without people, nothing else will be there. Thanks for the question. That's a good one.

Second MC: I agree is the most important one.

The next question is related to people. So in OSS project, is there anything you are doing to increase the number of the contributors?

Queeny Jin: Yes, there are a lot of events, I should say. There are a lot of events to bring people together so that they can be part of the larger community and to interact with other people. And also we are doing content transcreation so that people can understand what's going on in the community and we have channels in place so that people can actually have a place to talk with each other. So all of these, the two pillars are to enable and empower the people in the community that they can have a sense of belonging.

Taiji Hagino: I see. By the way, how many often do you have the event per month or per quarterly?

It's

Queeny Jin: Not that that's regular. We have three annual events, so basically it's three events a year for now. But we are planning to use it more from PingCAP site. But as I mentioned earlier, we have user conferences, we have user groups, they have different kinds of user activities and events organised by themselves. We didn't track how often they have conferences or events, but they have regular events in their region as well.

Taiji Hagino: I see. I got it. Thank you.

Thank you. I think we can have one more question.

Second MC: Okay. Yeah, let me choose one of them. So yeah, so the question is how to control the conversation with the users and the developers. The developers want to make new, but users want stability or nothing change. Yeah, it's typical question every time.

Queeny Jin: It's a very good question.

Thank you. And so as I mentioned earlier, this requirements coming from the users matter a lot. Great deal to us as well. And that's why we have different special interest groups and working group as well. So for those stability related issues and requests, we would like to form working group so that we can have longer time and they can work across the different special interest groups to work on a certain specific stability issues. And for those new features and new requests they would like. We also have different stake so they can pick something new from the contributors and developers. They can pick something new from there and they can prioritise and just discuss all of those exciting things with their SIG leaders.

So that I think this can balance each a little, but it will not solve the problem once and for oral because this is something that's really difficult to solve.