Opening Open Source
Some of us in the Open Source community have been so busy criticizing closed source software that we rarely take the time to make self-critical analysis of Open Source projects. Sometimes we even refrain from self-criticism in open source communities for fear of giving arguments to the forces of Evil that hide in the shadows with their restrictive license mind-sets, ready to build up their case for why they have to control the users of their software and why they have to deprive them of their rights.
Unfortunately, this lack of self-criticism also deprives us of opportunities for improving, and particularly it prevents open source projects from reaching their full potential. So let’s swallow the pride for a minute, ignore momentarily the Evil forces of EULA’s promoters (they are a decadent industry anyways, trying to survive in the information age by using the obsolete logic of the industrial revolution) and let’s take a critical look at the state of our open source house.
If I have to pick a single defect from Open Source software projects, I will go for the one that I think is at the root of most other problems, or that at least, gets in the way of most other solutions
“Suboptimal Community Management”
That is
“We don’t harvest the manpower of
the community as well as we could”
In most of our projects we count with thousands of users that are subscribed to our mailing lists. We also guess that as many as three times more are users of the software, but because they are not subscribed to our lists we can’t really account for them. From this large community of users, we should be able to recruit about two-hundred capable developers who could dedicate time and effort to tasks such as:
- Fixing failing tests from Dashboards
- Increasing code coverage
- Fixing Valgrind errors
- Addressing Bugs that have been logged for a while in our bug trackers
- Improving and updating documentation
- Supervising and mentoring other users
If half of these 200 hypothetical developers were dedicating four hours a month to making contributions to the software, our Dashboards would be Green, our bug trackers would be empty of open bugs, and our code coverage would be in the 90%’s in less than six months.
In most projects, however, it turns out that the active team of developers who are actually committing code and addressing issues from the list above, tend to be in the size range of only five to ten people.
So we ask
- Where are the other 190 developers? and
- Why is it that they are not fixing our bugs?
and the answer that comes to mind is that it is our fault that they are not here, because we haven’t work hard enough on creating the mechanisms for
- Motivating them
- Training them
- Mentoring them
- Empowering them by giving them responsibilities
In particular, we tend to make the following two mistakes
- Not providing enough motivation for new users to become developers and to be engaged in the community
- Not being able to rapidly absorb, engage and retain the motivated users that appear from time to time
Of course, this is not just a matter of gathering a crowd of two hundred people. Fred Brooks explained to us long ago, in his book The Mythical Man Month that simply adding more people to a project is not going to make it go any faster. Just having more people is not the answer. A social infrastructure and a software process are also required.
We have seen living demonstrations that well managed communities can allocate most of their contributors to useful tasks and can manage to coordinate them effectively. For example: The Wikipedians managed to coordinate six million people to write in only six years the largest encyclopedia in history. They did so by using a core organization that for most of its history didn’t have any full-time employee and that, at its peak, only had about ten full-time people. The key to success is to put in place a distributed organization that will allow those willing and able to make their contributions without being jammed by unnecessary protocols.
How can we replicate this in our software projects?
I don’t have the answer to that, but I can list some of the things that get in our way, and hope that a smarter person can see a solution pattern in there.
When confronted with a volunteer who offers her/his time to the project we don’t know if she/he
- Has the required programming skills
- Has the mentality for team work
- Has the discipline to follow a software quality process
- Has the commitment to stay around for long enough to justify our time training them
and we know that it will cost us a lot of time to try to find out the answer to each one of those questions above. Unless, of course, this newcomer happens to be endorsed by someone we know and whose judgement we trust.
We have seen in the past, how disruptive it can be for a project to bring someone on board who fails on any of the points above. I certainly don’t want to have to teach a newcomer what a C pointer is, or spend time explaining the reasons why some C++ classes use that funny “&” symbol. But, I’d rather do that than have to deal with a good programmer who doesn’t want to work with others in the community, or who doesn’t care about the quality of the software and its long term maintenance.
In this first stage, we are dealing with a problem of
- Lack of information about newcomers
- A need for building trust (us trusting them)
Should we pass that barrier, then we are to be prepared for
- Providing them with guidance on how to contribute
- Giving them responsibilities in the community
- Giving them opportunities to build their reputation in the community
The other side and one potential solution
When teaching our Open Source Software Practices course at the Rensselaer Polytechnic Institute, we have also seen the other side of this equation. We require all students to work in an open source project of their choice. It is often the case that students attempt to join a project of their interest and they get cold receptions or plainly get ignored by the project developers. We advise the students to ask nicely and to ask a couple of times, but we also advise them to look somewhere else if they haven’t received a welcoming message in the first three weeks.
In that endeavor, we have seen that the approach that better suits the point of view of newcomers is the one that several communities call: “junior jobs“. These are collections of bugs that are easy to fix, and that can be used as easy tasks for newcomers to prove themselves. Seen from the outside, this seems to be an effective mechanism to overcome the obstacles that we have listed. Newcomers can easily tackle one of the junior jobs, old-time developers can easily verify the correctness of the solution, and build confidence in the newcomer, and engage them with more challenging tasks. In this way, newcomers can demonstrate their programing and team-work skills as they get to learn the customs of the community.
A second level is needed
A more challenging case is the one of capable and motivated developers who occasionally show up in our communities. They arrive full of energy, with new ideas and fresh perpectives. They take critical looks at our ways, and make well-intentioned calls for change. We are not quite prepared for them. We wish that they would have showed up ealier, maybe years ago when many of the basic design decisions were made. We wonder if they are going to stay around long enough to help push the boat in the direction that they are advocating. Bigger challenge “jobs” are needed to keep them interested, and a more flexible structure is needed in order to retain them and to realign their energetic path with the one of the community. I don’t have an answer for this level, but I’m sure that open source projects would reap great benefits if an answer to this level of engagement were available.
The urgent need
Whatever solution we want to advocate, it is critical for Open Source projects to engage their communities in a two-way exchange and to continuously recruit new developers, maintainers and evangelists from among their ranks. It suffices to watch any of the code swarms of open source projects (e.g. ITK swarm, VTK swarm, CMake swarm) to realize that developers come and go over the years, and that a continuous replacement is necessary for maintaining the health of an Open Source project.
Hi dear Luis,
My point of view is not the one of an expert, but I would like to
mention 2 solutions that comes to my mind when I read the difficulties
you mention:
A. Not knowing newcomers, and if they are “worth” the time spent to
teach them.
I would say “open your work” and comply with open source model which
stands on a community of developpers.
Such communities you have in Google Code or Source forge where a
developper creates his/her profile before contributing to a project
and receive comments on his/her work by the project manager. You would
then benefit from these feedback to easily spot brilliant newcomers.
But this would mean to be part of the Sourceforge or Google code
projects…
B. Not knowing if you should accept new breakthrough ideas.
I would say “open your mind” (or be a woman 🙂 i.e. work at several
levels). Since how long ITK stands at version 3.x.x or VTK at version
5.x.x? Is there a meaning to the 3 or the 5 anymore?
(sorry for the comment, I am new to ITK and VTK only since 2 years)
Why couldn’t you have not several develoment versions? One for bug
fixing (the z in x.y.z) and new functionalities (the y in x.y.z), and
one for breakthrough ideas, completly new architectures (the x in
x.y.z) ? Such version could be started by these newcomers and there
could be several at a same time. Only if they progrress well would
they become the official new major release.
Ok, I’m sorry if these ideas are wrong. I’m not a professional
developper, far from it. There are certaintly difficulties that I
don’t see.
But anyway, ITK, VTK and ParaView are great!
If I look at CMake, I see a lot of contributors but there are some problems:
a) Bugs with patches rot in the bugtracker, other bugs too (a lot of open source projects have this problem)
b) You don’t use a decentralized version control system like git
c) There is only one mailing list! This mailing list is flooded with user questions, there should be one for people who want to contribute modules and one for people who are interested in core development.
You know normally developers prefer to send patches via email so they can be discussed.
Andreas, funny you should mention CMake and git, we are in the process of converting to git this week! We do have a cmake-developers mailing list, and it might be a good time to start using it again… The patches are often untested, and we are working on being able to test them easier once we move to git.