Characteristics of a successful OSS Project

What are the characteristics of a "successful" OSS project?

Stefano Piazza thought it might be a useful exercise at one stage in Gnumed's development (2007) to look at the characteristics of some projects widely acknowledged as "successful" and apply those to Gnumed. He did a brief (3hr) Google search on successful Open Source projects

He offered some brief conclusions:

(1) Most OSS projects fail

90-95% of all startups disappear

(2) Of the projects studied, all arose out of a need to improve a current program

Apache, Mozilla and Sendmail all arose from the need to improve a current program. Hence a working "prototype" was available. It appears difficult to generate community interest without anything to work on. It could be argued that Gnumed 0.1 is a prototype, even though it has only minimal functionality.

(3) There is little literature on traditional design methods

On this quick search, I could find very little on the upfront design process we have been discussing. Feasibility and user requirements appear to be not formally considered in most OSS projects. Presumably the developers involved consider these issues informally.

(4) Most OSS developers are users

As such, they understand the "domain" the software is to work in, and have a clear idea of (at least their own) "user requirements". This appears to me to be a major problem for Gnumed " while the developers are medical, there is divergence on what functionality the program should provide as a minimum. Most potential long term users will not contribute to development as happens in many OSS projects

(5) Project "governance" varied

Of the three projects considered, one was a spinoff from a commercial venture (Mozilla) and Netscape retained control and put considerable resources into managing and testing the codebase (6 teams). Apache was probably the most similar to Gnumed. This had a much smaller codebase than Mozilla. There were 8-25 "core" developers, with 4-8 active at any time. Most of the codebase (>85%) was contributed by 15 core developers. They had a "quorum" for voting on major issues. Most communication was via E mail lists. Sendmail was coordinated by a single developer.

(6) Users who are knowledgeable contribute to identifying and fixing bugs

Approximately 10 times as many users as "core" developers supplied fixes and many more again identified bugs and tested software

What is "success"? Wide uptake quality in use bug "density" Maintainability Community of developers

What is "failure"?

Various quotes and summaries follow " I am not sure how they can be used but might stimulate discussion

(from http://aseigo.blogspot.com/2005/11/writing-successful-open-source.html)

Writing successful open source software

i consider kicker to be a software failure.

ok, now that i've laid down the shock line let me back pedal a bit here. a lot of people use kicker and it works pretty well for them. there are few other desktop panel apps out there that are as flexible or offer as many features.

on the other hand, there are a number of highly intractable bugs in kicker. and very few people work on it; in fact it has gone through periods of complete neglect in its 5 years of life. why?

the code is too complex as it grew "organically", feature after feature. and for an open source app, that's all it takes to neuter development. this is because that most open source apps change hands several times over their lifespan. open source code therefore need to be highly transferable between people. how does one achieve that? i'd be lying if i said i had the answer (i have more questions that answers ;), but here are my current thoughts on the matter:

in open source we work with very small teams of developers with high turn over. these people do it because they enjoy it. our software processes then must support that. how?

clarity:

when starting the application, a clear goal must be had in mind. one needn't add in every feature necessary on day 1 (in fact, one shouldn't) but it is absolutely, positively critical to know where you going. because unlike closed source software where code can get uglified over time due to the shifting sands of requirements and still develop successfully simply by throwing more money and (begrudging) developers at it, open source code needs clarity. clear code makes it simple to learn and easy to debug; this makes a coder happy, and happy coders are open source coders.

simplicity:

good open source code doesn't get overly fancy in its design. it may be neat to use 10 different design patterns in an ever growing display of coding prowess, but all that tends to do is make it harder for the next person to come along and untangle your thoughts. if the design is clever, there's probably a simpler way of doing the same thing that results in more maintainable code. remember that you won't be the last person to work on this code, and as an open source developer you are likely doing this only a few hours a week so even you probably don't have time to build the eiffel tower of software.

extensibility:

if the design doesn't allow for easily adding features without touching the core code, expect the core code to become a nasty dog pile of features as features that people Must Have(tm) are added. a lack of extensibility is a killer to clarity and simplicity, and will therefore in turn damage the project's future. plugins, scripting or even just a cleanly compartmentalized design are key concepts here.

documentation:

is there high level documentation for how the app works? how the various classes and components fit together? the purpose of things? if not, then the bar is raised much higher for new contributors. they end up spending a lot of time up front reverse engineering instead of contributing and this will cause most people to just walk away.

reuse vs invention:

if there's well written code out there that does what your software needs, use that code. if it isn't exactly what you need but it's close enough, let go of the perfectionism and don't reinvent the wheel. and for goddess' sake don't fork the code if you need to adjust it a bit; fix it upstream so we don't end up with N slightly different flavours of the same component. however, if the best thing out there isn't a good match, don't try and shoe-horn it in (the use of kioslaves for imap in kmail is a good example of this IMHO).

make it easy to contribute:

be willing to accept patches. write unit tests so it's easy to see if a patch breaks something. ensure the code is clear and documented. today's patcher may be tomorrow's maintainer. dole out the praise when its due; have fun.

The following is my (RH) take on a paper at http://www.research.avayalabs.com/techreport/ALR-2002-003-paper.pdf

Apache

Grew out of a previous project " NCSA httpd program 1995 " coordinate fixes to this

Process issues were studied first " how would the development team work? E mail communication Minimum quorum voting to resolve conflicts " new "core" members were welcomed after a period of contributions (~6 months) numbers 8-25 "core" members 4-8 active at any one time

New features/bugfixes requirements from BUGDB (bug database). 1-2 interested developers would keep an eye an this and send significant requests to the developer mailing list. If the need was great enough someone would take it on " no formal process for this Next step " find a solution or decide on which solution Coder then produces code Patch submitted to mailing list for approval Testing by coder

Major releases managed by a "senior" core developer Force resolution of major issues ie "shaking the tree and raking up the leaves"

Numbers contributing - 249 to code
  1. to bug fixes
  2. bug reports leading to code changes from 458 individuals Majority of problem reports submitted by non core developers " ie the community did have a significant role in testing the software
The top 15 developers contributed >85% of the code changes

Defect density was difficult to compare with commercial projects, but roughly comparable

Some conclusions/hypotheses Core teams max 10-15, if the project is larger, it should be broken up into subprojects With a large user base, contributors will find defects and fixes, freeing the core team to continue with coding new features " otherwise coders will become bogged down fixing errors. Coders will generally be users of the software, so have a clear idea of "user requirements"

Mozilla

Code was released by Netscape into the OSS "space"in 1998. Netscape remained a "benevolent dictator" to oversee development. This was a much larger project, but again began with a previous software base " development was to progress this

12 core coders /members " partly commercially funded 6 test teams, code inspection before release, formal test cases

This is a quote from http://www.cmcrossroads.com/ubbthreads/showflat.php?Cat=&Number=53266

Great projects live and die by the strength of their leaders for which no amount of management can substitute. Successful project leaders keep their teams focused on the "need to have" capabilities, not the "nice to have."

Since project members come and go, it"s imperative that all project knowledge is retained " not through specific efforts to generate models or heavy documentation - but through the collection of data as a by-product of actual development work. The reprocessing of that data as knowledge needs to be immediately available to new project members. I find the code of successful open source projects very readable " it attracts developers and facilitates rapid, evolutionary progress.

Short, iterative development cycles supported by continuous integration/build/test ensures that code is always stable and the nascent application demonstrable. Again, this aids the comprehension of new project members and the act of gathering feedback from the target user community early in the development cycle. Short iterative development cycles also provide a framework that allows developers to commit to projects in relatively short amounts of time if that is all they have available.

Open source developers working on sites like SourceForge?.net can also tap into resources available outside of their immediate project, elsewhere in the community. A centralized application on which all work is performed supports the rapid identification of developers with specific expertise you may be able to tap " either for brief questions or for joining your project as significant contributors (and who knows, you may also find that the project you are planning has already been done!).

"Agile", eXtreme Programming (XP) and other formalized methodologies employed by a growing number of companies have evolved from open source development experiences. Moving an organization to these new ways of working may be a daunting task. However, I can almost guarantee that some of the staff you have within your company already have many of these skills honed on the open source projects to which they have contributed and/or

The following are selected quotes from the seminal paper on OSS "The Cathedral and the Bazaar"at

http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/

Throwing code away and starting again

But I had a more theoretical reason to think switching might be as good an idea as well, something I learned long before Linux. 3. ``Plan to throw one away; you will, anyhow.'' (Fred Brooks, The Mythical Man-Month, Chapter 11) Or, to put it another way, you often don't really understand the problem until after the first time you implement a solution. The second time, maybe you know enough to do it right. So if you want to get it right, be ready to start over at least once [JB].

The Importance of Having Users

And so I inherited popclient. Just as importantly, I inherited popclient's user base. Users are wonderful things to have, and not just because they demonstrate that you're serving a need, that you've done something right. Properly cultivated, they can become co-developers. Another strength of the Unix tradition, one that Linux pushes to a happy extreme, is that a lot of users are hackers too. Because source code is available, they can be effective hackers. This can be tremendously useful for shortening debugging time. Given a bit of encouragement, your users will diagnose problems, suggest fixes, and help improve the code far more quickly than you could unaided. 6. Treating your users as co-developers is your least-hassle route to rapid code improvement and effective debugging. The power of this effect is easy to underestimate. In fact, pretty well all of us in the open-source world drastically underestimated how well it would scale up with number of users and against system complexity, until Linus Torvalds showed us differently. Maybe it shouldn't have been such a surprise, at that. Sociologists years ago discovered that the averaged opinion of a mass of equally expert (or equally ignorant) observers is quite a bit more reliable a predictor than the opinion of a single randomly-chosen one of the observers. They called this the Delphi effect. It appears that what Linus has shown is that this applies even to debugging an operating system"that the Delphi effect can tame development complexity even at the complexity level of an OS kernel. [CV] One special feature of the Linux situation that clearly helps along the Delphi effect is the fact that the contributors for any given project are self-selected. An early respondent pointed out that contributions are received not from a random sample, but from people who are interested enough to use the software, learn about how it works, attempt to find solutions to problems they encounter, and actually produce an apparently reasonable fix. Anyone who passes all these filters is highly likely to have something useful to contribute. Linus's Law can be rephrased as ``Debugging is parallelizable''. Although debugging requires debuggers to communicate with some coordinating developer, it doesn't require significant coordination between debuggers. Thus it doesn't fall prey to the same quadratic complexity and management costs that make adding developers problematic.

Spend time on your data structures " in our case SQL tables

9. Smart data structures and dumb code works a lot better than the other way around. Brooks, Chapter 9: ``Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious.'' Allowing for thirty years of terminological/cultural shift, it's the same point.

Reframing the problem

But there are two more fundamental, non-political lessons here that are general to all kinds of design. 12. Often, the most striking and innovative solutions come from realizing that your concept of the problem was wrong. I had been trying to solve the wrong problem by continuing to develop popclient as a combined MTA/MDA with all kinds of funky local delivery modes. Fetchmail's design needed to be rethought from the ground up as a pure MTA, a part of the normal SMTP-speaking Internet mail path. When you hit a wall in development"when you find yourself hard put to think past the next patch"it's often time to ask not whether you've got the right answer, but whether you're asking the right question. Perhaps the problem needs to be reframed. There is a more general lesson in this story about how SMTP delivery came to fetchmail. It is not only debugging that is parallelizable; development and (to a perhaps surprising extent) exploration of design space is, too. When your development mode is rapidly iterative, development and enhancement may become special cases of debugging"fixing `bugs of omission' in the original capabilities or concept of the software.

The need for a working prototype to start with

It's fairly clear that one cannot code from the ground up in bazaar style [IN]. One can test, debug and improve in bazaar style, but it would be very hard to originate a project in bazaar mode. Linus didn't try it. I didn't either. Your nascent developer community needs to have something runnable and testable to play with. When you start community-building, what you need to be able to present is a plausible promise. Your program doesn't have to work particularly well. It can be crude, buggy, incomplete, and poorly documented. What it must not fail to do is (a) run, and (b) convince potential co-developers that it can be evolved into something really neat in the foreseeable future.

Commercial design doesnt always get the best results!

One of the best-known folk theorems of software engineering is that 60% to 75% of conventional software projects either are never completed or are rejected by their intended users. If that range is anywhere near true (and I've never met a manager of any experience who disputes it) then more projects than not are being aimed at goals that are either (a) not realistically attainable, or (b) just plain wrong.
Topic revision: 17 Aug 2009, JamesBusser
 
Download.png
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback
Powered by Olark