Alice Doesn't Work Here Anymore

You’ve joined a long-standing project and it’s your first day on the job. Before too long you have access to the codebase. It is comprised of hundreds of files and it relies on a software stack that includes several acronyms you don’t know. Each layer of the stack has its own files that configure it. Some of those layers also have source files of their own that get compiled or interpreted. Some of those layer-specific source files have their own build methods. Tying it all together is a collection of build and test scripts. Complex projects also have test environments and perhaps even extensions for the development environment itself.

You marvel over the directories and files filled with intricate lines of delicately interrelated lines of code and configuration parameters. You wonder how it all fits together into the amazing software product that it already is.

Among the only files you might actually understand is one entitled README. You know what to do with it. Everyone does. Whether you believe it is an homage to Lewis Carroll’s Alice in Wonderland or whether you believe it is a mere concession to pragmatism you probably know it is the place to start.

What you find there is supposed bootstrap your learning process. After reading this file you should be able to determine where to get the rest of the information you need.

But, there’s often something wrong in Wonderland. The README file leads to documentation that is out-of-date, too detail oriented and doesn’t address the bigger picture of how the software is organized and why it does what it does. I have even read “The code is the documentation.” This is, in my opinion, an admission of utter defeat in writing documentation for a codebase.

Alice’s adventure in wonderland was in fact provided for by someone. Someone left behind the critical things she’d need for success: the key to the door, the cake labeled “Eat Me” and the potion labeled “Drink Me” that allowed her to enter the magic garden. Perhaps the White Rabbit? Whoever it was had an agenda, and they needed Alice’s help to achieve it. It was in their best interests to make sure she had everything she needed to succeed, which worked out to be a cake and a potion and a key.

But, the README homage to Alice in Wonderland is diminished by the fact that not everything is provided that is needed for success.

Alice brought her wits but she still needed the key. The key in a software project is documentation. If Alice doesn’t succeed on the software project, the first question to ask is about the documentation. And, if Alice does succeed, will she contribute to the documentation for the project to make it easier for those who join the project years later when Alice doesn’t work here anymore?

There is a social contract in a software project between those already up-to-speed and those who are not. It is more than interesting that assuming one sticks with the project, one is on both ends of this social contract. And yet, the documentation still sucks. How could this possibly be?

One possibility is that people lack the skills to write documentation for others. That is perhaps possible, but it isn’t as hard to write documentation as it is to write code. If you can’t write something in words, how do you expect to write it in code? And, people are much better at interpreting imperfect writing than a compiler is at interpreting imperfect code. The lower standard for perfection makes documentation easier than code.

Good documentation is well-factored, like good code: nothing large is repeated. There are pictures instead of lengthy descriptions. It is written in the active voice (Subject-Verb-Object). It minimizes forward references to terms (e.g. it defines terms before it uses them). It is unambiguous and thorough without being repetitive or dwelling on minutia that obscure the more important information. Not all things that are true about a software project belong in a document about it. It must also be useful to someone trying to learn about the project. Being true is necessary but not sufficient. What is necessary in documentation is,anything that you could not infer from other facts about the project. Different topics like requirements, functional specifications or software designs have their own audiences and deliverables. What they have in common is that the reader’s time and attention are limited resources. Documents with factual but useless clutter discourage the reader and reduce comprehension.

Another possibility is that people simply don’t want to spend the time to write documentation. Perhaps they feel it is beneath them or perhaps they believe that the code they are writing is more important to the project than documentation about that code or about the project overall. As for being beneath them: that is nonsense. If documentation is an essential part of a software project one is shirking their duty to the project by not writing their share. And, if documentation is not an essential part of the software project: run. Just run. If you think a house divided can’t stand, wait until you see how wobbly an undocumented software project can become. Leadership is establishing a shared vision. Nothing destroys any sense of leadership so well as the diaspora of vision created by not documenting the project for those who join it. Anyone who joins is always a second-class citizen and it isn’t long before ignorance and confusion cause the project to devolve into a feudal system.

Another possibility is that it acts as a convenient, passive barrier to entry. I think this is an example of using a lack of transparency as a weapon to reduce competition. It is one thing not to value documentation or the time spent writing and reviewing it. It is quite another to believe it benefits oneself by its absence. I believe people have their own self interests and unless they understand how they are connected to others that self interest can drift toward the infantile.

I wonder why do some people on a software project believe that they benefit from the lack of documentation?

Are they threatened by the idea that others might be as effective as they are on the project? Do they fear that others might begin to influence the design of the project if they understand it well enough? Do they fear that they will be criticized for the choices they made in implementing the project? Do they fear that their role will be diminished and they won’t be as important or useful? What do they fear that they feel is protected by the wall of missing documentation?

I think the fundamental question is: what is success for a software project? How do you evaluate success?
  • Its ability to provide steady employment?
  • Its ability to maximize the influence and importance of those who created or inherited it?
  • Maximizing time spent coding?
  • Is it based on the number of users?
  • Bugs?
  • Features?
  • Ratings?
  • Simplicity?
  • Its ability to make money?
  • Its novelty or ingenuity?
  • How useful it is?
  • How secure it is?
  • How easy it is to use?
  • How many platforms it runs on?
  • How few resources it uses when it runs?
  • How fast it runs?
  • Its maintainability?
  • Ease with which new people can join and contribute?
  • How easy it is to extend?

Certainly some combination of the above and no doubt there are others as well, but it certainly isn’t the same from project to project.

One way to evaluate the success factors above is by whether or not the factor leads an engineer to value producing documentation. Only some do. In fact, one could argue that documentation is only directly related to the last three. All the others are indirectly connected and some of them are actually inconsistent with producing documentation. For example, the first three are better served by reducing the amount of documentation. Of course, documentation doesn’t really get deleted; therefore, the only way to effectively reduce it is to not produce it to begin with.

I think of software projects as if they were complex organism. I think of them as animals. The code is the mind; the data is the body. The software project is the evolution of that species. Each instance of the product is an instance of that organism.

The species’ prospect for long term survival depends on the software project’s success factors. The species’ interests are possibly very different from the engineers who created it. The species want to survive. It wants to be used by many people, have no bugs. It wants more attention from its users so it wants more and more features. It wants to evolve and it doesn’t want its evolution to make it sick. It doesn’t want to suffer and it doesn’t want its users to suffer. It wants to run fast and not to waste resources, so that it can make the most of each operating environment. It wants to be popular so that more and more users will want it. And most important: it wants the largest number of people who can help it when it gets sick or needs to evolve. The worst case for such a software animal is where there is only one person who understands how the animal was made. If that person dies or stops helping, the animal has nobody who knows it well. Anyone else who tries to help it is as likely to harm it.

Should engineers care about the interests of the software animals they create? I think so, but some may disagree. I think I’d restate the question as this: are we practicing software engineers or software system husbandry? If it is only software engineering then document production can possibly be excluded from the process. If it doesn’t really matter whether the software animal survives in the longer term then software engineering alone might suffice. It isn’t exactly gaming the system to avoid providing it, but it is defining success by criteria so limited that success in the engineering itself doesn’t necessarily imply success of the project, the company or the software species. But, it’s just a job, right? Caring about the outcome is extra, I suppose.

But, if success of the project and the company matters then the needs of the software species matter and it wants its engineers to write documentation. It wants transparency. It is unwise to work against the interests of the software species we create, because they can get sick and die. They can fail to evolve and therefore their competitiveness can end suddenly when other software species out-evolve them.

Documentation is the investment in evolution that allows a software project to survive.