John Keklak's Coaching Comments

Wednesday, November 03, 2004

History lessons...

A huge amount of time, money and brainstrain is spent on trying to figure out existing code. It is my experience that far more is spent on this activity than on actually writing code or even fixing bugs. I'd like to share with you some experiences and a suggestion that will significantly reduce this expenditure.

I'll start with the suggestion: Make certain software developers learn the history of the code they are working on. Once they do this, they can move much more quickly. Let me explain how I came to this insight through a few of my experiences.

In several situations, I've found myself surrounded by volumes of code and a list of enhancements to make and bugs to fix. Understanding what the client wanted was the easy part. The hard part was becoming fluent in the code so I could make the appropriate changes.

In one situation, it was just me and the code -- the programmers who had written the code were long gone. To make things worse, it was pretty badly written -- Windows code written by developers fairly new to Windows.

I managed to add some instrumentation code that revealed enough of what was going on so I could fix some of the worst bugs, but I never got to the point where I really felt fluent with the code. The main reason was there were vestiges of things which didn't seem to belong, but nonetheless there they were. I could only theorize about these vestiges of code -- did they have something to do with logic which had been long changed or removed? or they were the beginnings of some project that had been abandoned? were they still relevant? Since I wasn't being paid to do archaeology, I never got around to figuring out the reasons for these vestiges, but -- in the back of my mind -- they always bothered me. I always took these vestiges into account when modifying code, but I always had the feeling this was just a ritual.

For example, many objects had a member named 'm_revision', which seemed to be incremented whenever a change occurred. Clearly, the intent of this member was to give each change a revision number. The problem was that I wasn't entirely sure that these revision numbers were really used anywhere -- there was logic that depended on them, but it was not clear this logic was ever executed. I decided to dutifully make sure 'm_revision' was incremented properly just in case there were situations where this logic was executed.

Now imagine if I knew that 'm_revision' wasn't really used anywhere. The code for incrementing it and using it, then, was unnecessary. One option: I could cut corners and skip making sure my changes incremented 'm_revision' properly. However, this approach would probably make a bigger mess than continuing to make sure it was incremented. A better approach would be to remove all code which mentioned 'm_revision', which I could do with confidence if I could talk to a developer who could confirm that this was OK to do. As I mentioned earlier, the developers were long gone, so I had no practical choice but to operate on the assumption that 'm_revision' was necessary.

Some time later, I found myself in the midst of another client's body of source code, somewhat better written, but massive -- literally thousands of classes and millions of lines of code. Once again, my mission was clear, but the source code, to a great extent, was incomprehensible. I spent much more time than I would ever care to admit trying to get fluent with the code.

A key difference this time was I still had access to many of the programmers who had written this code. While many of them found my questions annoying, a few developers found a few hours for me to lead them through a long series of questions. Some of the discussions were merely about the meanings and purposes of certain classes. However, the most valuable discussions revealed how the developers came to create the classes that they did -- e.g. "First we defined these classes, but then we encountered certain problems, so we solved these problems by introducing these other classes", etc. Knowing how the code had evolved gave me a quick fluency which simply memorizing classes and purposes could never do. The explanations for why classes were created served as landmarks in the code, quickly making it familiar terrain. Although it has been a number of years since I interviewed these developers, my fluency with these classes remains.

I now think about how valuable it would have been to interview the developers who created 'm_revision'. With the history of 'm_revision' and its entire body of code, I could have developed a confident fluency, and I could have done much more for the client.

The lesson? Make it part of your software development culture to pass on the history of the code. History gives you fluency and landmarks that makes code feel familiar. Familiarity is what allows software developers to make code changes confidently and quickly.

The best way to pass on the history of the code is to write it down -- in a Word file, in the source code, on a company intranet. And write it down before the original developers can no longer be located. Regularly add stories and explanations from each development cycle. Who should do the writing? All developers, perhaps; a developer who writes English well, much better choice.

Finally, the beauty of this documention is you don't have to update it -- the code may change, but its history does not.