Improving the quality of large legacy codebase by incrementally enforcing encapsulation

I was thinking of a way to help maintain a large complicated legacy project if it was at least meant to have object-oriented design. The problem with such projects is they were developed by an army of different people over time and there's a chance some of them weren't as skillful in object-oriented ways as we'd like. The blame is not only on them alone. OO paradigm is decades old, but the knowledge of how to use it most efficiently and how to handle cases that look like procedural problems has been piling up slowly. For instance, legendary Design Patterns book by the Gang of Four wasn't published until 1995. And a little design mistake in coding 15 years ago will accumulate in technical debt, as over the years engineers were in constant hurry to deliver, without having time to refactor (if they were even aware of bad design at the time).

The main work horse here is the encapsulation paradigm for which some languages, including C++ and Java, provide an option to enforce it, if we're willing to use it. You can do this trick in each change to the code you commit, be it bug fix, new feature, anything. Open some header file around the functionality you're working on. Choose some chunk of public members, that look like class internals and put them into the private section. You should have vague idea of what class internal are here, since you're working on this piece of code anyway. Then try to compile. The following may occur: the code will either compile or it will not. If it doesn't, a class' internal structure is not just exposed, but is also being used elsewhere (bad design alarm). If it compiles, I see no possible functional changes this might bring to the project. All uses of object instantiated from that class obviously don't use that member variable or function, otherwise it wouldn't have compiled. Clients are therefore unaffected and the change is safe.

Which brings me to subclasses that inherit this class. Again, direct uses of a new-made private members will break compilation as clients' use would (bad design alarm). As for virtual members, inheritance does not follow "private can't be reached" rule. Subclasses can override superclass‘ private members, but they cannot use them at the same time. Some experts say having public virtual functions is a (bad design alarm) anyway.

What have we gained? We reduced class' public interface and increased encapsulation. With this we reduced possibilities of error, since each possible connection between classes brings higher risk of a bug. Also we prevented engineers in the future to use that member thinking it is already used on numerous other places in the code without being aware they are actually adding to the complexity of it and increasing chances of a bug in the future. For them to pull member from the private into the public section will have to be intentionally and hopefully with a good reason, but not by mistake. We also simplified writing unit tests, because this is just another thing we don't have to test. With this in mind, we can use that time to write more unit tests that actually cover more functionality and corner cases instead of being busy by class internals. <Insert more of "why is encapsulation a good thing" arguments here.>

And if stuffing members into private section didn't succeed? Well, then we at least pinpointed some possible design issues. Sometimes a design mistake will be obvious and will be repairable in the same commit and other times not, but even in the latter case at least we can mark the code as being (bad design alarm) and move on. We'll go back and refactor that part when we'll be able to afford the time. Maybe some day there will even be a bug in that immediate area, bad designs often cause them, and such a comment might give us some clues.

How much we spent? 2 seconds of time and no risk to the project. To be fair, we did introduce some additinal complications for merging code in the future, but let's say this isn't our immediate concern right now. This offers relatively a lot of gain for such a small price to pay.


Previous: MFC compared to other libraries
Next: Reversing graphs in Kosaraju's algorithm is in fact necessary