First Write, Second Copy, Third Refactor

From Programmer 97-things

Jump to: navigation, search

It is difficult to find the perfect balance between code complexity and its reusability. Both under- and overengineering are always around the corner, but there are some symptoms that could help you to recognize them. The first one is often revealed by excessive code duplication, while the second one is more subtle: Too many abstract classes, overly deep classes hierarchies, unused hook methods, and even interfaces implemented by only one class — when they are not used for some good reason, such as encapsulating external dependencies — can all be signs of overengineering.

It is said that late design can be difficult, error-prone, and time consuming, and the complete lack of it leads to messy and unreusable code. On the other hand, early engineering can introduce both under- and overengineering. Up-front engineering makes sense when all the details of the problem under investigation are well defined and stable, or when you think to have a good reason to enforce a given design. The first condition, however, happens quite rarely, while the second one has the disadvantage of confining your future possibilities to a predetermined solution, often preventing you from discovering a better one.

When you are working on a problem for the very first time, it is a difficult — and perhaps even useless — exercise to try to imagine which part of it could be generalized in order to allow better reuse and which not. Doing it too early, there is a good chance that you are jumping the gun by introducing unnecessary complexity where nobody will take advantage of it, yet at the same time failing to make it flexible and extensible at the points where it should be really useful. Moreover, there is the possibility that you won't need that algorithm anywhere else, so why waste your efforts to make reusable something that won't be reused?

So the first time you are implementing something new, write it in the most readable, plain, and effective way. It is definitely too early to put the general part of your algorithm in an abstract class and move its specialization to the concrete one or to employ any other generalization pattern you can find in your experience-filled programmer's toolbox. That is for a very simple reason: You do not yet have a clear idea of the boundaries that divide the general part from the specialized one.

The second time you face a problem that resembles the one you solved before, the temptation to refactor that first implementation in order to accommodate both these needs is even stronger. But it may still be too early. It may be a better idea to resist that temptation and do the quickest, safest, and easiest thing it comes you in mind: Copy your first implementation, being sure to note the duplication in a TO DO comment, and rewrite the parts that need to be changed.

When you need that solution for the third time, even if to satisfy a slightly different requirement, the time is right to put your brain to work and look for a general solution that elegantly solves all your three problems. Now you are using that algorithm in three different places and for three different purposes, so you can easily isolate its core and make it usable for all three cases &mdashl and probably for many subsequent ones. And, of course, you can safely refactor the first two implementations because you have the unit tests that can prove that you are not breaking them, don't you?

By Mario Fusco

This work is licensed under a Creative Commons Attribution 3

Personal tools