This is a short new book by Kent Beck, having almost 100 pages #surprising. Most of the chapters are one or two pages long.
Foreword
I liked an old quote by Benjamin Brewster in Foreword, saying: In theory, there is no difference between theory and practice, while in practice, there is. According to Larry Constantine, who wrote the Foreword and who coined Cohesion and Coupling, Cohesion and Coupling are how our brains deal with complicated systems. Using relationships between pieces of code. Nice and tidy. That's the theory.
Preface
Beck calls his design "Empirical". Why? The most debates in software design are about microservices, how big repositories should be, events versus explicit calls, objects versus imperative code. These "What" debates hide more fundamental disagreement in software design. What Beck misses in the debates is "When". We have two extremes from the "When" perspective:
- Speculative design - we know what we will want to do next, so we prepare for it upfront. When the software is in production, we will never have the chance to redesign it. So let's do it today.
- Reactive design - features are everything anybody cares about so let's design as little as possible upfront and get back to features. When the features are almost impossible to add again, only then turn to design again.
- Empirical design - Beck likes to think his method is somewhere in the middle of the two extremes. When certain class of features feels hard to add, we design until the pressure is relieved. We start with just enough design to get the feedback loops going.
Introduction
Beck's personal mission is to help geeks feel safe in the world. That mean doing everything, including design, in small safe steps. Beck misses "How much?" and "When?" in the descriptions of software design #interesting. He likes that design is an intellectual puzzle and that sometimes it has a positive snowball effect, when a small redesign benefits become bigger than originally expected.
Part I - Tidyings
Beck's general learning strategy is to go from concrete to abstract #interesting. That's why this book starts with tidyings. Tidyings are cute little fuzzy refactorings which nobody could possibly hate. Word "Refactoring" took fatal damage when people started to use it to refer to pauses in feature development. The same people took away the "don't change behavior" rule, so what "Refactoring" means to project stakeholders is: nothing to show at the end plus possible damage. No thank you. #funny
Chapter 1 - Guard Clauses
Instead of having all the method's code indented inside a single if clause at the start of the method, reverse the condition instead and return the trivial value immediately. We are long after FORTRAN days with "single return per function" recommendation. We should not overdo guard clauses. More than 6 guard clauses are not easier to read. We can also extract method if it enables us to tidy to a guard clause.
Chapter 2 - Dead Code
Just delete it. If you suspect the code is not used, pre-tidy by logging its usage and deploy to production to see first.
Chapter 3 - Normalize Symmetries
What other people call consistency, Beck calls normalizing symmetries. Beck want's to normalize all instances of some pattern (e.g. lazy initialization) in the code to a consistent manner. One by one.
#rant I miss some discussion or at least a mention of a team agreement here. After reading this chapter one developer could start "optimizing" the whole project according to his one and only possible design. Also some discussion about automating these team-agreed tidyings is IMHO missing, e.g. eslint, checkstyle, custom rules etc.
Chapter 4 - New Interface, Old Implementation
Create an interface we wish we could have and implement it using the old interface. I like the term wishful programming for this approach, which I took from another book. The same approach applies when you:
- Code backwards - from the last line of the routing as if you have all the necessary calculations already there. #interesting
- Code test first - as if the production code already existed (and had an ideal API).
- Design helpers - if only you had this service / helper utility, the work would be so much easier.
I worry about inconsistencies if this concrete tidying was overused. E.g. more people creating many new interfaces out of one non-ideal old interface. I would prefer some discussion about completeness of the refactoring. I would incline to completing the transition to the new interface. #rant
Chapter 5 - Reading Order
We should reorder the content of a file in order which a reader would prefer to read it.
Chapter 6 - Cohesion Order
We should reorder the code so that elements you need to change are adjacent.
Chapter 7 - Move Declaration and Initialization together
Variables and their initialization seem to drift apart sometimes. Tidy this by moving declaration to initialization. If we accidentally break the code, move in the smaller steps. That's the tidying way.
Chapter 8 - Explaining Variables
If we encounter complex code, extract an intermediate result to a variable named by the intent of the code.
Chapter 9 - Explaining Constants
When we understand the meaning of a literal constant, extract it to the symbolic constant. There are downstream tidyings from this one, like putting constants which change together, together, etc.
Chapter 10 - Explicit Parameters
Don't obfuscate input or output parameters as a map. Don't use global variables deep inside the call hierarchy. Send them explicitly.
Chapter 11 - Chunk Statements
If we find two parts of statements that do a different thing, split them by a newline.
Chapter 12 - Extract Helper
We can do this tidying when we see part of a routine which has not much in common with the rest of the routine. We extract a method from it. Or we can do it when we need to change something. We first extract it to a method, then make the change, then inline the method back. We will very often find out that the extracted method is useful and keep it. Another example is when there is a temporal coupling between two method calls. We can then extract a new method which takes care about the temporal coupling and calls the methods in the correct order.
Chapter 13 - One Pile
Sometimes the code is split into many tiny pieces but it seems harder to understand than if they were inlined together. Inline them together to one big pile (a method). Then tidy from there. Some symptoms of the problem:
- Long, repeated list of arguments in methods. #interesting
- Repeated code, especially conditionals. #interesting
- Poor naming of helper methods.
- Shared mutable data structures.
Chapter 14 - Explaining Comments
When we finally understand a piece of code, that's a valuable moment. Capture it in a comment. If we find a file without a header comment, consider adding a comment about why the reader may find reading the file useful. #interesting
Chapter 15 - Delete redundant Comments
When you find a comment that says exactly what the code does, delete it. Tidyings often chain together. Previous tidying might make a comment redundant.
Part II - Managing
Get used to design the code a little all the time #important. Tidying are gateway refactorings. The title of the book is Tidy First? with the emphasis on the question mark. Beck acknowledges that that you might tidy doesn't mean you should always tidy.
Chapter 16 - Separate Tidying
Imagine we need to add a feature to the code. Usually we do couple of tidyings in the process. If B is behavioral change in the code and S is the structure change (a tidying), then we might do changes in order BSSBSBBS. Now we might or might not want to split the tidyings into separate commits or pull requests. It might be split like: B,SS,B,S,BB,S or even B,S,S,B,S,BB,S. Or it might not be split at all. One big pull request could be too big for constructive review feedback. Too many PRs could be overwhelming. It also depends on code review latency. If we get feedback fast, we are encouraged to create smaller pull requests, encouraging even faster reviews. If the latency is high, we are encouraged to create one big pull request, further slowing reviews. Beck encourages us to experiment with not requiring code review for tidyings.
Chapter 17 - Chaining
Stick to tiny tidying steps. Tidyings tend to chain. You implement one and you see another. Be wary of chaining too much, too fast. Failed tidying is expensive relative to the cost of series of small successful tidyings. This chapter provides lot of examples of which tidying could chain with which.
Chapter 18 - Batch Sizes
How much tidying do you need to do? Tidying is not looking toward a far future #important. Tidying meets an immediate need. How much tidying will be easy to integrate and deploy? There are couple of disadvantages of batching tidyings together:
- Collisions - e.g. version control conflicts.
- Interactions - chance of changing the behavior rises.
- Speculation - higher chance that we tidy just because, not based on the immediate need.
That said, review costs rise as the tidying batches size shrink. What is the solution? Reduce the cost of review for tidyings. Allow deploying tidyings without a review #interesting. This is possible in teams with strong culture and trust.
Chapter 19 - Rhythm
How much time should tidying take? Minutes up to an hour #important. More than hour of tidying probably means that you lost track of minimum set of structure changes you need to enable the desired behavior change. If the code is one big ball of mud, then it won't stay like this for long. There is strong "pave the path" tendency. 80% of the changes occur in 20% of the files. So yes, sometimes tidying can take longer than one hour, but not for long.
Chapter 20 - Getting Untangled
Let's say you have the behavioral changes B and tidyings S messed together in one pull request, like e.g. SBSSBSSSBB. You have at least 3 options, none of them attractive:
- Ship it AS-IS. This is impolite to reviewers and prone to errors, but it's quick.
- Untangle the tidyings into separate PRs or commits. This is more work.
- Discard the work and start over, tidying first.
Beck recommends us to experiment with the third approach #interesting.
Chapter 21 - First, After, Later, Never
When to tidy?
- Never - if it ain't broke, don't fix it. Some systems are truly static and there is no need to change them. There is nothing to be learned from improving the design.
- Later - if you have a big list of tidyings without immediate payoff. Make a list of future tidyings (Beck calls this his "Fun" list because he has an odd notion of fun #funny). Then when you have an hour, you can take on tidying #4 from the list.
- After - tidy after if you are going to change the same area again soon, it is cheaper to tidy it now and the cost of tidying is roughly in proportion to the cost of behavior changes.
- First - if it will pay off immediately either in improved comprehension or cheaper behavior changes. If you know what to tidy and how.
Part III - Theory
Why to tidy? Theory does not convince. When to start making design decisions? When to stop? How to make the next decision?
Chapter 22 - Beneficially Relating Elements
Beck calls software design beneficially relating elements as in biology #interesting. Substantial structures have parts, e.g. tokens -> expressions -> statements -> functions -> objects/modules -> systems. Elements have boundaries. Relationships between software elements are e.g. invoking, publishing, listening, referring (as in fetching). Beneficiality here is e.g. that element A has a benefit from existence of element B, because element B takes away complexity of doing something complex and hides. From this point of view the designer of a system could only:
- Create and delete elements.
- Create and delete relationships.
- Increase the benefit of a relationship.
E.g. when box.width() * box.height() is replaced with box.area(), a new element was created and the relationship between the caller and the box is improved.
Chapter 23 - Structure and Behavior
Software creates value in two ways - what it does today and the possibility of new things we can make it do tomorrow. Behavior can be described 2 ways - input/output pairs and invariants. Structure create options, what the software could do next, so it also makes money. The structure could make it easy to add new countries to our application or it could make it hard.
Chapter 24 - Economics: Time Value and Optionality
A dollar today is worth more than a dollar tomorrow, so earn sooner and spend later. In chaotic situation, options are better than things, so create options in face of uncertainty. Software design has to balance these two frequently opposing forces.
Chapter 25 - A Dollar Today > A Dollar Tomorrow
This encourages us to tidy after rather than tidy first. Only exception is when tidying first gives us immediate returns of investment.
Chapter 26 - Options
Lessons learned from stock options:
- Option to implement behavior has value on its own even before implementing the behavior.
- Options to implement behavior are more valuable if there are many options.
- The options are more valuable if the behaviors are more valuable.
- We don't have to care which item is most valuable as long as we keep the option to implement it.
- The more uncertain our predictions of value are, the greater the value of the option is (versus just implementing it). So we should embrace change.
Software design is preparation for change of behavior. Design is the premium we pay for the option of buying the behavior change tomorrow. The more volatile the behavior change the better.
Chapter 27 - Options Versus Cash Flows
The only point here we should get used to being aware of the incentives, the opposing forces of creating options versus cash flows (dollar today > dollar tomorrow).
#rant I understood the last 5 chapters were just long story about justification of software design from the economic point of view - we should motivate stakeholders to allow us to design with "more options" incentive.
Chapter 28 - Reversible Structure Changes
Structure changes and most design decisions are generally reversible #interesting. Even "extract as a service" can be made easily reversible by clever usage of feature flags. If we get halfway into it and realize this is one of those services that really could have been a SQL query, we can change it without too much fuss #interesting.
Chapter 29 - Coupling
Expensive software is where changing one element forces us to change many other elements #interesting. In cheap software we tend to make only localized changes. There is 1:N relationship, where one change could make us change N other elements. Then there is much bigger problem - cascading changes where one change causes ripple effect because of recursive 1:N relationships.
Chapter 30 - Constantine's Equivalence
cost(software) ~= cost(change), because 70+% of the software cost are changes.
cost(software) ~= cost(change) ~= cost(big changes) ~= coupling
Chapter 31 - Coupling vs Decoupling
You can face the cost of coupling or pay the price of decoupling and reap the benefits. You can also fall anywhere along this continuum. That's one of the reasons software design is hard.
Chapter 32 - Cohesion
Coupled elements should be subelements of the same container. Elements that are not coupled should go elsewhere. But make not sudden moves. Proceed in tiny steps. Maybe improve cohesion just for one element at the time. Follow the Scout rule ("leave it better than you found it").
Chapter 33 - Conclusion
We should not get carried away with tidying. When we practice tidying, we are preparing the design on behalf of others like us. It should be ordinary part of development. This book was about design by and for individuals. Next books will be about us and programmer colleagues and the next book will be about relations to all stakeholders. Beck aspires to make software design an exercise in human relationships. So should we tidy first? Likely yes. Just enough.
Comments