Notes about A Philosophy of Software Design

My notes about the book by John Ousterhout. It is quite a short book. Originally I made a mistake of reading just the Conclusions of the chapters but there is a lot of contents not in the Conclusions. After really reading the whole book, I started to like the Red Flags and other small pieces of hints.

Chapter 2 - The Nature of Complexity

Complexity is anything related to the structure of a system that makes it hard to understand or modify. Isolating complexity in a place where it will almost never be seen or touched is almost as good as removing it.

There are many symptoms of complexity:
  • Change amplification - when change requires modification in many places. 
  • Cognitive load - how much a developer must know in order to make a change. Sometimes more code is simpler than less code because of cognitive load.
  • Unknown unknowns - it is not obvious which places should be modified or what knowledge the developer must have to make a change.
Complexity is caused by 2 things - dependencies and obscurity.

Complexity is incremental - it accumulates in time in small chunks. We should adopt a "zero tolerance" philosophy.

Chapter 3 - Working Code is not Enough

According to Ousterhout, there are 2 high-level approaches to a problem. Tactical, focused on getting the thing done as soon as possible. And strategical, with more time spent in upfront design, which Ousterhout advocates and says it is cheaper than the tactical approach in the long run.

The best approach seems to be to invest continually approximately 10-20% of our time to design.

Motivation for startups is the word of mouth - if the codebase sucks, people will talk and there will be problems with hiring talent. Tactical programming is too commonplace among startups. Facebook is the biggest example.

Chapter 4 - Modules should be deep

Modular design is about developers being forced to face as small amount of complexity as possible. Therefore modules should have complex implementation but a simple API. They should be "deep".

The goal of modular design is to minimize the dependencies between modules. If we think about modules in interface and implementation, than the implementation should be more complex than the interface.

The interface of the module is meant also in the non-technical perspective. If a developer must have some knowledge in order to use the module, it is part of the module's interface.

Red flag - shallow modules do not hide much of the complexity behind a simple interface and they should be avoided. Small modules tend to be shallow.

A lot of small classes do not make a system simple.

Providing choice is good, but the common case should be made very simple to use in the interface. If interface has many features, its effective complexity is the complexity to use the most common features.

Chapter 5 - Information Hiding (and Leakage)

This chapter is a technique to create the deep modules (from the previous chapter). When creating a new module, we should think hard what information could be hidden in the module. If the information is reflected in the interface of a module, it has effectively leaked.

Red flag - information leakage - if the same knowledge is used in multiple places, such as two different classes both understanding the format of a particular file type.

When designing modules, focus on knowledge that is needed to perform the task, not the ordering of the tasks.

Red flag - temporal decomposition - when execution order is reflected in the code structure.

The common mistake is to make classes to small leading to the information leakage. It is sometimes beneficial to make a class bigger and deep, so it can handle the whole problem easier.

Red flag - overexposure - API with common case which forces users to learn about the features which they usually don't need. E.g. buffering support for reading files in Java should be default and hidden.

Chapter 6 - General-Purpose Modules are Deeper

Ousterhout thinks that over-specialization of modules may be the single greatest cause of complexity in software. He argues general-purpose classes are better even if you never reuse the class. The phrase somewhat general-purpose means that the implementation should be done by the current needs, but the API of the module should be general-purpose.

In an example of storing text for an editor he argues that many too specialized methods, like backspace, delete are a bad design and force developer to learn too many of them. A more general-purpose API with just small number of methods is much better and easier understood and used. Even when frontend programmers need the backspace functionality, they would have to read the implementation of it to be sure how it is implemented. Hiding this information behind the API just creates obscurity.

The question to ask ourselves:
  • What is the simplest possible API to cover current needs?
  • In how many situations will the method be used?
  • Is this API easy to use for my current needs?

We can try to eliminate the special cases with the default cases, e.g. the "null object" pattern.

Chapter 7 - Different Layer, Different Abstraction

Red Flag - pass-through method - method which only delegates to another object. This usually means there is no clean division of responsibilities between classes.

Before creating decorators, which have too much coupling to the decorated classes, think twice about the alternatives:
  • Wouldn't adding the functionality directly to the underlying class be simpler?
  • Could you merge the functionality with an existing decorator?
  • Could you implement is a stand-alone class with its responsibility instead of an decorator?
We can see the different layer, different abstraction pattern manifest in the rule that API should be simpler (and use different abstractions) than the implementation of a deep class. 

API duplication can be caused by pass-through parameters. They can be solved with context objects.

Chapter 8 - Pull Complexity Downwards

As a module developer, we should strive to make life as easy as possible for the users of the module even if it means making our lives harder.

We should avoid configuration parameters as much as possible. Before exposing it, we should ask ourselves - will users of the module have more/better information to set the better value for the parameter?

But pulling complexity downwards could be taken too far, as seen in example in chapter 6, where students pulled frontend-specific logic to the backend classes, leading to information leakage.

Chapter 9 - Better together or better apart?

Act of dividing classes adds some additional complexity:
  • Number of components.
  • Additional code to manage the components.
  • Harder reuseDuplication.
Here are couple of reasons to put two modules together:
  • They share information.
  • The are used together.
  • The overlap conceptually.
  • It is hard to understand one without looking at the other.
We should bring classes together if it will simplify the API. Or if it eliminates the duplication. We should separate general-purpose code and special-purpose code.

Red flag - repetition - if the same piece of code is used in many places, it means we probably didn't find the right abstractions.

Red flag - special-general mixture.

In general, developers tend to break methods too much. The most important goal should be to provide clean abstractions instead. Each method should do one thing and do it completely.

Red flag - conjoined methods - if you cannot understand the method without looking into another method, it is a red flag. 

Ousterhout presents an alternative view by Rober C. Martin from Clean Code, where Martin breaks method very often until they are very small, usually just a line or couple of lines long. Ousterhout asks whether the breaking of methods really simplifies the system.

Chapter 10 - Define Errors out of Existence

Exceptions contribute disproportionally to complexity. We should reduce number of places where exceptions have to be handled.

Exceptions thrown from a class are part of its interface. Class with lots of exceptions have complex interfaces, are shallower than classes with fewer exceptions. 

The best way to eliminate exception handling complexity is when there are no exceptions to handle. For example we don't have to throw an exception from method which deletes a file when the file does not exist.

Exceptions masking is an example of pulling the complexity downwards.

The third technique for reductng complexity is to handle many exceptions in a single place.

Fourth technique is to just crash. Some exceptions are not worth handling. E.g. out of memory error.

Chapter 11 - Design it twice

Consider multiple options for each major design decision - design it twice. Try to pick approaches that are radically different from each other. Make a list of pros and cons for each option. The most important consideration should be the ease of API for the higher-level code. Other factors to consider:
  • Does one alternative have a simpler API?
  • Is one interface more general-purpose?
  • Does one interface enable more efficient implementation?

Chapter 12 - Why write Comments

Comments, written correctly, improve system's design. If users must read the code in order to use an API, then there is no abstraction. Capture what was in the mind of designer, but couldn't be captured by the code in comments.

Chapter 13 - Comments should describe Things that aren't obvious from the Code\

See the chapter title. Developers should be able to understand the API without reading the implementation. Every class should have an API comment, every class variable should have a comment and every method should have an interface comment.

Red flag - comment repeats code - don't repeat the code in the comments. Use different words in comments than in the code.

Red flag - implementation documentation contaminates interface - sometimes it is necessary, but usually is not.

The main goal of inside-code comments is to explain what the code is doing, not how it is doing it.

Chapter 14 - Choosing Names

Selecting names is one of the most underrated aspects of software design. Take a bit of extra time to choose great names, which are precise, unambiguous and intuitive. Good names should have two properties - precision and consistency.

Red flag - vague name - if the name could belong to a lot of things, it is more likely to be misused.

Red flag - hard to pick name - hint that the underlying object might not have a clean design.

Consistency has 3 requirements:
  • Always use the common name for the given purpose.
  • Never use the common name for any other purpose.
  • Make sure that the purpose is narrow enough that all similarly named objects have the similar behavior.
Avoid extra words, e.g. like in Hungarian notation. Don't repeat the context if you are inside some obvious context, e.g. not fileBlock in File, but just block.

The greater the distance between name's declaration and its uses, the longer the name should be.

Chapter 15 - Write the Comments first

The best time to write the comments is in the beginning of the class design. For a new class, start by writing the API comment. Next, write comments for most important interface methods. If a method or a variable needs a long comment, that's a red flag, that it is not a good abstraction. Give writing the comments first a try.

Chapter 16 - Modifying existing Code

Unfortunately a typical mindset is "what is the smallest possible change to get it done?". Resist this temptation to do a quick fix. Think strategically. Modify the design so that the change will be easy to make. If you are not making the design better, you are probably making it worse.

The best way for the comments to stay updated is to put them close to the code described. Comments belong in the code, not the commit log. If the documentation is external, don't repeat it, just reference it. Before committing, verify that all comments are updated as well. Comments are easier to maintain if they are higher-level.

Chapter 17 - Consistency

Consistency allows developers to work more quickly with fewer mistakes. It can be applied at many levels:
  • Names.
  • Coding styles.
  • Interfaces.
  • Design patterns.
  • Invariants.
To ensure consistency we should:
  • Document it.
  • Enforce it. E,g. with automated checkers.
When in Rome, do as the Romans do. Don't change the existing conventions. Having a "better idea" is not sufficient excuse to introduce inconsistencies. The value of consistency vs inconsistency is almost always higher that any difference between two ideas.

Chapter 18 - Code should be obvious

The best way to determine the obviousness of code is through code reviews.

Event-driven programming makes it hard to follow the flow of control. The event handler functions are never invoked directly, they are invoked indirectly by the event module.

Red flag - nonobvious code - if meaning or behavior of code cannot be understood from a quick read it is a read flag.

Software should be designed for ease of reading, not writing.

Chapter 19 - Software Trends

Interface inheritance is a good thing. But class hierarchies which use implementation inheritance heavily tend to have high complexity. Composition can provide the same benefits

One of the most important elements of agile development is the notion that development should be incremental and iterative. One of the risks of agile development is that it can lead to tactical programming. Developing incrementally is generally a good idea, but the increments should be abstractions, not features. Once in a need for a new abstraction, design it cleanly and somewhat general-purpose.

Unit tests became widespread. But TDD focuses too much on getting specific feature working rather than having the best design. This is tactical programming. There is no obvious time to design. 
I strongly disagree with the author here. I think TDD improves my API design (e.g. with wishful programming of APIs for the tests) and also IMHO there is an obvious time to design in TDD, in refactoring phase.

Design patters are good ideas, but there is a risk of over-application.

Whenever we counter a new trend in programming, we should challenge it from the complexity point of view.

Chapter 20 - Designing for Performance

High performance can be achieved without sacrificing good design

During development, the best approach is to use the reasonable amount of performance awareness so that we don't bloat the whole code base with the slow algorithms.

We should measure the performance before and after optimizing.

If we have found the piece of code to optimize, the best way to do it is with a "fundamental" change. Unfortunately sometimes there is no fundamental fix to implement. The key idea then is to design the code around the critical path. There will always be code that must be always run and lots of exceptional cases. We should try to design the code as there were no exceptional cases and the critical path could run as fast as possible. Sometimes the original code has too many layers with shallow abstractions.

Chapter 21 - Decide what matters

Good software design separates what matters from what matters not. Things that matter not should be hidden as much as possible. We should try to make as little matter as possible. 

The first mistake is to treat too many things as important. Second mistake is to fail to recognize what matters. 

This is also important for technical writing, or even a life philosophy. 

Chapter 22 - Conclusion

The reward for being a good designer is that you get to spend a larger fraction of your time doing the fun part of the programming. Poor designers spend most of their time chasing bugs in complicated and brittle code.

Comments

Popular posts from this blog

Notes about the Clean Architecture book

Notes about the Building Microservices