Tuesday, October 31, 2017

Notes about the Clean Architecture book

The first look into the contents of the book was a bit scary. Some of these things were already explained in Robert Martin's previous books. Then I found out they are explained from a different (software architect's) perspective. Overall, I liked this book better than Clean Coder, but it was less useful for me than Clean Code book of the same author.

Part I - Introduction

The foreword contained two useful antipatterns:
  • Too authoritative and rigid architecture.
  • Speculative generality in software architecture.
The main point of the preface is that not much changed for the last 50 years in software architecture. The rules of software architecture are the same regardless of any variable (time, type of project, ...).

When you get the architecture of the software right, you magically don't need horde of programmers maintaining it.

Chapter 1 - What is Design and Architecture?

There is no difference between software design and architecture. There is a continuum of decisions from highest to lowest levels. They both are part of it.

The goal of architecture is to minimize the amount of personnel to develop and maintain a software system.

A typical project where architecture doesn't matter starts alright. The first couple of releases are fine and everybody is productive. But as requirements add up, developers try to fit new requirements into current project, which is shaped differently. The jigsaw starts to rot and the productivity falls. It is common that the changes done at the end of the project cost 40 times more than changes in first 2-3 releases.

What went wrong? The slow and steady wins the race. Developers today live in 3 big lies:
  1. Myth: We just need to get to market first and we clean it up later. (it is not going to happen)
  2. Myth: Messy code makes us faster today and slows us later. Myth: We can switch mode from making messes to cleaning messes. (making messes is always slower, even in very short term as shown in example of a simple kata)
  3. Myth: Starting from scratch is a solution. (the same mess will be the result)

Chapter 2 - A Tale of two Values

The first value of software is its behavior. Many developers think it's the only value. The second value is the architecture. It has far greater value, while being overlooked by both programmers and managers.

The behavior is urgent but not important. The architecture is important but not urgent. The dilemma of software developers is that business managers are not equipped to value architecture. That's what software engineers were hired for.

Software architects focus more on structure of the project than on particular features/functions. They make those new features easier to write, modify, extend and to maintain. They should fight for keeping the project clean to prevent it being impossible to change.

Part II - Programming Paradigms

Chapter 3 - Paradigm Overview

There are three main programming paradigms:
  • Structured programming taught us to separate code by functions and it is core of our algorithms. It does so by preventing the use of goto.
  • Object-oriented programming taught us to manage dependencies between modules of code with the use of polymorphism. It prevents us from using function pointers.
  • Functional programming taught us discipline how to access data (immutability). It does so by preventing variable assignment.
All these paradigms were invented during 10 years between 1958 and 1968 (BTW in reverse order) and no new paradigm appeared since then. 

Chapter 4 - Structured Programming

Djikstra found out that certain uses of goto statements prevent decomposition of algorithms into smaller problems. I.e. goto prevents divide and conquer approach to proofing of algorithm correctness. 

Other uses of goto didn't have this problem. Djikstra found that the non-problematic use of goto is corresponding to selection and iteration statements (if and while). 

Bohm and Jacopini proved that all programs can be programmed with just selection, iteration and sequence. This was a remarkable coincidence. All programs can be programmed with the same tools which enable us to prove their correctness. 

Djikstra wrote "Go To Statements Considered Harmful", his famous paper from 1968, so the structured programming was born. Today, we are all structured programmers, although not necessarily by choice. We have no other option (of unconstrained goto jumps).

We still use the divide-and-conquer approach when decomposing complex problems into simple methods by using what is called functional decomposition.

Djikstra's dream of provable programs never came to reality. Instead, informatics leaned to scientific method. The programs are not provable, but they are falsibible, i.e. we can only prove they are wrong. We do this using tests. We decompose programs into simple testable parts using functional decomposition, which we prove using tests.

Chapter 5 - Object-Oriented Programming

Plugin architecture was invented to protect software from coupling to IO devices. Even though the idea is old, the programmers didn't extend it to their own programs, because using function pointers was dangerous. OOP allows plugin architecture to be used anywhere, for anything.

By using dependency inversion, we can have source code dependencies point in the opposite direction to the flow of control. This can be implemented by using interfaces. The code in lower layer can depend on interface, which higher layer will implement. Even though lower layer controls the flow in this case, the dependency points from higher layer to lower layer. This has profound effect on software design. Any dependency wherever it is, can be inverted. We can module the software dependencies however we want. E.g. the database and the UI can depend on (pure) business rules.

So the most important trait of OOP is polymorphism. It allows us to make low-level details (like database or UI) depend on high-level policies, like business rules. Low-level details can be then developed independently.

Chapter 6 - Functional Programming

Variables in functional programming languages do not vary. All the problems that we face in concurrent programming cannot happen if we use immutable variables. And yes, immutability is practicable, even if we make couple of compromises:

Segregation of mutability tells us to put as much processing logic as possible to immutable components, to drive code out of the components that must allow mutation. 

Event sourcing is a strategy, when we store e.g. all the bank transactions but not the account balance. When state is required, we will re-apply all transactions and re-compute the current account balance. As a consequence, we implemented just the CR part of the CRUD.

Part III - Design Principles

This is not an introduction to the SOLID principles, but rather an architectural review on them.

Chapter 7 - Single Responsibility Principle

The module should be responsible to one and only one actor. Symptoms of this rule violations are:
  1. Accidental duplication - when Employee class has 3 methods, which are responsible to COO, CFO and CTO of the company respectively. The solution to this problem is to separate code e.g. inside calculator classes responsible to their respective manager. We can even keep the most reasonable computation in the Employee if that makes sense.
  2. Merges - when 2 teams have merge conflicts, it usually means they are changing one component for different reasons.

Chapter 8 - Open-Closed Principle

If component A should be protected from changes in component B, than component B must depend on component A. This is how OCP works at the architectural level. Architects organize components into a topology, where higher level components are protected from the changes in lower level components.

Chapter 9 - Liskov Substitution Principle

LSP violation example - Square is not a good subclass of a Rectangle, because its sides don't change independently. The LSP should be applied at the architectural level, otherwise the system becomes polluted with extra mechanisms.

Chapter 10 - Interface Segregation Principle

The ISP on the architectural level means - don't depend on modules which contain something you don't need, or you will face unexpected issues.

Chapter 11 - Dependency Inversion Principle

  • Don't refer to volatile concrete classes. Refer to abstract interfaces instead. Also don't create volatile objects by yourself, but delegate to abstract factories instead.
  • Don't derive from volatile concrete classes. Inheritance is the strongest bond and should be used with greatest care.
  • Don't override concrete functions. 
  • In fact, don't ever mention anything that is volatile and concrete.
The DI is used in all the following chapter. It is the main mechanism of separating the modules.

Pat IV - Component Principles

Chapter 12 - Components

This chapter is just a history lesson on how did we get to the dynamically linked files and that it enables plugin architecture.

Chapter 13 - Component Cohesion

The chapter discusses 3 principles of component cohesion.

REP - Reuse/Release Equivalence principle

The granule of reuse is the granule of release. The classes and modules which belong to a component must form a cohesive group. There must be an overarching theme that these classes share. The parts of a component should be releasable together. This is kind of a weak advice - the component should "make sense". However, it is still important, because violations are quite easy to detect. They don't make sense.

CCP - Common Closure Principle

Those classes which change for the same reason at the same times should be in one component. Those which change for different reasons at different times should be split into separate components.

CRP - Common Reuse Principle

Don't force users of a component to depend on things they don't need. We want to make sure we put only those classes to a component, which are inseparable from each other.

These 3 principles form a triangle, similar to scope, time and money triangle. You can be really good at 2 of those principles, but then you will lack the third. When projects start, they tend to sacrifice the REP. Developability wins over reusability. Then, as projects mature, they will lean slowly towards the REP. The component cohesion changes with time, from developability towards reusability.

Chapter 14 - Component Coupling

The next 3 chapters are about tension between developability and good design.

ADP - Acyclic Dependency Principle

Allow no cycles in component dependency graph. If you have any cycle between, say 3 components, these 3 components have effectively became one large component. There are at least 2 ways of breaking the cycle:
  1. Apply the dependency inversion principle, i.e. introduce interfaces to invert the dependency.
  2. Create a new component which both problematic components will depend on.
Component dependency diagrams have very little to do with the function of the application. Instead, they are mapped to the buildability and maintainability of the system. Component dependency structure grows and evolves with the logical design of the application.

SDP - Stable Dependency Principle

Depend in direction of stability. Instability of a component can be measured. There are fan-in and fan-out metrics for a component, which count the incoming and outgoing dependent classes. Then instability (I) of a component is fan-out / (fan-in + fan-out). SDP says that that the I metric of a component should be larger than the I metric of components it depends on. We can fix the violations of SDP the same 2 ways as we fix violations of ADP (DIP or a new component). If we choose to create a new component, it will likely contain interfaces only. This is quite common in Java or C#.

SAP - Stable Abstraction Principle

Component should be as abstract as it is stable. Again, some metrics. Abstraction of a component (A) is # abstract classes / # concrete classes. We generally want the A metric be opposite to the I metric of a component. There are 2 extreme violations of this principle:
  1. Zone of Pain, where you have stable component with concrete classes. Such components are too rigid. Example would be a database schema. It is harmful because databases are volatile. A harmless example of component in Zone of Pain, is a String, because it is not likely to be changed.
  2. Zone of Uselessness, where you have abstract components with no dependents. Such components are simply unused.
Where you want your components to sit, is called the Main Sequence. You can measure how far are you from it D = |A+I-1|. Therefore a statistical analysis of design is possible. You can plot a graph of your components and focus on those which are too far from the Main Sequence. 

Part V - Architecture

Chapter 15 - What is Architecture?

First of all, a software architect is a programmer. He continues to take programming tasks while guiding the rest of the team toward a design that maximizes productivity. His design strategy is to leave as many options open as possible. Good architecture makes the application easy to develop, maintain and deploy. The ultimate goal is to minimize the total cost of the system and maximize programmer productivity.

If 5 teams are developing a system and no other factors are involved, the system will likely be split into 5 components - one for each team.

Architects should always consider deployment issues early on. E.g. it might not be wise to start with micro-services.

The impact of architecture on operations is less dramatic than on development, deployability and maintenance. Almost any issue can be solved by throwing more hardware on it.

Of all aspects of a software system, maintenance is the most costly. The primary costs are spelunking and risk:
  • Spelunking is the digging process, trying to find the best place and strategy to introduce even the simplest feature or fix a bug.
  • While making those changes, the likelihood of creating additional defects is always there, therefore adding the cost of risk.
What are the options architects need to keep open? They are details that don't matter, e.g. database, web server, REST, SOA, microservices, or dependency injection framework. But what if your company already made a commitment to certain database? A good architect pretends than no such decision has been made. He maximizes number of decisions not made.

Chapter 16 - Independence

A shopping cart application with a good architecture will look like a shopping card application. More on this will be in chapter 21.

Decision of communication protocol between the components is a decision that a good architect leaves open. An architecture that maintains the proper isolation of its components and doesn't assume any communication means between them, will be much easier to transition to a different communication stack, as the operational needs change for the system.

Conway's Law: Any organization that designs a system will produce a design whose structure is a copy of the organization's communication structure.
The structure is composed of well-isolated, independently deployable components.

A good architecture enables system to be deployed immediately after it is build, ideally with one button, or automatically.

We can decouple components 2 ways:

  • Decoupling by horizontal layers leaves us e.g. with a components for UI, business rules and a database.
  • Decoupling use cases are vertical slices of the software in order to separate use cases development.
The pattern here is to decouple things which change for different reasons. The decoupling mode which we choose might also help operations. But to take advantage of operational benefit, we must split the components to very separate services / microservices. A good architect leaves options open and the decoupling mode is one of those options.

As long as the layers and use cases are decoupled, the software will support teams being separated in any organizational way, e.g. feature teams, database teams, ...

Architects often fall into a trap of fearing duplication too much. When we find a truly duplicate code, we are honor-bound to remove it. But there are 2 kinds of duplication. A true duplication is when every change in one instance requires a change in the other instance. But if 2 sections of code are similar, but change for different reasons at different times - then they are not true duplicates.

Uncle Bob's preference is to decouple components enough so that micro-services could be physically separated, but still keep them in one address space as long as possible. The problem with starting with microservices is that they probably won't be fine-grained enough.

A good architecture will allow a system to be born as a monolith, then be split to multiple modules and then eventually be split into different services. The process could be even reversed in later stages of a project.

The point is that decoupling mode will probably change in project's lifetime, so the architecture's role is to be prepared for such change.

Chapter 17 - Boundaries: Drawing Lines

The point here is to draw a boundary around the low-level details of the system, such as a database. Example of FitNesse project shows that this allowed Uncle Bob to ditch the database completely and have everything in plain-text files instead.

If we put the database behind an interface (applying DIP), we can draw a boundary line just below the interface, so that the database is replaceable. The database (low-level detail) should indeed depend on business rules (higher-level abstractions). Both GUI and DB should be "plugins" to the business rules. We should recognize this as an application of DIP to allow Stable Abstractions Principle.

Chapter 18 - Boundary Anatomy

There are 4 ways of implementing a boundary:
  1. Monolith - everything is statically linked into a single executable, but maintaining good modularization (boundaries) is still very beneficial here. Modules can be developed independently. We use DIP to make the correct dependency flow (low-level details depend on high-level abstractions). Communications between the components are very fast (they're just method calls) and they tend to be chatty. 
  2. DLLs - the same thing, but the monolith is split into multiple components / libraries (e.g. JARs).
  3. Processes - these are system processes, which is much stronger separation than the previous two. Communication is mort costly, so it is kept to minimum.
  4. Services - communication is even more costly because it all happens over network. Lower level services are plugins to higher level services.
Most systems except monoliths use more than one strategy. E.g. each service could be a set of JARs inside one WAR. This means that a system is typically divided into boundaries which are chatty and boundaries which are more concerned with latency.

Chapter 19 - Policy and Level

A strict definition of a level is the distance from the inputs and outputs. This chapter is another restatement that high-level policies should not depend on low-level details. We can flip the dependency flow using DIP.

Chapter 20 - Business Rules

This chapter sets some of the naming concepts for the Clean Architecture (chapter 22). 

Critical Business Rules are the business rules which would exist even if there was no computer system to automate things. They usually require some data, these would be the Critical Business Data. They are embodied in a software system in Entities.

Use Cases are software-application-specific business rules, which describe how the system is used. They should not include how the system appears to the user. This is too much detail for the use cases. Note that the Entities don't know that Use Cases exist, but Use Cases actively use Entities. The dependency flow goes from Use Cases to Entities.

We might be tempted to use Entities as our request/response objects, because they share so much data. We should avoid this temptation. They will change at different times for different reasons.

The Use Cases should be independent on the UI or the database. They are the core value of our system. Use cases should be the most reusable component in the system.

Chapter 21 - Screaming Architecture

When new programmers come to see your health-care system source code repository, they should immediately see it and say: "Oh, this is a health-care system." The architecture should "scream" health-care on them.

The fact that the system is delivered via the the web is a detail and should be treated as such. The same applies to frameworks. They should not dominate the system. Look at the frameworks with jaded eye. Are they worth the cost of marrying them? Can you protect yourself from them?

Chapter 22 - The Clean Architecture

The best description of the Clean Architecture is from the Uncle Bob himself and is available for free on his former employer's website. He found that all modern architectures share these traits:
  • Independent of frameworks. Frameworks are just a detail, a plugin to the core of the system.
  • Testable. If the core of the system is independent, it can be fully covered by tests.
  • Independent of UI. UI is just a detail, a plugin to the core of the system.
  • Independent of DB. Database is just a detail, a plugin to the core of the system.
  • Independent of any external agency.
So he presents a layers (which he calls circles) from bottom to top:
  1. Entities are enterprise-wide business rules. They are very unlikely to change.
  2. Use cases are application-specific business rules. They might change because of e.g. change requests to the application.
  3. Interface adapters convert formats from one to another. They do it in the way that is beneficial for the lower layers.
  4. Drivers and Frameworks are the details. There is usually not much code in this layer. E.g. just a glue code.

Chapter 23 - Presenters and Humble Objects

The Humble Object pattern is useful for testability of a system. We separate hard-to-test code to a separate object, which contains nothing else. That's why it is called humble. The rest of the system is testable.

We want our Views to be humble. So everything that is displayed on the screen, is represented as a string or a boolean, or something similarly simple. We don't want to format dates in the views. We don't want any logic in our views.

The use of humble objects at architectural boundaries greatly increases the testability of the system.

Chapter 24 - Partial Boundaries

This chapter presents 3 strategies for implementing simpler boundaries. Each of them has less bureaucracy than the full-fledged solution. There are of course many other strategies for this. Architects decide whether a real boundary is needed or such partial solution would suffice.
  1. Skip the last step. We will carefully prepare the boundaries so that the code could be split into different components. But then we will let it live in a single component.
  2. Even lighter is one-directional boundary. We will separate dependencies using interfaces. Nothing prevents bypassing interfaces and using dependencies directly.
  3. Even lighter is a facade pattern. Client code will use a facade, which has direct dependendencies on implementation classes. Therefore client code will transitively depend on implementation classes.

Chapter 25 - Layers and Boundaries

This chapter is similar to the previous one. Full-fledged boundaries come at a cost. This cost has to be weighted. Architects don't want to over-engineer, as it is much worse than under-engineering. Sometimes, a boundary must be created. Sometimes, partial boundary will suffice. Sometimes a need for boundary should be ignored (as he presents in this chapter on 200-lines of code game). However, this is not a one-time decision. It takes a watchful eye, a repeated weighing of pros and cons.

Chapter 26 - The Main Component

The main component is the lowest-level component. The ultimate detail. Think of main as a configuration plugin to the application. It contains the basic setup and configuration. It is a plugin, so we can have one for production environment, and other for development environment. Or different mains for different countries to deploy to.

Chapter 27 - Services: Great and Small

Services (or microservices) which physically separate business functions of an application are not much more than an expensive function calls. They are also not necessarily architecturally important.

Is strong decoupling really an advantage of microservices? Services share no variables, that's for sure. But they are coupled by the data they share - e.g. requests and responses. If one request needs to add a field to it, it might cause change in multiple services. So this benefit is an illusion.

Is independent developability really an advantage of microservices? First, history has shown that big scalable systems can be developed either as a monolith, or as component-based system, or as services. Second, if strong decoupling is a myth, than a simple change can cause all of the services to change, so there is no real independence of the teams.

Services are nice mechanism for scalability and developability. However, they are not architecturally significant. They are just a communication mechanism between system boundaries. The architecture is drawn by these boundaries. In many cases, client and a service are so coupled, there is no architectural significance whatsoever.

Chapter 28 - The Test Boundary

Tests are part of the system.

If tests depend on very volatile things, like GUIs, they tend to be fragile. Generally, Fragile Tests Problem is when a simple change can cause hundreds or thousands of tests to fail. Such tests make the system rigid. The developers are afraid to make necessary design changes, if any change breaks so many tests.

Well designed test suite has its own API. This API contains all the necessary code to bypass e.g. security rules, or whatever prevents testing. It will often be a superset of interactors and interface adapters used by UI. The purpose of the API is to decouple tests structure from the system structure.

Imagine a system, where every production class or even method has a corresponding test. Such system also becomes rigid. Each refactoring breaks many tests. The testing API should hide the system structure from the tests.

Chapter 29 - Clean Embedded Architecture

We need more software and less firmware. If the only way to test your software is on the target hardware, then the target hardware will become the test bottleneck. There is nothing that keeps us from polluting all the code with hardware-specific code. Software and firmware shouldn't intermit. It is an antipattern and such code will resist changes. Hardware is a detail. That's why we have HAL (hardware abstraction layer) which software can use and which abstracts the underlying hardware. You have to be agnostic to the OS details in your software. OSAL is a thing.

The app-titude test:
  1. First make it work.
  2. Then make it right.
  3. Then make it fast.

Part V - Details

Chapter 30 - Database is a Detail

Why do we have disks and databases? How would we programmers store our data structures if there were no disks and we would have to store them in RAM? We wouldn't organize them to database tables and access them through SQL. We would store them as objects, because that's what we do. That's why from the architectural viewpoint, we shouldn't care about the storage details of our objects. It is just a detail.

Don't underestimate vendors marketing. In late 1980s, every company had to have RDBMS. Today words "enterprise" or "SOA" are more marketing than reality.

To sum it up, the data model is important. The database is just a technology, a mechanism, of storing it to disks. The database is a detail.

Chapter 31 - The Web is a Detail

Abstraction of GUI as a plugin is not easy. It will likely take several iterations to get it right. However, it is often worth it. The way to do it, is to have use cases independent on the GUI.

Chapter 32 - Frameworks are Details

Framework authors created frameworks to solve their problems. Not yours. And still, they try to persuade you to couple your system to their framework as tightly as possible. It is one-directional marriage. You take on all the risks and burden and the framework takes on nothing at all:
  • Architecture of the framework might not be very clean. And once it is in, the framework is not going out.
  • Framework might help you boost up the start of the project. But as time passes, you might outgrow it and you start fighting the framework more than it is worth.
  • The framework might evolve in a direction you might not like.
  • There could be a better framework, which you would like to use instead, but since you married this one, you can't.
The solution? Don't marry the framework. You can still use it, just don't couple to it. E.g. you shouldn't have @Autowired annotations all over your business classes. A better place for Spring IOC is the Main component, because the wiring of classes is the lowest level detail.

Of course, you must marry some frameworks. E.g. standard library. But it still should be a decision. It is not a commitment to be entered lightly. 

Chapter 33 - Case Study: Video Sales

Uncle Bob presents the architecture of his cleancoders website for selling videos. First, he does a use case analysis. Four actors come out of it. Then he creates an architecture: For every actor, there will be corresponding Views, Presenters, Interactors (Use Cases) and Controllers. These 4 layers will be separated by architectural boundaries. There are also Data Gateways and a Revenue Gateway and a Database in a separate architectural boundary, Utilities. The point of separation by actors is the "different reasons" part of the SRP. The point of architectural boundaries is the "different rates" of the SRP. Once you separated the code this way, you can mix it into components or even deployables. 

I remember being disappointed with this chapter. It is too short for a case study. And this architecture doesn't "scream" video sales to me, because the first thing I see are the technical boundaries.P

Chapter 34 - The Missing Chapter

This chapter starts with different ways to package our software:
  1. Package by (a technical) layer. Similar to the case study, we package together things which are on the same technical layer. Martin Fowler says this is a good way to get started.
  2. Package by (s business) feature. We put everything which has something to do e.g. with orders, into orders super-package. After a simple "move" refactoring, we have package by feature. Now the top-level package architecture really screams something about the application.
  3. Ports and adapters. All the variations of hexagonal architecture, clean architecture, etc. fall into this category. You have one package of everything testable, the domain, like services and their dependencies via interfaces. Everything else, the infrastructure, is separated and dependent on the domain. This idea comes from the DDD.
  4. Package by component. This is Simon Brown's, this chapter author's, recommended way. It's the same as ports and adapters, but the "backend" infrastructure is packaged together with the domain. This can be considered as a preparation for splitting to microservices, if necessary.
After this presentation comes the point. All 4 ways of packaging are the same dependency-flow-wise. But the visibility of the components could be different. If some of them are in the same package, then we can use the default visibility in Java. Mr. Brown suggests that we use compiler (and not just discipline) to enforce the dependency rule. He is also enthusiastic about the new Java 9 module system, which gives us more power for encapsulation. Alternative way is to use different source code trees, using maven, gradle or other build tool.

Part VII - Appendix VII

Appendix A - Architecture Archaeology

My only note from this part is the experience that if you want to build a reusable framework, you have to have at least 2 reusers for it.

Saturday, September 16, 2017

Notes about the Clean Coder Book

This book's main theme is software developer's professionalism. I like this book much less than Uncle Bob's Clean Code book (the first of the series). That said, it is still a worthy read. Robert C. Martin introduces his book as a catalog of his own errors. He says we are in dire need for professionalism in our software developer profession.

Chapter 1 - Professionalism

Professionalism is all about taking responsibility. The first rule is to do no harm. As it is virtually impossible to create a bug-free code, the first thing to learn is apologizing. The code has to be tested. Then tested again. The tests must be automated. Basically the QA should find nothing. Uncle Bob demands 100% code coverage by tests. At the very least, your automated tests should assure you that the code will most likely pass the QA.

If you find a code which is hard to work with, do the refactoring to make the change easier next time. This is the Boy Scout rule: "Always check in a module cleaner than you checked it out."

You should plan to work 60 hours weekly. 40 hours are for your employer and 20 hours are for you. You should be reading, practicing, learning and otherwise developing your career. You should be conversant with:
  • Design patterns - all 24 of GOF patterns and a working knowledge of many of the POSA books.
  • Design principles - SOLID and component principles.
  • Methods - XP, Scrum, Lean, Kanban, Waterfall, Structured Analysis and Structured Design.
  • Disciplines - TDD, OOD, Structured Programming, Continuous Integration and Pair Programming.
  • Artifacts - UML, DFDs, Structure Charts, Petri Nets, Transition Diagrams and Tables, flow charts and decision tables.
Uncle Bob also introduces "katas" here. Kata is a 10 minutes warm-up exercise in the morning and 10 minutes cool-down in the evening. You program a simple problem, which you already know the solution to, just to train your fingers.

The second best way to learn is to collaborate with other peers, e.g. during pair programming. The best way to learn is to teach.

Professionals know their domain. They identify themselves with their employers. Their employers problems are their problems.

Professionals are also humble. They will not make fun of others for their mistakes, because they know they could be the next to make one.

Chapter 2 - Saying No

To reach the best possible outcome of an argument (e.g. about project deadline), you have to say no sometimes and then work out a solution that is agreeable by both parties. WHY part is just a detail, but the fact that e.g. the deadline is unfeasible is important. The most important time to say no is when stakes are highest.

There is no trying. Promising to "try" is admission that you have been holding back, that you have reservoir of extra effort that you can apply.

Uncle Bob suggests against passive aggressive behavior, when you, knowing that a disaster is coming, protect your back by keeping all the important memos, which clearly show what you told your manager and when. Uncle Bob advises to try to talk to the supervisor manager instead, even if it is jumping above heads.

Chapter 3 - Saying Yes

There are 3 parts of a commitment:
  1. You say you'll do it.
  2. You mean it.
  3. You actually do it.
You can recognize lack of commitment by detecting these keywords:
  • Need / should
  • Hope / wish
  • Let's
The commitment looks like this - you are taking full responsibility for something in front of audience of at least one person. If you depend on anyone else, you have to phrase the commitment in more specific actions that can bring you closer to the end goal. If you find any obstacles, you should inform somebody ASAP to give them a chance to help you to fulfill your commitment.

When you are out of time, it is wrong to break the discipline. You won't get done faster if you don't write tests. You won't get done faster if you don't refactor. You won't get done faster if you omit full regression suite. Breaking disciplines only slows us down.

Professionals are not required to say yes to everything. However, they should be creative to find ways to make "yes" possible. When they say yes, they use language of the commitment so that there is no doubt what they've promised.

Chapter 4 - Coding

This chapter is series of random advice about coding. The key to coding mastery is confidence and error-sense. If you are tired or distracted, do not code. Avoid the zone, or you might loose the big picture. Many people don't code well while listening to music. If you have a writer's block, find a pair partner. TDD might reduce debugging time 10 times. The trick to managing lateness is early detection and transparency. Do not be tempted to rush. 

You should not work overtime unless:
  • You can personally afford it.
  • It will not take more than 2 weeks.
  • Your boss has a fallback plan if overtime doesn't work.
To avoid false delivery, we should have and independent definition of done. The best way to achieve this is by having business analysts and testers create automated acceptance tests.

You should help others. You will likely learn more than you gave. On the other hand, it is unprofessional to get stuck when help is easily accessible. Since for many of us, collaboration is not an instinct, we require disciplines that drive us to collaboration.

Chapter 5 - Test Driven Development

TDD works and everybody needs to get over it. The 3 laws of TDD:
  1. You are not allowed to write any production code until you have first written a failing unit test.
  2. You are not allowed to write more of a unit test than is sufficient to fail - and not compiling is failing.
  3. You are not allowed to write more production code that is sufficient to pass the currently failing unit test.
Benefits: Certainity, Defect reduction, Courage, Documentation (where is the first place you go for documentation? if you are a programmer, you look for examples), Design (the tests you write after the fact are defense, the tests you write before are offense).

Chapter 6 - Practicing

Even in the '90s, long build times were the norm. Nowadays, programmers don't wait for compiles. 

A Coding Dojo is exercise when the leader types the simple program and other programmers follow him keystroke by keystroke. Programming Kata is a precise set of choreographed keystrokes that simulates the solving of some programming problem. You are practicing the movements and decisions involved in the solving of the problem. Wasa is 2-man Kata. Two partners choose a kata, one writes a test, the other one makes it pass. Then they reverse the roles.

In one way or another, professionals practice. The do this because they care about doing the best job then possibly can. Practicing is what you do when you aren't getting paid.

Chapter 7 - Acceptance Testing

The more precise you make the requirements, the less relevant they become as the system is implemented. Done means all code is written, all tests pass, QA and stakeholders have accepted. Done.

Implementation work on a feature begins then the acceptance tests for the feature are ready. It is the developer's job to connect the acceptance tests to the system, and then to make them pass. As a professional developer, it is your job to negotiate with the test author for a better test.

Unit tests and acceptance tests are documentation first and tests second.

It is hard to specify GUI upfront. People want to fiddle with the GUI. The trick is to design the system so that you can treat the GUI as thought it were an API rather than a set of buttons, sliders, grids and menus. Therefore when writing acceptance tests for GUI you take advantage of the underlying abstractions that don't change very frequently. Better still is to write tests than invoke the features of the underlying system through a real API rather than through the GUI. The reason is that the GUI is likely to change, making the GUI-specific tests very fragile. So keep the GUI tests to minimum. The more GUI tests you have the less likely you are to keep them.

Make sure all your unit and acceptance tests are run several times per day using your continuous integration system. If the fail, then the whole team should stop whatever they are doing and focus of getting the broken tests to pass again.

The only way to effectively eliminate communication errors between programmers and stakeholders is to write automated acceptance tests. They are the perfect requirements document.

Chapter 8 - Testing Strategies

The goal of the development group should be that the QA group should find nothing. Every time QA finds something, the development team should react in horror. They should ask themselves how it happened and take steps to prevent it in the future.

In general, business people tend to write happy path scenarios and QA writes the corner, boundary and unhappy path scenarios. The other role of QA is exploratory tests.

A professional team needs a suite of automated tests which are depicted as a test automation pyramid. 
Unit tests provide as close to 100% coverage as reasonable. Generally this number should be somewhere near 90%. It should be true coverage, not some fake tests just to cover some lines. 
Component tests cover roughly 50% of the system. They are written by the QA and business group with assistance from the development. They are usually written using e.g. Cucumber. They are directed more towards happy-path scenarios and very obvious corner cases. The vast majority of unhappy-path scenarios should be covered by unit tests and is meaningless at this level.
Integration tests are written by the architects or lead designers of the system. There could be performance or throughput tests at this level. They are typically not executed as part of CI after every commit, because they take long times. Their role is to verify the architecture of the system is sound.
System tests are ultimate integration tests against the fully integrated system. Their intent is to ensure the whole system is constructed correctly. We might see performance and throughput tests at this level.
Exploratory tests are the only manual tests from the pyramid. They are also not scripted. Their intent is the last verification of the system behavior while exploring for unexpected behavior.

Chapter 9 - Time Management

There are 2 truths about meetings:
  • Meetings are necessary.
  • They are huge time wasters.
One of the most important duties of your manager is to keep you out of meetings. Your responsibility is to your project first, so it might be wise to decline a meeting if it is not worth to sacrifice your project to theirs.

When the meeting is boring, leave. Simply ask, at the opportune moment, if your presence is still necessary.

Always have and agenda and a goal. 
Stand-up meetings should take 1 minute per person, 20s per each of the questions.
Iteration planning meetings should take no more than 5% of the iteration.
Allocate no more than 20 minutes for retrospective and 25 minutes for the demo.

Kent Beck: "Any argument that can't be settled in five minutes can't be settled by arguing." The only thing to do is to get some data.

Switching between physical and mental activities can improve your performance at both.

Pomodoro technique is recommended. You won't allow anything to disturb you for 25 minutes. You defer all the interruptions after the tomato time. Then you will deal with them and have a 5 minutes pause. Than you continue with longer pause after 4 or so tomatoes.

Professionals avoid getting too vested in an idea so they can easily leave it. They do this to avoid spending too much time in blind alleys.

Professionals are always on the lookout for the growing messes, and they clean them as soon as they are recognized.

Chapter 10 - Estimation

Business likes to view estimates as commitments. Developers like to view estimates as guesses.

Professionals communicate the probability distribution of an estimate to the management clearly, to enable them to make correct decisions. A good rule of thumb for estimation is (O +4N + P) / 6, where O is optimistic estimate, P is pessimistic one and N is realistic one.

There are many "games" for group estimations. Flying finger is about estimating with fingers under the table. The scale of the fingers is defined at the beginning of the meeting. Planning poker is similar, but with cards with values e.g. 0, 1, 3, 5, 10. In the end the stories can be assigned to buckets of size with Fibonacci numbers, e.g. 1, 2, 3, 5, 8.

What to do with too large numbers? Split the task into smaller ones. The sum of their estimates will be more precise than the large task estimate.

Chapter 11 - Pressure

As discussed previously we shouldn't commit to deadlines we are unsure we can meet. Quick and dirty is an oxymoron. Dirty always means slow. Don't skip your disciplines in a crunch. If they are the best way to work, they should be followed even in a crisis.

The trick to handling pressure is to avoid it when you can and weather it when you cannot. You avoid it by managing your commitments, following your disciplines and staying clean. You weather it by staying calm, communicating, following your disciplines and getting help.

Chapter 12 - Collaboration

We are happiest when alone focusing deeply on some interesting problem. One of the worst symptoms of dysfunctional team is when each programmer builds a wall around his code and refuses to let other programmers touch it. Professionals pair. 

Chapter 13 - Teams and Projects

We have an issue if projects are so small, that every programmer works on the project for 50% or even 25% of his time. There is no such thing as half person. The problem is worse if the rest of the people in the projects are different.

Nicely gelled team of 12 might have 7 programmers, 2 testers, 2 analysts and a project manager. Professional development organizations don't form teams around projects. They allocate projects to existing gelled teams. This is because teams are harder to form than projects. A gelled team can take more projects than one at a time.

Chapter 14 - Mentoring, Apprenticeship and Craftsmanship

What you learn at school and what you find in a job are often a very different things. The problem of most companies teaching is that the supervision is not technical. The responsibility of teaching the next generation of programmers falls on us, not universities.

Friday, March 17, 2017

My notes about the Clean Code book

These are my personal notes about the Robert C. Martin's book Clean Code.

Chapter 2 - Meaningful names

The book starts with an advice that we should use intention-revealing names. E.g. elapsedTimeInDays vs d. Of course we should avoid disinformation and non-informative names. Uncle Bob then explains what he means by using meaningful distinctions. E.g. variables message and theMessage next to each other do not make sense. We should obviously use pronounceable names (generationTimestamp vs genymhdhms) and searchable names (e.g. MAX_CLASSES_PER_STUDENT is more searchable than 7). We shouldn't be cute (e.g. whack method). We should be consistent with names per concept. E.g. decide between fetch, retrieve and get and stick to it.  

We should avoid encodings, as the modern programming languages have strong typing, making them unnecessary. E.g. prefix "I" before interfaces names is wrong. Better is no prefix for interfaces and some suffix for implementations, e.g. Impl. The client of our API is usually not interested in the implementations, but he is interested in the interfaces.

Class names should be nouns. Words like Manager, Data or Info should be avoided. Class names should not be verbs. Methods names should start with a verb.

A new advice for me was that we should prefer solution domain names in preference to problem domain names. E.g. accountVisitor (if visitor as a design pattern is used) is preferred to problem domain names.

Then the chapter talks about adding meaningful context to the names. E.g. add "address" prefix to variables if it makes code more readable. Even better is extract to them to Address class. On the other hand, we shouldn't add unnecessary context, such as "segment" prefix to all variables, if we are in the context of "Segmentation" class.

Chapter 3 - Functions

The first rule of functions is that they should be small. Ideally less than 4 lines long. This also means that if a function contains a loop or an if, the inner body of that should be probably only 1 line long and it would therefore probably be another function call, which has also documentatory value.

The 2nd rule is that functions should do one thing only. How to validate this? There are at least 2 ways. Either try to describe the function purpose with the TO paragraph. "TO getCustomerName we ask the cache of the customer names to retrieve the name". If the function lines are an abstraction level below given TO paragraph, we're doing something wrong. Another nice heuristic is that if we can extract another meaningful method from the method body, it does more than one thing. If a function has comments about its "sections", it does more than one thing.

Third rule of functions is to have one level of abstraction only. This means e.g. not to mix file system details with domain logic in one method.

Another interesting approach is a step-down rule. Every function should be followed by functions which it uses (one level of abstraction below). We want to read the list of functions as if it was TO paragraphs narrative.

The chapter follows by more concrete examples or tricks. Switch statements are evil, because they cannot do one thing only and they tend to repeat themselves across the code base. Therefore the only place where they make sense, is a factory method for a polymorphism mechanism, which replaces all their usages. One nice heuristic for evaluating a function quality is command query separation. The function should either do something, or return an information. It shouldn't do both. We should prefer exceptions to error codes, as they introduce bad client code. That said, we should extract try/catch blocks into own methods, since error handling is one thing

A lot of chapter deals with number of function arguments. Ideal number being zero, than 1, 2, and it should never be 3 or more. Why? Because arguments require a lot of conceptual power. They are also hard to test. Output arguments are even more hard to understand. How we can fight multi-parameter functions? Well, there are e.g. argument objects, which wrap arguments. We can extract a wrapper class for the argument which we would pass to every function. Or we can move an argument to a field instead of passing it between methods. Or we can even extract a class which takes the argument as input in constructor and delegate work to this class.

Chapter 4 - Comments

The chapter starts with a surprising statement for me. Most of the comments are bad comments. They only make up for bad code. It is much better to rewrite the code to make comments unnecessary. We should really think twice before writing a comment. Could we improve the design to make the comment unneeded?

Then the chapter continues with comments which are not bad (but that's a minority):
  • Legal comments.
  • Informative comments, like a comment describing a long regular expression.
  • Explanations of intent.
  • Clarification comments, usually about the code you cannot change.
  • Warning of consequences.
  • TODO comments.
The most funny part of the chapter are examples of bad comments:
  • Mumbling comments, where you have to examine the code to even understand what the author meant.
  • Redundant comments, like "configuration settings" on the configurationSettings field of a class.
  • Mandated comments, such as all methods must have JavaDoc, lead to the typical JavaDoc abominations.
  • Journal comments when we have versioning systems.
  • Noise comments, like "default constructor".
  • Comments instead of variables or methods. Many comments can be replaced by a simple variable or a method.
  • Obviously: Position makers, Closing bracket comments, Attributions and Bylines, Commented-out code.
  • HTML comments.
  • Too much information, e.g. historical or other non-relevant information.
  • Inobvious comments (which themselves needs explanation), e.g. "start with an array that is big enough to hold all the pixels (plus filter bytes), and an extra 200 bytes for header info" from apache commons.
  • Function headers, JavaDoc for non-public stuff.

Chapter 5 - Formatting

I liked the funny explanation in the importance of formatting. Formatting is important. By now (meaning "in chapter 5") we should know that functionality of the code is not as important as its readability. While functionality will change a lot, code readability will hunt us for a long time.

He starts with vertical formatting. The good heuristic is that the files should be around 200 lines of code long, and almost never over 500 lines of code. We should think of a class as of a newspaper article. Most important stuff should be at the top and less important at the bottom. We should immediately know if we are at the right place by just reading the class name. We should use whitespace to separate concepts visually. We should avoid whitespace to imply close association. Methods that correlate should be kept vertically together. Local variables should be declared as close to place they are first used as possible. Instance variables should be at the class beginning. Methods which are called from other methods should be below them, to be read naturally, as a newspaper article.

Then the chapter continues with horizontal formatting. Uncle Bob likes to set his personal line length limit to 120 characters. I didn't like the rest of this part, as it seemed obvious. Indentation helps reading what is more nested, and so on. I found it funny they had a team meeting, where they decided all the formatting rules in 10 minutes. Our team is certainly not so fast :)

Chapter 6 - Objects and Data Structures

Uncle Bob first explains the real purpose of encapsulation. Hiding the fields is about abstractions. A class should not publish its properties via getters and setters. It should rather design an interface to manipulate with the essence of the data without disclosing the implementation.

The chapter then compares data structures (including POJOs) with objects. Procedural code (i.e. code using data structures) makes it easy to add functions without changing the data structures. OOP makes it easy to add new objects without changing the functions. Both has disadvantage where the other has advantage. Then there are hybrids, when a programmer tries to add logic to a data structure. These have all the disadvantages, and should be avoided.

A lot of space is for Law of Demeter. In short, we should only talk to friends, not strangers. E.g. code context.getOptions().getScratchDir().getAbsolutePath() violates Law of Demeter, but only if the objects returned are not all data structures, which expose their internal structure by design. It is usually good to split up the calls to 3 calls of a getter each on a separate line. If the options or scratchDir were not data structures, but OO objects, we should refactor the code. The first idea, to create a complex getter on context, is not very good, because it would be soon too big. We should think more and pick a correct design that simplifies the usage and hides the implementation details.

Chapter 7 - Error handling

The chapter starts with obvious - we should use exceptions in favor of error codes. Another obvious note is about checked exceptions - we basically shouldnt use them (unless we re developing special (e.g. critical) software). The first surprising thing for me was a recommendation to write unit test for exceptional cases first. This way it will force us to prepare the transactionality of the function right from the start.

Another good advice is to create just one class for exceptions incoming from one part of the system. We can distinguish between errors by context provided in the exception. We should only create multiple types of exceptions only if we really intent to have places where we catch just one, but not another kinds of exceptions.

Last good advice, is not to check for nulls, or pass nulls, but use a special case refactoring instead.

Chapter 8 - Boundaries

The chapter starts with a naturally different motivations between providers and consumers of a 3rd party code. Providers want to offer the library to biggest possible audience, while consumers would like the solution to be customized for their needs. Therefore 3rd party code is a problem worth noticing.

We can start exploring 3rd party code by automated tests, which will test it using our intended usage. This way we will not only learn about it, but also have a suite of tests for future upgrades of the library. 

The way to manage 3rd party code is to limit the dependent code to minimum. We can use wrappers around the library. Or we can create an ideal interface for us and an adapter for the 3rd party library. The adapter is very good for yet-unknown dependencies.

Chapter 9 - Unit Tests

The chapter about unit tests is just a basic introduction to them. Author explains the three laws of TDD:
  1. Write test code first.
  2. Write just so much test code that it fails.
  3. Write just so much production code that the failing tests pass.
The big part of the chapter is about a premise that the tests should be clean. They should not be a second-class citizen when it comes to code quality. Dirty tests are even more expensive than no tests.

A nice rule to follow when writing unit tests is one assert per unit test. Also one concept per test.

Chapter 10 - Classes

This chapter is mostly about two things - high cohesion and low coupling. It starts with Single Responsibility Principle. The class should have exactly one responsibility, i.e. one reason to change. We should builds systems with many small classes which communicate with small amount of other classes.

A class should have small amount of instance variables. Each method should manipulate at least one of the instance variables. The more variables the method uses, the more cohesive it is to the class. When a class looses cohesion, we should split it into two classes.

We should wire classes to interfaces, not concrete implementation. This way, we not only gain low coupling between classes, but we also enable dependency injection pattern.

Chapter 11 - Systems

This chapter was one of the most boring ones. Maybe because the fight between Spring and EJBs seems to be over and Spring has won. So he chapter only states the obvious:
  • Architecture should be non-invasive.
  • We should write POJOs only.
  • If we have cross-cutting concerns, there are aspects for that.
Whole chapter fells like just an introduction to the Spring Framework.

Chapter 12 - Emergence

One of the shortest chapters. It only presents 4 rules of simple design by Kent Beck. In order of importance a design is simple if it:
  1. Runs all tests (i.e. is covered by automated tests).
  2. Has no duplication.
  3. Expresses the intent.
  4. Minimizes the number of classes and methods.
Then the chapter goes on explaining each of the rules. I would highlight two advises. The most important way to by expressive is to try. Most people don't. High number of classes and methods are sometimes result of pointless dogmatic thinking, such as making an interface for every class.

Chapter 13 - Concurrency

Writing clean concurrent programs is hard. Concurrency helps us decouple what gets done from when it gets done. This can dramatically improve throughput and sometimes even structure of a system. While throughput improvement is an obvious motivation, structural improvement is not so obvious. The system will start to look like many small computers talking to each other, which is supposed to help separate concerns. My experience is that the main (and almost only) motivation for concurrent code is the throughput.

Then there are tips. 
  • We should keep concurrency-related code separate from other code (i.e. concurrency is one responsibility). 
  • We should limit access to shared data.
  • Threads should be as independent as possible. Ideally working with local variables only.
  • We should know our concurrency libraries.
  • We should understand our algorithm models, such as produced-consumer, readers-writers and dining philosophers.
  • Beware dependencies between synchronized methods.
  • Keep the synchronized sections as small as possible.
  • Think about the shut-down process early, if it is important for you to shut down correctly. It is hard to do correctly.
  • Don't ignore one-time system failures, as they can be the concurrency-related issues.
  • Make your threading code pluggable/configurable. E.g. count of threads.
  • Test your code with more threads than processors and on different platforms.
  • Use automated tool to instrument your code with jiggles, which try to mess up the concurrency to make bugs more visible.

Chapter 14 - Successive Refinement

This chapter is one of the longest ones so far because of lots of code in it. Although the useful information is scarce here, I liked to read the example and how Uncle Bob thinks. The most helpful thing I learned here was how he does refactoring. He never breaks tests. When he wants to introduce more generic parsing of a boolean or string arguments, he doesn't immediately replace the old variables with new types. Instead he places new implementation next to the old one, just to keep tests passing all the time. Only after all the implementation was changed to new way, the old instances got removed.

Chapter 15 - JUnit Internals

For me, this chapter was one of the best chapters of the book. It was an example of code improvement. The source code was of JUnit by Kent Beck, which means the code was already better than most of the code we deal with each day. It had 100% code coverage, although probably line coverage only. Uncle Bob claims to have found a useless IF there, which means that branch coverage couldn't not have been 100%.

Chapter 16 - Refactoring SerialDate 

In this chapter, Uncle Bob took a random open-source project and refactored it to be cleaner. Same as in chapter 14, all changes that he made were made while unit tests were passing.  There were many small refactorings done in this chapter. It should have been good, but the form was not good for me. I always had to jump back and forth to the code in the appendix, so it was too long for me. It did not read well.

Chapter 17 - Smells and Heuristics

This chapter is basically a reference of code smells and heuristics. It is partially redundant with the rest of the book, but it is still very useful.

Comments

  1. Inappropriate information should not be in the comments. This includes historical information and so on (mostly metadata info).
  2. Obsolete comments should be deleted.
  3. Redundant comments like exact repeating of what code does are useless.
  4. Poorly-written comments. We should always think how to write a comment, if we must write one.
  5. Commented-out code should be removed. We have source control systems for keeping history.

Environment

  1. Build requires more than one step is a smell. It should be possible to do with one command, or equivalent in IDE.
  2. Tests require more than one step. It should be possible to run them all in one step.

Functions

  1. Too many arguments are hard to understand.
  2. Output arguments are even harder to understand.
  3. Flag arguments usually mean that function does not do one thing.
  4. Dead function is useless and should be removed.

General

  1. Multiple languages in one source file, like HTML and Java, Javadoc, English, and so on. We should limit the count as much as possible.
  2. Obvious behavior is not implemented, like not implementing stringToDay(dayName) for "Monday". Not following the principle of least surprise.
  3. Incorrect behavior at the boundaries means that the implementation at boundaries is not done, or not tested.
  4. Overriden safeties like turned off warnings is also a bad sign.
  5. Duplication is what half of this book, or half of GOF patterns are about.
  6. Code at a wrong level of abstraction is quite common and quite hard to get rid of.
  7. Base classes depending on their derivatives is obviously wrong (although it has exceptions).
  8. Too much information exposed, e.g. bloated interface for no reason.
  9. Dead code, i.e. code that can never be executed, should be removed. There are tools for finding it.
  10. Vertical separation means that e.g. related functions are not vertically not close enough.
  11. Inconsistency is also about principle of least surprise. Do similar things in a similar manner.
  12. Clutter like default constructor with no body, should be removed.
  13. Artificial coupling, like general enums being inner classes of non-related classes, so other classes have to know about those classes too.
  14. Feature envy is kind of similar to Law of Demeter, but rephrased in a way, that if methods of a class heavily manipulate data of the other class, the original class envies features of the other class and the methods belong there. Sometimes there is no other way.
  15. Selector arguments is a redundant smell to flag arguments with a note that this applies to not only booleans, but also enums, integers and so on.
  16. Obscured intent by e.g. Hungarian notation is not necessary.
  17. Misplaced responsibility like when function names do not tell what functions really do.
  18. Inappropriate static methods should sometimes be instance methods. We should only create static functions if we are sure we don't need polymorphism for them.
  19. Heuristic: Use explanatory variables is the first heuristic (in contrast with smells so far). We should use them.
  20. Heuristic: Function names should say what they do, duh.
  21. Heuristic: Understand the algorithm we should. We should not skip understanding the other people's code.
  22. Heuristic: Make logical dependencies physical, e.g. don't duplicate a constant, but create a getter for it to get it in another class.
  23. Heuristic: Prefer polyorphism to if/else or switch/case statements.
  24. Heuristic: Follow standard conventions of your platform/language/etc.
  25. Heuristic: Replace magic numbers with named constants.
  26. Heuristic: Be precise, e.g. don't use floats for money.
  27. Heuristic: Structure over convention means we should prefer ways to force structure e.g. by polymorphism over relying on other programmers copying switch/case statements the same way as we designed them.
  28. Heuristic: Encapsulate conditionals so they are better understood.
  29. Heuristic: Avoid negative conditionals as they are harder to understand.
  30. Heuristic: Functions should do one thing.
  31. Hidden temporal couplings should be made explicit by e.g. returning a type from a method which second method accepts.
  32. Heuristic: Don't be arbitrary means we should have a reason for how we design our code.
  33. Heuristic: Encapsulate boundary conditions means that we should even encapsulate +1s if they are spread over our code.
  34. Heuristic: Functions should have statements on only 1 level of abstraction.
  35. Heuristic: Keep configurable data at high levels means that e.g. constants should be in classes high in the namespaces hierarchy.
  36. Heuristic: Avoid transitive navigation is the law of Demeter. We should not call to deep levels of the classes we use.

Java

  1. Use wildcards when importing to save space.
  2. Don't inherit constants from an interface, use a static import for that.
  3. Use enums instead of some constants.

Names

  1. Use descriptive names, e.g. variable name "a" is bad.
  2. Choose names at appropriate level of abstraction, e.g. a Modem interface should have a connect method instead of a dial method.
  3. Use standard nomenclature where possible, e.g. a "Decorator" in a class name which is a decorator.
  4. Unambiguous names, e.g. no rename, doRename and renameInternal.
  5. Use long names for long scopes, e.g. "i" is OK for short scope of a for-cycle, but longer the scope of a variable/method is, the longer its name should be.
  6. Avoid encodings, like Hungarian encoding. They are obsolete these days.
  7. Names should describe side-effects, e.g. createOrReturnObjectOutputStream.

Tests

  1. Insufficient tests means the test coverage is too low.
  2. Use a coverage tool, otherwise you never know.
  3. Don't skip trivial tests, they have documentary value.
  4. Ignored tests can be used to question requirements.
  5. Test boundary conditions, obviously.
  6. Exhaustively test near bugs, as they tend to congregate.
  7. Patterns of failure are revealing means that sometimes failures in tests lead you to a bug fix.
  8. Test coverage patters can be revealing means that the code which is not executed by passing tests might contain the reason why the failing tests are failing.
  9. Tests should be fast, otherwise you will unlikely run them often.

Appendix A - Concurrency II

This chapter was supposed to go deeper into concurrency problems, but I still think it just scratched the surface. He restated that concurrency could improve throughput. He also repeated that concurrency-related code should be kept separate from other code, because of the Single Responsibility Principle. Then there was a nice example using bytecode of the code that is not atomic (e.g. ++i). We should generally be aware which code is shared and which parts of it are atomic and which are not. Then he introduced Java libraries for concurrency, such as AtomicInteger or ConcurrentHashMap or Executor. He continued with example of why Iterator is inherently not thread safe (because temporal dependencies between its methods). We must create an adapter for it if we want to share it between threads. Then he explained a deadlock and 4 possible ways to prevent it:
  1. Avoid mutual exclusion, e.g. by increasing the number of resources so there are more of them than threads. 
  2. Break Lock and Wait, so don't wait. Release all locks if you need a resource which is locked. This could lead to starvation or a livelock.
  3. Break Preemption is similar to the previous solution, but it allows some waiting. It would contain a strategy for requests for resources between threads.
  4. Breaking Circular Wait is the most common approach, which requires the resources to be locked in agreed order by each thread. This also has disadvantages, because this could lock a not-yet-needed resource too early, or it could be plain impossible if the 2nd resource to be locked depends on a result from the 1st locked resource.
The last part of the chapter repeated that there are test libraries for instrumenting production code so that it fails on concurrency problem much more frequently.