My last post on top-down development attracted a lot of attention from the Twittersphere and lots of comments. The vast majority of these were constructive, whether they agreed with me or not. I am delighted that the post sparked such a response because we can only improve approaches through challenge and discussion. It’s well worth looking at Robert Martin’s Clean Code blog where he has taken the time to rebut the points I made (Thanks, Bob). I think he has some things wrong here but I’ll address them in a separate post.
As I make clear in Chapter 8 of my book on software engineering, I think TDD is a an important step forward in software engineering. There are some classes of system where it is clearly appropriate, one of which is web-based consumer-facing systems and I believe that the use of TDD in such circumstances makes sense. I think that the key characteristics of ‘TDD-friendly’ systems are:
- A layered architecture. A point made by several commentators was that, even when GUIs are hard to test, a layered architecture overall simplifies the testing process. Absolutely right – when you can structure an architecture with the presentation layer, the application logic layer and the data management layer, these can be tested separately.
- Agreed success criteria. When the stakeholders in a system agree on what constitutes success, you can define a set of tests around these criteria. You don’t need a detailed specification but you do need enough information to construct that specification (and maybe represent that as a set of tests) as you are building the system.
- A controllable operating environment. By this, I mean an environment where you don’t have to interact with other systems that you can’t control and which may, by accident or design, behave in ways which adversely affect the system you are development. In this situation, the problem is designing for resilience and a deep program analysis is much better for this is better than (any kind of) testing.
I started with TDD on a class of system that met these criteria and I liked it. It worked well for me. Then, I moved on to a system which was concerned with visualizing complex linked structures. Now, the thing about visualization is that (a) it’s often much more difficult to have clearly separate layers – the UI is the program and (b) it’s very hard to have pre-defined success criteria – you basically have to program by experiment and see what works and what doesn’t. TDD didn’t work. Of course, this may be due to my inexperience but I think there is more to it than this. Essentially, I think that if a system does not have the above characteristics, then TDD is inherently problematic.
It is unrealistic to think that all systems can be organised as a layered model. For example, if you are building a system from a set of external services, these are unlikely to fit neatly into layers. Different services may have different interaction models and styles and your overall UI is inevitably complex because you have to try and reconcile these. If you have a system that involves rich user interaction (e.g. a VR system), then most of the work is in the UI code. I’ll discuss the myth of a ‘thin’ UI in a separate post.
It is equally unrealistic to think that we can always have agreed success criteria for a system, just as it’s unrealistic to have complete program specifications. Sometimes, stakeholders who have significant influence choose not to engage in system development but don’t like what they see when they get it. Some problems, like visualisation, are often problems where you work by trial and error rather than around a definitive idea of what the system should do. If you are not sure what you are trying to test, then TDD is challenging. In those circumstances, you build planning to throw at least one away. And, maybe if you finally get agreement, you can use TDD for the final version.
The problem of a controllable operating environment is one that isn’t often mentioned by software engineering commentators. When you put software into a complex system with people and other hardware devices, you will get lots of unexpected inputs from various sources. The classic way of handling this is to force a UI on system users that limits the range of their interaction and so the software doesn’t have to handle inputs it doesn’t understand. Bad data is eliminated by ignoring anything that doesn’t meet the data validation criteria defined by the system. So far, so good. You get frustrated users who can’t do what they want through the UI (think of the limitations of e-banking systems). Unexpected data from sensors just gets ignored.
This is all very well until you are faced with a situation where ignoring data means that your system breaks the law; where ignoring sensor data means that systems fail in catastrophic ways and kill or injure people; where stopping users interacting with the system means that they can’t respond to system failure and limit the damage caused by that failure.
So, sometimes, you simply have to deal with ‘unknown unknowns’. By definition, you can’t test for these and if you can’t test, how can you use TDD? Deep program analysis and review is the only way that you can produce convincing evidence that unexpected events won’t move the system into an unsafe state.
I don’t believe that such thing as a universal software engineering method that works for all classes of system. TDD is an important development but we need to understand its limits. I may have missed it, but I have never read anything by experienced TDD practitioners that discusses the kinds of system where it’s most effective and those systems where it might not work that well.