/ best practices

Chernobyl Driven Development: 10 lessons learnt from the miniseries

Or, how I learned to stop worrying and love failing fast.

People in the software community nowadays do Everything Driven Development. Test Driven Development. Domain Driven Design. Behavior Driven Development. But I’m here to bring you the latest fashion of x driven development: Chernobyl Driven Development! (The acronym should be ChDD, in order to distinguish it from the better known CDD, Community Driven Development). Inspired by the smashing success of the HBO series about the Chernobyl nuclear disaster, these are 10 lessons learnt from it so you too can make sure your project is a bomb.

(If you haven’t watched the series yet, do it, it’s great, and it’s only five episodes long, I’ll wait, and if not, I won’t spoil too much).

1. Test in production

In Chernobyl there were many reasons why everything failed. One of those reasons was that they were doing a test procedure in the power plant to make sure the backup system actually worked (which is not actually a bad idea). So why not test in production? It is essential to make sure things really work properly. You can break things, sure, but what’s the worst thing that could happen, besides ruining human lives? Actually, this is not a bad idea if you’re not running an actual nuclear reactor, and you limit testing to a small surface area. So if you’re deploying on production, you can make a canary deploy to make sure you don’t affect everybody at the same time.

You could, also, have a test environment. These are expensive, and in the case of a power plant, really impossible.

2. Productivity is king

Of course, you might wonder why they didn’t run the test before going live. You know why? Because productivity matters. We live in a cutthroat world of competition, you need to optimize to reach the market fast and delivering value (or in the case of Chernobyl, electricity). So they did the obvious thing, they put the plant live on production, and later they ran the tests while the plant was already serving power. You see that we in the software world are already doing Chernobyl Driven Development, since we often add tests after the features have been already deployed, if we test at all. Since tests don’t deliver value by themselves, they aren’t productive, so it’s often all too tempting to not add them at all (the product might fail in the market, after all).

3. Cut costs

Another reason why Chernobyl was a glowing success: cutting costs. Safety measures were ignored while others were in place but didn’t work as they were supposed to. Remember: Agile organizations are customer centric, which basically means the customers like things cheap, and as long as things don’t blow up, that’s fine. If they do blow up, the customer is dead, so who cares. Don’t pay too much attention to the lack of tests.

4. Be opaque

Having a well written plan or documentation is anti-agile. As we saw in the TV series, the workers had been handed a plan which had striked-out parts and was basically just a sketch. This is good. This is good because documentation doesn’t deliver value, and having a plan stiffens creativity. It’s best to improvise because you might not always know what will come next, which makes things exciting and makes you feel alive. The problem with plans is that they try to look too far into the future: you need to open Schrödinger’s box to actually see what’s inside (spoiler alert: the cat was dead). Related to this, don’t have too many policies. Policies are bad, they’re bureaucratic, and bureaucracy is bad. It is known.

Another way to be opaque: don’t communicate, not even when you’re running tests on production. Many people in the power plant didn’t even know a test was being done. That’s ok. Communication is noise, and noise doesn’t deliver value. Value is best delivered in small teams that are isolated from each other. That’s how you isolate failure, like you can see on Chernobyl, where only two or three people got blamed, instead of figuring out that the whole organization was rotten.

5. Push the limits

Nobody likes nay-saying prophets of doom. They never actually get anything done, and Agile is about getting things done. You need to push the limits. Like they did on Chernobyl, when they pushed the system to its limits. You have to admire the revolutionary comrade Anatoly Dyatlov who, in order to run the test on the power plant and have a promotion, dared to push the system to the limit and spared no resources to do it, even encouraging his workers to do things despite their warnings. He resorted to intimidation, sure. But I think here on the western capitalist world we have an improved method. Veiled threats are much more effective ways to intimidate people than screaming at the top of your lungs. Or even better, cheering on people with a smile and compliments. That disarms their fears so they can get things done. Remember lesson #2.

Sure, Anatoly failed. But that’s what this is about. Failing.

6. Put untrained people in front of the controls

People always complain about how Gendo Ikari pushed Shinji to get in the Evangelion robot despite having no training and being 14 years old. In the show, Ulana Khomyuk (a fictional character that represents scientists that worked to repair the damage done by Chernobyl) is surprised at the young age of the worker in front of the controls. That’s fine! The only way to learn how to ride a bike is getting inside the production bike. Sure, you might be untrained, but see the previous lessons.

In the software world we do it all the time. You might have heard about the impostor syndrome. It is described by Wikipedia as “a psychological pattern in which an individual doubts their accomplishments and has a persistent internalized fear of being exposed as a "fraud". Despite external evidence of their competence, those experiencing this phenomenon remain convinced that they are frauds, and do not deserve all they have achieved.”

This syndrome is prevalent in our industry. Many people have offered their opinion on why it’s that, but I’ve never seen anybody give the obvious answer: because we’re all impostors. We aren’t adequately trained, we improvise, we’re isolated and we pose all the time. It doesn’t help as well the fact that we tend to be ignorant of the history of computer science (if it is a science), and thus we reinvent the wheel all the time, badly on most cases.

Of course, some people have tried to push back against this, by letting other people know that it is ok to not know things, and you can get things done without knowing everything (this is actually true, as long as you don’t question yourself if getting things done is the sole measure of success). But usually it’s merely a chat programmers have with each other. You’re still being sold as a senior expert to the customer.

7. Minimize risks

In the show we saw how they minimized risks successfully. A great way to minimize risks is having ill-fitting measuring tools, like the measurement instruments they used on Chernobyl that reported levels of radiation much lower than the real ones. This is good, because then you can assert things are safe even when they’re not, and we all know safety is primarily rhetorical and psychological. Metrics are essential for that in software, they are quantitative and not qualitative, which means they tell nothing of value while pretending to be objective and non debatable. Like for example you can have a test suite with no assertions and a coverage of 100%. Metrics without thoughtful reflection are great for upper management who are isolated from the work the teams are actually doing, which if you recall is good because opacity and isolation minimize communication, which is noise, which is anti-productivity.

Another way to minimize risks is to pretend they’re not there (be opaque!), like they did on Chernobyl by not letting the workers know about the danger of having the reactor get to the point it did and then pushing the emergency shutdown (spoiler alert: the danger was having a nuclear explosion).

You might have been misled into thinking that minimizing risks was actually about working with risks and having balancing forces that act as a counterpoint to these risks. You have an actual lesson on how to do that on the show, when Valery Legasov shows us a dynamic system of forces while explaining how the system works. The problem with this is that making a model of forces would bring us closer to engineering and to doing serious work. We’re not engineering but in title, engineering is anti-agile, we are deliverers of value, and we already learnt that models and documentation don’t deliver value.

Another way to minimize risks would be to study the history of computer science. That would help us learn from past mistakes. There is even modern tooling that has learnt from this history and aims to create safer ways to program. A good example of this are Rust, Haskell and Erlang, programming languages that with different strategies aim to make programs more robust and less prone to failure. We could do that, but the problem is that Javascript is more productive.

8. Tell, don’t ask

A great principle of object oriented design is also applied in Chernobyl: tell people what to do, don’t ask questions. Especially if they are obvious questions begging to be asked, don’t ask them. See also opacity. People don’t get fired for not asking questions, people get fired for not getting things done. When in doubt, divine your way into things. (As an aside, I would be surprised if anybody asked me anything about this piece of writing)

9. Cheer on the greatness of the Soviet Union

This is the key ingredient. We are the best. We are rockstars. We’re all comrades. Repeat it ad nauseam. Keep calm and carry on. Nobody likes nay-sayers. A spirit of camaraderie is essential to success and productivity.

Be careful though, you don’t want to actually empower workers.

10. Learn from failures and repair this broken world

At the beginning the null filled the earth, but the Lord said, let there be Maybe. Chernobyl blew up, true. But it could have been much worse. The series is frustrating, but also inspiring. For every Anatoly there is a Legasov. We don’t need to program in Javascript. Or if we do, we can try to clear the radioactive rubble from the roof. We can do it, little by little, replacing each piece of the callback hell with a Promise.