best practices Featured

The Evil Defaults

Working on a project with a lot of legacy code always leaves good lessons. This time I’ll present you what I like to call “evil defaults”, pieces of code that can do a lot of harm in any software project.

Nahuel Garbezza

Sep 27, 2018 • 4 min read

(Este artículo tiene una versión en castellano, si así lo deseas: ver "Los defaults malvados")

Intro

Working on a project with a lot of legacy code always leaves good lessons. This time I'll present you what I like to call evil defaults, pieces of code that can do a lot of harm in any software project.

What?

An evil default is a piece of code, which acts (as its name indicates) as a default option in a decision flow, and under certain circumstances causes problems in the system when it is evaluated.

That piece of code can be a single value, or a set of actions with effects, or even a no-op (doing nothing is also doing something!).

In a very abstract way:

if business_rule_a_happens
  do_a
elsif business_rule_b_happens
  do_b
else
  # sometimes we don’t know what to do ¯\_(ツ)_/¯
  do_something_that_might_be_ok
end

We don't like to write if statements, but it's just to show the idea :-) We prefer polymorphic objects to take care of these decisions. Either way, whether it's a branch in an if statement or a polymorphic object to represent that case, we are making decisions. And we need to choose wisely what goes into the else branch, because it is a rule that will apply to a lot of cases, even some we probably don't even imagine.

The evil default gets worse as we continue on the execution flow, because a wrong decision can put our system in an inconsistent state, and unless there’s an exception the program will continue doing things.

So, to summarize the problems:

The system seems to work, but in an incorrect way. There's a false feeling of success.
When we realize there's an error, it's probable too late. Debugging and fixing time is usually high for these kind of bugs.
The code is not reflecting what needs to be modeled. If we don't know what to do (and even if the client doesn't know either) under a given situation, the best thing to do is just to fail.

Why?

Programmers learn something about the problem space and then try to reflect it in a computable model. And it's obvious that you can't have a perfect picture at any time. It is an iterative and incremental process, as we move forward we become more experts in the subject we are dealing with, but in the meantime we have to make decisions. To the computer, there are no grey areas in the software we have to build, even though we haven't understood enough the whole problem.

So lacking domain knowledge is the main reason to introduce evil defaults, and when we introduce them we don't realize we are introducing something bad. We think that it just works. Today.

Another cause of evil defaults is oversimplification, maybe we understood the problem but we think one single default value/action will handle it well. Then we sadly realize that's not the case.

Also, we tend to have some "fear to fail" and prefer, erroneously, to decide and go for an option (object/value/no-op/whatever) instead of raising an error to reflect the true reality that we don't know what to do.

A real life example

We were building an automated order import process for external marketplaces (such as Ebay or Amazon). So we hit an API to get all the pending orders, and we store them in our system. We needed to have a consistent country mapping, so for instance, ARG gets mapped to 'Argentina' (that's how we store it in our system). Our logic had an evil default, which was choosing USA as the default country if there's not an entry in our country map.

We had very limited shipping options at that time so that was not a problem for a while, but when we started to support new countries, our order importer started to behave incorrectly, and we noticed it a few days later. Customers were complaining about their items not being delivered. It was not so obvious the cause of those problems. The fix itself was easy, but fixing affected orders was hard.

How to solve them?

As I mentioned before, the best way to model an unknown case is by raising an error. Stopping the execution flow and wait for human intervention to look and handle it.

We would like to be aware when this situation happens, and to have all the necessary information to tackle it, so we need:

Good error messages, descriptive and having additional context that could be helpful (parameter, ids of involved objects, configuration values, etc)
Exception reporting services, probably connected to an alert system. Something like Rollbar or Honeybadger.
Someone responsible for acting when those errors occur. For instance, in our current project we have built a triaging rotating role, and the person in charge (a different one per week) is responsible for looking at those errors and report it to the right person/team.

Are there good defaults?

Of course! There's an interesting thread in Software Engineering’s StackExchange that talks about it, and there's an answer with a very good example. We all know that some protocols have a default port associated, like for instance FTP uses port 23 by default. So if we open a FTP connection, it sounds safe to assume 23 will be the default, and nobody will doubt of it. This is something we learnt from the domain.

Conclusions

Think before putting a default. Ask yourself these questions: Am I sure that all those cases should be caught by this logic? Am I hiding potential bugs? Am I oversimplifying the problem?
Defaults should have tests. That way we'll have at least an explanation of a real case where we needed that default value/action. It is even better if we describe somewhere why it was introduced. It'll be easy to TDD-it and break it once we learn something new and have to rewrite the business rules.
Fail fast! that way the feedback loop is shorter and you realize soon there's an action to take. Don't be afraid to raise an error. Just make sure you include all the context needed to analyze it.