Dissecting a fresh Elixir application

In this post we will explaining some key concepts of the Elixir language using the code generated by the Phoenix project template and some handpicked examples.

Dissecting a fresh Elixir application

Introduction

In this post we will explain some key concepts of the Elixir language using the code generated by the Phoenix project template and some handpicked examples.

The only previous knowledge assumed is basic Ruby. Elixir was designed to look like Ruby so most of Ruby's datatypes are supported, there is one extra basic datatype that Elixir supports: tuples, basically a list of elements and it's written just like lists but with curly braces eg: {:user, "Joe", "Doe"}.

The subject

This is the file we'll be dissecting today:

defmodule IncredibleApplication do
  use Application

  def start(_type, _args) do
    import Supervisor.Spec

    children = [
      supervisor(IncredibleApplication.Endpoint, []),
    ]

    opts = [
      strategy: :one_for_one,
      name: IncredibleApplication.Supervisor
    ]
    Supervisor.start_link(children, opts)
  end
end

This is the phoenix application template main file, it starts the HTTP server
and configures it to our liking.

Macros and metaprogramming

use Application

This may look like a keyword or something baked into the language but it's not. use is a macro. Macros are very common in functional dynamic languages (like Erlang or lisp) they allow the programmer to pragmatically generate code at compile time.

A good example

assert 1 == 2

This code will raise an AssertionException and the tests will fail, as expected, but the error message is what's interesting:

     Assertion with == failed
     code:  1 == 2
     left:  1
     right: 2

Anyone would expect a failure like got false expected true but instead we get a rich error message, how could this function be aware of the code that generated that failure? As you may have guessed, assert is a macro: it gets the abstract syntax tree (in this case {:==, 1, 2}) and processes it in order to get better error messages on failed assertions.

Immutability

opts = [
  strategy: :one_for_one,
  name: IncredibleApplication.Supervisor
]

This is regular assignment but since in Elixir all variables are immutable we can't really change the variable's value, all references to this variable before this place are the old one. But we can bind this name to another value later, this is mostly straightforward except when it's not, here is a good example:

(1)> my_list = [1,2,3]

(2)> function = fn() -> my_list end

(3)> function.()
#=> [1,2,3]

(4)> my_list = [1]

(5)> function.()
#=> [1,2,3]

This is a classic example. In most (OOP, mutation based) languages line 5 would result in [1], but in Elixir changing the variable's binding is not the same as changing the variable's value, this means that references and values are unambiguous: everything is a value, an immutable value, always.
Immutability is one of the key aspects that distinguishes Elixir from most mainstream programming languages, you can be sure that passing a value to a function will never affect it's state, you can know that if passing a value params to do_this works and passing params to do_that works then passing that same value to both will work as well, having values instead of references improves system's composability since they have no effect but their return values.

This principle also extends to message passing, a message is just a piece of data, immutable data. And as such there is no way two processes could get coupled by accidentally sharing a reference to some mutable thing that both of them can affect. This kind of spooky action at a distance is often the cause of many heisenbugs that take many of our mental cycles to debug, of course if you really want to you can cause them, but you must explicitly do so and that only difference is priceless.

Supervisors all the way down

Supervisor.start_link(children, opts)

This is just a function call, nothing out of this world, but this particular function is specially famous in Elixir (as in Erlang).

BEAM (the virtual machine Erlang runs on) is based on the idea of the actor model. Explaining the actor model is out of the scope of this post but there are some useful laws to get the hang of it:

  • An actor can process only one message at the same time
  • All messages are asynchronous (although most runtimes implement some primitives for synchronous messages since they are so common in practice)
  • To send a message to a process, the sender must have a reference to that process (a process can register itself in a globally known name like IncredibleApplication)

For example let's think about this situation: I have a stream of events generated by my application and I want to send them to another service via a HTTP API, the problem this api can not handle the events if I send them one at a time but they support a batch api that can handle the load if I distribute it appropriately. This is a very simple use case for having a running thing that I can send events to and it uses some logic to batch events and send them together strategically.

This kind of task is not too difficult in either Elixir or Ruby, I just need a worker that accepts events and when the amount is big enough, it should send hem together. But in Ruby you have to handle the concurrency: what happens if you send two events at the same time? can you be sure none of them will be dropped?

You could, of course, use locks or a database to ensure that every event is
processed and that there are no concurrency bugs but this is way more
complexity than I'd like to add for such a simple task.

Another problem in the Ruby world that may not be as obvious as concurrency
issues is reliability: this process need to be always up and ready, otherwise you'd loose events. There is no reasonable circumstance in which this process should raise an exception, other than very critical exceptions like OutOfMemory, this worker should be alive forever. Among the approaches we could take is to never raise an exception or surround everything with a big catch Exception clause, but both are incredibly hard to get right.

In Elixir, as in most programming languages, an unhandled exception means the
death of that process, but since Erlang values fault-tolerance and reliability above everything else they needed a way to handle this failure in a graceful way, their solution to this problem was (and is) OTP (Open Telecom Platform): a library now integral to Erlang that provides, among others, the concept of supervisors. Supervisors are specialized processes with only one purpose: monitor other processes (that may also be supervisors). In this specific case, if one of the monitored processes crashed our supervisor would restart it immediately (you can read more about different recovery strategies here).
This pattern forms a hierarchical structure known as the supervision tree. All programs in Elixir are big supervision trees, making them both reliable and easy to distribute across different machines.

That's what the Supervisor.start_link function is doing for our program: it
initializes and monitors the http server process (that also spawns more processes), if we wanted to add our BatchingForwarder this is where it should be.

Conclusion

Although there are plenty more axes to evaluate a language like libraries, community, tools etc (maybe I'll write something up about those someday) I hope this brief introduction to the niceties of the Elixir language would encourage you to give it a try.