/ programming

Dissecting a fresh elixir application

Introduction

In this post we will explaining some key concepts of the Elixir language using
the code generated by the Phoenix project template and some handpicked examples.

The only previous knowledge assumed is basic Ruby. Elixir was designed to look
like Ruby so most of Ruby's datatypes are supported, there is one extra basic
datatype that Elixir supports: tuples, basically a list of elements and it's
written just like lists but with curly braces eg: {:user, "Joe", "Doe"}.

The subject

This is the file we'll be dissecting today:

defmodule IncredibleApplication do
  use Application

  def start(_type, _args) do
    import Supervisor.Spec

    children = [
      supervisor(IncredibleApplication.Endpoint, []),
    ]

    opts = [
      strategy: :one_for_one,
      name: IncredibleApplication.Supervisor
    ]
    Supervisor.start_link(children, opts)
  end
end

This is the phoenix application template main file, it starts the http server
and configures it to our liking.

Macros and metaprogramming

use Application

This may look like a keyword or something baked into the language but it's not.
use is a macro. Macros are very common in functional dynamic languages (like
Erlang or lisp) they allow the programmer to pragmatically generate code at
compile time
.

A good example

assert 1 == 2

This code will raise an AssertionException and the tests will fail, as expected,
but the error message is what's interesting

     Assertion with == failed
     code:  1 == 2
     left:  1
     right: 2

Anyone would expect a failure like got false expected true but instead we get
a rich error message, how could this function be aware of the code that
generated that failure? As you may have guessed, assert is a macro: it gets
the abstract syntax tree (in this case {:==, 1, 2}) and processes it in order
to get better error messages on failed assertions.

Immutability

opts = [
  strategy: :one_for_one,
  name: IncredibleApplication.Supervisor
]

This is regular assignment but since in Elixir all variables are immutable we
can't really change the variable's value, all references to this variable
before this place are the old one. But we can bind this name to another value
later, this is mostly straightforward except when it's not, here is a good
example

(1)> my_list = [1,2,3]

(2)> function = fn() -> my_list end

(3)> function.()
#=> [1,2,3]

(4)> my_list = [1]

(5)> function.()
#=> [1,2,3]

This is a classic example. In most (OOP, mutation based) languages line 5 would
result in [1], but in Elixir changing the variable's binding is not the
same as changing the variable's value, this means that references and values
are unambiguous: everything is a value, an immutable value, always.

Immutability is one of the key aspects that distinguishes Elixir from most
mainstream programming languages, you can be sure that passing a value to a
function will never affect it's state, you can know that if passing a value
params to do_this works and passing params to do_that works then passing
that same value to both will work as well, having values instead of references
improves system's composability since they have no effect but their return
values.

This principle also extends to message passing, a message is just a piece of
data, immutable data. And as such there is no way two processes could get
coupled by accidentally sharing a reference to some mutable thing that both of
them can affect. This kind of spooky action at a distance is often the cause of
many heisenbugs that take many of our mental cycles to debug, of course if you
really want to you can cause them, but you must explicitly do so and that only
difference is priceless.

Supervisors all the way down

Supervisor.start_link(children, opts)

This is just a function call, nothing out of this world, but this particular
function is specially famous in Elixir (as in Erlang).

BEAM (the virtual machine Erlang runs on) is based on the idea of the actor
model. Explaining the actor model is out of the scope of this post but there are
some useful laws to get the hang of it:

  • An actor can process only one message at the same time
  • All messages are asynchronous (although most runtimes implement some
    primitives for synchronous messages since they are so common in practice)
  • To send a message to a process, the sender must have a reference to that
    process (a process can register itself in a globally known name like
    IncredibleApplication)

For example let's think about this situation: I have a stream of events
generated by my application and I want to send them to another service via a
http api, the problem this api can not handle the events if I send them one at a
time but they support a batch api that can handle the load if I distribute it
appropriately. This is a very simple use case for having a running thing that I
can send events to and it uses some logic to batch events and send them together
strategically.

This kind of task is not too difficult in either Elixir or Ruby, I just need a
worker that accepts events and when the amount is big enough, it should send
them together. But in Ruby you have to handle the concurrency: what happens if
you send two events at the same time? can you be sure none of them will be
dropped?

You could, of course, use locks or a database to ensure that every event is
processed and that there are no concurrency bugs but this is way more
complexity than I'd like to add for such a simple task.

Another problem in the Ruby world that may not be as obvious as concurrency
issues is reliability: this process need to be always up and ready,
otherwise you'd loose events. There is no reasonable circumstance in which this
process should raise an exception, other than very critical exceptions like
OutOfMemory, this worker should be alive forever. Among the approaches we
could take is to never raise an exception or surround everything with a big
catch Exception clause, but both are incredibly hard to get right.

In Elixir, as in most programming languages, an unhandled exception means the
death of that process, but since Erlang values fault-tolerance and reliability
above everything else they needed a way to handle this failure in a graceful
way, their solution to this problem was (and is) OTP (Open Telecom Platform): a
library now integral to Erlang that provides, among others, the concept of
supervisors. Supervisors are specialized processes with only one purpose:
monitor other processes (that may also be supervisors). In this specific case,
if one of the monitored processes crashed our supervisor would restart it
immediately (you can read more about different recovery
strategies here).
This pattern forms a hierarchical structure known as the supervision tree. All
programs in Elixir are big supervision trees, making them both reliable and easy
to distribute across different machines

That's what the Supervisor.start_link function is doing for our program: it
initializes and monitors the http server process (that also spawns more
processes), if we wanted to add our BatchingForwarder this is where it should
be.

Conclusion

Although there are plenty more axes to evaluate a language like libraries,
community, tools etc (maybe I'll write something up about those someday) I hope
this brief introduction to the niceties of the Elixir language would encourage
you to give it a try.