rails

Testing Rake tasks

An essay on Rake tasks, their Rails dependency, the value of testing them and, most of all, how to do it. Testing Rake tasks is essential when you need to keep your project robust and reliable all along. Let me show you how (and why) you should definitely do it.

Carlos Javier Perez

Jan 14, 2019 • 6 min read

So, you've come up with a pretty neat Rake task for that imperative urgent update of all the company prices on US items online. CI passes ok. Deployment too. It's time. You invoke the task and, yes, you've missed a typo on one of the arguments. So you quickly make a beautiful PR just changing a single character. Once again CI and deployment go well. Invoke! It starts, but some of your items are not being updated. It seems you forgot a 'where' clause on one of the queries of your task. Ahhh, if only there was a wonderful perfect world where you could test Rake tasks. There is, of course. Some developers argue on the real value of testing Rake tasks since they're only used once or on rare particular occasions. Others think it's enough to extract all the logic out of them to properly tested objects (we'll look into that later). Before fully diving into the question of how to test them (and if it's really worth it), we'll take a quick look at Rake tasks themselves.

Rake and Rails sitting in a tree

Rake was originally implemented as a Ruby version of GNU's Make, merely a build utility. As time went by, thanks to its true power, declarativeness, build routines and the possibility of creating custom tasks, it quickly became a fully-fledged automation tool. Though being considered a Ruby internal DSL, it could be used in any environment. Any developer can write their own Rake tasks, specific to their application. There are Rake tasks natively built into Rails that perform common functions, such as:

rake db:migrate

This time we're focusing on custom Rake tasks, the ones we could create to automate our project, by running them a single time, or periodically using any scheduling method (i.e cron). So let's write a simple task. In Rails, you keep your custom Rake tasks inside /lib/tasks folder.

lib/tasks/example.rake

namespace :products do
   desc 'Updating products'
   task updating_products: :environment do
     # Some code for updating products
   end
end

The structure follows a simple Ruby syntax. In fact, task isn't just a keyword but a method that takes two arguments, a hash and a block. The first argument is a hash (like {:key_1 => value_1} ). On our example, the key is updating_products and the value is the prerequisite environment. It could also have more parameters if we include any argument to be used on the task itself (We'll see an example next). environment is a prerequisite present on every generated Rails Rake task, and it loads the Rails environment allowing every defined class on the application to be available in the task code. The second argument is a block. It's the body of the task defined within the do ... end clause. So, to sum up, a Rake task is simply:

a_method a_hash a_block

Testing Rake tasks? Reaaaaally?? Do I have to?

Well, they say it's not really worth it. You needn't test a Rake task. What you should do is encapsulate the logic and business on one (or several) objects. Of course, such objects should be tested as usual, preferably using TDD (because, well, that's what grown-up developers do. In case you aren't used to TDD, here's a great post about it). Using TDD means that if you've come to something like this:

lib/tasks/update_item_prices_on_region.rake

namespace :products do
  desc 'update item prices on region'
  task :update_item_prices_on_region, [:region_name] => :enviromnent do |_, args|
    items = Item.where(region_name: args.region_name)
    items.each do |item|
      # Some code for building the request for the API
      # Some code for sending the request to the API
      # Some code for handling response and checking with DB
      # Some code for updating
    end
  end
end

without having written a single test, then you're way way off.
So you start by testing and building an implementation on a different object, that you'd later call from the task. The Rake task will then look something similar to this:

lib/tasks/update_item_prices_on_region.rake

namespace :products do
 desc 'update item prices on region'
 task :update_item_prices_on_region, [:region_name] => :enviromnent do |_, args|
   items = Item.where(region_name: args.region_name)
   items.each do |item|
     ItemUpdater.new(item.id).execute
   end
 end
end

On this example, the object ItemUpdater is in charge of (almost) everything. This is a robust, commonplace solution, but it's surely not the full answer. You could argue that the Rake task is not being tested, you're just testing the object(s) it uses. And you would be right. Nevertheless, most developers think this is just enough. There's no place left for error. Well, they're wrong.

Yes, you do!

Rake tasks are code. Rake tasks do stuff and have a few strict rules for their implementation. Arguments should be given in a strict syntax and you could also include defaults for some or all of them. It is also pretty common to perform data migrations in batches, so you need to make sure everything past the first batch is processed as expected too. There are lots of errors that could happen since you invoke the task until you get to the already tested object. For example, typos happen all the time. Did you notice the error on the previous example? I'm sure you did. Didn't you? (enviromnent != environment).

Domain changes. This is a tricky one. Changes on the object the Rake task uses may or may not be caught by testing it. All in all, it will definitely add a layer of reliability that might be more or less incidental on the robustness of the task. I'll expand on this later.

And to top it all: It is insanely easy and straightforward.

Aren't you convinced yet? Let's take a look at this simple mandatory rules to test Rake tasks (I'm using Rspec for the examples):

Rake has to be required.
The tested file must be loaded manually.
environment must be defined as an empty Rake task.

That's it. We're ready to go. First, let's correct that little typo we've got on the previous example:

lib/tasks/update_item_prices_on_region.rake

namespace :products do
 desc 'update item prices on region'
 task :update_item_prices_on_region, [:region_name] => :environment do |_, args|
   item_ids = Item.where(region_name: args.region_name).pluck(:id)
   item_ids.each do |item_id|
     ItemUpdater.new(item_id).execute
  end
 end
end

Now we can test it properly:

spec/tasks/update_item_prices_on_region_spec.rb

require 'spec_helper'
require 'rake'

describe 'update item prices on region' do
 let (:region_name) { 'US' }
 let (:run_codes_rake_task) do
   Rake.application.invoke_task "products:update_prices_on_region[#{region_name}]"
 end

 before do
   Rake.application.rake_require "tasks/update_item_prices_on_region"
   Rake::Task.define_task(:environment)
 end

 context 'when the update_item_prices_on_region task is invoked' do
  context 'and there are no items for the given region' do
    it 'does not call the item updater' do
     expect(ItemUpdater).to_not receive(:new)

     run_codes_rake_task
    end
  end

  context 'and there is one item for that region and other for another region' do
    let!(:us_item) { FG.create(:item, region: 'US') }
    let!(:uk_item) { FG.create(:item, region: 'UK') }
    
    it 'calls the item updater once' do
     expect(ItemUpdater).to receive(:new).once.with(us_item.id)

     run_codes_rake_task
    end
  end
 end
end

So, let's explain what we've done here. On the second line, and again, on the before clause, we’re covering the three rules, first we require Rake, then we load the file and define environment as an empty Rake task. Along the test, we cover everything from the invoking of the task to the actual call of the object.
Let's enumerate a few of the errors we could catch here:

Typos.
Query mistakes.
Argument errors.
Some changes related to the object the task uses. For example, if in the future the object we're using changes its module or name (i.e from ItemUpdater to Marketplaces::ItemUpdater or ItemPriceUpdater) this tests would fail, allowing us to prevent the task from being obsolete.
Anyway, once tests pass we can say that we've got the Rake task tested.

Conclusion

Even though I didn't use the simplest example since I added a couple of arguments, something not always necessary on Rake tasks, this isn't the most complex situation either. We could sprinkle a lot of stuff on our Rake tasks: logs, data migrations, default arguments, etc. But you get the gist of it. It's not at all that difficult, and it provides a lot of robustness and control on our automated tasks, either if they're a one time only or something scheduled to happen periodically. Moreover, if you're being a little lazy or you don't want to pollute your project with objects which will be used just once, you could leave the logic in the task and test it right there, though this is something I definitely don't recommend.
All in all, since a chain is as strong as its weakest link, testing Rake tasks will add a layer of certainty and reliance, providing the robustness every project deserves.