Expressive Tests as Documentation (Using RSpec and FactoryBot)
Software Engineer, Fooda
This post was written by Mikee Pourhadi, a Software Engineer at Fooda. Mikee has a lot of strong opinions about testing, and his instincts are right about 15% of the time. He loves food. Fooda loves food. Sometimes things work out pretty well.
What is this doing?
That’s a trick question – there’s no way for you to know. These tests do absolutely nothing to clue you into the behavior they are evaluating, or how any of the required setup is used. Sure, this class is being called and apparently tested, but from a readability standpoint, there’s not much here.
There’s a well-repeated phrase around unit tests: ‘tests as documentation.’ But so often tests are just used as code coverage sanity checks; assuring that at the very least if something in the code changes, a test will go red. That’s a critical part of testing, of course, but tests can be so much more than that. I would argue that a test file should be the most important resource for a developer when trying to understand a class, and that comes from much more than code coverage.
A test file is a blueprint of class. It should show all the functionality succinctly, how to achieve that behavior and what data is needed. The goal of writing a test should be to find as even a balance as possible between performance and readability.
So what does it mean to have an ‘expressive’ test file? Put simply, everything defined and used in the test file should have an obvious origin, and the functionality under test should be clear. If a developer needs to hunt around the test directory to figure out what is going on, something’s wrong. Also, if the behavior of the class under test is not quickly and easily deduced, the file itself becomes far less useful.
So how about spec/services/create_manifest_spec.rb.rb defined above – does that look like an expressive test file?
Well… it’s not great. There’s a FactoryBot trait with an ambiguous purpose; a shared context with some data setup that we’re unsure of, but is used (must be where ‘manifest_template’ came from?); a shared example that looks important but has an ambiguous title; and apparently ‘it works’ it an apt description of the behavior.
Let’s start with the description text. ‘Building a manifest’ is simply repeating the name of the class – it’s not really serving much of a purpose. A decent pattern to use it to just identify the method under test, so the output of running your tests is clear. This also incentivises you to write one expectation per test (something I’ll get into later), further adding to readability. The example above should have its test broken into multiple describes, with helper text like “describe ‘#call do.’” That’ll be a pretty quick refactor, and one that already yields readability benefits.
This may sound like a small piece of writing tests – and I find it to be overlooked quite often – but it goes a very long way in organizing your tests and helping a future developer decipher the behavior in the months or years following the creation of it.
To be frank, I’m not a big fan of shared contexts or shared examples. By definition their purpose is to abstract functionality away into reusable blocks. That’s not always a bad thing in tests (maybe you’re testing inheritance behavior), but I find the trade-off of just writing the tests out to be a more beneficial and usable solution overall.
And honestly, DRYing up test code feels somewhat opposite to the goal of tests as documentation as a whole. In order for a test file to be as clear and useful as possible, it should be plainly obvious as to what behavior is being tested, what data is being created, and what methods are being called.
Let’s pretend that I dug into those shared examples and found that “it_behaves_like ‘a manifest builder’” simply asks the class details about the manifest that is built. For example, checking for a required method, like ‘#manifest_built?’ that other classes are going to call on the parent service.
Let’s also pretend that I looked into that shared context and pulled out only the data that I needed as well. Looks like that ‘manifest_template’ method was just a call to a factory to create a template.
We can pull out the shared example and the ‘manifest_template’ method from the shared file, and write them directly to our example as they are of critical importance to usage of this class in our app.
That little FactoryBot guy up there is a little conspicuous, isn’t it? Nothing in the test file suggests it to be necessary, but if we delete it, the test fails. So now we have a test that is tightly coupled to a method inside of a factory, and nobody would have any idea why.
Data setup in a test file is very nearly as important as the tests themselves in terms of readability. If I don’t know what sort of data this class expects to interact with, I’m missing a swath of information about actually using it. I wouldn’t build a Lego set that didn’t tell me which pieces I needed and just showed me how the set was supposed to look post-assembly, and I wouldn’t be happy to see a test file with data setup that is obscured away in shared contexts.
Don’t get me wrong, traits are extremely useful things. But I find that if the class fully depends on a trait, it might be best to just clearly write it out in the test file. Let’s go back to our example, create_manifest_spec, and pretend that I did some digging around to find exactly what the class depended on in that trait.
I’ll replace the original meal let with this code – it is a little more code, but it is much easier to see that CreateManifest requires ‘purchased_by’ to be present on the meal object.
And now we have to clearly identify what is under test. The above example shoves a lot into a single test, initializes the class repeatedly, and tests many different behaviors. Let me start this one by refactoring it:
This is a simplification of the class, but illustrates my point. If we clearly label the service and method under test, we can separate the examples into single expectation blocks which typically leans more towards readability than performance. I also defined a named subject, which helps for readability and memoizes the service called inside of the block in case you need to do more than just a single expectation in a test.
Further on that note: Sometimes test setup can be very intensive and necessitate lots of database calls. In this case it’s probably a better idea to have more than one expectation per test to more evenly balance performance and readability. It’s important to balance a quick test suite with a readable one.
So now let’s look at the refactored file.
We can now clearly see what behavior is being tested, which methods are being used, and which models are required to execute this class. This reads much more like a documentation than the previous implementation, and though we will likely take a performance hit here, the trade-offs are well worth it. While we might be adding a second or two in test duration, we’re saving minutes or more of developer brain time. Now future developers will be able to go straight to this file to understand the class, and modify it easily if needed.