Maintainable pyunit test suites – fixtures

So a while back I blogged about maintainable test suites. One of the things I’ve been doing since is fiddling with the heart of the fixtures concept.

To refresh your memory, I’m defining fixture as some basic state you want to reach as part of doing a test. For instance, when you’ve mocked out 2 system calls in preparation for some test code – that represent a state you want to reach. When you’ve loaded sample data into a database before running the actual code you want to make assertions about – that also represents a state you want to reach. So does simply combining three or four objects so you can run some code.

Now, there are existing frameworks in python for this sort of thing. testresources and testscenarios both go some way towards this (and I and to blame for them :)), so does the zope testrunner with layers,  and the testfixtures project has some lovely stuff as well. And this is without even mentioning py.test!

There are a few things that you need from the point of view of running a test and establishing this state:

  • You need to  be able to describe the state (e.g. using python code) that you wish to achieve.
  • The test framework needs to be able to put that state into place when running the test. (And not before because that might interfere with other tests)
  • And the state needs to be able to be cleaned up.

Large test suites or test suites dealing with various sorts of external facilities will also often want to optimise this process and put the same state into place for many tests. The (and I’m not exaggerating) terrible setUpClass and setUpModule and other similar helpers are often abused for this.

Why are they terrible? They are terrible because they are fragile; there is no (defined in the contract) way to check that the state is valid for the next test, and its common to see false passes and false failures in tests using setUpClass and similar.

So we also need some way to reuse such expensive things while still having a way to check that test isolation hasn’t been compromised.

Having looked around, I’ve come to the conclusion we’ll all benefit if there is a single core protocol for doing these things, something that can be used and built on in many different ways for many different purposes. There was nothing (that I found) that actually met all these requires and was also tasteful enough that folk might really like using it.

I give you ‘fixtures‘. Or on Launchpad. This small API is intended to be a common contract that all sorts of different higher level test libraries can build on. As such it has little to no policy or syntatic sugar.

It does have a nice core, integration with pyunit.TestCase, and I’m going to add a library of useful generic fixtures (like temporary directories, environment isolators and so on) to it. I’d be delighted to add more committers to the project, and intend to have it be both Python 2.x and 3.x compatible (if its not already – my CI machine isn’t back online after the move yet, I’m short of round tuits).

Now, if you’re writing some code like:

class MyTest(TestCase):
    def setUp(self):
        foo = Foo()
        bar = Bar()
        self.quux = Quux(Foo(), Bar())
        self.addCleanup(self.quux.done)

You can make it reusable across your code base simply by moving it into a fixture like this:

class QuuxFixture(fixtures.Fixture):
    def setUp(self):
        foo = Foo()
        bar = Bar()
        self.quux = Quux(Foo(), Bar())
        self.addCleanup(self.quux.done)

class MyTest(TestCase, fixtures.TestWithFixtures):
    def setUp(self):
        self.useFixture(QuuxFixture)

I do hope that the major frameworks (nose, py.test, unittest2, twisted) will include the useFixture glue themselves shortly; I will offer it as a patch to the code after giving it some time to settle. Further possibilities include declared fixtures for tests, and we should be able to make setUpClass better by letting fixtures installed during it get reset between tests.

What do I do @ work?

I recently moved within Canonical from being a paid developer of Bazaar to take on a larger challenge  Technical Architect for Launchpad. Its been two months now, and its time to put my head up out of the coal face, have a look around and regroup.

When I worked on Bazaar, every day when I started work got up I was working on a tool anyone can use, designed for collaboration upon sourcecode, for people writing software. This is a toolchain component right at the heart of the free software world. Bazaar and tools like it get used everyday to manage, distribute and collaborate on the sourcecode that makes up the components of Ubuntu, Debian, Fedora and so forth. Every time someone new starts using Bazaar for a new free or open source project, well I felt happy – happy that in my small part I’m helping with this revolution we’re carrying out.

Launchpad is pretty similar to Bazaar in some ways. Obviously they are both free software, both are written in Python, and both are sponsored by Canonical, my employer. And they both are designed to assist in collaboration and communication between free software developers – albeit in rather different ways.

Bazaar is a tool anyone can install locally, run as a command line, GUI, or local webserver, and share code either centrally (e.g. by pushing to Launchpad), or in a peer to peer fashion, acting as their own server.

Launchpad, by contrast is a website which (usually) folk will use as a service – in their browser, from the comand line – FTP (for package building), ssh (for Bazaar branch pushing or pulling), or even local GUI programs using the Launchpad API service. This makes it more approachable for first time collaborators, but its less able to be used offline, and it has all the usual caveats of web sites : it needs a username and password, it’s availability depends on the operators – on the team I’m part of. So there’s a lot less room for error: if we do something wrong, the system is unavailable, and users can’t just ‘apt-get install’ an older release.

With Launchpad our goal is to to get all the infrastructure that open source need out of the way, so that they can focus on their code, collaboration within their team – and almost uniquely – collaboration with other teams. As well as being open source, Launchpad is free for all open source projects to use – Ubuntu is our single biggest user – they use it for all bugtracking, translation and package building, and have a hugefraction of the total storage overhead in the database.

Launchpad is a pretty nice system, so people use it, and as a result (on a technical basis) it is suffering from its own success: small corner cases in the code turn up every day or two, code written years ago to deal with a relatively small data set now has to deal with data sets a thousand or more times larger (one table, for instance, has over 600,000,000 rows in it.

For the last two months then, I’ve been working on Launchpad. As Technical Architect, I need to ensure that the things that we (users, stakeholders and developers of Launchpad) want to do are supported by the structure of the system : the platform(s) we’re building on, the way we approach problems, coding standards and diagnostic tools. That sounds pretty dry and hands off, but I’m finding its actually very balanced. I wrote a presentation when I started the job, which encapsulated the challenges I saw in front of the team on this purely technical front, and what I thought I needed to do.

I think I was about right in my expectations: On a typical day, I’ll be hands on in a problem helping get it diagnosed, talking long term structural changes with someone around how to make things more efficient / flexible / maintainable, and writing a small patch here or there to help move things along.

In the two months since I took on this challenge, we’ve made significant headway on the problem of performance for Launchpad : many inefficient code paths have been identified and removed, some new infrastructure has been created as is being rolled out to make individual pages faster, and we’ve massively increased the diagnostic data we get when things go wrong. We’ve introduced facilities for responding more rapidly to issues in the software (but they have to be rolled out across the system) and I hope, over the next 4 months we’ll reach the first of my performance goals: for any webpage in Launchpad, it will complete rendering in 5 seconds 99% of the time. (Note that we already meet this goal if you measure the whole system, but this is biased by some pages being very frequently hit and also being very small).

[edited to correct a typo and a missing ‘5 seconds’]