bzr selftest uses testtools

the emperor has new clothes

bzr has just changed the base class for its test suite from ‘unittest.TestCase’ to ‘testtools.TestCase’. This change has cleaned up a bunch of test logic, deleted a significant amount of code (much of which was redundant with Python unittest) and added some useful and important features.

bzr  has only been able to make this change due to testtools expanding its mission from a simple ‘aggregation of proven unittest extensions’ into one where new extensions that *make unittest more extensible*. My deepest thanks to Jonathan for permitting me to use testtools as the vehicle to put these extension-enabling-extensions (and for his patience in reviewing said changes!).

The change was pretty easy: The bulk of the changes were in bzrlib.tests and bzrlib.tests.test_selftest. I chose to cleanup an ugly API at the same time which added a little scattershot across a number of tests. And there are more changes that can be done to take better advantage of testtools – the amount of deleted and cleaned up code isn’t complete. Even so, its a pretty clear win:

18 files changed, 228 insertions(+), 496 deletions(-)

What went?

bzr had an implementation of This function is the main workhorse of Python’s unittest module, and yet sadly it has to be replaced to change the exceptions that can be raised(to signal new outcomes), or to improve on test cleanup. Testtools provides an API to permit registering new exception types and handlers for them. Like python 2.7 testtools also provides the TestCase.addCleanup API, and these two things combined mean that bzr no longer needs to reimplement the run method.

For expected failures, bzr uses a helper method TestCase.expectFailure to perform an existing assertion and convert the test into an expected failure if that assertion does not trigger. This was another feature testtools already provides and thus got deleted.

All the custom code for skipping and expected failures got deleted, and the other outcomes bzr uses turned into extensions (as per the run discussion above).

In bzr test cases generate a log (because bzr generates a log) and previously the TestResult in bzrlib inspected each test that had been executed to extract the log. This was made simpler by using the details API that testtools provides (see testtools.TestCase.addDetail), which permits tests to add arbitrary data in a semi-structured fashion. This is supported by subunit and a long standing bug with bzr selftest --parallel was fixed as a result – logs from tests run in other processes are now carried across the process barrier intact and are presented cleanly.

Some other minor cleanups are in unittest compatibility code, where bzr would degrade gracefully with unittest runners, and testtools provides such logic comprehensively, so all that got deleted too.

Whats new?

I think the most significant new facility that testtools offers bzrlib is assertThat. This assertion is inspired by the very nice assertThat in JUnit (which has changed substantially since Python’s unittest was written based on it). This assertion separates the two concerns of ‘raise an exception’ and ‘decide if an exception should be raised’. The separation allows for better reuse of custom checking code, because it permits composition in a cleaner way than extra assertion methods permit. Testtools does not include many matchers as yet, but matchers are easy to write, and if one were to write a small adapter to the hamcrest library, there are a bunch of ready made matchers there (though they have a very Java feel – such as is not meaning is – which is why Testtools did not use that library).

Secondly, the addDetail API referenced above, in combination with testtools.TestCase.addOnException will permit capturing the entire working area when a test fails, something that developers currently have to fiddle about with breakpoints to achieve. This hasn’t been done, but is a straight forward patch I hope to do in the new year.

Lastly, Testtools offers testtools.TestCase.getUniqueInteger and testtools.TestCase.getUniqueString, which are not as yet used in bzr tests, but we may start using them soon.

Beyond that, the other features of testtools are already present in bzrlib, and we simply need to find and delete more duplicated code.

Various releases

Recently I’ve been working on the Python unittest API in my spare time, with a long term goal of making it possible to safely and sensibly glue many different plugins together into the core.

Two important components of that goal are being able to extend the data included in a test result, and being able to change how a test is run (such as adding new exceptions that should be treated as specific outcomes – python unittest uses exceptions to signal outcomes).

In testtools 0.9.2 we have an answer to both those issues. I’m really happy with the data included in outcomes API, ‘TestCase.addDetail’. The API for extending outcomes works, but only addresses part of that issue for now.

Subunit 0.0.4, which is available for older Ubuntu releases in the Subunit releases PPA now, and mostly built on Debian (so it will propogate through to Lucid in due course) has support for the addDetail API. Subunit now depends on testtools, reducing the non-protocol related code and generally making things simpler.

Using those two together, bzr’s parallelised test suite has been improved as well, allowing it to include the log file for tests run in separate processes (previously it was silently discarded). The branch to do this will be merged soon, its just waiting on some sysadmin love to get these new versions into its merge-test environment. This change also provides complete capturing of the log when users want to supply a subunit log containing failed tests. The python code to do this is pretty simple:

def setUp(self):
    super(TestCase, self).setUp()
    self.addDetail("log", content.Content(content.ContentType("text", "plain",
        {"charset": "utf8"}), lambda:[self._get_log(keep_log_file=True)]))

I’ve made a couple of point releases to python-junitxml recently, fixing some minor bugs. I need to figure out how to add the extra data that addDetails permits  to the xml output. I suspect its a strict superset and so I’ll have to filter stuff down. If anyone knows about similar extensions done to junit’s XML format before, please leave a comment :)

Debianising with bzr-builddeb

Bzr build-deb is very nice, but it can be very tricky to get started. I recently did a fresh debianisation of a project that is in bzr upstream, and I thought I’d record the recipe to make it work (at least until the various bugs making it hard re fixed).

Assuming that the upstream uses bzr, it goes like this:

  1. Start with a branch that is close to the code you want to Debianise. E.g. if the release was off trunk, 3 commits back: bzr branch trunk -r -3 debian
  2. Debianise as normal: put the tarball with the right name in the parent dir,  add a debian directory and fiddle until you build a package you’re happy with. Don’t commit while doing this.
  3. Build a source package- debuild -S, or bzr builddeb -S
  4. Revert your changes – bzr revert.
  5. Import the dsc – bzr import-dsc ../*.dsc
  6. Now, you may find that some dot files, such as .bzrignore have been discarded inappropriately (there is a bug open on this). If that happened, keep going. Otherwise, you’re done: you can now use merge-upstream on future upstream releases, and debcommit etc.
  7. bzr uncommit
  8. bzr revert .bzrignore (and any other files that you want to get back)
  9. debcommit
  10. All done, see point  6 for details.


Why upstreams should do distribution packaging

Software comes in many shapes and styles. One of the problems the author of software faces is distributing it to their users.

As distributors we should not discourage upstreams that wish to generate binary packages themselves, rather we should cooperate with them, and ideally they will end up maintaining their stable release packages in our distributions. Currently the Debian and Ubuntu communities have a tendancy to actively discourage this by objecting when an upstream software author includes a debian/ directory in their shipped code.  I don’t know if Redhat or Suse have similar concerns, but for the dpkg toolchain, the presence of an upstream debian directory can cause toolchain issues.

In this blog post, I hope to make a case that we should consider the toolchain issues bugs rather than just-the-way-it-is, or even features.

To start at the beginning, consider the difficulty of installing software: the harder it is to install a piece of software, the more important having it has to be for a user to jump through hoops to install it.

Thus projects which care about users will make it easy to install – and there is a spectrum of ease. At one end,

checkout from version control, install various build dependencies like autoconf, gcc and so on

through to

download and run this installer

Now, where some software authors get lucky, is when someone else makes it easy to install their software, they make binary packages, so that users can simply do

apt-get install product

Now some platforms like MacOSX and Microsoft Windows really do need an installer, but in the Unix world we generally have packaging systems that can track interdependencies between libraries, download needed dependencies automatically, perform uninstalls and so on. Binary packaging in a Linux distribution has numerous benefits including better management of security updates (because a binary package can sensibly use shared libraries that are not part of the LSB).

So given the above, its no surprise to me to see the following sort of discussion on #ubuntu-motu:

  1. upstream> Hi, I want to package product.
  2. developer> Hi, you should start by reading the packaging guide
  3. (upstream is understandably daunted – the packaging guide is a substantial amount of information, but only a small fraction is needed to package any one product.)

or (less usefully)

  1. upstream> Hi, I want to package product.
  2. developer> If you want to contribute, you should start with existing bugs
  3. upstream> But I want to package product.

Another conversation, which I think is very closely related is

  1. developer> Argh, product has a debian dir, why do they do this to me?!

The reasons for this should be pretty obvious at this point:

  • Folk want to make their product easy to install and are not themselves DD’s, DM’s or MOTU’s.
  • So they package it privately – such as in a PPA, or their own archive.
  • When they package it, they naturally put the packaging rules in their source tree.

Now, why should we encourage this, rather than ask the upstream to delete their debian directory?

Because it lets us, distributors, share the packaging effort with the upstream.

Upstreams that are making packages will likely be doing this for betas, or even daily builds. As such they will find issues related to new binaries, libraries and so on well in advance of their actual release. And if we are building on their efforts, rather than discarding them, we can spend less time repeating what they did and more packaging other things.

We can also encourage the upstream to become a maintainer in the distro and do their own uploads: many upstreams will come to this on their own, but by working with them as they take their early steps we can make this more likely and an easier path.