Subunit and subtests

Python 3 recently introduced a nice feature – subtests. When I was putting subunit version 2 together I tried to cater for this via a heuristic approach – permitting the already known requirement that some tests which are reported are not runnable be combined with substring matching to identify subtests.

However that has panned out poorly, when I went to integrate this with testr the code started to get fugly.

So, I’m going to extend the StreamResult API to know about subtests, and issue a subunit protocol bump – to 2.1 – to add a new field for labelling subtest events. My plan is to make this build a recursive tree structure – that is given test “test_foo” with subtest “i=3″ which the Python subtest code would identify as “test_foo (i=3)”, they should be identified in StreamResult as test_id “test_foo (i=3)” and parent_test_id “test_foo”. This can then nest arbitrarily deep if test runners decide to do that, and the individual runnability becomes up to the test runner, not testrepository / subunit / StreamResult.

subunit version 2 progress

Subunit V2 is coming along very well.

Current status:

  • I have a complete implementation of the StreamResult API up as a patch for testtools. Thats 2K LOC including comeprehensive tests.
  • Similarly, I have an implementation of a StreamResult parser and emitter for subunit. Thats 1K new LOC including comprehensive tests, and another 500 lines of churn where I migrate all the subunit filters to v2.
  • pdb debugging works through subunit v2, permitting dropping into a debugger to work. Yay.

Remaining things to do:

  • Update the other language bindings – the C library in particular.
  • Teach testrepository to expect v2 input (and probably still store v1 for a while)
  • Teach testrepository to use pipes for the stdin of test runner backends, and some control mechanism to switch input between different backends.
  • Discuss the in-Python API with more folk.
  • Get code merged :)

Simpler is better – a single event type for StreamResult

StreamResult, covered in my last few blog posts, has panned out pretty well.

Until that is, that I sat down to do a serialised version of it. It became fairly clear that the wire protocol can be very simple – just one event type that has a bunch of optional fields – test ids, routing code, file data, mime-type etc. It is up to the recipient at the far end of a stream to derive semantic meaning, which means that encoding a lot of rules (such as a data packet can have either a test status or file data) into the wire protocol isn’t called for.

If the wire protocol doesn’t have those rules, Python parsers that convert a bytestream into StreamResult API calls will have to manually split packets that have both status() and file() data in them… this means it would be impossible to create many legitimate bytestreams via the normal StreamResult API.

That seems to be an unnecessary restriction, and thinking about it, having a very simple ‘here is an event about a test run’ API that carries any information we have and maps down a very simple wire protocol should be about as easy to work with as the current file or status API.

Most combinations of file+status parameters is trivially interpretable, but there is one that had no prior definition – a test_status with no test id specified. Files with no testid are easily considered as ‘global scope’ for their source, so perhaps test_status should be treated the same way? [Feedback in comments or email please]. For now I’m going to leave the meaning undefined and unconstrained.

So I’m preparing a change to my patchset for StreamResult to:

  • Drop the file() method altogether.
  • Add file_bytes, mime_type and eof parameters to status().
  • Make the test_id and test_status parameters to status() optional.

This will make the API trivially serialisable (both to JSON or protobufs or whatever, or to the custom binary format I’m considering for subunit), and equally trivially parsable, which I think is a good thing.

First experience implementing StreamResult

My last two blog posts were largely about the needs of subunit, but a key result of any protocol is how easy working with it in a high level language is.

In the weekend and evenings I’ve done an implementation of a new set of classes – StreamResult and friends – that provides:

  • Adaption to and from the existing TestResult APIs (the 2.6 and below API, 2.7 API, and the testtools extended API).
  • Multiplexing multiple streams together.
  • Adding timing data to a stream if it is absent.
  • Summarising a stream.
  • Copying a stream to multiple outputs
  • A split out API for instructing a test run to stop.
  • A simple test-at-a-time stream processor that makes it easy to just deal with tests rather than the innate complexities of an event based interface.

So far the code has been uniformly simple to write. I started with an API that included an ‘estimate’ function, which I’ve since removed – I don’t believe the complexity is justified; enumeration is not significantly more expensive than counting, and runners that want to be efficient can either not enumerate or remember the enumeration from prior runs.

The documentation in the linked pull request is a good place to start to get a handle on the API; I’d love feedback.

Next steps for me are to do a subunit protocol revision that maps to the Python API, both parser and generator and see how it feels. One wrinkle there is that the reason for doing this is to fix intrinsic limits in the existing protocol – so doing forward and backward wire protocol compatibility would defeat the point. However… we can make the output side explicitly choose a protocol version, and if we can autodetect the protocol version in the parser, even if we cannot handle mixed streams we can get the benefits of the new protocol once data has been detected. That said, I think we can start without autodetection during prototyping, and add it later. Without autodetection, programs like TestRepository will need configuration options to control what protocol variant to expect. This could be done by requiring this new protocol and providing a stream filter that can be deployed when needed.

Less SPOFs: pyjunitxml, testscenarios

I’ve made the Testtools committers team own both the project and the trunk branch for both pyjunitxml and testscenarios. This removes me as a SPOF if anything needs doing in those projects – any Testtools committer can now do it. (Including code review and landing). If you are a testtools committer and need PyPI release rights, ping me and I’ll add you. (I wish PyPI had group management).

testrepository iteration for python projects

Tesetrepository has a really nice workflow for fixing a set of failing tests:

  1. Tell it about the failing tests (e.g. by doing a full test run, or running a single known failing test)
  2. Run just the known failing tests (testr run –failing)
  3. Make a change
  4. Goto step 2

As you fix up the tests testr will just give your test runner a smaller and smaller list of tests to run.

However I haven’t been able to use that feature when developing (most) Python programs.

Today though, I added the necessary support to testtools, and as a result subunit (which inherits its thin test runner shim from testtools) now supports –load-list. With this a simple .testr.conf can support this lovely workflow. This is the one used in testrepository itself: it runs the testrepository tests, which are regular unittest tests, using – this gives it subunit output, and tells testrepository how to run a subset of tests.

test_command=python -m $IDOPTION testrepository.tests.test_suite
test_id_option=--load-list $IDFILE

Subunit and nose

Looks like someone has come up with a nose plugin for subunit – excellent!

In their post the author notes that subunit is not easy_installable at the moment. It will be shortly. Thanks to Tres Seaver there is a for the python component of Subunit, and he has offered to maintain that going forward. His patch is in trunk, and the next release will include a pypi upload.

The next subunit release should be pretty soon too – the unicode support in testtools has been overhauled thanks to Martin[gz], and so we’re in much better shape on Python 2.x than we were before. Python3 for testtools is trouble free in this area because confused strings don’t exist there :)