31 Aug 2009

Hi Rich! Re hour+long unit tests

I agree that you need a comprehensive test suite, and that it should test all the dark and hidden corners of your code base.

But time is not free! A long test suite inhibits:

  • cycle time – the fastest you can release a hot fix to a customer
  • developer productivity – you can’t forget about a patch till its passed the regression test suite
  • community involvement – if it takes an hour to run the test suite, an opportunistic developer that wanted to tweak something in your code will have walked away long ago

    Note that these points are orthogonal to whether developers edit-test cycle runs some or all tests, or whether you use a CI tool, or a test-commit tool, or some other workflow.

    All that said though, I’m extremely interested in *why* any given test suite takes hours: does it need to? What is it doing? Can you decrease the time by 90% and coverage by 2%?

    I got another response back, which talks about keeping the working set of tests @ about 5 minutes long and splitting the rest off (via declared metadata on each test) into ‘run after commit or during CI’. This has merits for reducing the burden on a developer in their test-commit cycle, but as I claim above, I believe there is still an overhead from those other tests that are pending execution at some later time.

    From a LEAN perspective, the cycle time is very important. Another important thing is handoffs. Each time we hand over something (e.g. a code change that I *think* works because it passed my local tests), there is a cost. Handing over to a machine to do CI is just as expensive as handing to a colleague. Add that contributors sending in patches from the internet may not hang around to find out that their patch *fails* in your CI build, and you can see why I think CI tools are an adjunct to keeping a clean trunk, rather than a key tool. The key tool is to not commit regressions :)

    Oh, and I certainly accept that test suites should be comprehensive… I just don’t accept that more time == more coverage, or that there isn’t a trade off between comprehensive and timeliness.

  • 30 Aug 2009

    Made some time to hack… the results:

    config-manager 0.4 released, re-uploaded to debian (it was removed due to some confusion a while back). This most notably drops the hard dependency on pybaz and adds specific-revision support for bzr.

    subunit snapshot packaging sorted out to work better with subunit from Ubuntu/Debian. This latest snapshot has nested progress and subunit2gtk included.

    PQM got a bit of a cleanup:

  • The status region shown during merges is ~ twice as tall now.
  • if the precommit_hook outputs subunit it will be picked up automatically and shown in the status region.
  • all deprecation warnings in python2.6 are cleaned up
  • Pending bugfixes were merged from Tim Cole and Daniel Watkins – thanks guys!

  • 16 Aug 2009

    Hudson seemed quite nice when I was looking at how drizzle use it.

    I proposed it to the squid project to replace a collection of cron scripts – and we’re now in the final stages of deployment: Hudson is running, doing test builds. We are now polishing the deployment, tweaking where and how often reports are made, and adding more coverage to the build farm.

    I thought I’d post a few observations about the friction involved in getting it up and running. Understand that I think its a lovely product – these blemishes are really quite minor.

    Installation of the master machine on Ubuntu – add an apt repository, apt-get update, apt-get install.

    Installation of the master machine on CentOS 5.2 (Now 5.3): make a user by hand, download a .war file, create an init.d script by copy-and-adjusting an example off the web. Plenty of room to improve here.

    Installing slave machines: make a user by hand, add an rc.local entry to run java on slave.jar as that new user. This could be more polished

    Installing a FreeBSD 6.4 slave: manually download various java sources to my laptop, scp them up to the FreeBSD machine, build *java* overnight, then make a user, add an rc.local entry etc. _Painful_.

    The next thing we noticed was that the model in Hudson doesn’t really expose platforms – but we want to test on a broad array of architectures, vendors and releases. i386-Ubuntu-intrepid building doesn’t imply that i386-Debian-lenny will build. We started putting in tags on the slaves that will let us say ‘this build is for amd64-CentOS-5.2’, so that if we have multiple machines for a platform, we’ll have some redundancy, and so that its easy to get a sense of whats failing.

    This had some trouble – its very manual, and as its manually entered data it can get out of date quite easily.

    So in the weekend I set out make a plugin, and ran into some yak shaving.

    Hudson plugins use maven2 to build and deploy. So I added the maven2 plugin to my eclipse (after updating eclipse to get the shiniest bzr-eclipse), and found the first bug – issue 1580 maven2 and teamplugins in eclipse 3.5 don’t play all that nice.

    Push.

    Removing bzr-eclipse temporarily allowed eclipse’s maven plugin to work, but for some reason many dependencies were not found, and various discussions found on the net suggest manually adding them to the CLASSPATH for the project – but not how to identify which ones they were.

    Pop.

    So, I switched to netbeans – a 200MB download, as Ubuntu only has 6.5 in the archive. netbeans has the ability to treat a maven2 project as a directly editable project. I have to say that it works beautifully.

    Push.

    I made a new plugin, looked around for an appropriate interface, (DynamicLabellers, designed for exactly our intended use).

    Sadly, in my test environment, it didn’t work – the master didn’t call into the plugin at all, and no node labels were attached.

    Push.

    Grab the source for hudson itself, find the trick-for-newcomers here – do a full build outside netbeans, in netbeans open main/war as a project, and main/core as well, not just the top level pom.xml. To run, with main/war selected in the project list hit the debug button. However changes made to the main/core sources are not deployed until you build them (F11) – the debug environment looks nearly identical to a real environment.

    Pop.

    There is a buglet in DynamicLabeller support in Hudson, where inconsistent code between general slave support and the ‘master node’ – ‘Hudson.java’ causes different behaviour with dynamic labels. Specifically the master node will never get dynamic labels. So I fixed this, cleaned up the code to remove the duplication as much as possible (there are comments in the code base that different synchronisation styles are needed for some reason) and submitted upstream.

    Pop.

    I’ll make the plugin for squid pretty some evening this week, and we should be able to start asking for volunteers for the squid build farm.

    Yay!

    06 Aug 2009

    0800 Friday morning, machine is slow… why?

    Random disk I/O, evolution doing a table scan again, and popularity contest fighting with it by reading in many many inodes.

     9729 be/6 nobody    183.44 K/s    0.00 B/s  0.00 %  0.00 % perl -w /usr/sbin/popularity-contest 

    Time for pop-con to go – while I like giving statistics about use, this isn’t the first time its chosen to get in my way.

    06 Aug 2009

    Success! Last weekend I started working on glue to help drizzle integrate closer with hudson using subunit and pyjunitxml.

    This is now up and running – you can see test runs with details of the test output. The details are being read by hudson from the xml output which created by subunit2junitxml. But the drizzle test runner only needs to output subunit, which is somewhat simpler, as well as streaming.

    06 Aug 2009

    Dear evolution, Why are you taking 180 seconds (+- 20) to open mails ?

    Dear user, I’m reading message metadata for every message in that folder, then for all the new mail that arrived while I was doing that.

    Dear evolution, sqlite is good, not reading the entire maildirs in to filter mail, or open a folder, would be better.

    iotop says:
       TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO> COMMAND
     14797 be/4 robertc     4.30 M/s    0.00 B/s  0.00 %  0.00 % evolution

    Dear user, I know, but I’m still the program I was before sqlite… what to do? (bugs have been filed, tuits are currently lacking).

    Dear evolution, I see you have a large db, but perhaps its just your cache size?

    $ ls -l /home/robertc/.evolution/mail/imap/robertc@localhost/folders.db
    -rw-r--r-- 1 robertc robertc 953131008 2009-08-06 12:23 /home/robertc/.evolution/mail/imap/robertc@localhost/folders.db
    $ sqlite3 /home/robertc/.evolution/mail/imap/robertc@localhost/folders.db
    SQLite version 3.6.14.2
    Enter ".help" for instructions
    Enter SQL statements terminated with a ";"
    sqlite> pragma default_cache_size
    2000
    sqlite> pragma default_cache_size=50000

    Dear user, oooo, that feels good.
    2000 – the default, is 2MB of cache. evolution was using 535m of actual memory before, this gives it 25 times the cache size. My theory is that the still present read-all-message-info limitations in evolution were causing cache thrashing. (A DB doesn’t help at all unless one actually is looking at less data :).