Sanity and Testcases for pymake

Monday, February 23rd, 2009

Testcases kept me sane writing (and rewriting) pymake. This shouldn’t be a surprise to experienced developers: most developers agree that that test-driven development is good. Often, however, beginning programmers don’t know how to start a project with adequate testing. This post attempts to describe the pymake test environment and give examples of pymake tests.

I started pymake with fear and trepidation. I’ve been working extensively with makefiles for 6 years; makefile parsing and execution still occasionally surprises me. This fear was a great motivator: if I had thought this to be an easy job, I might have skipped writing tests until much later in the process. But the testsuite has been absolutely essential: I doubt I could have completed initial development in two weeks without it, and there is no way I could have refactored the code to support in-process recursion and parallel make this week without it.

Start Small

The most important hurdle in a new project is creating a framework to run tests. The requirements for a test framework are pretty simple:

  • make it easy to write new tests;
  • make it easy to run the tests;
  • don’t waste time writing fancy test apparatus.

The specifics of your test framework will depend on your project. When possible, re-use existing frameworks, or at least borrow extensively from them. For pymake, I use two basic types of test: makefile tests and python unit tests.

Makefile Tests

Because the entire purpose of pymake is to parse and execute makefiles, pymake has a test harness for parsing and executing makefiles. This test harness runs make against a testcase makefile; parsing and executing the makefile should complete successfully (with a 0 exit code) and print TEST-PASS somewhere during execution. Typically, each makefile will test a single feature or a related set of features.

This test harness is particularly important because pymake is supposed to be a mostly drop-in replacement for GNU make. This test harness can be used to test both GNU make and pymake. The harness was committed in revision 1 of the pymake repository, long before pymake could parse makefiles. The first tests were tests of GNU make behavior, in cases where that behavior was under-documented or confusing. Before I started implementing the meat of the parser, I already had discovered several interesting behaviors and written tests for them.

tchaikovsky:/builds/pymake $ python tests/runtests.py # run the testsuite using GNU make
tchaikovsky:/builds/pymake $ python tests/runtests.py -m /builds/pymake/make.py # run the testsuite using make.py

As the project became basically functional, each new feature was committed with a test. See, for instance, a fix for parsing line continuations with conditional blocks.

Initially, the makefile test harness only checked for success. But an important part of most test suites is to check for proper error handling. runtests.py grew additional features to allow a makefile to specify that it should return a failure error code, and also to specify a command line. It also ran each test in a clean directory, to avoid unexpected interactions between tests.

Writing makefile testcases often required creativity. It’s often important to check that commands are executed in a specified order, or that a particular command is only executed once. One technique is to append output to a signal file while running commands, and then test the contents of the file (tests/diamond-deps.mk):

# If the dependency graph includes a diamond dependency, we should only remake
# once!

all: depA depB
	cat testfile
	test `cat testfile` = "data";
	@echo TEST-PASS

depA: testfile
depB: testfile

testfile:
	printf "data" >>$@

This same technique is also useful to make sure that parallel execution is enabled or disabled appropriately: tests/parallel-toserial.mk.

Python Unit Tests

In the early stages of pymake, only some portions of the data model and parser were implemented: there were lots of low-level functions for location-tracking, line continuations, and tokenizing. It was important to me that these low-level functions were rock-solid before I started attempting to glue them together.

The python standard library includes the unittest module, which is a simple framework for creating and running a test suite.

import unittest
class MyTest(unittest.TestCase):
  # any function named test* will be run as a single test case
  def test_arrayindex(self):
    self.assertEqual([1, 2, 3][0], 1)

pymake uses the unittest module to test the data model and parser: tests/datatests.py and tests/parsertests.py.

One annoying limitation of the unittest module is that is difficult to construct a set of test cases that run the same test code on different input data. To solve this problem, I wrote a multitest helper function. The developer writes a class with a testdata dictionary and a runSingle method, and multitest will create a test function for each element in the test data:

def multitest(cls):
    for name in cls.testdata.iterkeys():
        def m(self, name=name):
            return self.runSingle(*self.testdata[name])

        setattr(cls, 'test_%s' % name, m)
    return cls

class TokenTest(TestBase):
    testdata = {
        'wsmatch': ('  ifdef FOO', 2, ('ifdef', 'else'), True, 'ifdef', 8),
        'wsnomatch': ('  unexpected FOO', 2, ('ifdef', 'else'), True, None, 2),
        'wsnows': ('  ifdefFOO', 2, ('ifdef', 'else'), True, None, 2),
        'paren': (' "hello"', 1, ('(', "'", '"'), False, '"', 2),
        }

    def runSingle(self, s, start, tlist, needws, etoken, eoffset):
        d = pymake.parser.Data.fromstring(s, None)
        tl = pymake.parser.TokenList.get(tlist)
        atoken, aoffset = d.findtoken(start, tl, needws)
        self.assertEqual(atoken, etoken)
        self.assertEqual(aoffset, eoffset)
multitest(TokenTest)

Tests Allow For Simple Refactoring

Every project I’ve worked on has had to refactor code after it was first written. Sometimes you know you’ll have to refactor code in the future. Other times, you discover the need to refactor code well after you’ve started writing it. In either case, the test suite can allow you to perform large-scale refactoring tasks with confidence. Two examples will help explain how refactoring was important:

Makefile Variable Value Representation

VAR = $(OTHER) $(function arg1,arg2)

Makefiles have two different “flavors” of variables, recursive and simple. When I first started pymake, I decided to parse recursive variable declarations “immediately” into an Expansion object. This worked well, and it made reporting the locations of parse errors easy.

Unfortunately, there is a case where you cannot parse a variable value immediately:

VAR = $(function
VAR += arg1,
VAR += arg2
VAR += )

In this case, VAR cannot be parsed until it has been fully constructed. Fixing this case involved changing the entire data model of variable storage:

Revision 64 (348f682e3943)

Adding a makefile test for the failing case.

Revision 67 (63531e755f52)

Refactoring variable storage to account for dynamically-composed variables.

Independent Parsing Model

Because parsing doesn’t perform very well, it’s good to optimize it away when possible. The original parsing code when through each makefile line by line and inserted rules, commands, and variables into the makefile data structure immediately. This makes it difficult or impossible to save the parsed structure and re-use it. On Friday I refactored the parser into two phases. The first phases creates a hierarchical parsing model independent of any particular makefile. The second phase executes the parsing model in the context of the variables of a particular Makefile.

After first implementing this change, I found one serious error: I was associating commands with rules without considering conditionals such as ifdefs:

all:
command1
ifdef FOO
command2
else
command3

Fortunately, tests/ifdefs.mk was already in the testsuite, and detected this error. Fixing it required reworking the parsing model with an extra execution context to correctly associate commands with their parent rules.

Secondly, after committing the parsing model, I found an additional regression when building Mozilla: the behavior of “ifndef” was reversed due to an early return statement. I was able to add an additional test and a simple fix, once I figured out what was going on.

pymake status

pymake features implemented since last week:

  • Implement $(eval):
  • 122:1995f94b1c2f: Implement the vpath directive
  • 123:17169ca68e03: Implement automatic wildcard expansion in targets and prerequisites. I hate this, but NSS uses it, and I hate NSS more.
  • 135:fcb8d4ddd21b: Run submakes within the same process if possible
  • parallel-execution branch: Parallel execution of commands (-jN)
  • 156:3ae1e58c1a25: Cache parser models (avoid reparsing rules.mk)
  • win32-msys branch: Ted has pymake working on Windows. It doesn’t build Mozilla yet because we leak MSYS paths into makefiles, but that shouldn’t be hard to fix.

pymake: A Mostly GNU-compatible `make`, Written in Python

Friday, February 13th, 2009

Mozilla has a lot of Makefiles, and although we’re not completely happy with our current recursive-make build system, converting to something else is a very difficult task. In order to consider other solutions, we have to have a seamless migration path from our existing infrastructure. Unfortunately for any potential migration, Mozilla uses pretty much every make feature and quirk imaginable. It isn’t sufficient to get by with some simplistic importer: any migration needs to understand the arcane details of makefile syntax and parsing rules.

So finally, after two years of considering whether it’s feasible and a good idea, I bit the bullet and wrote a make replacement in python. It was almost as mind-numbingly difficult as I thought it would be. But after two solid weeks, I have a tool which will mostly build Mozilla, and seems to be doing it correctly.

Why would you do such a thing?

Mark Pilgrim was right: only a crazy person says “I’m going to build my own build system”. So rather than creating a whole new build system with a new syntax, I wanted to start out replicating an existing build system exactly. Then we can make incremental changes if appropriate.

What kind of incremental changes?

First up are speed increases. Our Windows builds are currently very slow, in part due to the large number of processes that are spawned. There are several ways to attack this:

  • Each makefile command currently spawns an MSYS shell and then (usually) the actual binary. The MSYS shell is expensive and in most cases unnecessary. pymake should be able to skip the shell for simple command lines.
  • Mozilla runs `nsinstall` a lot. Axel has already implemented nsinstall as a python module: our makefile rules should be able to run commands in python and skip the external process.
  • We can use recursive make without launching additional make processes by recursing within the same pymake process.

Where can I see this monstrosity?

The pymake website. You can also just pull or browser the pymake sources directly from hg.mozilla.org using Mercurial.

Why don’t you just hack GNU make itself?

There are some improvements we’re interested in doing which just aren’t possible within a C codebase:

  • implementing rules directly in Python
  • Condensing multiple invocations of make into a single process

. Python is also about a zillion times easier to hack quickly.

pymake is mostly GNU-make compatible. What are the differences?

The known differences are documented in the README file.

Does it perform well?

Not yet. pymake currently takes more time than GNU make. Almost all of this time is spent reading and parsing makefiles. Parsing autoconf.mk/config.mk/rules.mk from the Mozilla tree takes 0.3 seconds on my Linux machine, where GNU make takes 0.03 seconds. Dependency resolution and execution is as fast or faster than GNU make. I’d love to have some help profiling and fixing the parsing. Also, pymake does not yet support parallel execution.

Next week sometime, I promise to write about some of the difficulties I encountered writing pymake, and how test-driven development saved me from terrible pain and suffering.

Parsing Compiler Errors

Wednesday, December 3rd, 2008

Long ago, Mozilla had a tinderbox which would collate every warning produced by the Mozilla build and generate statistics and reports about them. I’m trying to re-create this tool.

When building Mozilla or most other large software projects that use GNU make, compiler warnings get sent to stdout (and sometimes stderr). The messages usually look something like this:

../../../dist/include/dom/nsIDOMXULSelectCntrlEl.h:33: warning: ‘virtual nsresult nsIDOMXULSelectControlElement::GetSelectedItem(nsIDOMXULSelectControlItemElement**)’ was hidden
../../../dist/include/dom/nsIDOMXULMultSelectCntrlEl.h:73: warning:   by ‘virtual nsresult nsIDOMXULMultiSelectControlElement::GetSelectedItem(PRInt32, nsIDOMXULSelectControlItemElement**)’

All of the file paths in the warning (or set of warnings in this case) are relative to the current working directory. Because the working directory changes during the build as make recurses through subdirectories, automatic parsers need some way to know what the working directory is at any point in the build log. Make provides the -w option which will print the following output every time it recurses into or leaves a directory:

make[1]: Entering directory ‘/builds/static-analysis-buildbot/slave/full/build’
make[1]: Leaving directory ‘/builds/static-analysis-buildbot/slave/full/build’

This is fine if you are only building in one directory at a time. But with the -j option, it is likely that make will be building in multiple directories at once. This will interleave output from multiple jobs in the same log, making it difficult for an automated parser to make any sense of them.

What I’d like is a tool or technique which will save the build log for each makefile command separately and combine them all at the end of a build.

Pre-emptive snarky comment: “switch Mozilla to use {scons,waf,cmake,ant,…}”