pymake news:
- Bad news: pymake is still 5x slower than GNU make on Linux/Mac.
- Good news: pymake is 25% faster than msys make (GNU make on Windows)!
- Best news: there’s a lot of room to make performance better.
All measurements are do-nothing depend builds. Full rebuilds aren’t significantly affected because compiler speed overwhelms any time we spend in make.
Creating Windows processes is more expensive than creating processes on a unix-like operating system. Creating MSYS processes is hugely more expensive. Windows I/O in general is slow compared to Linux, at least for typical build tasks. Because pymake recurses in a single process, caches parsed makefiles such as rules.mk, and avoids many shell invocations, it can make up for slow parsing times by dramatically reducing time spent elsewhere.
How to use pymake on Windows
Don’t use pymake with client.mk on Windows, yet. pymake doesn’t understand MSYS-style paths, which is what configure substitutes for @srcdir@ and @topsrcdir@ when using client.mk. This will be fixed by the patches available from this bug tree.
Configuring manually isn’t hard: to build Firefox in c:/builds, follow this recipe:
$ mkdir /c/builds
$ hg clone http://hg.mozilla.org/mozilla-central /c/builds/mozilla-central
$ cd /c/builds/mozilla-central
$ autoconf-2.13 && (cd js/src && autoconf-2.13)
$ mkdir ff-debug
$ cd ff-debug
$ export MAKE='python -O c:/builds/mozilla-central/build/pymake/make.py'
$ ../configure --enable-application=browser --enable-debug --disable-optimize
$ python -O ../build/pymake/make.py -j4
How to use pymake on Linux/Mac
Configure manually as above, or add the following flags to your mozconfig file:
export MAKE="python -O $topsrcdir/build/pymake/make.py"
mk_add_options MAKE="python -O @TOPSRCDIR@/build/pymake/make.py"
Soon on all platforms this will be as simple as mk_add_options MOZ_ENABLE_PYMAKE=1
Thank you!
Special thanks to Arpad Borsos who wrote tests and an implementation of –keep-going for pymake.
Next plans
Immediate future plans for pymake reduce the process count even further, especially for depend builds:
Currently every invocation of nsinstall is a separate process, and we invoke nsinstall even when all its install targets are up to date. Simple tasks like this will instead be implemented as native python commands. Ted implemented a branch to do this, but the current implementation blocks the only thread. I think we’re going to switch and use shared-nothing threads and message passing to parallelize before making this the default behavior.
Every time Mozilla processes a makefile the build system combines all the compiler-generated dependencies into a single .all.pp file using mddepend.pl: this allows developers to move or remove header files without breaking depend builds. Running a perl script for every makefile invocation is silly, especially because all it does is parsing and rewrite makefile syntax. I will have pymake read these dependency files directly and ignore missing files (causing a rebuild without an error) using a syntax includedeps $(INCLUDEFILES)
Longer-term work that would make pymake much more useful:
- Build an object graph of the entire Mozilla tree recursively. I think I know how to do this, although there will be some issues with how to deal with local versus global variables.
- Warn and eventually force a more rigorous dependency graph: warn if a dependent file ‘appears’ without having a rule to create it.
- Make parsing a lot faster using mx.TextTools instead of native python regular expressions. Keep the regular expressions as a slow path for developers who don’t have TextTools installed.
Python Reference Cycles and Closures
While debugging pymake performance and memory usage I found an interesting fact, which in hind sight should have been obvious: functions which enclose themself in python create reference cycles which have to be cleaned up by the Python garbage collector:
def outerFunction(outerCallback):
targetsToBuild = [1, 2, 3]
def innerCallback():
if len(targetsToBuild):
# innerCallback closes on itself... this creates a reference cycle every time you call outerFunction
# if you call outerFunction 100000 times per build, this can add up really quickly and cause large GC pauses
targetsToBuild.pop(0).build(innerCallback)
else:
outerCallback()
After finding this problem, I refactored (1, 2, 3) the pymake code to use objects instead of closures to save asynchronous state while rebuilding. Also, OptionParser instances create cycles by default. There is a lightly-documented method OptionParser.destroy which can be used to manually break these cycles (thanks to Ted for finding it). pymake now runs without creating any reference cycles and I disabled the python garbage collector.
Environment Munging in MSYS
When MSYS goes from an MSYS process to a Windows process, and vice-versa, it munges certain environment variables to account for the path styles. I previously thought that it only munged PATH, but I discovered today that I was wrong: MSYS was munging the MAKEFLAGS environment variable in odd ways.
If MAKEFLAGS in the MSYS process was ‘ -j1 — PATH=e:/builds/mozilla-central’ it would be munged into ‘ -j1 — PATH=e;c:/mozilla-build/msys/builds/mozilla-central’ in a non-MSYS process. Without the leading space the value was not touched. I don’t know why this is, but I altered the pymake code slightly so that MAKEFLAGS would never start with a space (and would be more compatible with gmake in the process).