Introducing Jydoop: Fast and Sane Map-Reduce

Tuesday, April 9th, 2013

Analyzing large data sets is hard, but it’s often way harder than it needs to be. Today, Taras Glek and I are unveiling a new data-analysis tool called jydoop. Jydoop is designed to allow engineers without any experience in HBase/Hadoop to write data analyses on their local machine and then deploy the analysis to our production Hadoop cluster. We want to enable every Mozilla engineer to use telemetry and crash-stats data as effectively as possible.

Goals

Jydoop started with three simple goals:

Enable fast prototyping/testing

Setting up hadoop/hbase is unreasonably difficult and time-consuming, and engineers should not need a hadoop/hbase setup in order to write an analysis. Because Jydoop analyses are written in Python, they can be developed and tested on a local machine without Hadoop or even Java.

Don’t hide map/reduce

In a query language like pig, it is hard to know exactly how your query will perform; which tasks will run on the map nodes, which will run on the reducers, and which must be run on the final reduced data. Jydoop doesn’t hide those details: each analysis has simple map/combine/reduce functions so that engineers know how a clustered query is going to behave.

Be fast

Performance of clustered queries should be as good or better than our existing solutions using pig or other existing tools.

Prototyping

Telemetry data takes the form of a single blob of JSON which is stored in an hbase table. A telemetry ping may be larger than 100kb, and we typically receive about 2 million telemetry pings per day. In order to allow engineers to test an analysis, we save off a small sample of reports (5000 or so) in a single file. Then analysis scripts can be written and tested against the sample.

The simplest job is one which simply acts as a filter. The following script is used to select all telemetry records which have an “androidANR” key and save them to a file:

import telemetryutils

# Ask telemetryutils to set up our query by date range using the correct hbase tables
setupjob = telemetryutils.setupjob

def map(key, value, context):
    if value.find("androidANR") == -1:
         return

    context.write(key, value)

To test this script against sample data, run python FileDriver.py scripts/anr.py sampledata.txt.
To run this script against a single day of telemetry data, run make ARGS='scripts/anr.py anrresults-20130308.txt 20130308 20130308' hadoop.

More complex analyses are also possible. The following script will calculate the bucketed distribution of the size of the telemetry ping:

import telemetryutils
import jydoop
import json

setupjob = telemetryutils.setupjob

kBucketSize = float(0x400)

def map(key, value, context):
    j = json.loads(value)
    channel = j['info'].get('appUpdateChannel', None)

    # if .info.appUpdateChannel is not present, we don't care about this record
    if channel is None:
        return # Don't care about this record

    bucket = int(round(len(value) / kBucketSize))
    context.write((channel, bucket), 1)

combine = jydoop.sumreducer
reduce = jydoop.sumreducer

I was then able to produce this chart with the result data and a little JS:

telemetry-size-distribution-20130308

History and Alternatives

The native language for Hadoop map/reduce jobs is Java. Writing an analysis directly in Java requires careful construction of a class hierarchy and is extremely tedious. Even more annoying, it’s basically impossible to test an analysis without having access to a hadoop/hbase environment. Even if you use something like the cloudera test VMs, setting up hbase and mapreduce jobs on test data is onerous.

It is a venerable Hadoop tradition to build wrapper tools around mapreduce which run distributed queries. There are tools such as Hadoop Streaming which allow map/reduce to forward to arbitrary programs via pipes. There are at least five different tools which already combine Hadoop with python. Unfortunately, none of these tools met the performance and prototyping requirements for Mozilla’s needs.

At Mozilla, we currently mostly use pig, a query language and execution tool which transforms a series of query and filter statements into one or more map/reduce jobs. In theory, pig has an execution mode which can prototype a query on local data. In practice, this is also very difficult to use, and so most people prototype their pig scripts directly on the production cluster. It works after a while, but it’s not especially friendly.

Performance

They key difference between Jydoop and the existing Python alternatives is that we use Jython to run the python code directly within the map and reduce jobs, rather than shelling out and communicating via pipes. This allows for much tighter integration with hadoop and also allows us to control the execution environment.

For maximum performance, we use a custom key/value class which serializes and compares python objects. For performance and sanity, this class operates on a limited subset of Python types: analysis scripts are limited to using only None, integers, floats, strings, and tuples of these basic types as their key and values.

During the process of developing Jydoop, we also discovered that JSON parsing performance was a significant factor in the overall job speed. The performance of the `json` module which is built into jython 2.7 is horrible. We got better but not great performance using jyson as a drop-in JSON replacement. Eventually we wrote a complete json replacement which wraps the standard Jackson streaming API. With these improvements in place, tt appears that jydoop performance beats pig performance on equivalent tasks.

Conclusion

Jydoop is now available on github.

We hope that jydoop will make it possible for many more people to work with telemetry, crash, and healthreport data. If you have any issues or questions about using jydoop with Mozilla data, please feel free to ask questions in mozilla.dev.platform. If you are using jydoop for other projects, feel free to file github issues, submit github PRs, or email me directly and let me know what you’re up to!

pymake: 25% faster than msys make

Thursday, April 2nd, 2009

pymake news:

  • Bad news: pymake is still 5x slower than GNU make on Linux/Mac.
  • Good news: pymake is 25% faster than msys make (GNU make on Windows)!
  • Best news: there’s a lot of room to make performance better.

All measurements are do-nothing depend builds. Full rebuilds aren’t significantly affected because compiler speed overwhelms any time we spend in make.

Creating Windows processes is more expensive than creating processes on a unix-like operating system. Creating MSYS processes is hugely more expensive. Windows I/O in general is slow compared to Linux, at least for typical build tasks. Because pymake recurses in a single process, caches parsed makefiles such as rules.mk, and avoids many shell invocations, it can make up for slow parsing times by dramatically reducing time spent elsewhere.

How to use pymake on Windows

Don’t use pymake with client.mk on Windows, yet. pymake doesn’t understand MSYS-style paths, which is what configure substitutes for @srcdir@ and @topsrcdir@ when using client.mk. This will be fixed by the patches available from this bug tree.

Configuring manually isn’t hard: to build Firefox in c:/builds, follow this recipe:

$ mkdir /c/builds
$ hg clone http://hg.mozilla.org/mozilla-central /c/builds/mozilla-central
$ cd /c/builds/mozilla-central
$ autoconf-2.13 && (cd js/src && autoconf-2.13)
$ mkdir ff-debug
$ cd ff-debug
$ export MAKE='python -O c:/builds/mozilla-central/build/pymake/make.py'
$ ../configure --enable-application=browser --enable-debug --disable-optimize
$ python -O ../build/pymake/make.py -j4

How to use pymake on Linux/Mac

Configure manually as above, or add the following flags to your mozconfig file:

export MAKE="python -O $topsrcdir/build/pymake/make.py"
mk_add_options MAKE="python -O @TOPSRCDIR@/build/pymake/make.py"

Soon on all platforms this will be as simple as mk_add_options MOZ_ENABLE_PYMAKE=1

Thank you!

Special thanks to Arpad Borsos who wrote tests and an implementation of –keep-going for pymake.

Next plans

Immediate future plans for pymake reduce the process count even further, especially for depend builds:

Currently every invocation of nsinstall is a separate process, and we invoke nsinstall even when all its install targets are up to date. Simple tasks like this will instead be implemented as native python commands. Ted implemented a branch to do this, but the current implementation blocks the only thread. I think we’re going to switch and use shared-nothing threads and message passing to parallelize before making this the default behavior.

Every time Mozilla processes a makefile the build system combines all the compiler-generated dependencies into a single .all.pp file using mddepend.pl: this allows developers to move or remove header files without breaking depend builds. Running a perl script for every makefile invocation is silly, especially because all it does is parsing and rewrite makefile syntax. I will have pymake read these dependency files directly and ignore missing files (causing a rebuild without an error) using a syntax includedeps $(INCLUDEFILES)

Longer-term work that would make pymake much more useful:

  • Build an object graph of the entire Mozilla tree recursively. I think I know how to do this, although there will be some issues with how to deal with local versus global variables.
  • Warn and eventually force a more rigorous dependency graph: warn if a dependent file ‘appears’ without having a rule to create it.
  • Make parsing a lot faster using mx.TextTools instead of native python regular expressions. Keep the regular expressions as a slow path for developers who don’t have TextTools installed.

Python Reference Cycles and Closures

While debugging pymake performance and memory usage I found an interesting fact, which in hind sight should have been obvious: functions which enclose themself in python create reference cycles which have to be cleaned up by the Python garbage collector:

def outerFunction(outerCallback):
  targetsToBuild = [1, 2, 3]
  def innerCallback():
    if len(targetsToBuild):
      # innerCallback closes on itself... this creates a reference cycle every time you call outerFunction
      # if you call outerFunction 100000 times per build, this can add up really quickly and cause large GC pauses
      targetsToBuild.pop(0).build(innerCallback)
    else:
      outerCallback()

After finding this problem, I refactored (1, 2, 3) the pymake code to use objects instead of closures to save asynchronous state while rebuilding. Also, OptionParser instances create cycles by default. There is a lightly-documented method OptionParser.destroy which can be used to manually break these cycles (thanks to Ted for finding it). pymake now runs without creating any reference cycles and I disabled the python garbage collector.

Environment Munging in MSYS

When MSYS goes from an MSYS process to a Windows process, and vice-versa, it munges certain environment variables to account for the path styles. I previously thought that it only munged PATH, but I discovered today that I was wrong: MSYS was munging the MAKEFLAGS environment variable in odd ways.

If MAKEFLAGS in the MSYS process was ‘ -j1 — PATH=e:/builds/mozilla-central’ it would be munged into ‘ -j1 — PATH=e;c:/mozilla-build/msys/builds/mozilla-central’ in a non-MSYS process. Without the leading space the value was not touched. I don’t know why this is, but I altered the pymake code slightly so that MAKEFLAGS would never start with a space (and would be more compatible with gmake in the process).

Performance and Pymake

Thursday, February 26th, 2009

pymake runs correctly now. With some Mozilla patches, I can get it to recurse through most of the tree within a single process, doing parallel builds correctly, on Windows, Linux, and Mac, python 2.4 and 2.5.

6x as slow

Performance is bad. On a benchmark of a do-nothing rebuild of make tier_xpcom, which is about 40 recursive makes:

bsmedberg $ time make tier_xpcom
real	0m1.129s

bsmedberg $ time python -O /builds/pymake/make.py tier_xpcom
real	0m7.525s

This is a bit depressing, because improved Windows performance was one of the primary reasons for doing pymake. It’s possible that the Windows numbers are better, given that launching Windows processes is so much more expensive than on Linux: but I doubt that will make up the whole 6x performance loss.

Improved 50% today!

Earlier today, the number was about 15 seconds. Profile-guided patches helped reduce execution time significantly!

The first optimization was string-joining performance. str.join Showed up near the top of profiles both in call counts and own-time. Expansion objects are a core data structure in pymake: the parser turns a string such as “Hello$(COMMA), $(subst a,o,warld)!” into a list of bare strings and function calls (a variable expansion is treated as a function. An expansion is “resolved” to a string by passing it variables and a makefile context. The original, naive way of resolving an Expansion used ''.join() on the elements. This was replaced, in two phases, with a system of iterators which recursively yield strings, and an itersplit method, which splits an expansion into words without even joining it into a single string. A final optimization replaced ''.join entirely: it was better, in 99% of cases, to use simple appending methods when few elements are being joined.

Another optimization avoids parsing makefile syntax into expansions until it’s actually needed. In many cases, makefiles will use a small set of variables many times, and will never read the value of other variables. The first candidate optimization had pymake parse variables as they were set; a much better solution later was to lazily parse variables the first time they were read.

A grab-bag of other optimizations improved performance by a bit, but the last attempt increased code complexity far more than performance.

Hitting a Performance Barrier

At the moment I think pymake has hit a performance barrier and I’m not sure how to proceed. The current profile of pymake, generated with cProfile, is mostly unhelpful:

7228961 function calls (6902795 primitive calls) in 10.934 CPU seconds

Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    15529    0.783    0.000    2.059    0.000 /builds/pymake/pymake/parser.py:689(parsemakesyntax)
49054/35478    0.555    0.000    2.320    0.000 /builds/pymake/pymake/util.py:24(itersplit)
      128    0.396    0.003    0.396    0.003 {posix.read}
466085/222572    0.356    0.000    2.653    0.000 /builds/pymake/pymake/data.py:192(resolve)
    51384    0.289    0.000    1.491    0.000 /builds/pymake/pymake/parserdata.py:214(execute)
    13876    0.288    0.000    1.007    0.000 /builds/pymake/pymake/data.py:684(resolvevpath)
    29268    0.280    0.000    0.280    0.000 {posix.stat}
   171027    0.266    0.000    0.471    0.000 /builds/pymake/pymake/data.py:430(match)
    25700    0.246    0.000    0.327    0.000 /builds/pymake/pymake/data.py:384(__init__)
    40350    0.223    0.000    0.223    0.000 /usr/lib64/python2.5/logging/__init__.py:1158(getEffectiveLevel)
    58982    0.213    0.000    0.329    0.000 /builds/pymake/pymake/data.py:312(set)
43854/42319    0.211    0.000    1.572    0.000 /builds/pymake/pymake/functions.py:56(resolve)
131959/117714    0.207    0.000    1.343    0.000 /builds/pymake/pymake/data.py:258(get)
     2130    0.194    0.000    2.281    0.001 /builds/pymake/pymake/data.py:542(resolveimplicitrule)
    47515    0.189    0.000    0.421    0.000 /builds/pymake/pymake/data.py:1204(gettarget)
      128    0.182    0.001    0.182    0.001 {posix.fork}
     7717    0.174    0.000    1.941    0.000 /builds/pymake/pymake/parserdata.py:117(execute)
    57298    0.173    0.000    0.255    0.000 /usr/lib64/python2.5/posixpath.py:56(join)
    73798    0.165    0.000    0.628    0.000 /builds/pymake/pymake/data.py:1103(matchesfor)
    46953    0.157    0.000    0.184    0.000 /builds/pymake/pymake/parser.py:176(iterdata)
1153401/1150418    0.156    0.000    0.158    0.000 {len}
    27900    0.156    0.000    0.163    0.000 /builds/pymake/pymake/data.py:67(__init__)
11264/168    0.148    0.000    6.120    0.036 /builds/pymake/pymake/parserdata.py:431(execute)
37008/23960    0.141    0.000    0.176    0.000 /builds/pymake/pymake/parser.py:193(itermakefilechars)
   330817    0.135    0.000    0.135    0.000 {method 'startswith' of 'str' objects}

parsemakesyntax, the function which parses $(FOO) $(BAR) into an Expansion, is still the single most time-consuming function. But since I don’t have line-by-line heatmaps, it’s hard to know what parts of that function might be inefficient. The callee data is not much help:

                                                          ncalls  tottime  cumtime
/builds/pymake/pymake/parser.py:689(parsemakesyntax)  ->   27666    0.017    0.017  /builds/pymake/pymake/data.py:111(__init__)
                                                             233    0.000    0.001  /builds/pymake/pymake/data.py:116(fromstring)
                                                           33408    0.059    0.097  /builds/pymake/pymake/data.py:145(appendstr)
                                                           10219    0.014    0.020  /builds/pymake/pymake/data.py:156(appendfunc)
                                                           27666    0.084    0.212  /builds/pymake/pymake/data.py:183(finish)
                                                            1474    0.002    0.002  /builds/pymake/pymake/functions.py:24(__init__)
                                                            1271    0.002    0.002  /builds/pymake/pymake/functions.py:32(setup)
                                                            2765    0.005    0.007  /builds/pymake/pymake/functions.py:40(append)
                                                            8315    0.018    0.022  /builds/pymake/pymake/functions.py:48(__init__)
                                                             430    0.001    0.001  /builds/pymake/pymake/functions.py:70(__init__)
                                                             203    0.001    0.002  /builds/pymake/pymake/functions.py:380(setup)
                                                           25800    0.106    0.346  /builds/pymake/pymake/parser.py:73(getloc)
                                                            9986    0.068    0.072  /builds/pymake/pymake/parser.py:97(findtoken)
                                                           27239    0.035    0.072  /builds/pymake/pymake/parser.py:155(get)
                                                           46953    0.157    0.184  /builds/pymake/pymake/parser.py:176(iterdata)
                                                           15245    0.071    0.133  /builds/pymake/pymake/parser.py:193(itermakefilechars)
                                                            6440    0.026    0.033  /builds/pymake/pymake/parser.py:247(itercommandchars)
                                                           25515    0.032    0.032  /builds/pymake/pymake/parser.py:673(__init__)
                                                           15529    0.003    0.003  {callable}
                                                           28565    0.008    0.010  {len}
                                                            9986    0.003    0.003  {method 'append' of 'list' objects}
                                                               1    0.000    0.000  {method 'iterkeys' of 'dict' objects}
                                                            9986    0.005    0.005  {method 'pop' of 'list' objects}

Yes, I know getloc is inefficient, and a commenter on one of my previous posts suggests a possible solution. But that’s not going to create any significant improvement. In order to have performance parity with GNU make there has to be an algorithmic improvement.

Can I trust cProfile?

There are some confusing aspects to the cProfile output which make me suspect it. In particular, I suspect that generator functions are not being accounted for correctly: the primary work of the iterdata function is to call match on a compiled regular expression object, but that method doesn’t even show up in the callee list:

                                                   ncalls  tottime  cumtime
/builds/pymake/pymake/parser.py:176(iterdata)  ->   17815    0.004    0.004  {built-in method end}
                                                    17815    0.004    0.004  {built-in method group}
                                                    35630    0.014    0.014  {built-in method start}
                                                    25222    0.005    0.005  {len}

In any case, it’s hard to usefully analyze the profiling output. What I want is a Shark-like hierarchical profile. Apparently, dtrace can profile Python code, but I seen any useful visualization/analysis tools for that combination: if anyone knows of something, please let me know!

Help Wanted

I’ve been head-down in pymake for a few weeks now; my review queue and several other Mozilla tasks need attention. I’d really love some help from people who know Python performance (or who have no fear). If you’re interested and want some guidance, please e-mail me or leave a comment here. We can probably construct some useful pymake performance tests that are not as scary as actually building Mozilla.

Sanity and Testcases for pymake

Monday, February 23rd, 2009

Testcases kept me sane writing (and rewriting) pymake. This shouldn’t be a surprise to experienced developers: most developers agree that that test-driven development is good. Often, however, beginning programmers don’t know how to start a project with adequate testing. This post attempts to describe the pymake test environment and give examples of pymake tests.

I started pymake with fear and trepidation. I’ve been working extensively with makefiles for 6 years; makefile parsing and execution still occasionally surprises me. This fear was a great motivator: if I had thought this to be an easy job, I might have skipped writing tests until much later in the process. But the testsuite has been absolutely essential: I doubt I could have completed initial development in two weeks without it, and there is no way I could have refactored the code to support in-process recursion and parallel make this week without it.

Start Small

The most important hurdle in a new project is creating a framework to run tests. The requirements for a test framework are pretty simple:

  • make it easy to write new tests;
  • make it easy to run the tests;
  • don’t waste time writing fancy test apparatus.

The specifics of your test framework will depend on your project. When possible, re-use existing frameworks, or at least borrow extensively from them. For pymake, I use two basic types of test: makefile tests and python unit tests.

Makefile Tests

Because the entire purpose of pymake is to parse and execute makefiles, pymake has a test harness for parsing and executing makefiles. This test harness runs make against a testcase makefile; parsing and executing the makefile should complete successfully (with a 0 exit code) and print TEST-PASS somewhere during execution. Typically, each makefile will test a single feature or a related set of features.

This test harness is particularly important because pymake is supposed to be a mostly drop-in replacement for GNU make. This test harness can be used to test both GNU make and pymake. The harness was committed in revision 1 of the pymake repository, long before pymake could parse makefiles. The first tests were tests of GNU make behavior, in cases where that behavior was under-documented or confusing. Before I started implementing the meat of the parser, I already had discovered several interesting behaviors and written tests for them.

tchaikovsky:/builds/pymake $ python tests/runtests.py # run the testsuite using GNU make
tchaikovsky:/builds/pymake $ python tests/runtests.py -m /builds/pymake/make.py # run the testsuite using make.py

As the project became basically functional, each new feature was committed with a test. See, for instance, a fix for parsing line continuations with conditional blocks.

Initially, the makefile test harness only checked for success. But an important part of most test suites is to check for proper error handling. runtests.py grew additional features to allow a makefile to specify that it should return a failure error code, and also to specify a command line. It also ran each test in a clean directory, to avoid unexpected interactions between tests.

Writing makefile testcases often required creativity. It’s often important to check that commands are executed in a specified order, or that a particular command is only executed once. One technique is to append output to a signal file while running commands, and then test the contents of the file (tests/diamond-deps.mk):

# If the dependency graph includes a diamond dependency, we should only remake
# once!

all: depA depB
	cat testfile
	test `cat testfile` = "data";
	@echo TEST-PASS

depA: testfile
depB: testfile

testfile:
	printf "data" >>$@

This same technique is also useful to make sure that parallel execution is enabled or disabled appropriately: tests/parallel-toserial.mk.

Python Unit Tests

In the early stages of pymake, only some portions of the data model and parser were implemented: there were lots of low-level functions for location-tracking, line continuations, and tokenizing. It was important to me that these low-level functions were rock-solid before I started attempting to glue them together.

The python standard library includes the unittest module, which is a simple framework for creating and running a test suite.

import unittest
class MyTest(unittest.TestCase):
  # any function named test* will be run as a single test case
  def test_arrayindex(self):
    self.assertEqual([1, 2, 3][0], 1)

pymake uses the unittest module to test the data model and parser: tests/datatests.py and tests/parsertests.py.

One annoying limitation of the unittest module is that is difficult to construct a set of test cases that run the same test code on different input data. To solve this problem, I wrote a multitest helper function. The developer writes a class with a testdata dictionary and a runSingle method, and multitest will create a test function for each element in the test data:

def multitest(cls):
    for name in cls.testdata.iterkeys():
        def m(self, name=name):
            return self.runSingle(*self.testdata[name])

        setattr(cls, 'test_%s' % name, m)
    return cls

class TokenTest(TestBase):
    testdata = {
        'wsmatch': ('  ifdef FOO', 2, ('ifdef', 'else'), True, 'ifdef', 8),
        'wsnomatch': ('  unexpected FOO', 2, ('ifdef', 'else'), True, None, 2),
        'wsnows': ('  ifdefFOO', 2, ('ifdef', 'else'), True, None, 2),
        'paren': (' "hello"', 1, ('(', "'", '"'), False, '"', 2),
        }

    def runSingle(self, s, start, tlist, needws, etoken, eoffset):
        d = pymake.parser.Data.fromstring(s, None)
        tl = pymake.parser.TokenList.get(tlist)
        atoken, aoffset = d.findtoken(start, tl, needws)
        self.assertEqual(atoken, etoken)
        self.assertEqual(aoffset, eoffset)
multitest(TokenTest)

Tests Allow For Simple Refactoring

Every project I’ve worked on has had to refactor code after it was first written. Sometimes you know you’ll have to refactor code in the future. Other times, you discover the need to refactor code well after you’ve started writing it. In either case, the test suite can allow you to perform large-scale refactoring tasks with confidence. Two examples will help explain how refactoring was important:

Makefile Variable Value Representation

VAR = $(OTHER) $(function arg1,arg2)

Makefiles have two different “flavors” of variables, recursive and simple. When I first started pymake, I decided to parse recursive variable declarations “immediately” into an Expansion object. This worked well, and it made reporting the locations of parse errors easy.

Unfortunately, there is a case where you cannot parse a variable value immediately:

VAR = $(function
VAR += arg1,
VAR += arg2
VAR += )

In this case, VAR cannot be parsed until it has been fully constructed. Fixing this case involved changing the entire data model of variable storage:

Revision 64 (348f682e3943)

Adding a makefile test for the failing case.

Revision 67 (63531e755f52)

Refactoring variable storage to account for dynamically-composed variables.

Independent Parsing Model

Because parsing doesn’t perform very well, it’s good to optimize it away when possible. The original parsing code when through each makefile line by line and inserted rules, commands, and variables into the makefile data structure immediately. This makes it difficult or impossible to save the parsed structure and re-use it. On Friday I refactored the parser into two phases. The first phases creates a hierarchical parsing model independent of any particular makefile. The second phase executes the parsing model in the context of the variables of a particular Makefile.

After first implementing this change, I found one serious error: I was associating commands with rules without considering conditionals such as ifdefs:

all:
command1
ifdef FOO
command2
else
command3

Fortunately, tests/ifdefs.mk was already in the testsuite, and detected this error. Fixing it required reworking the parsing model with an extra execution context to correctly associate commands with their parent rules.

Secondly, after committing the parsing model, I found an additional regression when building Mozilla: the behavior of “ifndef” was reversed due to an early return statement. I was able to add an additional test and a simple fix, once I figured out what was going on.

pymake status

pymake features implemented since last week:

  • Implement $(eval):
  • 122:1995f94b1c2f: Implement the vpath directive
  • 123:17169ca68e03: Implement automatic wildcard expansion in targets and prerequisites. I hate this, but NSS uses it, and I hate NSS more.
  • 135:fcb8d4ddd21b: Run submakes within the same process if possible
  • parallel-execution branch: Parallel execution of commands (-jN)
  • 156:3ae1e58c1a25: Cache parser models (avoid reparsing rules.mk)
  • win32-msys branch: Ted has pymake working on Windows. It doesn’t build Mozilla yet because we leak MSYS paths into makefiles, but that shouldn’t be hard to fix.

pymake: A Mostly GNU-compatible `make`, Written in Python

Friday, February 13th, 2009

Mozilla has a lot of Makefiles, and although we’re not completely happy with our current recursive-make build system, converting to something else is a very difficult task. In order to consider other solutions, we have to have a seamless migration path from our existing infrastructure. Unfortunately for any potential migration, Mozilla uses pretty much every make feature and quirk imaginable. It isn’t sufficient to get by with some simplistic importer: any migration needs to understand the arcane details of makefile syntax and parsing rules.

So finally, after two years of considering whether it’s feasible and a good idea, I bit the bullet and wrote a make replacement in python. It was almost as mind-numbingly difficult as I thought it would be. But after two solid weeks, I have a tool which will mostly build Mozilla, and seems to be doing it correctly.

Why would you do such a thing?

Mark Pilgrim was right: only a crazy person says “I’m going to build my own build system”. So rather than creating a whole new build system with a new syntax, I wanted to start out replicating an existing build system exactly. Then we can make incremental changes if appropriate.

What kind of incremental changes?

First up are speed increases. Our Windows builds are currently very slow, in part due to the large number of processes that are spawned. There are several ways to attack this:

  • Each makefile command currently spawns an MSYS shell and then (usually) the actual binary. The MSYS shell is expensive and in most cases unnecessary. pymake should be able to skip the shell for simple command lines.
  • Mozilla runs `nsinstall` a lot. Axel has already implemented nsinstall as a python module: our makefile rules should be able to run commands in python and skip the external process.
  • We can use recursive make without launching additional make processes by recursing within the same pymake process.

Where can I see this monstrosity?

The pymake website. You can also just pull or browser the pymake sources directly from hg.mozilla.org using Mercurial.

Why don’t you just hack GNU make itself?

There are some improvements we’re interested in doing which just aren’t possible within a C codebase:

  • implementing rules directly in Python
  • Condensing multiple invocations of make into a single process

. Python is also about a zillion times easier to hack quickly.

pymake is mostly GNU-make compatible. What are the differences?

The known differences are documented in the README file.

Does it perform well?

Not yet. pymake currently takes more time than GNU make. Almost all of this time is spent reading and parsing makefiles. Parsing autoconf.mk/config.mk/rules.mk from the Mozilla tree takes 0.3 seconds on my Linux machine, where GNU make takes 0.03 seconds. Dependency resolution and execution is as fast or faster than GNU make. I’d love to have some help profiling and fixing the parsing. Also, pymake does not yet support parallel execution.

Next week sometime, I promise to write about some of the difficulties I encountered writing pymake, and how test-driven development saved me from terrible pain and suffering.

PUTting and DELETEing in python urllib2

Tuesday, October 21st, 2008

The urllib2 Python module makes it pretty simple to GET and POST data using HTTP (and other protocols). But there isn’t a good built-in way to issue HTTP PUT or DELETE requests. I ran into this limitation while working on a project to upload automatically generated documentation to the Mozilla Developer Center. The DekiWiki API for uploading an file attachment uses the HTTP PUT method.

It turns out there is an easy workaround. You can subclass the urllib2.Request class and explicitly override the method:

import urllib2

class RequestWithMethod(urllib2.Request):
  def __init__(self, method, *args, **kwargs):
    self._method = method
    urllib2.Request.__init__(*args, **kwargs)

  def get_method(self):
    return self._method

Preview for Thursday’s post: the generated documentation is already online.

Parsing IDL in Python

Tuesday, October 7th, 2008

One of the current pain points in our build system is the xpidl compiler. This is a binary tool which is used to generate C++ headers and XPT files from XPIDL input files. The code depends on libIDL, which in turn depends on glib. Because this tool is a build-time requirement, we have to build a host version, and in most cases we also build a target version to put in the SDK package.

Getting glib and libidl on linux systems is not very difficult: all the major distros have developer packages for them. But getting libidl and glib on Windows and mac can be quite painful. On Windows, we have had to create our own custom static library versions of this code which are compatible with all the different versions of Microsoft Visual C++. On Mac you can get them from macports, but as far as I know they are not available in universal binaries, which means that you can’t cross-compile a target xpidl.

Parsing IDL, while not trivial, is not so complicated that it requires huge binary code libraries. So a while back I reimplemented the XPIDL parser using python and the PLY (python lex-yacc) parsing library. The core parsing grammar and object model is only 1200 lines of code.

Because we don’t have any unit tests for xpidl, I chose to use A-B testing against the output of the binary xpidl: the header output of the python xpidl should match byte-for-byte the header output of the binary xpidl. I wrote a myrules.mk file which would automatically build and compare both versions during a buld. This turned out to be a royal pain, because the libIDL parser is not very consistent and has bugs:

  • Some, but not all attributes are re-ordered so that they are no longer in original order, but are ordered according to an internal glib hash function.
  • The code which associates doc-comments with IDL items is buggy: in many cases the comment is associated with a later item.

I had to add some temporary nasty hacks in order to work around these issues. And finally, reproducing the wacky whitespace of the binary tool wasn’t worthwhile, so I starting comparing the results using diff -w -B. But with these hacks and changes, both xpidl compilers produce identical C++ headers.

I completed the code to produce C++ headers during a couple of not-quite-vacation days, but I didn’t write any code to produce XPT files. I shelved the project as an attractive waste of time, until jorendorff needed an IDL parser to produce quick stub C++ code. Jason managed to take my existing code and hook up a quick-stub generator to it. The python xpidl parser and quick-stub generator are both used in the codebase.

Currently, we’re still using the old binary xpidl to produce C++ headers and XPT files. If somebody is interested, I’d really like help adding code to produce XPT files from the new parser, so that we can ditch the old binary code completely.

If you ever need to use python to parse some interesting grammar, I highly recommend PLY. If you turn on optimization it performs very well, and it has very good support for detailed error reporting.

JSON serialization of interconnected object graphs

Thursday, August 21st, 2008

In it’s basic form, JSON cannot serialize cyclic graphs of objects, or graphs where multiple paths can lead to the same object. In a project I’m working on, I wanted to move such a graph of highly-interconnected objects from JS to python. So I have invented a format built on top of JSON that can be used to serialize/deserialize such graphs.

Basically, the JSON comes across as a large list:

[
  /* list[0] is the base object at the root of the eventual object graph. */
  {
    /* string, number, true/false, and null properties are serialized directly */
    "stringprop": "stringvalue",
    "numprop": 3.1415,
    /* but lists and objects are not serialized directly. Instead, they are represented by an index
       into the base list. "sharp" is a nod to JS sharp variables, from which this was originally inspired */
    "complexprop": {"sharp": 1}
  },
  /* list[1] is referenced from list[0].complexprop. It also references itself, see below */
  [
    "simplestring",
    3,
    {"sharp": 1}
  ]
]

You can find JS for serializing these types of graphs here, and python for deserializing them here.

It turns out that I probably don’t actually need this code: I’ve found a simpler solution for my particular problem, but I wanted to share this solution in case other people might find it useful.

Mozilla Build System: Now Requiring Python, and new Windows build package

Friday, March 2nd, 2007

Starting next week the Mozilla build system is going to require python on all platforms. This will allow us to gradually convert various build scripts which are currently written in perl, and start hacking on an autoconf replacement written in python. Any python 2.3 or higher will be acceptable.

In addition, I have made a production release of MozillaBuild, version 1.0. The MozillaBuild package is now the recommended build environment for all Mozilla developers on Windows. There were a few bugs with the 1.0 releases that have now been corrected in a MozillaBuild 1.1 release candidate.

After the Tinderboxes have been upgraded to use the MozillaBuild package on trunk, I am going to be discontinuing support for the cygwin build environment.

Please post questions and followups to the mozilla.dev.builds newsgroup.

Bound Functions and Function Imports in JavaScript

Wednesday, January 3rd, 2007

After playing with the code in my last post, and asking some apparently silly questions about the ES4 spec, I’ve found some techniques which can be used to implement python-style “import” using current JavaScript.

References to Bound Functions

In python, it is possible to hold a reference to an unbound method, or to a method which is bound to a particular object:

>>>class Foo:
...  def dumpMe(self, arg):
...    print "self: %s\narg: %s" % (self, arg)
...
>>> Foo.dumpMe
<unbound method Foo.dumpMe>
>>> Foo().dumpMe
<bound method Foo.dumpMe of <__main__.Foo instance at 0x2aaaaab33680>>
>>> Foo.dumpMe("something")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unbound method dumpMe() must be called with Foo instance as first argument (got str instance instead)
>>> Foo().dumpMe("something")
self: <__main__.Foo instance at 0x2aaaaab33d40>
arg: something

The JavaScript language does not have a built-in syntax to obtain a bound method:

js> function Foo() { }
js> Foo.prototype.dumpMe = function(arg) {
  dump("this: " + this + "\narg: " + arg);
}
js> new Foo().dumpMe
function (arg) {
    print("this: " + this + "\narg: " + arg);
}
js> f = new Foo();
js> f.dumpMe("test")
this: [object Object]
arg: test
js> unbound = new Foo().dumpMe
js> unbound("test")
this: [object BackstagePass @ 0x6214e0 (native @ 0x513fc8)]
arg: test

(The BackstagePass object is the global object in xpcshell).

However, with a little extra code it is possible to emulate bound functions in JavaScript:

js> Function.prototype.bind = function(object) {
  var func = this;
  return function() { return func.apply(object, arguments); }
}
js> boundFunc = f.dumpMe.bind(f);
js> boundFunc("test")
this: [object Object]
arg: test

We can then use bound functions to emulate the from module import x, y, x statement of python:

js> function jsimport(module) {
  // this function has two forms:
  // jsimport(module) is equivalent to python "from module import *"
  // jsimport(module, "name" [, "name2"]) is equivalent to python "from module import name, name2"

  // NOTE: "this" is the global object
  function internal_import(name) {
    if (module[name] instanceof Function) {
      this[name] = module[name].bind(module);
    }
    else {
      this[name] = module[name];
    }
  }

  if (arguments.length == 1) {
    for (name in module) {
      if (module.hasOwnProperty(name)) {
        internal_import(name);
      }
    }
  }
  else {
    for (i = 1; i < arguments.length; ++i) {
      internal_import(arguments[i]);
    }
  }
}

You could use this technique today to hide away the IOService or PrefService a little:

jsimport(Components.classes["@mozilla.org/preferences/pref-service;1"].getService(Components.interfaces.nsIPrefBranch),
         "getCharPref", "getIntPref", "getBoolPref");

if (getBoolPref("dom.window.dump.enabled"))
  // do something dumpy