Parsing IDL in Python

One of the current pain points in our build system is the xpidl compiler. This is a binary tool which is used to generate C++ headers and XPT files from XPIDL input files. The code depends on libIDL, which in turn depends on glib. Because this tool is a build-time requirement, we have to build a host version, and in most cases we also build a target version to put in the SDK package.

Getting glib and libidl on linux systems is not very difficult: all the major distros have developer packages for them. But getting libidl and glib on Windows and mac can be quite painful. On Windows, we have had to create our own custom static library versions of this code which are compatible with all the different versions of Microsoft Visual C++. On Mac you can get them from macports, but as far as I know they are not available in universal binaries, which means that you can’t cross-compile a target xpidl.

Parsing IDL, while not trivial, is not so complicated that it requires huge binary code libraries. So a while back I reimplemented the XPIDL parser using python and the PLY (python lex-yacc) parsing library. The core parsing grammar and object model is only 1200 lines of code.

Because we don’t have any unit tests for xpidl, I chose to use A-B testing against the output of the binary xpidl: the header output of the python xpidl should match byte-for-byte the header output of the binary xpidl. I wrote a myrules.mk file which would automatically build and compare both versions during a buld. This turned out to be a royal pain, because the libIDL parser is not very consistent and has bugs:

I had to add some temporary nasty hacks in order to work around these issues. And finally, reproducing the wacky whitespace of the binary tool wasn’t worthwhile, so I starting comparing the results using diff -w -B. But with these hacks and changes, both xpidl compilers produce identical C++ headers.

I completed the code to produce C++ headers during a couple of not-quite-vacation days, but I didn’t write any code to produce XPT files. I shelved the project as an attractive waste of time, until jorendorff needed an IDL parser to produce quick stub C++ code. Jason managed to take my existing code and hook up a quick-stub generator to it. The python xpidl parser and quick-stub generator are both used in the codebase.

Currently, we’re still using the old binary xpidl to produce C++ headers and XPT files. If somebody is interested, I’d really like help adding code to produce XPT files from the new parser, so that we can ditch the old binary code completely.

If you ever need to use python to parse some interesting grammar, I highly recommend PLY. If you turn on optimization it performs very well, and it has very good support for detailed error reporting.

Atom Feed for Comments 5 Responses to “Parsing IDL in Python”

  1. Justin Wood (Callek) Says:

    Is there a bug # on the “produce XPT files” aspect of this, or even just a general bug number on the dropping of xpidl binary?

  2. nossralf Says:

    I wrote a XPIDL parser using pyparsing a year or so ago. Fairly complete, though not fit for what you’re doing.

    The thing is, you can then auto-generate DevMo markup from IDL files very easily. The only real effort a developer needs to put in is actually documenting the interface, no need to write wiki markup beyond that. Quite nice.

  3. Justin Wood (Callek) Says:

    For those who may read these comments later, I found: Bug 458936 as the IDL -> XPT part of all this.

  4. mak Says:

    I’m using Your code to produce :) python code which will use mozillas xpidl interface – and I’m willing to see Your modifications instead of main to xpidl generator
    xpcom as python actor is rather irritating and useless

    python and its code should act as client to mozilla – that looks correct to me – any communication should stay at api level, but without xpcom this looks impossible
    :) – partialy – we have comtypes – this is typically windows solution – but
    :) – with small modifications I have ported this api to mozilla compatible api, with help of gluezilla this code works even on linux

    my question is: Are U interested to help making Your xpidl compiler producing python xpcom code compatible – or should I do it alone ?

    nsISupports as IUnknown is using IUnknown code without using VARIANT structure :)

    eg:

    class nsIWebNavigation(nsISupports):
    _iid_ = GUID(‘{f5d9e7b0-d930-11d3-b057-00a024ffc08c}’)
    _methods_ = [
    COMMETHOD([], HRESULT, ‘GetCanGoBack’,
    ( [‘in’], POINTER(c_int), ‘aCanGoBack’ )),
    ]

    class nsIWebNavigation(nsISupports):
    _iid_ = GUID(‘{f5d9e7b0-d930-11d3-b057-00a024ffc08c}’)
    _methods_ = [
    COMMETHOD([], HRESULT, ‘GetCanGoBack’,
    ( [‘in’], POINTER(c_int), ‘aCanGoBack’ )),
    ]

    my code uses wxPython for GUI – but this not a deal – code is universal and depends on handle to window both gtk and msw have methods to get this info from gui

  5. Benjamin Smedberg Says:

    mak, I don’t think I’m particularly interested in helping with your effort. It is an interesting exercise, though!

Leave a Reply