Improving XPCOM for Mozilla 2

Friday, December 22nd, 2006

XPCOM technology, based on Microsoft COM, is fundamentally structured around the concept of binary object layouts and stylized calling conventions. XPCOM was a good technique for introducing modularity and extensibility to the Mozilla codebase, but it is showing its age. One of the interesting things about Mozilla 2 is that we can breaking API and binary compatibility.

There are several ways we should improve XPCOM:

  1. Improve reference-counting (this could include universal support for cycle collection, or even allocating all XPCOM objects using MMgc; Graydon and I talked about this some at the Summit, and I’m sure he’ll take the lead determining what this means in practice.
  2. Allow throwing complicated (object-type) exceptions from any XPCOM method, and reduce the verbosity and inefficiencies of nsresult return codes. C++ exceptions, as much as I dislike them, provide the shortest path to this goal. Taras has been working with oink to provide an automated way to convert method calls automatically.
  3. Reduce the complexity and verbosity of using XPCOM. I’ve been spending a fair amount of time working in Python recently, and I’m very impressed with its use of module objects. Using XPCOM could be a lot easier from script with some very simple changes. I’ll blog about this soon!

In order to achieve these objectives, I’m convinced that Mozilla must break XPCOM binary compatibility, and should stop using XPCOM as the binary embedding solution:

  • We may want the flexibility of making GCRoot or another abstract non-interface class the root type (nsISupports) for all XPCOM objects. We at least ought to add interfacerequestor and classinfo functionality to the root object type, and perhaps weak-reference support as well.
  • C++ exceptions are very compiler dependent (and compiler-version dependent) and are not good candidates for binary freezing.

The implications of a change like this are considerable:

  • It will no longer be possible (or desirable) to write binary XPCOM components in C++ that don’t live in the monolithic platform binary (libxul). At first this seemed like a significant challenge: Firefox and Thunderbird use binary components to do OS integration (profile migration and OS integration). Various extension also use binary components to integrate with external libraries. But most of these use-cases can be solved with a good foreign-function-interface library available from script. I’ll blog about this separately; I’ve been very impressed with the expressiveness and flexibility of the python ctypes library and I think it could be ported to SpiderMonkey rather easily.
  • Binary embedders (e.g. gtkmozembed clients) will no longer be able to access DOM objects via their XPCOM interfaces.
    The simplest way to solve this problem is to extend the scriptable NPAPI object model to be accessible by binary embedders. This will give embedders access to the DOM that is straightforward and relatively complete.

Brainstorming Example

class nsISupports : virtual public RCObject
{
  inline void AddRef() { IncrementRef(); return 2; }
  inline void Release() { DecrementRef(); return 1; }

  virtual nsISupports* QueryInterface(REFNSIID aIID, PRBool aAddRef) = 0;

  /**
   * For ease of conversion, provide an old-style QI wrapper.
   */
  inline nsresult QueryInterface(REFNSIID aIID, void **aResult) {
    *aResult = QueryInterface(aIID, PR_TRUE);
    return (*aResult) ? NS_OK : NS_NOINTERFACE;
  }

  virtual nsISupports* GetInterface(REFNSIID aIID, PRBool aAddRef) = 0;

  virtual nsIClassInfo* GetClassInfo() = 0;
};

The virtual inheritance of RCObject could be a problem for xptcall. There are ways around that. I’m also a little concerned that objects won’t be storing pointers to the “root” GCObject, but rather vtables within that object. I hope that doesn’t mess up MMgc.

tinderclient.py

Thursday, December 21st, 2006

As promised, I have created a python module which can be used to implement a tinderbox client to report arbitrary. I’ve also created a driver which can pull and build a Mozilla-like application. Sources here. I’ve tested it on Windows and Linux, but I fully expect it would work on Mac as well. It requires Python 2.4 and the killableprocess module.

I know it’s not especially obvious how to actually run a build. I use the following command line:

python mozbuild.py --config=/builds/tinderclient/firefox-config.py,/builds/tinderclient/sample-config.py --private-config=/builds/bs-passwords.py

Take a look at the MozillaTest tinderbox logs from Thursday to see the results of my test builds. Please note that anyone is welcome to run a tinderbox that reports to MozillaTest; it’s a place to test tinderbox scripts! If you want to start reporting to an official tree like MozillaExperimental or SeaMonkey-Ports, please ask permission from build@mozilla.org.

For those keeping track of killableprocess, I’ve committed some changes:

  • on *nix, create and kill process groups properly;
  • on Windows, allow redirecting the standard handles to files (instead of pipes);
  • on Windows, allow passing a dictionary for the environment of the new process to create.

Right now this is a technology experiment. We’ll probably use it to drive a Tamarin tinderbox. It’s vaguely possible that Mozilla will switch away from the old-style perl tinderbox client altogether going forward, but that requires replicating a lot of logic, and might not be worth it.

Sometime after the new year, I will be adding a driver script which can perform builds in a loop, perhaps with config updating from CVS the way the current tinderbox scripts do.

killableprocess.py

Monday, December 11th, 2006

I’ve managed, at long last, to solve the problem of launching subprocesses from python. I have created a python module which can launch a subprocess, wait for the process with a timeout, and kill that process and all of its sub-subprocesses correctly, on Windows, Mac, and Linux. Source code is here. It requires python 2.4+ because it subclasses the subprocess module. On Windows, it only works on Win2k+, and it requires the ctypes module, which comes with Python 2.5+, or can be installed into earlier versions of Python.

You will be seeing a python-based tinderbox client appear on the MozillaTest tree shortly. Small projects or projects that don’t want or need the byzantine logic of the existing tinderbox client scripts can use a Python module to do tinderbox reporting using a simple object-oriented API. I’m hoping to use this to get Tamarin builds reporting to a tinderbox tree, as well as do some of the FF+XR build automation (which is significantly different from the existing build process).

Adventures in Python: Launching Subprocesses

Thursday, November 9th, 2006

I’ve been looking at python for various build automation. I had what I thought would be a simple problem:

How do I launch a process, collecting stdout/stderr, with a timeout to kill the process if it runs too long?

The python subprocess module gets about 80% there. You can launch a process, and hook up stdout/stderr/stdin. You can poll the process for completion. But subprocess doesn’t have a simple parameter for process timeout. Total time spent: 45 minutes.

So, you use a loop or a thread to wait for the process and kill it if it takes too long, right? Subprocess doesn’t have an instance method to kill the process. Answer according to #python on freenode? os.kill(theprocess.pid, signal.SIGTERM). Except that this apparently doesn’t work on Windows: you have to emulate it. Total time spent: 1.5 hours.

This works, on unixy systems. But it fails miserably on Windows. It turns out that on Windows when you kill a process, any subprocesses that were launched don’t get killed. So I went searching code that I thought must have already solved this problem: BuildBot launches processes and has to kill them, right? Well, it turns out that BuildBot uses Twisted to do the dirty work. Twisted completely ignores the problem, as far as I can tell. It doesn’t use subprocess, but instead has a file called _dumbwin32proc.py which provides the event-driven access to the process pipes and status. This file is uglier than the devil’s rear end. Total time spent: 2.5 hours.

After much pain, I found Windows documentation that might help: Windows 2000+ can put processes into jobs. Instead of killing the parent process, you can kill the entire job. As far as I can tell this should be implementable in Python, but I haven’t found anyone who’s done it yet (even better, abstracted it behind a cross-platform API). If you know of code which has this working properly, please let me know. Otherwise I will be spending another 4 hours tomorrow to get this working (I know only halting python, though I’m getting better quickly). Total time spent: 3.5 hours.

Learning new languages isn’t that hard. Learning new programming worlds, with their bugs and quirks, is really hard.

Update: Solution in my post on killableprocess.py.