Multi-Process Plugins on By Default

Wednesday, January 27th, 2010

Out-of-process plugins (OOPP) are now on by default in mozilla-central! Starting tomorrow morning, the mozilla-central nightly builds will load Flash and all other plugins in a separate process by default (on Windows and Linux). The Electrolysis team would love for people to test any plugins on their system, especially less-popular plugins.

Since we are moving relatively quickly with multi-process plugins, there are a few known issues to be aware of:

  • The plugin-crash UI is not finished. The current UI is just a non-localized dialog so that we can get crash reports from nightly testers. This will be changed soon!
  • On Windows, tearing/repainting issues when scrolling, bug 535295
  • On Linux, compiz effects and Flash don’t work together on some systems, bug 535612
  • On Windows, selecting “Print” option in Flash may lock up Firefox, bug 538918
  • On Windows, hulu won’t switch to full-screen mode, bug 539658
  • On Linux with GTK+-2.18 or later, GDK assertions and a fatal XError, bug 540197
  • Firefox-process crashes at NPObjWrapper_NewResolve with silverlight and sometimes Flash, bug 542263

If you discover crashes while running nightlies, please make sure you submit them, and check about:crashes for the crash ID and signature. We could use help making sure plugin-related crashes and instability are filed and tracked by searching for signatures here and filing bugs in the Core:Plug-Ins component.

If your browser hangs, you can probably recover by killing the mozilla-runtime process in the Windows task manager or via `kill` on Linux. If you are a developer with a debugger, please use the Mozilla symbol server and get stacks for both the Firefox process and the mozilla-runtime process and file a bug.

In some cases, it may be useful to the Electrolysis developers if you obtain a plugin log, which is a log of calls made between the plugin and the browser. Instructions for obtaining the log are available here.

I am very excited that we’ve made it this far, and I look forward to our next milestone release, which will backport these changes to the 1.9.2 release in preparation for Firefox Lorentz.

Flash in a separate process

If for some reason you need to disable multi-process plugins, set the pref dom.ipc.plugins.enabled to false.

Error calling method on NPObject!

Monday, January 25th, 2010

When a plugin crashes, content script may still have a reference to JS objects provided by that plugin. The JS objects will throw an exception “Error calling method on NPObject” when any properties or methods are called. Unfortunately, this generic error message is also thrown whenever a plugin method fails for any reason. You can’t tell, just by looking at the exception, whether the process crashed or some other type of failure occurred.

This is important when a test fails: there could be any number of different errors lurking under the surface with similar outward appearance. Today there was a Mochitest error with the following symptoms:

197 ERROR TEST-UNEXPECTED-FAIL | /tests/modules/plugin/test/test_painting.html | [SimpleTest/SimpleTest.js, window.onerror] An error occurred - Error calling method on NPObject! at http://localhost:8888/tests/modules/plugin/test/test_painting.html:105
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)

Reading through the log, however, the important output is:

###!!! [RPCChannel][Child][/builds/moz2_slave/mozilla-central-linux/build/ipc/glue/RPCChannel.cpp:276] Assertion (mDeferred.empty() || 1 == mDeferred.size()) failed.  expected mDeferred to have 0 or 1 items, but it has %lu (triggered by rpc)
  local RPC stack size: 2863316886
  remote RPC stack guess: 8
  deferred stack size: 2863316886
  out-of-turn RPC replies stack size: 2863316886
  Pending queue size: 2863317142, front to back:

This assertion is immediately followed by an abort, which is visible in the crash dump output also:

Crash reason:  SIGSEGV
Crash address: 0xbdce4804

Thread 1 (crashed)
 0  libxul.so!mozilla::ipc::RPCChannel::DebugAbort(char const*, int, char const*, char const*, char const*, bool) [ipc_message.h:0235fc257969 : 97 + 0x0]

David B. mistakenly thought that this was a manifestation of Bug 541102 when in fact it is an entirely unrelated bug with similar symptoms. When in doubt about a crash, please check with one of the Electrolysis team to help diagnose and read the log.

Multi-Process Plugins

Tuesday, December 15th, 2009

Yesterday I landed multi-process plugin support in mozilla-central. By default, this capability is disabled, because there are still some serious bugs. But if you are willing to suffer some temporary instability, we could really use some help testing Minefield nightlies with out-of-process plugins (OOPP).

Currently only Windows and Linux support multi-process plugins: mac support requires additional work. To turn OOPP on, visit about:config, find the pref dom.ipc.plugins.enabled, set it to true, and restart your browser. Please report any crashes or instability in bugzilla: product “Core” component “Plug-Ins”. Where possible, please be as detailed as possible in bug reports:

  • Operating system: please be specific about Windows versions, since Windows XP and Windows Vista deliver some Windows events differently;
  • Page visited;
  • Plugin data from about:plugins;
  • Whether turning IPC off fixes the problem (Note: flipping the pref usually requires restarting the browser to take effect).

There is one major known bug right now: any plugin which is installed in a path with spaces fails to load. On Windows, this affects almost everything except Flash. I hope to have this fixed in tomorrow’s nightly. There is a tracking bug for all the known issues which prevent us from turning on OOPP by default.

Please direct any questions about this work to the mozilla.dev.tech.plugins discussion list.

Multi-Process Fennec

Friday, October 30th, 2009

Today Joe Drew, Olli Pettay, and I have gotten Mobile Firefox (Fennec) working with a separate process for rendering. It’s a significant achievement, because even though we had to hack out some Fennec features, it’s already a fairly functional browser. Olli made a screencast showing the browser in action:

Getting Fennec working was difficult partly because the mobile Firefox code uses a different drawing system: instead of displaying a native scrollable widget, the mobile code uses a cache of “tiles” to display the web page. This allows them to display certain kinds of content over the web page, as well has have better control and speed when scrolling, zooming, and performing other interactions.

In order to get all this working with multiple processes, the group attacked pieces of the problem separately. Joe Drew implemented a new method on the canvas element: asyncDrawXULElement. This call, very similar to drawWindow, will asynchronously ask the content process to draw a tile (or part of a tile).

Olli implement various interaction fixes: forwarding mouse events from the tiles to the content process, forwarding some important events such as MozAfterPaint from the content process back to the chrome process, and fixing widget focus in the embedded browser so that keystrokes are sent to it correctly.

Finally, I modified the Mozilla frame loader and subdocument frame such that “remote frames” could work correctly even without a docshell. I then hacked up the Fennec sources so that it would also work without a docshell, mainly by commenting out the security UI and zoom-to-element features which require additional information from the content process.

Now that it’s working, we hope to be able to bring additional developers in to fix up the features which we hacked around, fix DOM features which are currently broken such as link targeting, and start getting much better measurements for interactive performance and memory usage.

IPDL: The Inter-Process Protocol Definition Language

Tuesday, September 8th, 2009

IPDL is the language that Mozilla is using to describe all the messages between processes. Invented by Chris Jones, the IPDL language makes it easier for us to write type-safe and secure code by generating a lot of the basic validation code involved with messages.

When Chris was at Mozilla Headquarters a few weeks ago, he presented a tech-talk on IPDL which has been recorded and is available for download or viewing:

In the presentation, Chris explains the motivations for IPDL, demonstrates basic usage, and answers many questions about the limitations and benefits of using IPDL. I encourage anyone who is interested in Mozilla’s Multi-Process work to watch the presentation.

There are also some IPDL protocols in the tree if you’d like to read them: see the dom/ipc directory in the Electrolysis branch.

Electrolysis: Making Mozilla Faster and More Stable Using Multiple Processes

Tuesday, June 16th, 2009

For a long while now (even before Google Chrome was announced), Mozilla has been examining ways to make Firefox better by splitting the work of displaying web pages up among multiple processes. There are several possible benefits of using multiple processes:

  • Increased stability: if a plugin or webpage tries to use all the processor, memory, or even crashes, a process can isolate that bad behavior from the rest of the browser.
  • Performance: By splitting work up among multiple processes, the browser can make use of multiple processor cores available on modern desktop computers and the next generation of mobile processors. The user interface can also be more responsive because it doesn’t need to block on long-running web page activities.
  • Security: If the operating system can run a process with lower privileges, the browser can isolate web pages from the rest of the computer, making it harder for attackers to infect a computer.

Now that we’re basically done with Firefox 3.5 we’ve formed a project team. We’re calling the project “Electrolysis”. Because we can’t do everything at once, we are currently focusing on performance and stability; using a security sandbox will be implemented after the initial release. Details of the plan are available on the Mozilla wiki, but the outline is simple:

  1. Sprint as fast as possible to get basic code working, running simple testcase plugins and content tabs in a separate process.
  2. Fix the brokenness introduced in step one: shared networking, document navigation and link targeting, context menus and other UI functions, focus, drag and drop, and probably many other aspects of the code will need modifications. Many of these tasks can be performed in parallel by multiple people.
  3. Profile for performance, and fix extension compatibility to the extent possible.
  4. Ship!

We’re currently in the middle of stage one: Ben Turner and Chris Jones have borrowed the IPC message-passing and setup code from Chromium. We even have some very simple plugins loading across the process boundary! Most of the team is in Mountain View this week and we’re sprinting to see if we can implement a very basic tab in a separate process today and tomorrow.

For the moment we’re focusing on Windows and Linux, because the team is most familiar and comfortable on these environments. I sat down with Josh Aas on Friday and we discussed some of the unknowns/difficulties faced on mac. As soon as our initial sprint produces working code we’d love to have help from interested mac hackers!

If you’re interested in helping, or just lurking to see what’s going on, the Electrolysis team is using the #content channel on IRC and the mozilla.dev.tech.dom newsgroup for technical discussions and progress updates. We’ll also cross-post important status updates to mozilla.dev.platform.

If you’ve emailed me volunteering to help and I haven’t gotten back to you, I apologize! Until we get the stage-one sprint done there aren’t really any self-contained tasks which can be done in parallel.