Archive for 2010

Running Extension Code In Another Process

Friday, December 3rd, 2010

In order to support running Jetpack-style extensions in another process, Firefox 4 has support for running arbitrary JavaScript code in a separate process. Although this code was designed primarily to support the Jetpack SDK, Firefox and extensions can use this support to run arbitrary code in a separate process.

Running code in a separate process has advantages similar to running code in a separate thread. The running code will not block the main Firefox user interface. An added advantage is crash protection: if the code causes a crash, it will not take down the entire browser. There may also be some performance benefits from separating the garbage collection heaps and avoiding XPCOM overhead.

The basic steps to start a subprocess and run code in it are as follows:

var process = Components.classes["@mozilla.org/jetpack/service;1"].
  getService(Components.interfaces.nsIJetpackService).createJetpack();
process.evalScript("Put your JS here");
// When you are done with the process, you should explicitly destroy it.
process.destroy();

Of course, running a script in another process isn’t that useful unless you can communicate with it. This is accomplished by passing messages back and forth. To send a message to the remote process, use process.sendMessage:

process.sendMessage("messageName", param...)

To receive messages from the remote process, register a receiver function:

process.registerReceiver("messageName, function(messageName, argument...) { ... });

The remote process has access to a similar set of global functions, as well as the ability to create sandboxes and use ctypes. For more information about the full capabilities, see the Mozilla Developer Center documentation. Note that code running in a jetpack-style process does not have access to XPCOM, because XPCOM is not started in the jetpack process; it runs code using only the JavaScript engine.

If an extension is using ctypes to work with third-party code or OS libraries, I strongly encourage that extension to consider running the code in a separate process for crash protection. If an extension has long-running or computationally expensive tasks, it might make sense to move those into a separate process as well. If nothing else, it will make it much easier to measure the CPU and memory usage of that code separate from the rest of Firefox.

Bank of America online “Hardware and Software Requirements”

Wednesday, December 1st, 2010

Bank of America is asking me to agree to a “Electronic Communications Disclosure” which includes the following text:

(5) Hardware and Software Requirements

While you may be able to access and retain the Communications using other hardware and software, your personal computer needs to support the following requirements:

For Online Banking:

  • An operating system, such as:
    • Windows NT, 2000, ME, XP, Vista, or Win 7; or
    • Macintosh OS 10.x
  • Access to the Internet and an Internet browser which supports HTML 4.0 and 128bit SSL encryption and Javascript, such as:
    • For PC using Windows NT, 2000, ME, XP, Vista, or Win 7
      • Microsoft Internet Explorer 7.0 and higher
      • Firefox 3 and higher
      • Chrome 3.0 and higher
    • For Macintosh using OS 10.x
      • Safari 3.0 and higher
      • Firefox 3 and higher
      • Chrome 4.0 and higher

For Merrill Lynch brokerage websites:

  • You must have access to a personal computer with browser software such as Microsoft Internet Explorer; Adobe Acrobat Reader; and Internet access (at your cost).
    • Browser and reader versions necessary to view the Merrill Lynch brokerage websites are as follows:
      • Microsoft Internet Explorer version 6.0 and later
      • Firefox version 3.5 and later
      • Safari version 3.2 and later

Most Communications provided within Online Banking, Merrill Lynch brokerage websites or at other Bank of America websites are provided either in HTML and/or PDF format. For Communications provided in PDF format, Adobe Acrobat Reader 6.0 or later versions is required – A free copy of Adobe Acrobat Reader may be obtained from the Adobe website at www.adobe.com.
Download Adobe Reader for free. Link opens new window.

In certain circumstances, some Communications may be provided by e-mail. You are responsible for providing us with a valid e-mail address to accept delivery of Communications.

To print or download Communications you must have a printer connected to your computer or sufficient hard-drive space (approximately 1 MB) to store the Communications.

Does this entire section say anything other than “You have to have an operating system and a web browser, and sometimes a way to view PDF files, and here are some programs you can use.”? Why bother writing it at all?

Software Integration Is Not Evil

Monday, November 29th, 2010

Asa is wrong, and he’s being obnoxious about it to boot. What Google, and Apple, and Microsoft is doing is called software integration, and in general it’s very good that iTunes, Google Earth, and Windows Live are adding Firefox integration into their software. They are installing plugins and/or extensions using the recommended methods so that all NPAPI-capable browsers can see and load them. This is not “Google being bad”, this is Google following Mozilla’s recommendations for browser integration!

It’s true that Firefox should give users more control over integration software that is found on the system, and we’re working on prompting users whether they want to include that integration as part of Firefox. But claiming that Google, Apple, and Microsoft are somehow being evil is stupid and short-sighted. The problem, if there is one, lies entirely with Firefox, not with the software which is doing exactly what we ask of them.

Help Me With Starcraft Maps

Thursday, November 18th, 2010

I love Starcraft 2. In general, I think Blizzard has done a great job with the game, especially the multi-player mode and competitive matchmaking. But one part of the game which could use a lot of work is the system for playing and improving custom maps.

After being bitten by the Starcraft bug, I kept thinking of ways to modify or perhaps improve on the standard melee maps that are used in competive “ladder” matches. I watched a Day9 video commentary where he analyzed a nonstandard map. So I tried my hand at creating a custom map. I really enjoyed the process, and I think I came up with a pretty good first pass at a map.

It has been almost impossible to get people to play my map, or give me feedback. The UI for trying new maps in Starcraft itself is primitive, and there isn’t a place I can browse through maps and find ones which look interesting. It feels like Starcraft needs something like the Firefox Addons site, where people could browse and rate the maps.. The only remaining way to get feedback is to ask individual friends to play the map, either against me or against somebody else. This is difficult, because my friends usually have different schedules than I, and also have very different skill levels: playing against them would not be an even match.

I am therefore going to use my blog as an alternative way of soliciting feedback. If you play Starcraft (and I know many of my readers do!), please try out my maps and give me feedback on them. I now have three maps available:

Deserted Battleground (1v1)

Choose between three “natural” expansions from your starting position. One is more vulnerable but advances toward your opponent, and it’s easier to defend against airborne harassment. One in the back corner is far from your opponent. One is protected by cliffs, but very vulnerable to harassment. Xel’naga towers provide vision of the middle of the field, but side-paths can make for surprise attacks if not carefully scouted.

City Scout (1v1 or 4-player melee)

Your starting base has two ramps, and one is very close to an opponents starting position. Attacking the backdoor entrance involves very long distances, but good scouting is essential. Control of the center of the map makes for expansion

Butterfly Island (1v1 or 4-player melee)

A backdoor ramp on your main base is again vulnerable to harassment by speedy units (hellions/speedlings). Resources are scattered around the center of the map and there are several possible locations to plant a base and do mining.

I’d like feedback on the following areas at least:

  • Balance: Although the maps are generally mirror-images for balance, it is possible that certain features of the map may favor one race or another. Are there specific features which you think are unbalanced.
  • Play: are there any features of the map which you love or hate? I use backdoor entrances more than the standard Blizzard maps, and occasionally some unusual mineral/geyser placement.
  • Visual appeal: Do you like the general appearance of the map? Are there things you would improve?

Feel free to leave me comments here at my blog, in email, or by sending messages in-game. My Starcraft handle is Odysseus/749.

In order to play the maps,you must create a game in the Multi-Player section of Starcraft. This is true even if you just want to play the maps against the AI, because you cannot search for maps in the “Play Against AI” panel. You can then use the search function to find the maps by name. If you’re a bronze/silver-level player, I’d love to play games against you: please invite me/add me as a Starcraft buddy.

Asynchronous Plugin Layer Painting

Thursday, November 18th, 2010

Firefox 4 implements a new strategy for painting windowless plugins. This should result in improved performance and responsiveness when users are visiting sites such as Hulu and Vimeo which make use of windowless Flash to render their videos.

Background

On Windows and Linux, there are two basic modes in which plugins can render, windowless and windowed. When a windowed plugin instance is requested, Firefox creates a native widget; the operating system delivers native events, including paint requests, directly to the plugin window. This is simple, but it has a significant disadvantage: the plugin doesn’t participate in normal web layout. This means that the plugin cannot be transparent, and CSS effects such as opacity and most transformations cannot be applied to the plugin. Youtube currently mostly uses windowed plugin instances for rendering their videos.

Windowless plugin, on the other hand, do not have a native widget. Instead, events such as mouse and keyboard events, as well as requests to paint the plugin, are received by the browser and forwarded to the plugin using the NPP_HandleEvent API. Hulu and vimeo both make use of windowless plugin instances. Any Flash plugin with the wmode=”opaque” or wmode=”transparent” attribute in their <embed> or <object> tags is using windowless mode.

Asynchronous Painting

In Firefox 3.6 and earlier, every time the operating system asks the browser to paint its window, we synchronously walk the layout frames and ask each frame to paint itself. When a windowless plugin frame is asked to paint, it synthesizes a WM_PAINT event and sends it to the plugin using NPP_HandleEvent. This is straightforward, but it does involve a blocking call and process round-trip for plugins which run in a separate process.

In Firefox 4, we don’t paint the plugin directly to the screen. Instead, as soon as the plugin is visible we ask it to paint to a retained buffer (an X surface on Linux, and a shared-memory DIBSection on Windows). We retain the pixel data for the next time Firefox is asked to paint. When using D3D rendering, we can eagerly upload the plugin data to a texture, and the plugin texture is composited by the graphics card and GPU.

A Behavior Change: Opacity on Windows

On Windows, the new asynchronous painting API has one significant side effect: plugins responding to a WM_PAINT message must be aware of opacity. The device context which is passed to the plugin is backed by a DIBSection with an opacity channel. Certain Windows Drawing functions, such as the DrawText function, are not aware of opacity and will incorrectly overwrite the opacity data, leaving black splotches where transparent text was intended. Windows drawing functions such as AlphaBlend are the correct way to draw while preserving transparency information.

Most Flash and Silverlight sites work correctly with this new function, but there are a few Flash features which continue to use the old Windows APIs. This bug shows itself in current Firefox nightly builds as black splotches where text should be painted, and is being tracked by Mozilla bug 611698; we are working with Adobe to resolve this issue before Firefox 4 is released.

Testing Wanted

Although our implementation of asynchronous painting passes all of our internal tests and appears to work well in general, the web is a big place and we can’t test every page or plugin available on the web. We would really like people who develop with plugins or use plugin-intensive sites to test Firefox nightly builds and report any bugs which you see! These builds are updated to our most recent Firefox each night, so you will always have the latest and greatest features (and sometimes bugs) to experiment with.

Note to Flash Authors

If your site uses wmode=”transparent” but your Flash application is not actually transparent or translucent, you can get better performance in both Firefox and Chrome by switching to wmode=”opaque”. Please use wmode=”opaque” for content which does not need actual transparent behavior.

The Firefox Plugin Hang Detector

Wednesday, June 9th, 2010

If all goes well, Firefox 3.6.4 will be released with support for out-of-process plugins on Windows and Linux next week. As part of this project, Firefox now features a “hang detector” for plugins. The hang detector helps protect users from plugins and plugin scripts which stop responding for 10 seconds.

When Firefox makes an NPAPI call to a plugin which is being run in a separate plugin process, that call is translated into a message which is posted to the process. Firefox then waits for the response. If the plugin takes longer than 10 seconds to respond to a call, Firefox performs the following actions:

  • Stop the plugin process and collect a plugin-process “crash” minidump.
  • Collect a “crash” minidump from the browser process.
  • Terminate the plugin process.
  • Display the plugin-crashed UI, giving the user the opportunity to refresh the page and try again.

There are several reasons plugins might trigger the hang detector:

  • A plugin script (such as ActionScript run inside the Flash plugin) may be in an infinite loop or performing a very long computation.
  • The plugin itself may have a bug (such as a threading deadlock) which causes it to stop responding.
  • The plugin is not deadlocked, but is not processing events quickly enough, causing the event to linger waiting to be processed.
  • The implementation of Firefox out-of-process plugins may be causing a deadlock.

It is this last possibility that is most concerning, and we have pored over our Firefox crash stats studying the hang reports that we receive, trying to categorize the reports into one the categories above. During the long process of Firefox 3.6.4 release candidates, we have identified and fixed several “tophangs”: see bug 561817 for an example.

When hang reports were first submitted to crash-stats, it was very difficult to distinguish full crashes from hangs. It was also impossible to correlate the browser and plugin parts of a hang. Since then, the Socorro team has done an outstanding job of improving the crash-stats UI so that we can analyze hang reports. It is now possible, using the advanced query page, to search for only crash reports, only hang reports, and to limits searches to either the browser process or the plugin process. Individual hang reports are now cross-linked: see the following reports for an example:

In this particular hang-pair, it isn’t immediately clear what is causing the hang. The browser is painting, and during the process of painting a windowless Flash plugin sends the NPP_SetWindow NPAPI call. This is the hanging call. At the time the hang report was collected, the NPAPI thread of the Flash plugin is calling NPN_InvalidateRect, and inside that implementation locking a mutex. But we don’t know, looking at the stacks, whether this is a deadlock on that mutex, or whether the plugin just happened to be making that call at the time the hang detector collected its stack.

In some cases, developers may want to disable the hang detector. If you are using the Flash debugger, or if you are debugging Firefox, you can change the hang detector timeout by changing the preference dom.ipc.plugins.timeoutSecs. Setting a value of -1 disables the hang detector entirely. For more information on setting preferences, see about:config on MozillaZine.

Firefox, Safe From Plugin Crashes

Wednesday, March 3rd, 2010

Today we released the first Mozilla Developer Preview containing multi-process plugins. Firefox is now safe from plugin crashes, on Windows and Linux.

Where Do I Get It?

Here!

How Can I Help?

Warning: the developer preview is alpha-quality and may contain bugs. If you are willing to browse on the wild side, please download and use the preview build as your day-to-day browser. Visit web pages which use plugins. We have done a fair bit of testing with the most popular plugins such as Flash and Silverlight, but there are many less popular plugins which may not be tested at all. If you don’t know what plugins are installed, go to Tools / Add-Ons / Plugins.

If you encounter any crashes, please make sure you submit the crash reporter and try to provide a detailed comment about what you were doing at the time of the crash.

If you encounter any unexpected behavior, please file a bug, including the following information:

  • The website you were visiting
  • Plugin information copied from about:plugins
  • Expected and actual behavior
  • Whether the behavior is fixed when you change the IPC preference (see below).

How Does It Work?

The Mozilla Developer Center has an article describing the underlying architecture: there is a shim layer which acts like a plugin in the browser process and a browser in the plugin process. Function calls are translated into RPC messages passed between the two processes.

One of the key pieces of technology we’ve developed to make message passing more reliable is IPDL. IPDL is a language which precisely describes the messages that can be passed between processes, and allows developers to define a state machine and error handling conditions for messages and resources shared across processes. IPDL layers on top of an IPC stack that Mozilla copied from the Chromium codebase. For instance, this protocol describes the messages associated with a plugin instance on a web page. Each message may be delivered asynchronously, synchronously, or with RPC semantics.

Common Questions

What happens when a plugin crashes?

When a plugin crashes, the Mozilla crash reporter kicks in and submits the crash report to Mozilla. Then we replace the plugin display with the crashed-plugin UI. When you reload the page, we restart the plugin process.

Why don’t you reload the plugin automatically?

We thought about this: however, web page scripts often have state associated with a plugin. If we reload the plugin without reloading the entire page, those scripts will have unexpected state and can get very confused. Overall, it causes fewer problems for the user to simply refresh the page.

What’s the name of the plugin process?

The name of the plugin process in the Windows task manager is mozilla-runtime.exe (mozilla-runtime on Linux).

Why is mozilla-runtime using so much memory?

In general, our automated tests show that Mozilla actually uses less overall memory than it did previously. However, there are some measurement issues which may cause problems: memory which is shared between the two processes, such as mapped memory segments and code, may be counted twice. The Chromium project has had similar problems accurately measuring and presenting memory usage information. If there are particular pages or plugins which show a regression in memory usage, please file a bug!

What about Mac?

MacOS presents some unique challenges: the traditional drawing and interaction model for plugins is very difficult to do across processes. We are working on Mac support for multi-process plugins, and hope to have a preview of this work available soon.

Preferences

Multi-process plugin behavior can be controlled from preferences:

dom.ipc.plugins.enabled.filename

Each plugin can be controlled independently. For example, if Acrobat is causing problems, you can run it in-process by setting the pref dom.ipc.plugins.enabled.nppdf32.dll to false. The filename should be lower-case.

dom.ipc.plugins.enabled (true)

This controls the multi-process behavior for all plugins which don’t have a specific pref set as above.

After these preferences are changed the browser must be restarted.

Adobe Symbol Server: How Adobe Could Address Crash Issues

Thursday, February 18th, 2010

Since crash bugs are a top priority within Adobe, there is one relatively simple step Adobe should take which would make it much easier for everyone else to help Adobe track and diagnose crashes: implement a symbol server.

A symbol server is a public web server from which developers can fetch debugging information (PDB files) for released binaries. The Microsoft debuggers have excellent support for automatically pulling down symbols as they are needed in the debugger. Mozilla runs a symbol server for Firefox nightlies and releases, which is invaluable for people debugging and profiling Firefox without having to do a custom build. Microsoft runs a symbol server which contains debug information for Windows and many other Microsoft products, including the Silverlight plugin.

Debug information is not simply a way to get symbolic information from Flash. It is necessary in order to get any useful stack trace of the Mozilla code which is calling Flash. A common compiler optimization called frame pointer omission (FPO) avoids storing the frame pointer in the x86 EBP register, freeing that register up for general use. In order to walk the stack of this optimized code, the debugger has to query the frame size and frame pointer information from the PDB file. When debug information is not available, stack walking doesn’t produce usable results.

As an example, take the current #3 topcrash for nightly builds of Firefox (mozilla-central). The signature for this crash is NPSWF32.dll@0x1e7fe4. The stack traces from Mozilla’s crash reporting system are completely opaque:

Frame

Signature

0 NPSWF32.dll@0x1e7fe4

1 NPSWF32.dll@0x1ff471

2 NPSWF32.dll@0x2005bd

3 NPSWF32.dll@0x1fb195

4 NPSWF32.dll@0x1e02d1

5 NPSWF32.dll@0x17c22a

6 NPSWF32.dll@0x2959d

7 NPSWF32.dll@0×30386

8 @0x63aa15f

9 NPSWF32.dll@0x5bdef

Even worse, the crash signature depends on the particular version of Flash that is installed on the user’s computer. We can’t tell if a particular crash signature is fixed by a new revision of flash because without symbols we can’t correlate crashes between different versions.

As part of developing multi-process plugins for Firefox, we are constantly dealing with unexpected plugin behaviors. Whenever we encounter a problem which can be reproduced in both Silverlight and Flash, we’ll always test with silverlight, simply because Microsoft makes Silverlight symbols available through their symbol server and therefore we can actually step through their code and ours in a debugger.

Adobe should set up a symbol server for their three main plugins, Flash, Shockwave, and Acrobat. By implementing this simple tool, Adobe could help all browser vendors and interested hackers to help identify and fix bugs. If Adobe is concerned about using full debug information to reverse-engineer details of their code, there is a way to strip the PDB files so that only frame-pointer information and function names.

Multi-Process Plugins on By Default

Wednesday, January 27th, 2010

Out-of-process plugins (OOPP) are now on by default in mozilla-central! Starting tomorrow morning, the mozilla-central nightly builds will load Flash and all other plugins in a separate process by default (on Windows and Linux). The Electrolysis team would love for people to test any plugins on their system, especially less-popular plugins.

Since we are moving relatively quickly with multi-process plugins, there are a few known issues to be aware of:

  • The plugin-crash UI is not finished. The current UI is just a non-localized dialog so that we can get crash reports from nightly testers. This will be changed soon!
  • On Windows, tearing/repainting issues when scrolling, bug 535295
  • On Linux, compiz effects and Flash don’t work together on some systems, bug 535612
  • On Windows, selecting “Print” option in Flash may lock up Firefox, bug 538918
  • On Windows, hulu won’t switch to full-screen mode, bug 539658
  • On Linux with GTK+-2.18 or later, GDK assertions and a fatal XError, bug 540197
  • Firefox-process crashes at NPObjWrapper_NewResolve with silverlight and sometimes Flash, bug 542263

If you discover crashes while running nightlies, please make sure you submit them, and check about:crashes for the crash ID and signature. We could use help making sure plugin-related crashes and instability are filed and tracked by searching for signatures here and filing bugs in the Core:Plug-Ins component.

If your browser hangs, you can probably recover by killing the mozilla-runtime process in the Windows task manager or via `kill` on Linux. If you are a developer with a debugger, please use the Mozilla symbol server and get stacks for both the Firefox process and the mozilla-runtime process and file a bug.

In some cases, it may be useful to the Electrolysis developers if you obtain a plugin log, which is a log of calls made between the plugin and the browser. Instructions for obtaining the log are available here.

I am very excited that we’ve made it this far, and I look forward to our next milestone release, which will backport these changes to the 1.9.2 release in preparation for Firefox Lorentz.

Flash in a separate process

If for some reason you need to disable multi-process plugins, set the pref dom.ipc.plugins.enabled to false.

Error calling method on NPObject!

Monday, January 25th, 2010

When a plugin crashes, content script may still have a reference to JS objects provided by that plugin. The JS objects will throw an exception “Error calling method on NPObject” when any properties or methods are called. Unfortunately, this generic error message is also thrown whenever a plugin method fails for any reason. You can’t tell, just by looking at the exception, whether the process crashed or some other type of failure occurred.

This is important when a test fails: there could be any number of different errors lurking under the surface with similar outward appearance. Today there was a Mochitest error with the following symptoms:

197 ERROR TEST-UNEXPECTED-FAIL | /tests/modules/plugin/test/test_painting.html | [SimpleTest/SimpleTest.js, window.onerror] An error occurred - Error calling method on NPObject! at http://localhost:8888/tests/modules/plugin/test/test_painting.html:105
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)

Reading through the log, however, the important output is:

###!!! [RPCChannel][Child][/builds/moz2_slave/mozilla-central-linux/build/ipc/glue/RPCChannel.cpp:276] Assertion (mDeferred.empty() || 1 == mDeferred.size()) failed.  expected mDeferred to have 0 or 1 items, but it has %lu (triggered by rpc)
  local RPC stack size: 2863316886
  remote RPC stack guess: 8
  deferred stack size: 2863316886
  out-of-turn RPC replies stack size: 2863316886
  Pending queue size: 2863317142, front to back:

This assertion is immediately followed by an abort, which is visible in the crash dump output also:

Crash reason:  SIGSEGV
Crash address: 0xbdce4804

Thread 1 (crashed)
 0  libxul.so!mozilla::ipc::RPCChannel::DebugAbort(char const*, int, char const*, char const*, char const*, bool) [ipc_message.h:0235fc257969 : 97 + 0x0]

David B. mistakenly thought that this was a manifestation of Bug 541102 when in fact it is an entirely unrelated bug with similar symptoms. When in doubt about a crash, please check with one of the Electrolysis team to help diagnose and read the log.