If all goes well, Firefox 3.6.4 will be released with support for out-of-process plugins on Windows and Linux next week. As part of this project, Firefox now features a “hang detector” for plugins. The hang detector helps protect users from plugins and plugin scripts which stop responding for 10 seconds.
When Firefox makes an NPAPI call to a plugin which is being run in a separate plugin process, that call is translated into a message which is posted to the process. Firefox then waits for the response. If the plugin takes longer than 10 seconds to respond to a call, Firefox performs the following actions:
- Stop the plugin process and collect a plugin-process “crash” minidump.
- Collect a “crash” minidump from the browser process.
- Terminate the plugin process.
- Display the plugin-crashed UI, giving the user the opportunity to refresh the page and try again.
There are several reasons plugins might trigger the hang detector:
- A plugin script (such as ActionScript run inside the Flash plugin) may be in an infinite loop or performing a very long computation.
- The plugin itself may have a bug (such as a threading deadlock) which causes it to stop responding.
- The plugin is not deadlocked, but is not processing events quickly enough, causing the event to linger waiting to be processed.
- The implementation of Firefox out-of-process plugins may be causing a deadlock.
It is this last possibility that is most concerning, and we have pored over our Firefox crash stats studying the hang reports that we receive, trying to categorize the reports into one the categories above. During the long process of Firefox 3.6.4 release candidates, we have identified and fixed several “tophangs”: see bug 561817 for an example.
When hang reports were first submitted to crash-stats, it was very difficult to distinguish full crashes from hangs. It was also impossible to correlate the browser and plugin parts of a hang. Since then, the Socorro team has done an outstanding job of improving the crash-stats UI so that we can analyze hang reports. It is now possible, using the advanced query page, to search for only crash reports, only hang reports, and to limits searches to either the browser process or the plugin process. Individual hang reports are now cross-linked: see the following reports for an example:
In this particular hang-pair, it isn’t immediately clear what is causing the hang. The browser is painting, and during the process of painting a windowless Flash plugin sends the NPP_SetWindow NPAPI call. This is the hanging call. At the time the hang report was collected, the NPAPI thread of the Flash plugin is calling NPN_InvalidateRect, and inside that implementation locking a mutex. But we don’t know, looking at the stacks, whether this is a deadlock on that mutex, or whether the plugin just happened to be making that call at the time the hang detector collected its stack.
In some cases, developers may want to disable the hang detector. If you are using the Flash debugger, or if you are debugging Firefox, you can change the hang detector timeout by changing the preference dom.ipc.plugins.timeoutSecs. Setting a value of -1 disables the hang detector entirely. For more information on setting preferences, see about:config on MozillaZine.