The Firefox Plugin Hang Detector
If all goes well, Firefox 3.6.4 will be released with support for out-of-process plugins on Windows and Linux next week. As part of this project, Firefox now features a “hang detector” for plugins. The hang detector helps protect users from plugins and plugin scripts which stop responding for 10 seconds.
When Firefox makes an NPAPI call to a plugin which is being run in a separate plugin process, that call is translated into a message which is posted to the process. Firefox then waits for the response. If the plugin takes longer than 10 seconds to respond to a call, Firefox performs the following actions:
- Stop the plugin process and collect a plugin-process “crash” minidump.
- Collect a “crash” minidump from the browser process.
- Terminate the plugin process.
- Display the plugin-crashed UI, giving the user the opportunity to refresh the page and try again.
There are several reasons plugins might trigger the hang detector:
- A plugin script (such as ActionScript run inside the Flash plugin) may be in an infinite loop or performing a very long computation.
- The plugin itself may have a bug (such as a threading deadlock) which causes it to stop responding.
- The plugin is not deadlocked, but is not processing events quickly enough, causing the event to linger waiting to be processed.
- The implementation of Firefox out-of-process plugins may be causing a deadlock.
It is this last possibility that is most concerning, and we have pored over our Firefox crash stats studying the hang reports that we receive, trying to categorize the reports into one the categories above. During the long process of Firefox 3.6.4 release candidates, we have identified and fixed several “tophangs”: see bug 561817 for an example.
When hang reports were first submitted to crash-stats, it was very difficult to distinguish full crashes from hangs. It was also impossible to correlate the browser and plugin parts of a hang. Since then, the Socorro team has done an outstanding job of improving the crash-stats UI so that we can analyze hang reports. It is now possible, using the advanced query page, to search for only crash reports, only hang reports, and to limits searches to either the browser process or the plugin process. Individual hang reports are now cross-linked: see the following reports for an example:
In this particular hang-pair, it isn’t immediately clear what is causing the hang. The browser is painting, and during the process of painting a windowless Flash plugin sends the NPP_SetWindow NPAPI call. This is the hanging call. At the time the hang report was collected, the NPAPI thread of the Flash plugin is calling NPN_InvalidateRect, and inside that implementation locking a mutex. But we don’t know, looking at the stacks, whether this is a deadlock on that mutex, or whether the plugin just happened to be making that call at the time the hang detector collected its stack.
In some cases, developers may want to disable the hang detector. If you are using the Flash debugger, or if you are debugging Firefox, you can change the hang detector timeout by changing the preference dom.ipc.plugins.timeoutSecs. Setting a value of -1 disables the hang detector entirely. For more information on setting preferences, see about:config on MozillaZine.
June 10th, 2010 at 3:50 pm
Instead of just killing it can we ask the user if they would like to kill or wait?
There is a game that I play that sometimes gets bogged down in computations (a lot going on at once) such that it will stop responding, but eventually it returns. It doesn’t happen every time I play the game. It doesn’t have a save feature so I would get pretty annoyed if the browser killed it causing me to lose my place.
June 10th, 2010 at 5:12 pm
Also, is there an idea for when we can expect out-of-process plugins on the Mac?
June 14th, 2010 at 7:51 pm
[…] case a plug happens to hang and stop responding, Firefox 3.6.4 will have a hang detector that will turn off the plugin processes that’s hanging after ten seconds. It will be possible […]
June 19th, 2010 at 12:47 pm
Mark, we decided not to. The slow script dialog (JS) is already too confusing: they don’t understand what page/tab caused the dialog, and they usually don’t have enough information to make an informed decision about whether they want to wait. The same problem happens with plugins: since we’re deep in the stack and the plugin can’t paint (it’s hung!), you can’t show the user the page, and even if you did they wouldn’t understand that other Flash instances might be the “real” cause of the hang. It’s better to just start over.
And OOPP on Mac will be available on Snow Leopard in Firefox 4.
June 23rd, 2010 at 9:39 am
[…] More info here. […]
June 23rd, 2010 at 4:30 pm
Is there a technical limitation of Leopard (the operating system) preventing OOPP from being implemented for it?
Thanks for the work on Windows and Linux. Turns out that Flash problems are most prevalent on Mac as compared to the other platforms. For a Mac user like me, this issue has made me adopt Safari…
June 23rd, 2010 at 4:38 pm
Manoj, yes. Snow Leopard introduced new Core Animation features which let you share a CALayer between processes. This is not directly available on Leopard, and the workaround uses private APIs and has lots of issues. Rather than try and solve the Leopard problems, we made an engineering decision to focus on Snow Leopard.
June 25th, 2010 at 3:55 pm
[…] This will disable the hang detector that was added. It’s the same reason I don’t do any debugging in Chrome (alongside the lack of HttpFox). Sometimes a break point needs more than 10-15 seconds for me to figure out what’s going on. Check out more details here […]
June 25th, 2010 at 9:17 pm
I’ve been putting up with the Flash debugger crashing in Firefox for weeks, I just assumed it was Flash’s fault. But now I find out it was Firefox’s fault, only it didn’t tell me, and blamed the plugin instead, because you thought I might get confused??
If you really wont do a popup like you already do for JS, and like Chrome does for both JS and plugins, I suggest changing the wording if Firefox kills a plugin due to the hang detector to let users know that it is different type of failure than a normal plugin crash. Maybe even a “Read more…” link that mentions the ability to turn off the hang detector.
July 23rd, 2010 at 3:35 pm
Hey Benjamin,
Thanks for the response – I was unaware of the CALayer issue. So Safari and Chrome will not have OOPP-like functionality for Leopard either?
Thanks,
Manoj