Archive for the 'Mozilla' Category

Adobe Symbol Server: How Adobe Could Address Crash Issues

Thursday, February 18th, 2010

Since crash bugs are a top priority within Adobe, there is one relatively simple step Adobe should take which would make it much easier for everyone else to help Adobe track and diagnose crashes: implement a symbol server.

A symbol server is a public web server from which developers can fetch debugging information (PDB files) for released binaries. The Microsoft debuggers have excellent support for automatically pulling down symbols as they are needed in the debugger. Mozilla runs a symbol server for Firefox nightlies and releases, which is invaluable for people debugging and profiling Firefox without having to do a custom build. Microsoft runs a symbol server which contains debug information for Windows and many other Microsoft products, including the Silverlight plugin.

Debug information is not simply a way to get symbolic information from Flash. It is necessary in order to get any useful stack trace of the Mozilla code which is calling Flash. A common compiler optimization called frame pointer omission (FPO) avoids storing the frame pointer in the x86 EBP register, freeing that register up for general use. In order to walk the stack of this optimized code, the debugger has to query the frame size and frame pointer information from the PDB file. When debug information is not available, stack walking doesn’t produce usable results.

As an example, take the current #3 topcrash for nightly builds of Firefox (mozilla-central). The signature for this crash is NPSWF32.dll@0x1e7fe4. The stack traces from Mozilla’s crash reporting system are completely opaque:

Frame

Signature

0 NPSWF32.dll@0×1e7fe4

1 NPSWF32.dll@0×1ff471

2 NPSWF32.dll@0×2005bd

3 NPSWF32.dll@0×1fb195

4 NPSWF32.dll@0×1e02d1

5 NPSWF32.dll@0×17c22a

6 NPSWF32.dll@0×2959d

7 NPSWF32.dll@0×30386

8 @0×63aa15f

9 NPSWF32.dll@0×5bdef

Even worse, the crash signature depends on the particular version of Flash that is installed on the user’s computer. We can’t tell if a particular crash signature is fixed by a new revision of flash because without symbols we can’t correlate crashes between different versions.

As part of developing multi-process plugins for Firefox, we are constantly dealing with unexpected plugin behaviors. Whenever we encounter a problem which can be reproduced in both Silverlight and Flash, we’ll always test with silverlight, simply because Microsoft makes Silverlight symbols available through their symbol server and therefore we can actually step through their code and ours in a debugger.

Adobe should set up a symbol server for their three main plugins, Flash, Shockwave, and Acrobat. By implementing this simple tool, Adobe could help all browser vendors and interested hackers to help identify and fix bugs. If Adobe is concerned about using full debug information to reverse-engineer details of their code, there is a way to strip the PDB files so that only frame-pointer information and function names.

Multi-Process Plugins on By Default

Wednesday, January 27th, 2010

Out-of-process plugins (OOPP) are now on by default in mozilla-central! Starting tomorrow morning, the mozilla-central nightly builds will load Flash and all other plugins in a separate process by default (on Windows and Linux). The Electrolysis team would love for people to test any plugins on their system, especially less-popular plugins.

Since we are moving relatively quickly with multi-process plugins, there are a few known issues to be aware of:

  • The plugin-crash UI is not finished. The current UI is just a non-localized dialog so that we can get crash reports from nightly testers. This will be changed soon!
  • On Windows, tearing/repainting issues when scrolling, bug 535295
  • On Linux, compiz effects and Flash don’t work together on some systems, bug 535612
  • On Windows, selecting “Print” option in Flash may lock up Firefox, bug 538918
  • On Windows, hulu won’t switch to full-screen mode, bug 539658
  • On Linux with GTK+-2.18 or later, GDK assertions and a fatal XError, bug 540197
  • Firefox-process crashes at NPObjWrapper_NewResolve with silverlight and sometimes Flash, bug 542263

If you discover crashes while running nightlies, please make sure you submit them, and check about:crashes for the crash ID and signature. We could use help making sure plugin-related crashes and instability are filed and tracked by searching for signatures here and filing bugs in the Core:Plug-Ins component.

If your browser hangs, you can probably recover by killing the mozilla-runtime process in the Windows task manager or via `kill` on Linux. If you are a developer with a debugger, please use the Mozilla symbol server and get stacks for both the Firefox process and the mozilla-runtime process and file a bug.

In some cases, it may be useful to the Electrolysis developers if you obtain a plugin log, which is a log of calls made between the plugin and the browser. Instructions for obtaining the log are available here.

I am very excited that we’ve made it this far, and I look forward to our next milestone release, which will backport these changes to the 1.9.2 release in preparation for Firefox Lorentz.

Flash in a separate process

If for some reason you need to disable multi-process plugins, set the pref dom.ipc.plugins.enabled to false.

Error calling method on NPObject!

Monday, January 25th, 2010

When a plugin crashes, content script may still have a reference to JS objects provided by that plugin. The JS objects will throw an exception “Error calling method on NPObject” when any properties or methods are called. Unfortunately, this generic error message is also thrown whenever a plugin method fails for any reason. You can’t tell, just by looking at the exception, whether the process crashed or some other type of failure occurred.

This is important when a test fails: there could be any number of different errors lurking under the surface with similar outward appearance. Today there was a Mochitest error with the following symptoms:

197 ERROR TEST-UNEXPECTED-FAIL | /tests/modules/plugin/test/test_painting.html | [SimpleTest/SimpleTest.js, window.onerror] An error occurred - Error calling method on NPObject! at http://localhost:8888/tests/modules/plugin/test/test_painting.html:105
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)
PROCESS-CRASH | automation.py | application crashed (minidump found)
Thread 1 (crashed)

Reading through the log, however, the important output is:

###!!! [RPCChannel][Child][/builds/moz2_slave/mozilla-central-linux/build/ipc/glue/RPCChannel.cpp:276] Assertion (mDeferred.empty() || 1 == mDeferred.size()) failed.  expected mDeferred to have 0 or 1 items, but it has %lu (triggered by rpc)
  local RPC stack size: 2863316886
  remote RPC stack guess: 8
  deferred stack size: 2863316886
  out-of-turn RPC replies stack size: 2863316886
  Pending queue size: 2863317142, front to back:

This assertion is immediately followed by an abort, which is visible in the crash dump output also:

Crash reason:  SIGSEGV
Crash address: 0xbdce4804

Thread 1 (crashed)
 0  libxul.so!mozilla::ipc::RPCChannel::DebugAbort(char const*, int, char const*, char const*, char const*, bool) [ipc_message.h:0235fc257969 : 97 + 0x0]

David B. mistakenly thought that this was a manifestation of Bug 541102 when in fact it is an entirely unrelated bug with similar symptoms. When in doubt about a crash, please check with one of the Electrolysis team to help diagnose and read the log.

Firefox 3.6

Thursday, January 21st, 2010

We released Firefox 3.6 today. If you are currently running Firefox, choose “Check for Updates” from the Help menu. If you aren’t, go get Firefox 3.6 now! One of our most popular new features is Personas, which you can use to style Firefox the way you want. We’ve also made Firefox faster, more responsive, and more secure than ever.

Multi-Process Plugins

Tuesday, December 15th, 2009

Yesterday I landed multi-process plugin support in mozilla-central. By default, this capability is disabled, because there are still some serious bugs. But if you are willing to suffer some temporary instability, we could really use some help testing Minefield nightlies with out-of-process plugins (OOPP).

Currently only Windows and Linux support multi-process plugins: mac support requires additional work. To turn OOPP on, visit about:config, find the pref dom.ipc.plugins.enabled, set it to true, and restart your browser. Please report any crashes or instability in bugzilla: product “Core” component “Plug-Ins”. Where possible, please be as detailed as possible in bug reports:

  • Operating system: please be specific about Windows versions, since Windows XP and Windows Vista deliver some Windows events differently;
  • Page visited;
  • Plugin data from about:plugins;
  • Whether turning IPC off fixes the problem (Note: flipping the pref usually requires restarting the browser to take effect).

There is one major known bug right now: any plugin which is installed in a path with spaces fails to load. On Windows, this affects almost everything except Flash. I hope to have this fixed in tomorrow’s nightly. There is a tracking bug for all the known issues which prevent us from turning on OOPP by default.

Please direct any questions about this work to the mozilla.dev.tech.plugins discussion list.

Mozilla Status Updates

Tuesday, December 15th, 2009

If you work on Mozilla, how do you coordinate with other people? How do you let people know what you’re working on and ask for help without burdening your coworkers with unwanted email? As part of coordinating the Electrolysis project, I created a webtool which allows people to post status reports in a low-touch way. The Mozilla Status Board allows members of the Mozilla community to post status updates which will be distributed to other people on their teams, and to the public.

A status report is simple: you list items that have been accomplished, what you plan to work on next. Finally, you can list items that other members of your team may need to see, such as review requests, links to posted designs, or even vacation days.

One of the important design considerations was not forcing users into one communication medium. Users group themselves into teams, and each user can decide whether to receive email updates from their team or subscribe to a web feed. For example, if you wanted to see my personal status updates, you can visit my status page or subscribe to my personal feed. And if I wanted to use a feed reader, I could subscribe to the posts of everyone on my team. Hint: to change your email settings, visit the preferences page linked from the header.

Everyone in the Mozilla community is invited to use the status board. In order to keep spammers away, registrations require a password: ask somebody who has already registered, or ask in irc.mozilla.org #developers, or ask any Mozilla employee.

The status board was written in python cherrypy+genshi. The code is hosted at hg.mozilla.org and I am happy to take patches or suggestions. At some point I will probably try to transfer the site from my own server to some Mozilla server, but I’ll make sure that links keep working and data is migrated.

If you’ve been using the status board already, note that I just fixed a bug in the email system: daily/weekly emails were being delivered incorrectly, so starting tonight it should work correctly.

Update: Fixed an issue where Firefox wouldn’t remember your username correctly; renamed “Tags” to “Coordination” to make its intended purpose more obvious, and enabled markdown.

Multi-Process Fennec

Friday, October 30th, 2009

Today Joe Drew, Olli Pettay, and I have gotten Mobile Firefox (Fennec) working with a separate process for rendering. It’s a significant achievement, because even though we had to hack out some Fennec features, it’s already a fairly functional browser. Olli made a screencast showing the browser in action:

Getting Fennec working was difficult partly because the mobile Firefox code uses a different drawing system: instead of displaying a native scrollable widget, the mobile code uses a cache of “tiles” to display the web page. This allows them to display certain kinds of content over the web page, as well has have better control and speed when scrolling, zooming, and performing other interactions.

In order to get all this working with multiple processes, the group attacked pieces of the problem separately. Joe Drew implemented a new method on the canvas element: asyncDrawXULElement. This call, very similar to drawWindow, will asynchronously ask the content process to draw a tile (or part of a tile).

Olli implement various interaction fixes: forwarding mouse events from the tiles to the content process, forwarding some important events such as MozAfterPaint from the content process back to the chrome process, and fixing widget focus in the embedded browser so that keystrokes are sent to it correctly.

Finally, I modified the Mozilla frame loader and subdocument frame such that “remote frames” could work correctly even without a docshell. I then hacked up the Fennec sources so that it would also work without a docshell, mainly by commenting out the security UI and zoom-to-element features which require additional information from the content process.

Now that it’s working, we hope to be able to bring additional developers in to fix up the features which we hacked around, fix DOM features which are currently broken such as link targeting, and start getting much better measurements for interactive performance and memory usage.

Mousewheel Zoom Eureka!

Wednesday, October 28th, 2009

In Firefox, you can make the page text larger and smaller by holding down the Control key and rolling your mouse scroll wheel up and down. Before continuing to read, I want you to think about which direction makes more sense: should scrolling down make the page get bigger or smaller?

(more…)

IPDL: The Inter-Process Protocol Definition Language

Tuesday, September 8th, 2009

IPDL is the language that Mozilla is using to describe all the messages between processes. Invented by Chris Jones, the IPDL language makes it easier for us to write type-safe and secure code by generating a lot of the basic validation code involved with messages.

When Chris was at Mozilla Headquarters a few weeks ago, he presented a tech-talk on IPDL which has been recorded and is available for download or viewing:

In the presentation, Chris explains the motivations for IPDL, demonstrates basic usage, and answers many questions about the limitations and benefits of using IPDL. I encourage anyone who is interested in Mozilla’s Multi-Process work to watch the presentation.

There are also some IPDL protocols in the tree if you’d like to read them: see the dom/ipc directory in the Electrolysis branch.

Taras Owns libjar

Wednesday, August 19th, 2009

When Taras Glek stepped up and made major improvements to the performance of libjar, we couldn’t resist making him the module owner. libjar hasn’t had an official owner for as long as I can remember. I would occasionally do reviews, as would the networking owners (biesi/bz), but the code was really orphaned code that was unmaintained. Taras is already changing that in a big way. In a quick IRC convention with Brendan today, Taras became the new owner of the JAR code. Congratulations Taras!