Mozilla Layout Classes: nsIFrame class hierarchy

Friday, January 9th, 2009

The Mozilla layout code uses frame objects to lay out the DOM on the screen. (These are entirely different from &lt:frame> nodes in the DOM). All Mozilla frames inherit from a single C++ abstract class nsIFrame. As part of a project I’m working on to separate the frame classes from XPCOM, I used dehydra to generate a graph of all the frame types in Mozilla.

See the graph in SVG format.

What I really want for Christmas is a web-based interactive graph viewer for this type of content. I’ve seen a couple closed-source things in Java, but nothing really exciting or hackable.

Note: Viewing this graph in Safari won’t work, because the image is much larger than a single screen, and Safari doesn’t provide scrollbars for SVG that overflows… feel free to download it and view in inkscape, though, or get Firefox!.

Generated Documentation, part 2

Wednesday, November 26th, 2008

As I noted previously, I’ve been using our static analysis tools to generate documentation for the Mozilla string classes.

All of the code to generate this documentation is now checked in to mozilla-central. To regenerate documentation or hack the scripts, you will first need to build with static-checking enabled. Then, simply run the following command:

make -C xpcom/analysis classapi

To automatically upload the documentation to the Mozilla Developer Center, run the following command:

MDC_USER="Your Username" MDC_PASSWORD="YourPassword" make -C xpcom/analysis upload_classapi

One of the really exciting things about the Dehydra static-analysis project is that the analysis is not baked into any compiler. You can version your analysis scripts as part of your source code, run them from within your build system, and change them as your analysis needs change.

For example, I decided that a class inheritance diagram would help people understand the Mozilla string classes. So I modified the documentation script to produce graphviz output in addition to the standard XML markup. I then process the graphviz output to PNG with an imagemap and upload it to MDC along with the other output as an attachment1.

The output is available now. I’m still looking for volunteers to improve the output as well as the source comments to make it all clearer!

1. There is a MediaWiki extension so you can put graphviz markup directly in a wiki page and it will be transformed automatically. However, this extension currently doesn’t work on the Mozilla Developer Center. It’s being tracked in bug 463464 if you’re interested. ^

Generating Documentation With Dehydra

Tuesday, September 30th, 2008

One of the common complaints about the Mozilla string code is that it’s very difficult to know what methods are available on a given class. Reading the code is very difficult because it’s hidden behind a complex set of #defines, it’s parameterized for both narrow and wide strings, and because we have a deep and complex string hierarchy. The Mozilla Developer Center has a string guide, but not any useful reference documentation.

With a little hand-holding, static analysis tools can produce very useful reference documentation, which other tools simply cannot make. For example, because a static analysis tool knows the accessibility of methods, you can create a reference document that contains only the public API of a class. I spent parts of yesterday and today tweaking a Dehydra script to produce a string reference. I’m working with Eric Shepherd to figure out the best way to automatically upload the data to the Mozilla Developer Center, but I wanted to post a sample for comment. This is the public API of nsACString:

I am trying to keep the format of this document similar to the format we use for interfaces on MDC. It’s a bit challenging, because C++ classes have overloaded method names and frequently have many methods. In the method summary, I have grouped together all the methods with the same name.

Once the output and format are tweaked, I can actually hook the entire generation and upload process to a makefile target, and either run it on my local machine or hook it up to a buildbot. I used E4X to do the actual XML generation. It was a learning experience… I don’t think I’m a fan. I want Genshi for JavaScript. Making conditional constructs in E4X is slightly ugly, and making looping constructs is really painful: my kingdom for an XML generator so that I don’t have to loop and append to an XMLList.

Code Analysis and Rewriting at OSCON 2008

Friday, July 18th, 2008

I’m going to be at OSCON next week. Taras and I will be hosting a BoF session on using automatic analysis and rewriting tools for open-source projects on Wednesday evening. You’re welcome to come and learn about our tools, watch demos, or just just heckle and meet!

I’ve been to OSCON once before. I got a chance to meet people in person I only knew via email, and also meet some new people who had read my blog or otherwise knew of me. It was a blast. I hope to meet even more people this time around. Many of the official sessions don’t look that exciting to me, so I might spend a decent amount of time at OSCamp or just talking to interesting people. If you’d like to meet me and can’t make the BoF session, you can probably catch me at the Mozilla booth.

I’ll be staying the following weekend in Portland, so if you have recommendations of restaurants or sights that I shouldn’t miss while I’m there, comments welcome.

Static-Checking Tinderbox Online

Monday, June 30th, 2008

Today I set up a buildbot/tinderbox for the mozilla-central codebase built with static checking. This allows us to enforce annotations such as NS_FINAL_CLASS and NS_STACK_CLASS throughout our codebase. See the static-analysis-bsmedberg tree on the mozilla-central tinderbox.

Stack-only Classes

As an example, I today annotated nsAutoString as a stack class. If someone commits code which allocates an nsAutoString on the heap, the static-checking tinderbox will turn red.

I have a set of similar patches in the works which mark various helper classes as stack-only. These patches are needed because the XPCOMGC rewrites treat stack-only classes differently from regular heap-allocated types.

Help Wanted

A while back Dave Mandelin wrote an analysis of outparam usage. This is now running and producing warnings. I would like to find some help to go through the fairly large number of warnings this analysis produces and find the real bugs and fix the bogus warnings in the analysis.

The most important warning to check is

warning: outparam not written on NS_SUCCEEDED

This indicates a condition where the analyzer can’t prove that an outparam was written, but the method returned NS_SUCCEEDED anyway. This can lead to uninitialized memory errors and odd latent bugs. If you’d like to help, please hop over to the #mmgc IRC channel, and dmandelin or I can help walk you through the analysis/fixing process.

Local Machine

Because the dehydra/treehydra codebase is still in flux prior to the 1.0 release, I am currently maintaining this on one of my local machines, so if it goes down please don’t pester Mozilla’s IT or release teams. Once dehydra 1.0 is released, we will get turn on static checking on the main unit-test tinderboxes maintained by the Mozilla release team.

Has GCC Dehydra Replaced Elsa?

Sunday, March 9th, 2008

No.

GCC Dehydra allows us to do analysis passes on our existing code. In the far future it may also allow us to do optimization passes. But it does not have the ability to do code rewriting, and probably won’t gain that ability any time soon. In order to be able to do C++->C++ transformations, you have to have correct end positions for all statements/tokens, not just beginning positions. GCC does not track this information, and making it do so would be a massive undertaking.

Mozilla2 still lives in a dual world where automatic code refactoring is done using the elsa/oink toolchain, while static analysis is taking place using the GCC toolchain.

Statically Checking the Mozilla Codebase

Wednesday, March 5th, 2008

In the header file that declares class nsAutoString, there is an important comment: “Do not allocate this class on the heap”. This rule, buried deep in a header file that almost nobody reads, is a small example of a problem that plagues the Mozilla codebase: it’s easy to write incorrect code.

Mozilla, and XPCOM in particular, uses a meta-language on top of C++. This meta-language of helper classes and typesafe templates allows experienced XPCOM coders to avoid some of the complexities of XPCOM refcounting and memory management. Unfortunately, it is possible, even easy, to use this meta-language incorrectly or inefficiently. The following code, while correct C++, is incorrect “Mozilla C++”:

class Foo
{
private:
  nsAutoString mStr;
  ...
};

void Bad()
{
  Foo *foo = new Foo(); // allocated nsAutoString on the heap!
}

Errors such as this are especially insidious because the code works correctly; it is merely an inefficient use of memory which can add up quickly.

Taras and David have been working on a solution which will allow application-specific rules such as “only allocate nsAutoString on the stack” to be enforced at compile-time. It is called Dehydra GCC. It is a tool which translates the internal GCC representation of C++ code into a JavaScript object model, and allows application authors to write analysis passes as scripts.

This past week I hooked up Dehydra to the Mozilla build system. It is now possible to configure --with-static-checking=/path/to/gcc_dehydra.so and Dehydra will enforce a few basic rules: classes annotated with NS_STACK_CLASS may only be allocated on the stack, and classes annotated with NS_FINAL_CLASS may not be subclassed. For an example of a more complicated analysis, see bug 420933, ensuring that XPCOM methods handle outparams correctly.

Dehydra currently only works on Linux, and building it is a bit complicated: a custom patched GCC is required, as well as a spidermonkey package. Complete directions are available on the Mozilla Developer Center.

Dehydra is still a work in progress. We are working to complete the following tasks:

We are actively looking for hackers to help out with this project. There are many different kinds of tasks people can help with:

  • Write analysis passes to check for common bad-code patterns in Mozilla or other projects
  • Implement a webtool that allows users to browse code exactly
  • Help package up GCC-dehydra in RPMs and .deb packages for easy installation into Linux distros

I am very excited about the prospect of using static checking tools such as this one to improve our code quality and development cycle, and I’m looking forward to new and unexpected uses for this code! In future posts I will cover basics of writing a dehydra script.

Using Dehydra to Detect Problematic Code

Tuesday, October 16th, 2007

In XPCOMGC, the behavior of nsCOMPtr is very different than currently:

  • nsCOMPtr should only be used as a class member, never on the stack. Taras is working on a rewriting script that will replace nsCOMPtr<nsIFoo> on the stack with a raw nsIFoo* (more on that later).
  • the purpose of nsCOMPtr is not to ensure correct reference counting (there is no reference-counting!); instead it serves to enforce write barriers, so that MMgc can properly perform incremental GC.

I was able to rewrite nsCOMPtr so that existing code code mostly use the existing API: there is however one major difference: getter_AddRefs cannot return a pointer directly to the nsCOMPtr instance. Instead, it must save the value in a local variable and call nsCOMPtr.set() to preserve the write-barrier semantics. It does this using a temporary class:

/**
 * nsGetterAddRefs is used for XPCOM out parameters that need to be assigned
 * to nsCOMPtr members. We can't pass the address of nsCOMPtr.mRawPtr directly
 * because of the need to set the write barrier.
 */
template <class T>
class nsGetterAddRefs
{
public:
  explicit
  nsGetterAddRefs(nsCOMPtr<T> &aSmartPtr) :
    mTempPtr(aSmartPtr),
    mTargetSmartPtr(aSmartPtr)
  {
    // nothing else to do
  }

  ~nsGetterAddRefs()
  {
    mTargetSmartPtr = mTempPtr;
  }

  operator T**()
  {
    return &mTempPtr;
  }

private:
  T* mTempPtr;
  nsCOMPtr<T> &mTargetSmartPtr;
};

template <class T>
inline
nsGetterAddRefs<T>
getter_AddRefs(nsCOMPtr<T> &aSmartPtr)
{
  return nsGetterAddRefs<T>(aSmartPtr);
}

For the vast majority of cases where code makes a simple getter call, this works fine:

nsresult rv = something->GetAFoo(getter_AddRefs(mFoo));

However, if you test or use the returned value after you get it in the same statement, the value won’t be assigned yet:

Bad:

if (NS_SUCCEEDED(something->GetAFoo(getter_AddRefs(mFoo))) && mFoo)

Also bad:

NS_SUCCEEDED(something->GetAFoo(getter_AddRefs(mFoo))) && mFoo->DoSomething();

In the XPCOMGC world, both of these cases will fail because the ~nsGetterAddRefs destructor runs after the dereference of mFoo. Once we remove stack comptrs, this is not a common occurrence, but it does happen occasionally.

Checking for this kind of pattern is the perfect job for dehydra: check it out.

Dehydra is cool

Thursday, October 11th, 2007

In a lonely corner of Mozilla-land, Taras Glek has been working on tools to support automated analysis and rewriting of the Mozilla codebase. I hadn’t gotten a chance to play with these tools until recently, when I started working on mozilla2 code refactoring pretty much full-time. And Taras’ tools are amazing!

For instance, for XPCOMGC we needed to identify Mozilla XPCOM classes that need to inherit from GCObject. Taras has an uber-cool tool called Dehydra that reflects the C++ AST into JavaScript: with a little handholding, I was able to write the logic for this task in in 127-line JS file!

Processing (122/2004): ./security/manager/ssl/src/nsSmartCardMonitor.ii...  running dehydra... done.

It takes about 4 hours to run this script on every file in the tree (using multiple cores could make this much faster). Then I use a python script to post-process the output and create gcobject.patch (truncated example patch).

The possible uses of dehydra are almost limitless: it’s very likely that in due course we will do nightly dehydra runs with scripts that detect bad code patterns; it’s also possible to set up a tryserver to run dehydra over a build tree.

See the Mozilla wiki for more information about dehydra, including links to a PDF specification and more information from Taras’ blog.