Statically Checking the Mozilla Codebase
Wednesday, March 5th, 2008In the header file that declares class nsAutoString, there is an important comment: “Do not allocate this class on the heap”. This rule, buried deep in a header file that almost nobody reads, is a small example of a problem that plagues the Mozilla codebase: it’s easy to write incorrect code.
Mozilla, and XPCOM in particular, uses a meta-language on top of C++. This meta-language of helper classes and typesafe templates allows experienced XPCOM coders to avoid some of the complexities of XPCOM refcounting and memory management. Unfortunately, it is possible, even easy, to use this meta-language incorrectly or inefficiently. The following code, while correct C++, is incorrect “Mozilla C++”:
class Foo { private: nsAutoString mStr; ... }; void Bad() { Foo *foo = new Foo(); // allocated nsAutoString on the heap! }
Errors such as this are especially insidious because the code works correctly; it is merely an inefficient use of memory which can add up quickly.
Taras and David have been working on a solution which will allow application-specific rules such as “only allocate nsAutoString on the stack” to be enforced at compile-time. It is called Dehydra GCC. It is a tool which translates the internal GCC representation of C++ code into a JavaScript object model, and allows application authors to write analysis passes as scripts.
This past week I hooked up Dehydra to the Mozilla build system. It is now possible to configure --with-static-checking=/path/to/gcc_dehydra.so and Dehydra will enforce a few basic rules: classes annotated with NS_STACK_CLASS
may only be allocated on the stack, and classes annotated with NS_FINAL_CLASS
may not be subclassed. For an example of a more complicated analysis, see bug 420933, ensuring that XPCOM methods handle outparams correctly.
Dehydra currently only works on Linux, and building it is a bit complicated: a custom patched GCC is required, as well as a spidermonkey package. Complete directions are available on the Mozilla Developer Center.
Dehydra is still a work in progress. We are working to complete the following tasks:
- Treehydra: a sub-project to reflect the detailed GCC control-flow tree, instead of the higher-level syntax tree currently reflected by dehydra, which will allow for more detailed and accurate analyses
- port Dehydra to Mac
- allow Dehydra to process C, Objective-C, and Objective-C++ (currently it only processes C++)
- port Dehydra for Mingw (probably cross-compiled from Linux for sanity)
- make the tinderbox build with static checking
We are actively looking for hackers to help out with this project. There are many different kinds of tasks people can help with:
- Write analysis passes to check for common bad-code patterns in Mozilla or other projects
- Implement a webtool that allows users to browse code exactly
- Help package up GCC-dehydra in RPMs and .deb packages for easy installation into Linux distros
I am very excited about the prospect of using static checking tools such as this one to improve our code quality and development cycle, and I’m looking forward to new and unexpected uses for this code! In future posts I will cover basics of writing a dehydra script.