Archive for the 'Mozilla' Category

Release branches in mozilla-central

Thursday, October 9th, 2008

On the Firefox development branches, the version number is always “pre”: for example, 3.1b1pre. This makes it easy to distinguish between nightly builds and release builds. To produce a release, the release team creates a “minibranch”. This minibranch exists for the following reasons:

  • To allow bumping the version numbers to the release version, for example: 3.1b1.
  • To isolate the release process and allow the main development tree to re-open as quickly as possible.

This is a long-standing tradition in CVS, but we haven’t really done it before 3.1b1 in Mercurial. This week, mozilla-central grew a new branch: the GECKO191b1_20081007_RELBRANCH. Pushing this branch to mozilla-central caused some unexpected side effects for developers:

  1. Developers who issued a normal hg pull -u got the following message:
    adding 171 changesets with 234 changes to 110 files (+1 heads)
    not updating, since new heads added
    (run 'hg heads' to see heads, 'hg merge' to merge)

    Yes, a new head was added; but this head is on a named branch and shouldn’t affect developers who aren’t on that branch. This is a bug in Mercurial that will be fixed in future versions. To work around the problem, just run hg up, which will update you to the latest revision of the default branch.

  2. hg heads shows branch heads. Normally, developers working on the default branch don’t care about heads on other branches, and don’t want release branch heads showing up when they issue the hg heads command. The Mercurial developers are aware of this issue and will fix it in a future version. In the meantime, use the following command to see only the heads of the default branch: hg heads default.

Note: even with the above bugs fixed, hg pull -u isn’t the exact equivalent of hg pull; hg up: in the case where no new changes are available on the remote server, no update will be performed. This only affects trees where the working directory is not at the tip revision. This slightly unintuitive behavior is considered a feature by the Mercurial developers, not a bug.

A Good Bookstore

Tuesday, October 7th, 2008

I was sad to read about the closing of Olsson’s Books and Records in Washington D.C. When I lived in D.C., I worked two blocks from the Olsson’s downtown; I would drop in frequently during lunch, and spent an impressive fraction of my salary there.

While the prices and selection at Olsson’s were both competitive, most impressive and enjoyable was the staff: it was clear that everyone in the store loved books. They were able to provide relevant recommendations to me, after I became a “regular”, remembered my tastes and interests and were really good salesmen.

I’m sure that the “new” Barnes and Noble on 12th street had something to do with Olsson’s decline. Olsson’s probably couldn’t compete with the extended hours and the café. I don’t want to malign the huge chains: I enjoy spending a couple hours in a mega-super-duper Barnes and Noble. I’ve noticed that when a new store opens, the staff is really helpful and knowledgeable. But after a year or so, the quality of staff at the information desk starts dropping, sometimes dramatically. How can you be a really helpful bookseller if you’ve never heard of St. Augustine or Orson Scott Card? Why can’t the mega-chains keep book-loving sales staff?

Parsing IDL in Python

Tuesday, October 7th, 2008

One of the current pain points in our build system is the xpidl compiler. This is a binary tool which is used to generate C++ headers and XPT files from XPIDL input files. The code depends on libIDL, which in turn depends on glib. Because this tool is a build-time requirement, we have to build a host version, and in most cases we also build a target version to put in the SDK package.

Getting glib and libidl on linux systems is not very difficult: all the major distros have developer packages for them. But getting libidl and glib on Windows and mac can be quite painful. On Windows, we have had to create our own custom static library versions of this code which are compatible with all the different versions of Microsoft Visual C++. On Mac you can get them from macports, but as far as I know they are not available in universal binaries, which means that you can’t cross-compile a target xpidl.

Parsing IDL, while not trivial, is not so complicated that it requires huge binary code libraries. So a while back I reimplemented the XPIDL parser using python and the PLY (python lex-yacc) parsing library. The core parsing grammar and object model is only 1200 lines of code.

Because we don’t have any unit tests for xpidl, I chose to use A-B testing against the output of the binary xpidl: the header output of the python xpidl should match byte-for-byte the header output of the binary xpidl. I wrote a myrules.mk file which would automatically build and compare both versions during a buld. This turned out to be a royal pain, because the libIDL parser is not very consistent and has bugs:

  • Some, but not all attributes are re-ordered so that they are no longer in original order, but are ordered according to an internal glib hash function.
  • The code which associates doc-comments with IDL items is buggy: in many cases the comment is associated with a later item.

I had to add some temporary nasty hacks in order to work around these issues. And finally, reproducing the wacky whitespace of the binary tool wasn’t worthwhile, so I starting comparing the results using diff -w -B. But with these hacks and changes, both xpidl compilers produce identical C++ headers.

I completed the code to produce C++ headers during a couple of not-quite-vacation days, but I didn’t write any code to produce XPT files. I shelved the project as an attractive waste of time, until jorendorff needed an IDL parser to produce quick stub C++ code. Jason managed to take my existing code and hook up a quick-stub generator to it. The python xpidl parser and quick-stub generator are both used in the codebase.

Currently, we’re still using the old binary xpidl to produce C++ headers and XPT files. If somebody is interested, I’d really like help adding code to produce XPT files from the new parser, so that we can ditch the old binary code completely.

If you ever need to use python to parse some interesting grammar, I highly recommend PLY. If you turn on optimization it performs very well, and it has very good support for detailed error reporting.

A Parent’s Most Important Job

Wednesday, October 1st, 2008

I remember clearly the when I first read The Tipping Point. The book was a good read and thought-provoking, but I remember the book most clearly because of a small section near the end:

This [study] is, if you think about it, a rather extraordinary finding. Most of us believe that we are like our parents because of some combination of genes and, more important, of nurture — that parents, to a large extent, raise us in their own image. But if that is the case, if nurture matters so much, then why did the adopted kids not resemble their adoptive parents at all? The Colorado study isn’t saying that genes explain everything and that environment doesn’t matter. On the contrary, all of the results strongly suggest that our environment plays as big — if not bigger — a role as heredity in shaping personality and intelligence. What it is saying is that whatever that environmental influence is, it doesn’t have a lot to do with parents. It’s something else, and what Judith Harris argues is that that something else is the influence of peers.

Why, Harris asks, do the children of recent immigrants almost never retain the accent of their parents? How is it the children of deaf parents manage to learn how to speak as well and as quickly as children whose parents speak to them from the day they were born? The answer has always been that language is a skill acquired laterally — that what children pick up from other children is as, or more, important in the acquisition of language as what they pick up at home. What Harris argues is that this is also true more generally, that the environmental influence that helps children become who they are ‒that shapes their character and personality — is their peer group.

Expressed this way, I think it’s easy to come to the wrong conclusion: that parents have little influence over their children. A more useful inference would be:

A parent’s most important duty is to find the best possible peers for their children.

Generating Documentation With Dehydra

Tuesday, September 30th, 2008

One of the common complaints about the Mozilla string code is that it’s very difficult to know what methods are available on a given class. Reading the code is very difficult because it’s hidden behind a complex set of #defines, it’s parameterized for both narrow and wide strings, and because we have a deep and complex string hierarchy. The Mozilla Developer Center has a string guide, but not any useful reference documentation.

With a little hand-holding, static analysis tools can produce very useful reference documentation, which other tools simply cannot make. For example, because a static analysis tool knows the accessibility of methods, you can create a reference document that contains only the public API of a class. I spent parts of yesterday and today tweaking a Dehydra script to produce a string reference. I’m working with Eric Shepherd to figure out the best way to automatically upload the data to the Mozilla Developer Center, but I wanted to post a sample for comment. This is the public API of nsACString:

I am trying to keep the format of this document similar to the format we use for interfaces on MDC. It’s a bit challenging, because C++ classes have overloaded method names and frequently have many methods. In the method summary, I have grouped together all the methods with the same name.

Once the output and format are tweaked, I can actually hook the entire generation and upload process to a makefile target, and either run it on my local machine or hook it up to a buildbot. I used E4X to do the actual XML generation. It was a learning experience… I don’t think I’m a fan. I want Genshi for JavaScript. Making conditional constructs in E4X is slightly ugly, and making looping constructs is really painful: my kingdom for an XML generator so that I don’t have to loop and append to an XMLList.

Salmon Cakes

Friday, September 26th, 2008

I love crab cakes. But at least here in Johnstown, refrigerated crab meat is expensive enough that making crab cakes on a regular basis is impractical. But there is an affordable alternative that tastes almost as good: Salmon cakes. Canned salmon is inexpensive and is a great substitute; you can find it near the canned tuna at pretty much any decent supermarket.

Ingredients

  • 2 cans salmon (15.75oz each)
  • 1 cup breadcrumbs
  • lots of pepper
  • Spices:
    • 1 teaspoon ground mustard
    • 1 teaspoon paprika
    • 1/2 teaspoon cumin
    • 1/2 teaspoon red pepper flakes
    • Or whatever else strikes your fancy
  • 1 large onion, diced fine
  • 2 eggs, lightly beaten
  • bacon fat or frying oil (peanut, canola, sunflower, or soy oil)

Hardware

  • mixing bowl
  • can opener
  • fine strainer
  • griddle or large skillet (cast iron is best, but any heavy pan will do)
  • Metal spatula-like device: an offset spatula is best
  • Wire rack for draining: for best results, turn the rack upside-down in contact with newspaper.

Preparation

  1. Drain the salmon into a strainer. Pick through the fish and remove any backbone or other large bones, if present
  2. In a mixing bowl, combine the breadcrumbs and spices and toss
  3. Add the salmon, eggs, and onion to the bowl. Combine the ingredients with your hands. The mixture should be somewhat sticky. If it is dry, add another egg.
  4. Form the cakes with your hands:
    • The cakes can be any size from half-fist to fist sized. The cake should be a disc about twice as wide as it is thick… I can typically make 10 large-ish cakes from this recipe.
    • Squeeze in both hands to compact into roughly the correct shape.
    • While holding in the palm of one hand, cup your other hand around the outside of the cake to form it into a round.
  5. Heat the griddle on medium heat and add the frying fat.
  6. When water gently sizzles in the fat (3-4 minutes), add the cakes. It’s ok to place them close together.
  7. Turn when the first side is brown… I prefer a dark mahogany (~7 minutes), but many people prefer a more golden color (~5 minutes)
  8. When the second side is done, remove to the wire rack for draining and cover with foil. Serve immediately.

Service Suggestions

  • For a dipping sauce prepare sour cream with chives, or tartar sauce if you’re feeling very traditional.
  • Salmon cakes work well as a main dish, but you could also make smaller ones as hors d’œuvre or in a surf-n-turf combo.
  • On a cold day, pair with a warm vegetable soup.
  • On a warm day, pair with a cucumber salad.
  • Serve with Sauvignon Blanc or Corona.

Notes

Canned Salmon typically has a lot of added salt. You don’t need to add any salt, and I’d avoid salted seasoning blends (Old Bay) as well. Because the salmon is fully cooked, feel free to check for seasoning before frying.

I’ve seen recipes where the cakes are breaded before frying, typically with crushed saltine crackers. I can’t for the life of me figure out why.

If you are like me and instinctively add garlic to any dish calling for diced onions, please resist the temptation.

Allocated Memory and Shared Library Boundaries

Friday, September 26th, 2008

When people get started with XPCOM, one of the most confusing rules is how to pass data across XPCOM boundaries. Take the following method:

IDL markup

string getFoo();

C++ generated method signature

nsresult GetFoo(char **aResult);

Diagram showing transfer of allocation 'ownerhip' from the implementation method to the calling method

C++ Implementation

The aResult parameter is called an “out parameter”. The implementation of this method is responsible for allocating memory and setting *aResult:

nsresult
Object::GetFoo(char **aResult)
{
  // Allocate a string to pass back
  *aResult = NS_Alloc(4);

  // In real life, check for out-of-memory!
  strcpy(*aResult, “foo”);

  return NS_OK;
}

C++ Caller

The caller, after it is finished with the data, is responsible for freeing the data.

char *foo;
myIFace->GetFoo(&foo);
// do something with foo
NS_Free(foo);

The important thing to note is that the code doesn’t allocate memory with malloc, and doesn’t free it with free. All memory that is passed across XPCOM boundaries must be allocated with NS_Alloc and freed with NS_Free.

We have this rule because of mismatched allocators. Depending on your operating system and the position of the moon, each shared library may have its own malloc heap. If you malloc memory in one shared library and free it in a different library, the heap of each library may get corrupted and cause mysterious crashes. By forcing everyone to use the NS_Alloc/Free functions, we know that all code is using the same malloc heap.

Helper Functions

In most cases, there are helper functions which make following the rules much easier. On the implementation side, the ToNewUnicode and ToNewCString functions convert an existing nsAString/nsACString to an allocated raw buffer.

On the caller side, you should almost always use helper classes such as nsXPIDLString to automatically handle these memory issues:

Better C++ Caller

nsXPIDLCString foo;
myIFace->GetFoo(getter_Copies(foo));
// do something with foo

Impact on Extension Authors

It is especially important for extension authors to follow this advice on Windows. The Windows version of Firefox uses custom version of the Windows C runtime which we’ve patched to include the high-performance jemalloc allocator. Extension authors should link the C runtime statically, which guarantees that they will have a mismatched malloc/free heap.

Notes

What’s so bad about a liquidity crisis?

Wednesday, September 24th, 2008

I’ve been trying to follow the news and commentary about the “bailout” and financial markets in some detail; but there must be some obvious background knowledge I’m missing. From watching bits of the congressional hearing yesterday, and reading the newspapers, it seems that the major purpose of the bailout is “restore liquidity to the markets”, which seems to be an economist’s synonym for “make sure the markets can still loan people money”.

What would happen if the markets stopped loaning people money? For consumers at least, there would be some short-term pain: people have been expecting to be able to use easy credit, so they haven’t saved money for a new car, Christmas presents, and so forth. The housing market would certainly change, and housing prices would drop even further because of a lack of buyers. But would that actually significantly disrupt the economy? Wouldn’t the population save their money for a few years, limp along with their old car, and then buy a new one with saved cash?

Presumably after all the existing bad securities are untangled, some banks will start to be able to loan money again, and these lenders will set stricter requirements on collateral and verified ability to repay loans.

Perhaps the consequences are more serious if business credit disappeared: capitalizing a new business or making capital improvements to an existing business pretty much requires credit. If we want to preserve this essential use of credit to keep the real economy strong (not the speculative market economy), isn’t there a way the U.S. government could guarantee this kind of credit for business capital loans much more cheaply than $700B, and let the chips fall where they might everywhere else?

I invite the blogosphere to link me up to classic economic treatises and modern articles which could help me understand how a liquidity crisis would cause the economy to simply collapse.

Call for Help: Boehm+jemalloc

Wednesday, September 24th, 2008

At the Firefox summit we decided to change tack on XPCOMGC and try to use Boehm instead of MMgc. Overall I think this was a really good decision. With help from Graydon, I even have some Linux builds that use Boehm under the hood (most memory is not considered collectable, only string buffers are collected objects at this point).

Unfortunately, while Boehm is a pretty good collector, it doesn’t do so well at memory allocation and fragmentation. Heap usage is between 1.5x and 2x that of standard Firefox using jemalloc. What I really want is a combination of jemalloc and Boehm, taking the best features from each:

Boehm Features:

  • Fast and threadsafe conservative collector
  • Smart rooting of all thread stacks and static data
  • Incremental marking with hardware write barriers1
  • Option for parallel collection2
  • Ability to intermingle collected and non-collected memory

jemalloc features:

  • Better overall memory usage, primarily due to lower fragmentation
  • Very tight and well-performing allocator

Help Wanted

I’m looking for somebody who’s willing to painstakingly combine the best of these two allocators: either port the jemalloc low-fragmentation design to Boehm, or port the Boehm collection mechanism to the jemalloc allocator. If you’re interested, please contact me. Getting a solution to this problem really blocks any serious plans for further work on XPCOMGC.

Notes

  1. The key word is hardware. The MMgc solution failed because altering our codebase to have correct programmatic write barriers was going to involve boiling the ocean. And even with smart pointers, a standard MMgc write barrier involves a lot of overhead.
  2. In Boehm, parallel collection doesn’t work with most incremental collection, and so we may not actually decide to use it; avoiding large pauses with incremental collection is more important.

When linking, the order of your command-line can be important

Monday, September 22nd, 2008

Occasionally, people will come on the #xulrunner or #extdev channel with a question about compiling XPCOM components. The question often goes something like this:

<IRCGuy> I’m following a tutorial on making XPCOM components, but I can’t seem to get them to compile. Can anyone tell me what my problem is?

Hint for asking a good question: IRCGuy needs to tell us some combination of 1) what tutorial he’s following, 2) what the failing command is or 3) what the error message is.

This time, IRCGuy’s compile command and error message are:

IRCGuy@IRCGuy-france:/mnt/data/IRCGuy/public/project/xpcom-test$ make
g++  -I/usr/local/include/xulrunner-1.9/unstable -I/usr/local/include/xulrunner-1.9/stable -L/usr/local/lib/xulrunner-devel-1.9/lib -Wl,-rpath-link,/usr/local/bin  -lxpcomglue_s -lxpcom -lnspr4 -fno-rtti -fno-exceptions -shared -Wl,-z,defs  france2.cpp -o france2.so
/tmp/cceFg2dD.o: In function `NSGetModule':
france2.cpp:(.text+0x38c): undefined reference to `NS_NewGenericModule2(nsModuleInfo const*, nsIModule**)'

IRCGuy’s problem is a problem of link ordering: with most unix-like linkers, it is very important to list object files and libraries in the correct order. The general order you want to follow is as follows:

  1. Object files
  2. Static libraries – specific to general
  3. Dynamic libraries

If an object file needs a symbol, the linker will only resolve that symbol in static libraries that are later in the link line.

The corrected command:

g++  -I/usr/local/include/xulrunner-1.9/unstable -I/usr/local/include/xulrunner-1.9/stable -fno-rtti -fno-exceptions -shared -Wl,-z,defs  france2.cpp -L/usr/local/lib/xulrunner-devel-1.9/lib -Wl,-rpath-link,/usr/local/bin  -lxpcomglue_s -lxpcom -lnspr4 -o france2.so

Bonus tip: correct linker flags for linking XPCOM components can be found on the Mozilla Developer Center article on the XPCOM Glue. As noted in the article, xpcom components want to use the “Dependent Glue” linker strategy.