Archive for 2008

A Parent’s Most Important Job

Wednesday, October 1st, 2008

I remember clearly the when I first read The Tipping Point. The book was a good read and thought-provoking, but I remember the book most clearly because of a small section near the end:

This [study] is, if you think about it, a rather extraordinary finding. Most of us believe that we are like our parents because of some combination of genes and, more important, of nurture — that parents, to a large extent, raise us in their own image. But if that is the case, if nurture matters so much, then why did the adopted kids not resemble their adoptive parents at all? The Colorado study isn’t saying that genes explain everything and that environment doesn’t matter. On the contrary, all of the results strongly suggest that our environment plays as big — if not bigger — a role as heredity in shaping personality and intelligence. What it is saying is that whatever that environmental influence is, it doesn’t have a lot to do with parents. It’s something else, and what Judith Harris argues is that that something else is the influence of peers.

Why, Harris asks, do the children of recent immigrants almost never retain the accent of their parents? How is it the children of deaf parents manage to learn how to speak as well and as quickly as children whose parents speak to them from the day they were born? The answer has always been that language is a skill acquired laterally — that what children pick up from other children is as, or more, important in the acquisition of language as what they pick up at home. What Harris argues is that this is also true more generally, that the environmental influence that helps children become who they are ‒that shapes their character and personality — is their peer group.

Expressed this way, I think it’s easy to come to the wrong conclusion: that parents have little influence over their children. A more useful inference would be:

A parent’s most important duty is to find the best possible peers for their children.

Generating Documentation With Dehydra

Tuesday, September 30th, 2008

One of the common complaints about the Mozilla string code is that it’s very difficult to know what methods are available on a given class. Reading the code is very difficult because it’s hidden behind a complex set of #defines, it’s parameterized for both narrow and wide strings, and because we have a deep and complex string hierarchy. The Mozilla Developer Center has a string guide, but not any useful reference documentation.

With a little hand-holding, static analysis tools can produce very useful reference documentation, which other tools simply cannot make. For example, because a static analysis tool knows the accessibility of methods, you can create a reference document that contains only the public API of a class. I spent parts of yesterday and today tweaking a Dehydra script to produce a string reference. I’m working with Eric Shepherd to figure out the best way to automatically upload the data to the Mozilla Developer Center, but I wanted to post a sample for comment. This is the public API of nsACString:

I am trying to keep the format of this document similar to the format we use for interfaces on MDC. It’s a bit challenging, because C++ classes have overloaded method names and frequently have many methods. In the method summary, I have grouped together all the methods with the same name.

Once the output and format are tweaked, I can actually hook the entire generation and upload process to a makefile target, and either run it on my local machine or hook it up to a buildbot. I used E4X to do the actual XML generation. It was a learning experience… I don’t think I’m a fan. I want Genshi for JavaScript. Making conditional constructs in E4X is slightly ugly, and making looping constructs is really painful: my kingdom for an XML generator so that I don’t have to loop and append to an XMLList.

Salmon Cakes

Friday, September 26th, 2008

I love crab cakes. But at least here in Johnstown, refrigerated crab meat is expensive enough that making crab cakes on a regular basis is impractical. But there is an affordable alternative that tastes almost as good: Salmon cakes. Canned salmon is inexpensive and is a great substitute; you can find it near the canned tuna at pretty much any decent supermarket.

Ingredients

  • 2 cans salmon (15.75oz each)
  • 1 cup breadcrumbs
  • lots of pepper
  • Spices:
    • 1 teaspoon ground mustard
    • 1 teaspoon paprika
    • 1/2 teaspoon cumin
    • 1/2 teaspoon red pepper flakes
    • Or whatever else strikes your fancy
  • 1 large onion, diced fine
  • 2 eggs, lightly beaten
  • bacon fat or frying oil (peanut, canola, sunflower, or soy oil)

Hardware

  • mixing bowl
  • can opener
  • fine strainer
  • griddle or large skillet (cast iron is best, but any heavy pan will do)
  • Metal spatula-like device: an offset spatula is best
  • Wire rack for draining: for best results, turn the rack upside-down in contact with newspaper.

Preparation

  1. Drain the salmon into a strainer. Pick through the fish and remove any backbone or other large bones, if present
  2. In a mixing bowl, combine the breadcrumbs and spices and toss
  3. Add the salmon, eggs, and onion to the bowl. Combine the ingredients with your hands. The mixture should be somewhat sticky. If it is dry, add another egg.
  4. Form the cakes with your hands:
    • The cakes can be any size from half-fist to fist sized. The cake should be a disc about twice as wide as it is thick… I can typically make 10 large-ish cakes from this recipe.
    • Squeeze in both hands to compact into roughly the correct shape.
    • While holding in the palm of one hand, cup your other hand around the outside of the cake to form it into a round.
  5. Heat the griddle on medium heat and add the frying fat.
  6. When water gently sizzles in the fat (3-4 minutes), add the cakes. It’s ok to place them close together.
  7. Turn when the first side is brown… I prefer a dark mahogany (~7 minutes), but many people prefer a more golden color (~5 minutes)
  8. When the second side is done, remove to the wire rack for draining and cover with foil. Serve immediately.

Service Suggestions

  • For a dipping sauce prepare sour cream with chives, or tartar sauce if you’re feeling very traditional.
  • Salmon cakes work well as a main dish, but you could also make smaller ones as hors d’œuvre or in a surf-n-turf combo.
  • On a cold day, pair with a warm vegetable soup.
  • On a warm day, pair with a cucumber salad.
  • Serve with Sauvignon Blanc or Corona.

Notes

Canned Salmon typically has a lot of added salt. You don’t need to add any salt, and I’d avoid salted seasoning blends (Old Bay) as well. Because the salmon is fully cooked, feel free to check for seasoning before frying.

I’ve seen recipes where the cakes are breaded before frying, typically with crushed saltine crackers. I can’t for the life of me figure out why.

If you are like me and instinctively add garlic to any dish calling for diced onions, please resist the temptation.

Allocated Memory and Shared Library Boundaries

Friday, September 26th, 2008

When people get started with XPCOM, one of the most confusing rules is how to pass data across XPCOM boundaries. Take the following method:

IDL markup

string getFoo();

C++ generated method signature

nsresult GetFoo(char **aResult);

Diagram showing transfer of allocation 'ownerhip' from the implementation method to the calling method

C++ Implementation

The aResult parameter is called an “out parameter”. The implementation of this method is responsible for allocating memory and setting *aResult:

nsresult
Object::GetFoo(char **aResult)
{
  // Allocate a string to pass back
  *aResult = NS_Alloc(4);

  // In real life, check for out-of-memory!
  strcpy(*aResult, “foo”);

  return NS_OK;
}

C++ Caller

The caller, after it is finished with the data, is responsible for freeing the data.

char *foo;
myIFace->GetFoo(&foo);
// do something with foo
NS_Free(foo);

The important thing to note is that the code doesn’t allocate memory with malloc, and doesn’t free it with free. All memory that is passed across XPCOM boundaries must be allocated with NS_Alloc and freed with NS_Free.

We have this rule because of mismatched allocators. Depending on your operating system and the position of the moon, each shared library may have its own malloc heap. If you malloc memory in one shared library and free it in a different library, the heap of each library may get corrupted and cause mysterious crashes. By forcing everyone to use the NS_Alloc/Free functions, we know that all code is using the same malloc heap.

Helper Functions

In most cases, there are helper functions which make following the rules much easier. On the implementation side, the ToNewUnicode and ToNewCString functions convert an existing nsAString/nsACString to an allocated raw buffer.

On the caller side, you should almost always use helper classes such as nsXPIDLString to automatically handle these memory issues:

Better C++ Caller

nsXPIDLCString foo;
myIFace->GetFoo(getter_Copies(foo));
// do something with foo

Impact on Extension Authors

It is especially important for extension authors to follow this advice on Windows. The Windows version of Firefox uses custom version of the Windows C runtime which we’ve patched to include the high-performance jemalloc allocator. Extension authors should link the C runtime statically, which guarantees that they will have a mismatched malloc/free heap.

Notes

What’s so bad about a liquidity crisis?

Wednesday, September 24th, 2008

I’ve been trying to follow the news and commentary about the “bailout” and financial markets in some detail; but there must be some obvious background knowledge I’m missing. From watching bits of the congressional hearing yesterday, and reading the newspapers, it seems that the major purpose of the bailout is “restore liquidity to the markets”, which seems to be an economist’s synonym for “make sure the markets can still loan people money”.

What would happen if the markets stopped loaning people money? For consumers at least, there would be some short-term pain: people have been expecting to be able to use easy credit, so they haven’t saved money for a new car, Christmas presents, and so forth. The housing market would certainly change, and housing prices would drop even further because of a lack of buyers. But would that actually significantly disrupt the economy? Wouldn’t the population save their money for a few years, limp along with their old car, and then buy a new one with saved cash?

Presumably after all the existing bad securities are untangled, some banks will start to be able to loan money again, and these lenders will set stricter requirements on collateral and verified ability to repay loans.

Perhaps the consequences are more serious if business credit disappeared: capitalizing a new business or making capital improvements to an existing business pretty much requires credit. If we want to preserve this essential use of credit to keep the real economy strong (not the speculative market economy), isn’t there a way the U.S. government could guarantee this kind of credit for business capital loans much more cheaply than $700B, and let the chips fall where they might everywhere else?

I invite the blogosphere to link me up to classic economic treatises and modern articles which could help me understand how a liquidity crisis would cause the economy to simply collapse.

Call for Help: Boehm+jemalloc

Wednesday, September 24th, 2008

At the Firefox summit we decided to change tack on XPCOMGC and try to use Boehm instead of MMgc. Overall I think this was a really good decision. With help from Graydon, I even have some Linux builds that use Boehm under the hood (most memory is not considered collectable, only string buffers are collected objects at this point).

Unfortunately, while Boehm is a pretty good collector, it doesn’t do so well at memory allocation and fragmentation. Heap usage is between 1.5x and 2x that of standard Firefox using jemalloc. What I really want is a combination of jemalloc and Boehm, taking the best features from each:

Boehm Features:

  • Fast and threadsafe conservative collector
  • Smart rooting of all thread stacks and static data
  • Incremental marking with hardware write barriers1
  • Option for parallel collection2
  • Ability to intermingle collected and non-collected memory

jemalloc features:

  • Better overall memory usage, primarily due to lower fragmentation
  • Very tight and well-performing allocator

Help Wanted

I’m looking for somebody who’s willing to painstakingly combine the best of these two allocators: either port the jemalloc low-fragmentation design to Boehm, or port the Boehm collection mechanism to the jemalloc allocator. If you’re interested, please contact me. Getting a solution to this problem really blocks any serious plans for further work on XPCOMGC.

Notes

  1. The key word is hardware. The MMgc solution failed because altering our codebase to have correct programmatic write barriers was going to involve boiling the ocean. And even with smart pointers, a standard MMgc write barrier involves a lot of overhead.
  2. In Boehm, parallel collection doesn’t work with most incremental collection, and so we may not actually decide to use it; avoiding large pauses with incremental collection is more important.

When linking, the order of your command-line can be important

Monday, September 22nd, 2008

Occasionally, people will come on the #xulrunner or #extdev channel with a question about compiling XPCOM components. The question often goes something like this:

<IRCGuy> I’m following a tutorial on making XPCOM components, but I can’t seem to get them to compile. Can anyone tell me what my problem is?

Hint for asking a good question: IRCGuy needs to tell us some combination of 1) what tutorial he’s following, 2) what the failing command is or 3) what the error message is.

This time, IRCGuy’s compile command and error message are:

IRCGuy@IRCGuy-france:/mnt/data/IRCGuy/public/project/xpcom-test$ make
g++  -I/usr/local/include/xulrunner-1.9/unstable -I/usr/local/include/xulrunner-1.9/stable -L/usr/local/lib/xulrunner-devel-1.9/lib -Wl,-rpath-link,/usr/local/bin  -lxpcomglue_s -lxpcom -lnspr4 -fno-rtti -fno-exceptions -shared -Wl,-z,defs  france2.cpp -o france2.so
/tmp/cceFg2dD.o: In function `NSGetModule':
france2.cpp:(.text+0x38c): undefined reference to `NS_NewGenericModule2(nsModuleInfo const*, nsIModule**)'

IRCGuy’s problem is a problem of link ordering: with most unix-like linkers, it is very important to list object files and libraries in the correct order. The general order you want to follow is as follows:

  1. Object files
  2. Static libraries – specific to general
  3. Dynamic libraries

If an object file needs a symbol, the linker will only resolve that symbol in static libraries that are later in the link line.

The corrected command:

g++  -I/usr/local/include/xulrunner-1.9/unstable -I/usr/local/include/xulrunner-1.9/stable -fno-rtti -fno-exceptions -shared -Wl,-z,defs  france2.cpp -L/usr/local/lib/xulrunner-devel-1.9/lib -Wl,-rpath-link,/usr/local/bin  -lxpcomglue_s -lxpcom -lnspr4 -o france2.so

Bonus tip: correct linker flags for linking XPCOM components can be found on the Mozilla Developer Center article on the XPCOM Glue. As noted in the article, xpcom components want to use the “Dependent Glue” linker strategy.

Are you going to start plating soon?

Monday, September 22nd, 2008

I’m hoping to start blogging more regularly, and model my blog after The Old New Thing. So I’m planning on posting in pairs: one technical post related to my work or Mozilla, and one non-technical post relating to personal posts about my family, music, or other things I find interesting.

I think I am raising Food Network junkies. I was making BLT sandwiches for lunch the other day, and my three-year-old daughter Claire was hungry. She asked me “are you going to start plating soon?”

She adores Michael Symon, and often when we turn on Dinner: Impossible she says “I love Michael Symon, he’s beeeaautiful.” Claire wants to be an Iron Chef when she grows up. Ellie really enjoys when Cat Cora is the Iron Chef, or when there’s a female challenger in general. And they all watch Good Eats with rapt attention.

It’s amazing to me how much and how quickly they learn. We play a “how do we cook it” game: I pick an ingredient, and they tell me how you’d cook it. Ellie, who is four years old, recently said to me, “Daddy, if you don’t heat the carrot pieces in a pan with oil, they won’t be soft in your carrot stew.” In their play-kitchen, they are always concocting soups and baked goods, and arguing about ingredients.

Profiling Dromaeo Testcases with Shark

Thursday, September 4th, 2008

I’m taking a break from garbage collection for a week or so: I got stuck, and there are lots of other things going on I wanted to help out on. Yesterday and today’s project was profiling some DOM testcases.

Two days ago, Jason recently landed a great patch to minimize the XPConnect overhead of DOM calls (fast-path DOM). Prior to this patch, many profiles of DOM scripting were dominated by XPConnect overhead (marshaling calls from JS to binary XPCOM). So I decided to re-do some of these profiles and see if there were any easy wins lurking, now that the noise was gone. I first ran the Dromaeo tests in a build from mozilla-central and compared the results to Safari on the same machine. Now, I’m taking some of the comparatively worst performers and using Shark to profile the tests.

I figured that getting shark to profile individual tests would require some major hacking. But it turns out that Dromaeo already has support for wrapping tests with calls to generate Shark profiles! All I needed to do was hack a little bit to generate a single profile at a time.

I started by profiling the following test: DOM Modification (Prototype): update(). mozilla-central was 8x slower than Safari on this test.

  1. Start with a shark-enabled Firefox.
  2. Download or clone Dromaeo from here.
  3. Type `make web` to build a local copy of Dromaeo.
  4. Start shark for programmatic control as documented here.
  5. Point your browser at the test like so:
    file:///builds/dromaeo/web/index.html?dom-modify-prototype&shark=update&numTests=1
  6. Shark should do a little dance and pop up a profile viewer. For a quick overview on using the Shark profile viewer, see Vlad’s blog.
  7. By using the top-down view, I quickly discovered that over 70% of runtime was spent in a single function:

    Shark Top-Down View

  8. By double-clicking this function, I could see a heatmap of execution within the function: just two lines of code were responsible for most of the time!:
    A heatmap showing jsregexp.cpp.

  9. This was more than enough evidence to file a bug.
  10. After a bit of conversation with Brian Crower on IRC, I found that my initial hypothesis was wrong: The JS_ISSPACE
    macro is not really to blame. Every time it encountered a \s or \S in a regular expression character class, the code would loop over all 65,536 characters in the unicode basic plane and ask a series of lookup tables “is this character a space?” Because there are a small number of actual whitespace characters, I could replace this large loop with a small table of whitespace character ranges.

  11. The patch made this particular test 77% faster, from 850ms to 195ms.

I’ve already filed a bug on another test and will be working through at least four more significant slowdowns. Doing this profiling has been a lot of fun, and a nice change of pace from the garbage collection slog. I really encourage anyone who has a mac to spend a little time with Shark and a performance issue: it actually makes visualizing and analyzing performance problems fun.

Teaching wget About Root Certificates

Wednesday, August 27th, 2008

I am setting up some temporary tinderboxes to repack localization builds. Because I don’t trust the DNS service from my home ISP, I wanted to download builds from ftp.mozilla.org using HTTPS. It turns out this was quite the challenging task, due to the following cute and relatively useless error message:

ERROR: Certificate verification error for ftp.mozilla.org: unable to get local issuer certificate
To connect to ftp.mozilla.org insecurely, use '--no-check-certificate'.

What this really means is “your copy of wget/OpenSSL didn’t come with any root certificates, and HTTPS just isn’t going to work until you get them and I know about them.”

Getting Root Certificates

The best way to get the root certificates you need is at this website. It has a tool that will convert the root certificates built-in to Mozilla NSS into the PEM format that OpenSSL expects. It also has pre-converted PEM files available for download if you’re lazy.

Installing cacert.pem into MozillaBuild (Windows)

To install cacert.pem so that it works with MozillaBuild:

  1. Copy cacert.pem to c:/mozilla-build/wget/cacert.pem
  2. Create the following configuration file at c:/mozilla-build/wget/wget.ini:
    ca_certificate=c:/mozilla-build/wget/cacert.pem

Ted filed a bug about setting this up automatically for a future version of MozillaBuild.

Installing cacert.pem on Mac:

The following instructions assume you got your wget from macports using port install wget.

  1. Copy cacert.pem to /opt/local/etc/cacert.pem
  2. Create the following configuration file at /opt/local/etc/wgetrc:
    ca_certificate=/opt/local/etc/cacert.pem